SageMaker / Paginator / ListInferenceRecommendationsJobSteps

ListInferenceRecommendationsJobSteps#

class SageMaker.Paginator.ListInferenceRecommendationsJobSteps#
paginator = client.get_paginator('list_inference_recommendations_job_steps')
paginate(**kwargs)#

Creates an iterator that will paginate through responses from SageMaker.Client.list_inference_recommendations_job_steps().

See also: AWS API Documentation

Request Syntax

response_iterator = paginator.paginate(
    JobName='string',
    Status='PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'STOPPING'|'STOPPED',
    StepType='BENCHMARK',
    PaginationConfig={
        'MaxItems': 123,
        'PageSize': 123,
        'StartingToken': 'string'
    }
)
Parameters:
  • JobName (string) –

    [REQUIRED]

    The name for the Inference Recommender job.

  • Status (string) – A filter to return benchmarks of a specified status. If this field is left empty, then all benchmarks are returned.

  • StepType (string) –

    A filter to return details about the specified type of subtask.

    BENCHMARK: Evaluate the performance of your model on different instance types.

  • PaginationConfig (dict) –

    A dictionary that provides parameters to control pagination.

    • MaxItems (integer) –

      The total number of items to return. If the total number of items available is more than the value specified in max-items then a NextToken will be provided in the output that you can use to resume pagination.

    • PageSize (integer) –

      The size of each page.

    • StartingToken (string) –

      A token to specify where to start paginating. This is the NextToken from a previous response.

Return type:

dict

Returns:

Response Syntax

{
    'Steps': [
        {
            'StepType': 'BENCHMARK',
            'JobName': 'string',
            'Status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'STOPPING'|'STOPPED',
            'InferenceBenchmark': {
                'Metrics': {
                    'CostPerHour': ...,
                    'CostPerInference': ...,
                    'MaxInvocations': 123,
                    'ModelLatency': 123,
                    'CpuUtilization': ...,
                    'MemoryUtilization': ...,
                    'ModelSetupTime': 123
                },
                'EndpointConfiguration': {
                    'EndpointName': 'string',
                    'VariantName': 'string',
                    'InstanceType': 'ml.t2.medium'|'ml.t2.large'|'ml.t2.xlarge'|'ml.t2.2xlarge'|'ml.m4.xlarge'|'ml.m4.2xlarge'|'ml.m4.4xlarge'|'ml.m4.10xlarge'|'ml.m4.16xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.12xlarge'|'ml.m5.24xlarge'|'ml.m5d.large'|'ml.m5d.xlarge'|'ml.m5d.2xlarge'|'ml.m5d.4xlarge'|'ml.m5d.12xlarge'|'ml.m5d.24xlarge'|'ml.c4.large'|'ml.c4.xlarge'|'ml.c4.2xlarge'|'ml.c4.4xlarge'|'ml.c4.8xlarge'|'ml.p2.xlarge'|'ml.p2.8xlarge'|'ml.p2.16xlarge'|'ml.p3.2xlarge'|'ml.p3.8xlarge'|'ml.p3.16xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.18xlarge'|'ml.c5d.large'|'ml.c5d.xlarge'|'ml.c5d.2xlarge'|'ml.c5d.4xlarge'|'ml.c5d.9xlarge'|'ml.c5d.18xlarge'|'ml.g4dn.xlarge'|'ml.g4dn.2xlarge'|'ml.g4dn.4xlarge'|'ml.g4dn.8xlarge'|'ml.g4dn.12xlarge'|'ml.g4dn.16xlarge'|'ml.r5.large'|'ml.r5.xlarge'|'ml.r5.2xlarge'|'ml.r5.4xlarge'|'ml.r5.12xlarge'|'ml.r5.24xlarge'|'ml.r5d.large'|'ml.r5d.xlarge'|'ml.r5d.2xlarge'|'ml.r5d.4xlarge'|'ml.r5d.12xlarge'|'ml.r5d.24xlarge'|'ml.inf1.xlarge'|'ml.inf1.2xlarge'|'ml.inf1.6xlarge'|'ml.inf1.24xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.p4d.24xlarge'|'ml.c7g.large'|'ml.c7g.xlarge'|'ml.c7g.2xlarge'|'ml.c7g.4xlarge'|'ml.c7g.8xlarge'|'ml.c7g.12xlarge'|'ml.c7g.16xlarge'|'ml.m6g.large'|'ml.m6g.xlarge'|'ml.m6g.2xlarge'|'ml.m6g.4xlarge'|'ml.m6g.8xlarge'|'ml.m6g.12xlarge'|'ml.m6g.16xlarge'|'ml.m6gd.large'|'ml.m6gd.xlarge'|'ml.m6gd.2xlarge'|'ml.m6gd.4xlarge'|'ml.m6gd.8xlarge'|'ml.m6gd.12xlarge'|'ml.m6gd.16xlarge'|'ml.c6g.large'|'ml.c6g.xlarge'|'ml.c6g.2xlarge'|'ml.c6g.4xlarge'|'ml.c6g.8xlarge'|'ml.c6g.12xlarge'|'ml.c6g.16xlarge'|'ml.c6gd.large'|'ml.c6gd.xlarge'|'ml.c6gd.2xlarge'|'ml.c6gd.4xlarge'|'ml.c6gd.8xlarge'|'ml.c6gd.12xlarge'|'ml.c6gd.16xlarge'|'ml.c6gn.large'|'ml.c6gn.xlarge'|'ml.c6gn.2xlarge'|'ml.c6gn.4xlarge'|'ml.c6gn.8xlarge'|'ml.c6gn.12xlarge'|'ml.c6gn.16xlarge'|'ml.r6g.large'|'ml.r6g.xlarge'|'ml.r6g.2xlarge'|'ml.r6g.4xlarge'|'ml.r6g.8xlarge'|'ml.r6g.12xlarge'|'ml.r6g.16xlarge'|'ml.r6gd.large'|'ml.r6gd.xlarge'|'ml.r6gd.2xlarge'|'ml.r6gd.4xlarge'|'ml.r6gd.8xlarge'|'ml.r6gd.12xlarge'|'ml.r6gd.16xlarge'|'ml.p4de.24xlarge'|'ml.trn1.2xlarge'|'ml.trn1.32xlarge'|'ml.inf2.xlarge'|'ml.inf2.8xlarge'|'ml.inf2.24xlarge'|'ml.inf2.48xlarge',
                    'InitialInstanceCount': 123,
                    'ServerlessConfig': {
                        'MemorySizeInMB': 123,
                        'MaxConcurrency': 123,
                        'ProvisionedConcurrency': 123
                    }
                },
                'ModelConfiguration': {
                    'InferenceSpecificationName': 'string',
                    'EnvironmentParameters': [
                        {
                            'Key': 'string',
                            'ValueType': 'string',
                            'Value': 'string'
                        },
                    ],
                    'CompilationJobName': 'string'
                },
                'FailureReason': 'string',
                'EndpointMetrics': {
                    'MaxInvocations': 123,
                    'ModelLatency': 123
                },
                'InvocationEndTime': datetime(2015, 1, 1),
                'InvocationStartTime': datetime(2015, 1, 1)
            }
        },
    ],

}

Response Structure

  • (dict) –

    • Steps (list) –

      A list of all subtask details in Inference Recommender.

      • (dict) –

        A returned array object for the Steps response field in the ListInferenceRecommendationsJobSteps API command.

        • StepType (string) –

          The type of the subtask.

          BENCHMARK: Evaluate the performance of your model on different instance types.

        • JobName (string) –

          The name of the Inference Recommender job.

        • Status (string) –

          The current status of the benchmark.

        • InferenceBenchmark (dict) –

          The details for a specific benchmark.

          • Metrics (dict) –

            The metrics of recommendations.

            • CostPerHour (float) –

              Defines the cost per hour for the instance.

            • CostPerInference (float) –

              Defines the cost per inference for the instance .

            • MaxInvocations (integer) –

              The expected maximum number of requests per minute for the instance.

            • ModelLatency (integer) –

              The expected model latency at maximum invocation per minute for the instance.

            • CpuUtilization (float) –

              The expected CPU utilization at maximum invocations per minute for the instance.

              NaN indicates that the value is not available.

            • MemoryUtilization (float) –

              The expected memory utilization at maximum invocations per minute for the instance.

              NaN indicates that the value is not available.

            • ModelSetupTime (integer) –

              The time it takes to launch new compute resources for a serverless endpoint. The time can vary depending on the model size, how long it takes to download the model, and the start-up time of the container.

              NaN indicates that the value is not available.

          • EndpointConfiguration (dict) –

            The endpoint configuration made by Inference Recommender during a recommendation job.

            • EndpointName (string) –

              The name of the endpoint made during a recommendation job.

            • VariantName (string) –

              The name of the production variant (deployed model) made during a recommendation job.

            • InstanceType (string) –

              The instance type recommended by Amazon SageMaker Inference Recommender.

            • InitialInstanceCount (integer) –

              The number of instances recommended to launch initially.

            • ServerlessConfig (dict) –

              Specifies the serverless configuration for an endpoint variant.

              • MemorySizeInMB (integer) –

                The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.

              • MaxConcurrency (integer) –

                The maximum number of concurrent invocations your serverless endpoint can process.

              • ProvisionedConcurrency (integer) –

                The amount of provisioned concurrency to allocate for the serverless endpoint. Should be less than or equal to MaxConcurrency.

                Note

                This field is not supported for serverless endpoint recommendations for Inference Recommender jobs. For more information about creating an Inference Recommender job, see CreateInferenceRecommendationsJobs.

          • ModelConfiguration (dict) –

            Defines the model configuration. Includes the specification name and environment parameters.

            • InferenceSpecificationName (string) –

              The inference specification name in the model package version.

            • EnvironmentParameters (list) –

              Defines the environment parameters that includes key, value types, and values.

              • (dict) –

                A list of environment parameters suggested by the Amazon SageMaker Inference Recommender.

                • Key (string) –

                  The environment key suggested by the Amazon SageMaker Inference Recommender.

                • ValueType (string) –

                  The value type suggested by the Amazon SageMaker Inference Recommender.

                • Value (string) –

                  The value suggested by the Amazon SageMaker Inference Recommender.

            • CompilationJobName (string) –

              The name of the compilation job used to create the recommended model artifacts.

          • FailureReason (string) –

            The reason why a benchmark failed.

          • EndpointMetrics (dict) –

            The metrics for an existing endpoint compared in an Inference Recommender job.

            • MaxInvocations (integer) –

              The expected maximum number of requests per minute for the instance.

            • ModelLatency (integer) –

              The expected model latency at maximum invocations per minute for the instance.

          • InvocationEndTime (datetime) –

            A timestamp that shows when the benchmark completed.

          • InvocationStartTime (datetime) –

            A timestamp that shows when the benchmark started.