ListInferenceRecommendationsJobSteps

class SageMaker.Paginator.ListInferenceRecommendationsJobSteps
paginator = client.get_paginator('list_inference_recommendations_job_steps')
paginate(**kwargs)

Creates an iterator that will paginate through responses from SageMaker.Client.list_inference_recommendations_job_steps().

See also: AWS API Documentation

Request Syntax

response_iterator = paginator.paginate(
    JobName='string',
    Status='PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'STOPPING'|'STOPPED',
    StepType='BENCHMARK',
    PaginationConfig={
        'MaxItems': 123,
        'PageSize': 123,
        'StartingToken': 'string'
    }
)
Parameters
  • JobName (string) --

    [REQUIRED]

    The name for the Inference Recommender job.

  • Status (string) -- A filter to return benchmarks of a specified status. If this field is left empty, then all benchmarks are returned.
  • StepType (string) --

    A filter to return details about the specified type of subtask.

    BENCHMARK : Evaluate the performance of your model on different instance types.
  • PaginationConfig (dict) --

    A dictionary that provides parameters to control pagination.

    • MaxItems (integer) --

      The total number of items to return. If the total number of items available is more than the value specified in max-items then a NextToken will be provided in the output that you can use to resume pagination.

    • PageSize (integer) --

      The size of each page.

    • StartingToken (string) --

      A token to specify where to start paginating. This is the NextToken from a previous response.

Return type

dict

Returns

Response Syntax

{
    'Steps': [
        {
            'StepType': 'BENCHMARK',
            'JobName': 'string',
            'Status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'STOPPING'|'STOPPED',
            'InferenceBenchmark': {
                'Metrics': {
                    'CostPerHour': ...,
                    'CostPerInference': ...,
                    'MaxInvocations': 123,
                    'ModelLatency': 123,
                    'CpuUtilization': ...,
                    'MemoryUtilization': ...
                },
                'EndpointConfiguration': {
                    'EndpointName': 'string',
                    'VariantName': 'string',
                    'InstanceType': 'ml.t2.medium'|'ml.t2.large'|'ml.t2.xlarge'|'ml.t2.2xlarge'|'ml.m4.xlarge'|'ml.m4.2xlarge'|'ml.m4.4xlarge'|'ml.m4.10xlarge'|'ml.m4.16xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.12xlarge'|'ml.m5.24xlarge'|'ml.m5d.large'|'ml.m5d.xlarge'|'ml.m5d.2xlarge'|'ml.m5d.4xlarge'|'ml.m5d.12xlarge'|'ml.m5d.24xlarge'|'ml.c4.large'|'ml.c4.xlarge'|'ml.c4.2xlarge'|'ml.c4.4xlarge'|'ml.c4.8xlarge'|'ml.p2.xlarge'|'ml.p2.8xlarge'|'ml.p2.16xlarge'|'ml.p3.2xlarge'|'ml.p3.8xlarge'|'ml.p3.16xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.18xlarge'|'ml.c5d.large'|'ml.c5d.xlarge'|'ml.c5d.2xlarge'|'ml.c5d.4xlarge'|'ml.c5d.9xlarge'|'ml.c5d.18xlarge'|'ml.g4dn.xlarge'|'ml.g4dn.2xlarge'|'ml.g4dn.4xlarge'|'ml.g4dn.8xlarge'|'ml.g4dn.12xlarge'|'ml.g4dn.16xlarge'|'ml.r5.large'|'ml.r5.xlarge'|'ml.r5.2xlarge'|'ml.r5.4xlarge'|'ml.r5.12xlarge'|'ml.r5.24xlarge'|'ml.r5d.large'|'ml.r5d.xlarge'|'ml.r5d.2xlarge'|'ml.r5d.4xlarge'|'ml.r5d.12xlarge'|'ml.r5d.24xlarge'|'ml.inf1.xlarge'|'ml.inf1.2xlarge'|'ml.inf1.6xlarge'|'ml.inf1.24xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.p4d.24xlarge'|'ml.c7g.large'|'ml.c7g.xlarge'|'ml.c7g.2xlarge'|'ml.c7g.4xlarge'|'ml.c7g.8xlarge'|'ml.c7g.12xlarge'|'ml.c7g.16xlarge'|'ml.m6g.large'|'ml.m6g.xlarge'|'ml.m6g.2xlarge'|'ml.m6g.4xlarge'|'ml.m6g.8xlarge'|'ml.m6g.12xlarge'|'ml.m6g.16xlarge'|'ml.m6gd.large'|'ml.m6gd.xlarge'|'ml.m6gd.2xlarge'|'ml.m6gd.4xlarge'|'ml.m6gd.8xlarge'|'ml.m6gd.12xlarge'|'ml.m6gd.16xlarge'|'ml.c6g.large'|'ml.c6g.xlarge'|'ml.c6g.2xlarge'|'ml.c6g.4xlarge'|'ml.c6g.8xlarge'|'ml.c6g.12xlarge'|'ml.c6g.16xlarge'|'ml.c6gd.large'|'ml.c6gd.xlarge'|'ml.c6gd.2xlarge'|'ml.c6gd.4xlarge'|'ml.c6gd.8xlarge'|'ml.c6gd.12xlarge'|'ml.c6gd.16xlarge'|'ml.c6gn.large'|'ml.c6gn.xlarge'|'ml.c6gn.2xlarge'|'ml.c6gn.4xlarge'|'ml.c6gn.8xlarge'|'ml.c6gn.12xlarge'|'ml.c6gn.16xlarge'|'ml.r6g.large'|'ml.r6g.xlarge'|'ml.r6g.2xlarge'|'ml.r6g.4xlarge'|'ml.r6g.8xlarge'|'ml.r6g.12xlarge'|'ml.r6g.16xlarge'|'ml.r6gd.large'|'ml.r6gd.xlarge'|'ml.r6gd.2xlarge'|'ml.r6gd.4xlarge'|'ml.r6gd.8xlarge'|'ml.r6gd.12xlarge'|'ml.r6gd.16xlarge'|'ml.p4de.24xlarge',
                    'InitialInstanceCount': 123
                },
                'ModelConfiguration': {
                    'InferenceSpecificationName': 'string',
                    'EnvironmentParameters': [
                        {
                            'Key': 'string',
                            'ValueType': 'string',
                            'Value': 'string'
                        },
                    ],
                    'CompilationJobName': 'string'
                },
                'FailureReason': 'string',
                'EndpointMetrics': {
                    'MaxInvocations': 123,
                    'ModelLatency': 123
                }
            }
        },
    ],

}

Response Structure

  • (dict) --

    • Steps (list) --

      A list of all subtask details in Inference Recommender.

      • (dict) --

        A returned array object for the Steps response field in the ListInferenceRecommendationsJobSteps API command.

        • StepType (string) --

          The type of the subtask.

          BENCHMARK : Evaluate the performance of your model on different instance types.

        • JobName (string) --

          The name of the Inference Recommender job.

        • Status (string) --

          The current status of the benchmark.

        • InferenceBenchmark (dict) --

          The details for a specific benchmark.

          • Metrics (dict) --

            The metrics of recommendations.

            • CostPerHour (float) --

              Defines the cost per hour for the instance.

            • CostPerInference (float) --

              Defines the cost per inference for the instance .

            • MaxInvocations (integer) --

              The expected maximum number of requests per minute for the instance.

            • ModelLatency (integer) --

              The expected model latency at maximum invocation per minute for the instance.

            • CpuUtilization (float) --

              The expected CPU utilization at maximum invocations per minute for the instance.

              NaN indicates that the value is not available.

            • MemoryUtilization (float) --

              The expected memory utilization at maximum invocations per minute for the instance.

              NaN indicates that the value is not available.

          • EndpointConfiguration (dict) --

            The endpoint configuration made by Inference Recommender during a recommendation job.

            • EndpointName (string) --

              The name of the endpoint made during a recommendation job.

            • VariantName (string) --

              The name of the production variant (deployed model) made during a recommendation job.

            • InstanceType (string) --

              The instance type recommended by Amazon SageMaker Inference Recommender.

            • InitialInstanceCount (integer) --

              The number of instances recommended to launch initially.

          • ModelConfiguration (dict) --

            Defines the model configuration. Includes the specification name and environment parameters.

            • InferenceSpecificationName (string) --

              The inference specification name in the model package version.

            • EnvironmentParameters (list) --

              Defines the environment parameters that includes key, value types, and values.

              • (dict) --

                A list of environment parameters suggested by the Amazon SageMaker Inference Recommender.

                • Key (string) --

                  The environment key suggested by the Amazon SageMaker Inference Recommender.

                • ValueType (string) --

                  The value type suggested by the Amazon SageMaker Inference Recommender.

                • Value (string) --

                  The value suggested by the Amazon SageMaker Inference Recommender.

            • CompilationJobName (string) --

              The name of the compilation job used to create the recommended model artifacts.

          • FailureReason (string) --

            The reason why a benchmark failed.

          • EndpointMetrics (dict) --

            The metrics for an existing endpoint compared in an Inference Recommender job.

            • MaxInvocations (integer) --

              The expected maximum number of requests per minute for the instance.

            • ModelLatency (integer) --

              The expected model latency at maximum invocations per minute for the instance.