SageMaker / Paginator / ListInferenceRecommendationsJobSteps

ListInferenceRecommendationsJobSteps#

class SageMaker.Paginator.ListInferenceRecommendationsJobSteps#

paginator = client.get_paginator('list_inference_recommendations_job_steps')

paginate(**kwargs)#

Creates an iterator that will paginate through responses from SageMaker.Client.list_inference_recommendations_job_steps().

See also: AWS API Documentation

Request Syntax

response_iterator = paginator.paginate(
    JobName='string',
    Status='PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'STOPPING'|'STOPPED',
    StepType='BENCHMARK',
    PaginationConfig={
        'MaxItems': 123,
        'PageSize': 123,
        'StartingToken': 'string'
    }
)

Parameters:

JobName (string) –
[REQUIRED]

The name for the Inference Recommender job.
Status (string) – A filter to return benchmarks of a specified status. If this field is left empty, then all benchmarks are returned.
StepType (string) –
A filter to return details about the specified type of subtask.

BENCHMARK: Evaluate the performance of your model on different instance types.
PaginationConfig (dict) –
A dictionary that provides parameters to control pagination.
- MaxItems (integer) –
  
  The total number of items to return. If the total number of items available is more than the value specified in max-items then a NextToken will be provided in the output that you can use to resume pagination.
- PageSize (integer) –
  
  The size of each page.
- StartingToken (string) –
  
  A token to specify where to start paginating. This is the NextToken from a previous response.

Return type:

dict

Returns:

Response Syntax

{
    'Steps': [
        {
            'StepType': 'BENCHMARK',
            'JobName': 'string',
            'Status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'STOPPING'|'STOPPED',
            'InferenceBenchmark': {
                'Metrics': {
                    'CostPerHour': ...,
                    'CostPerInference': ...,
                    'MaxInvocations': 123,
                    'ModelLatency': 123,
                    'CpuUtilization': ...,
                    'MemoryUtilization': ...
                },
                'EndpointConfiguration': {
                    'EndpointName': 'string',
                    'VariantName': 'string',
                    'InstanceType': 'ml.t2.medium'|'ml.t2.large'|'ml.t2.xlarge'|'ml.t2.2xlarge'|'ml.m4.xlarge'|'ml.m4.2xlarge'|'ml.m4.4xlarge'|'ml.m4.10xlarge'|'ml.m4.16xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.12xlarge'|'ml.m5.24xlarge'|'ml.m5d.large'|'ml.m5d.xlarge'|'ml.m5d.2xlarge'|'ml.m5d.4xlarge'|'ml.m5d.12xlarge'|'ml.m5d.24xlarge'|'ml.c4.large'|'ml.c4.xlarge'|'ml.c4.2xlarge'|'ml.c4.4xlarge'|'ml.c4.8xlarge'|'ml.p2.xlarge'|'ml.p2.8xlarge'|'ml.p2.16xlarge'|'ml.p3.2xlarge'|'ml.p3.8xlarge'|'ml.p3.16xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.18xlarge'|'ml.c5d.large'|'ml.c5d.xlarge'|'ml.c5d.2xlarge'|'ml.c5d.4xlarge'|'ml.c5d.9xlarge'|'ml.c5d.18xlarge'|'ml.g4dn.xlarge'|'ml.g4dn.2xlarge'|'ml.g4dn.4xlarge'|'ml.g4dn.8xlarge'|'ml.g4dn.12xlarge'|'ml.g4dn.16xlarge'|'ml.r5.large'|'ml.r5.xlarge'|'ml.r5.2xlarge'|'ml.r5.4xlarge'|'ml.r5.12xlarge'|'ml.r5.24xlarge'|'ml.r5d.large'|'ml.r5d.xlarge'|'ml.r5d.2xlarge'|'ml.r5d.4xlarge'|'ml.r5d.12xlarge'|'ml.r5d.24xlarge'|'ml.inf1.xlarge'|'ml.inf1.2xlarge'|'ml.inf1.6xlarge'|'ml.inf1.24xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.p4d.24xlarge'|'ml.c7g.large'|'ml.c7g.xlarge'|'ml.c7g.2xlarge'|'ml.c7g.4xlarge'|'ml.c7g.8xlarge'|'ml.c7g.12xlarge'|'ml.c7g.16xlarge'|'ml.m6g.large'|'ml.m6g.xlarge'|'ml.m6g.2xlarge'|'ml.m6g.4xlarge'|'ml.m6g.8xlarge'|'ml.m6g.12xlarge'|'ml.m6g.16xlarge'|'ml.m6gd.large'|'ml.m6gd.xlarge'|'ml.m6gd.2xlarge'|'ml.m6gd.4xlarge'|'ml.m6gd.8xlarge'|'ml.m6gd.12xlarge'|'ml.m6gd.16xlarge'|'ml.c6g.large'|'ml.c6g.xlarge'|'ml.c6g.2xlarge'|'ml.c6g.4xlarge'|'ml.c6g.8xlarge'|'ml.c6g.12xlarge'|'ml.c6g.16xlarge'|'ml.c6gd.large'|'ml.c6gd.xlarge'|'ml.c6gd.2xlarge'|'ml.c6gd.4xlarge'|'ml.c6gd.8xlarge'|'ml.c6gd.12xlarge'|'ml.c6gd.16xlarge'|'ml.c6gn.large'|'ml.c6gn.xlarge'|'ml.c6gn.2xlarge'|'ml.c6gn.4xlarge'|'ml.c6gn.8xlarge'|'ml.c6gn.12xlarge'|'ml.c6gn.16xlarge'|'ml.r6g.large'|'ml.r6g.xlarge'|'ml.r6g.2xlarge'|'ml.r6g.4xlarge'|'ml.r6g.8xlarge'|'ml.r6g.12xlarge'|'ml.r6g.16xlarge'|'ml.r6gd.large'|'ml.r6gd.xlarge'|'ml.r6gd.2xlarge'|'ml.r6gd.4xlarge'|'ml.r6gd.8xlarge'|'ml.r6gd.12xlarge'|'ml.r6gd.16xlarge'|'ml.p4de.24xlarge',
                    'InitialInstanceCount': 123
                },
                'ModelConfiguration': {
                    'InferenceSpecificationName': 'string',
                    'EnvironmentParameters': [
                        {
                            'Key': 'string',
                            'ValueType': 'string',
                            'Value': 'string'
                        },
                    ],
                    'CompilationJobName': 'string'
                },
                'FailureReason': 'string',
                'EndpointMetrics': {
                    'MaxInvocations': 123,
                    'ModelLatency': 123
                }
            }
        },
    ],

}

Response Structure

(dict) –
- Steps (list) –
  
  A list of all subtask details in Inference Recommender.
  - (dict) –
    
    A returned array object for the Steps response field in the ListInferenceRecommendationsJobSteps API command.
    - StepType (string) –
      
      The type of the subtask.
      
      BENCHMARK: Evaluate the performance of your model on different instance types.
    - JobName (string) –
      
      The name of the Inference Recommender job.
    - Status (string) –
      
      The current status of the benchmark.
    - InferenceBenchmark (dict) –
      
      The details for a specific benchmark.
      - Metrics (dict) –
        
        The metrics of recommendations.
        
        CostPerHour (float) –
        
        Defines the cost per hour for the instance.
        
        CostPerInference (float) –
        
        Defines the cost per inference for the instance .
        
        MaxInvocations (integer) –
        
        The expected maximum number of requests per minute for the instance.
        
        ModelLatency (integer) –
        
        The expected model latency at maximum invocation per minute for the instance.
        
        CpuUtilization (float) –
        
        The expected CPU utilization at maximum invocations per minute for the instance.
        
        NaN indicates that the value is not available.
        
        MemoryUtilization (float) –
        
        The expected memory utilization at maximum invocations per minute for the instance.
        
        NaN indicates that the value is not available.
      - EndpointConfiguration (dict) –
        
        The endpoint configuration made by Inference Recommender during a recommendation job.
        
        EndpointName (string) –
        
        The name of the endpoint made during a recommendation job.
        
        VariantName (string) –
        
        The name of the production variant (deployed model) made during a recommendation job.
        
        InstanceType (string) –
        
        The instance type recommended by Amazon SageMaker Inference Recommender.
        
        InitialInstanceCount (integer) –
        
        The number of instances recommended to launch initially.
      - ModelConfiguration (dict) –
        
        Defines the model configuration. Includes the specification name and environment parameters.
        
        InferenceSpecificationName (string) –
        
        The inference specification name in the model package version.
        
        EnvironmentParameters (list) –
        
        Defines the environment parameters that includes key, value types, and values.
        
        (dict) –
        
        A list of environment parameters suggested by the Amazon SageMaker Inference Recommender.
        
        Key (string) –
        
        The environment key suggested by the Amazon SageMaker Inference Recommender.
        
        ValueType (string) –
        
        The value type suggested by the Amazon SageMaker Inference Recommender.
        
        Value (string) –
        
        The value suggested by the Amazon SageMaker Inference Recommender.
        
        CompilationJobName (string) –
        
        The name of the compilation job used to create the recommended model artifacts.
      - FailureReason (string) –
        
        The reason why a benchmark failed.
      - EndpointMetrics (dict) –
        
        The metrics for an existing endpoint compared in an Inference Recommender job.
        
        MaxInvocations (integer) –
        
        The expected maximum number of requests per minute for the instance.
        
        ModelLatency (integer) –
        
        The expected model latency at maximum invocations per minute for the instance.