SageMaker / Client / get_scaling_configuration_recommendation
get_scaling_configuration_recommendation#
- SageMaker.Client.get_scaling_configuration_recommendation(**kwargs)#
- Starts an Amazon SageMaker Inference Recommender autoscaling recommendation job. Returns recommendations for autoscaling policies that you can apply to your SageMaker endpoint. - See also: AWS API Documentation - Request Syntax- response = client.get_scaling_configuration_recommendation( InferenceRecommendationsJobName='string', RecommendationId='string', EndpointName='string', TargetCpuUtilizationPerCore=123, ScalingPolicyObjective={ 'MinInvocationsPerMinute': 123, 'MaxInvocationsPerMinute': 123 } ) - Parameters:
- InferenceRecommendationsJobName (string) – - [REQUIRED] - The name of a previously completed Inference Recommender job. 
- RecommendationId (string) – - The recommendation ID of a previously completed inference recommendation. This ID should come from one of the recommendations returned by the job specified in the - InferenceRecommendationsJobNamefield.- Specify either this field or the - EndpointNamefield.
- EndpointName (string) – - The name of an endpoint benchmarked during a previously completed inference recommendation job. This name should come from one of the recommendations returned by the job specified in the - InferenceRecommendationsJobNamefield.- Specify either this field or the - RecommendationIdfield.
- TargetCpuUtilizationPerCore (integer) – The percentage of how much utilization you want an instance to use before autoscaling. The default value is 50%. 
- ScalingPolicyObjective (dict) – - An object where you specify the anticipated traffic pattern for an endpoint. - MinInvocationsPerMinute (integer) – - The minimum number of expected requests to your endpoint per minute. 
- MaxInvocationsPerMinute (integer) – - The maximum number of expected requests to your endpoint per minute. 
 
 
- Return type:
- dict 
- Returns:
- Response Syntax- { 'InferenceRecommendationsJobName': 'string', 'RecommendationId': 'string', 'EndpointName': 'string', 'TargetCpuUtilizationPerCore': 123, 'ScalingPolicyObjective': { 'MinInvocationsPerMinute': 123, 'MaxInvocationsPerMinute': 123 }, 'Metric': { 'InvocationsPerInstance': 123, 'ModelLatency': 123 }, 'DynamicScalingConfiguration': { 'MinCapacity': 123, 'MaxCapacity': 123, 'ScaleInCooldown': 123, 'ScaleOutCooldown': 123, 'ScalingPolicies': [ { 'TargetTracking': { 'MetricSpecification': { 'Predefined': { 'PredefinedMetricType': 'string' }, 'Customized': { 'MetricName': 'string', 'Namespace': 'string', 'Statistic': 'Average'|'Minimum'|'Maximum'|'SampleCount'|'Sum' } }, 'TargetValue': 123.0 } }, ] } } - Response Structure- (dict) – - InferenceRecommendationsJobName (string) – - The name of a previously completed Inference Recommender job. 
- RecommendationId (string) – - The recommendation ID of a previously completed inference recommendation. 
- EndpointName (string) – - The name of an endpoint benchmarked during a previously completed Inference Recommender job. 
- TargetCpuUtilizationPerCore (integer) – - The percentage of how much utilization you want an instance to use before autoscaling, which you specified in the request. The default value is 50%. 
- ScalingPolicyObjective (dict) – - An object representing the anticipated traffic pattern for an endpoint that you specified in the request. - MinInvocationsPerMinute (integer) – - The minimum number of expected requests to your endpoint per minute. 
- MaxInvocationsPerMinute (integer) – - The maximum number of expected requests to your endpoint per minute. 
 
- Metric (dict) – - An object with a list of metrics that were benchmarked during the previously completed Inference Recommender job. - InvocationsPerInstance (integer) – - The number of invocations sent to a model, normalized by - InstanceCountin each ProductionVariant.- 1/numberOfInstancesis sent as the value on each request, where- numberOfInstancesis the number of active instances for the ProductionVariant behind the endpoint at the time of the request.
- ModelLatency (integer) – - The interval of time taken by a model to respond as viewed from SageMaker. This interval includes the local communication times taken to send the request and to fetch the response from the container of a model and the time taken to complete the inference in the container. 
 
- DynamicScalingConfiguration (dict) – - An object with the recommended values for you to specify when creating an autoscaling policy. - MinCapacity (integer) – - The recommended minimum capacity to specify for your autoscaling policy. 
- MaxCapacity (integer) – - The recommended maximum capacity to specify for your autoscaling policy. 
- ScaleInCooldown (integer) – - The recommended scale in cooldown time for your autoscaling policy. 
- ScaleOutCooldown (integer) – - The recommended scale out cooldown time for your autoscaling policy. 
- ScalingPolicies (list) – - An object of the scaling policies for each metric. - (dict) – - An object containing a recommended scaling policy. - Note- This is a Tagged Union structure. Only one of the following top level keys will be set: - TargetTracking. If a client receives an unknown member it will set- SDK_UNKNOWN_MEMBERas the top level key, which maps to the name or tag of the unknown member. The structure of- SDK_UNKNOWN_MEMBERis as follows:- 'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'} - TargetTracking (dict) – - A target tracking scaling policy. Includes support for predefined or customized metrics. - MetricSpecification (dict) – - An object containing information about a metric. - Note- This is a Tagged Union structure. Only one of the following top level keys will be set: - Predefined,- Customized. If a client receives an unknown member it will set- SDK_UNKNOWN_MEMBERas the top level key, which maps to the name or tag of the unknown member. The structure of- SDK_UNKNOWN_MEMBERis as follows:- 'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'} - Predefined (dict) – - Information about a predefined metric. - PredefinedMetricType (string) – - The metric type. You can only apply SageMaker metric types to SageMaker endpoints. 
 
- Customized (dict) – - Information about a customized metric. - MetricName (string) – - The name of the customized metric. 
- Namespace (string) – - The namespace of the customized metric. 
- Statistic (string) – - The statistic of the customized metric. 
 
 
- TargetValue (float) – - The recommended target value to specify for the metric when creating a scaling policy. 
 
 
 
 
 
 
 - Exceptions- SageMaker.Client.exceptions.ResourceNotFound