Bedrock / Client / create_inference_profile
create_inference_profile#
- Bedrock.Client.create_inference_profile(**kwargs)#
Creates an application inference profile to track metrics and costs when invoking a model. To create an application inference profile for a foundation model in one region, specify the ARN of the model in that region. To create an application inference profile for a foundation model across multiple regions, specify the ARN of the system-defined inference profile that contains the regions that you want to route requests to. For more information, see Increase throughput and resilience with cross-region inference in Amazon Bedrock. in the Amazon Bedrock User Guide.
See also: AWS API Documentation
Request Syntax
response = client.create_inference_profile( inferenceProfileName='string', description='string', clientRequestToken='string', modelSource={ 'copyFrom': 'string' }, tags=[ { 'key': 'string', 'value': 'string' }, ] )
- Parameters:
inferenceProfileName (string) –
[REQUIRED]
A name for the inference profile.
description (string) – A description for the inference profile.
clientRequestToken (string) –
A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency.
This field is autopopulated if not provided.
modelSource (dict) –
[REQUIRED]
The foundation model or system-defined inference profile that the inference profile will track metrics and costs for.
Note
This is a Tagged Union structure. Only one of the following top level keys can be set:
copyFrom
.copyFrom (string) –
The ARN of the model or system-defined inference profile that is the source for the inference profile.
tags (list) –
An array of objects, each of which contains a tag and its value. For more information, see Tagging resources in the Amazon Bedrock User Guide.
(dict) –
Definition of the key/value pair for a tag.
key (string) – [REQUIRED]
Key for the tag.
value (string) – [REQUIRED]
Value for the tag.
- Return type:
dict
- Returns:
Response Syntax
{ 'inferenceProfileArn': 'string', 'status': 'ACTIVE' }
Response Structure
(dict) –
inferenceProfileArn (string) –
The ARN of the inference profile that you created.
status (string) –
The status of the inference profile.
ACTIVE
means that the inference profile is ready to be used.
Exceptions
Bedrock.Client.exceptions.ResourceNotFoundException
Bedrock.Client.exceptions.AccessDeniedException
Bedrock.Client.exceptions.ValidationException
Bedrock.Client.exceptions.ConflictException
Bedrock.Client.exceptions.InternalServerException
Bedrock.Client.exceptions.TooManyTagsException
Bedrock.Client.exceptions.ServiceQuotaExceededException
Bedrock.Client.exceptions.ThrottlingException