CleanRoomsML / Client / create_training_dataset
create_training_dataset#
- CleanRoomsML.Client.create_training_dataset(**kwargs)#
- Defines the information necessary to create a training dataset. In Clean Rooms ML, the - TrainingDatasetis metadata that points to a Glue table, which is read only during- AudienceModelcreation.- See also: AWS API Documentation - Request Syntax- response = client.create_training_dataset( name='string', roleArn='string', trainingData=[ { 'type': 'INTERACTIONS', 'inputConfig': { 'schema': [ { 'columnName': 'string', 'columnTypes': [ 'USER_ID'|'ITEM_ID'|'TIMESTAMP'|'CATEGORICAL_FEATURE'|'NUMERICAL_FEATURE', ] }, ], 'dataSource': { 'glueDataSource': { 'tableName': 'string', 'databaseName': 'string', 'catalogId': 'string' } } } }, ], tags={ 'string': 'string' }, description='string' ) - Parameters:
- name (string) – - [REQUIRED] - The name of the training dataset. This name must be unique in your account and region. 
- roleArn (string) – - [REQUIRED] - The ARN of the IAM role that Clean Rooms ML can assume to read the data referred to in the - dataSourcefield of each dataset.- Passing a role across AWS accounts is not allowed. If you pass a role that isn’t in your account, you get an - AccessDeniedExceptionerror.
- trainingData (list) – - [REQUIRED] - An array of information that lists the Dataset objects, which specifies the dataset type and details on its location and schema. You must provide a role that has read access to these tables. - (dict) – - Defines where the training dataset is located, what type of data it contains, and how to access the data. - type (string) – [REQUIRED] - What type of information is found in the dataset. 
- inputConfig (dict) – [REQUIRED] - A DatasetInputConfig object that defines the data source and schema mapping. - schema (list) – [REQUIRED] - The schema information for the training data. - (dict) – - Metadata for a column. - columnName (string) – [REQUIRED] - The name of a column. 
- columnTypes (list) – [REQUIRED] - The data type of column. - (string) – 
 
 
 
- dataSource (dict) – [REQUIRED] - A DataSource object that specifies the Glue data source for the training data. - glueDataSource (dict) – [REQUIRED] - A GlueDataSource object that defines the catalog ID, database name, and table name for the training data. - tableName (string) – [REQUIRED] - The Glue table that contains the training data. 
- databaseName (string) – [REQUIRED] - The Glue database that contains the training data. 
- catalogId (string) – - The Glue catalog that contains the training data. 
 
 
 
 
 
- tags (dict) – - The optional metadata that you apply to the resource to help you categorize and organize them. Each tag consists of a key and an optional value, both of which you define. - The following basic restrictions apply to tags: - Maximum number of tags per resource - 50. 
- For each resource, each tag key must be unique, and each tag key can have only one value. 
- Maximum key length - 128 Unicode characters in UTF-8. 
- Maximum value length - 256 Unicode characters in UTF-8. 
- If your tagging schema is used across multiple services and resources, remember that other services may have restrictions on allowed characters. Generally allowed characters are: letters, numbers, and spaces representable in UTF-8, and the following characters: + - = . _ : / @. 
- Tag keys and values are case sensitive. 
- Do not use aws:, AWS:, or any upper or lowercase combination of such as a prefix for keys as it is reserved for AWS use. You cannot edit or delete tag keys with this prefix. Values can have this prefix. If a tag value has aws as its prefix but the key does not, then Clean Rooms ML considers it to be a user tag and will count against the limit of 50 tags. Tags with only the key prefix of aws do not count against your tags per resource limit. 
 - (string) – - (string) – 
 
 
- description (string) – The description of the training dataset. 
 
- Return type:
- dict 
- Returns:
- Response Syntax- { 'trainingDatasetArn': 'string' } - Response Structure- (dict) – - trainingDatasetArn (string) – - The Amazon Resource Name (ARN) of the training dataset resource. 
 
 
 - Exceptions- CleanRoomsML.Client.exceptions.ConflictException
- CleanRoomsML.Client.exceptions.ValidationException
- CleanRoomsML.Client.exceptions.AccessDeniedException