Personalize / Client / create_dataset

create_dataset#

Personalize.Client.create_dataset(**kwargs)#

Creates an empty dataset and adds it to the specified dataset group. Use CreateDatasetImportJob to import your training data to a dataset.

There are three types of datasets:

  • Interactions

  • Items

  • Users

Each dataset type has an associated schema with required field types. Only the Interactions dataset is required in order to train a model (also referred to as creating a solution).

A dataset can be in one of the following states:

  • CREATE PENDING > CREATE IN_PROGRESS > ACTIVE -or- CREATE FAILED

  • DELETE PENDING > DELETE IN_PROGRESS

To get the status of the dataset, call DescribeDataset.

Related APIs

See also: AWS API Documentation

Request Syntax

response = client.create_dataset(
    name='string',
    schemaArn='string',
    datasetGroupArn='string',
    datasetType='string',
    tags=[
        {
            'tagKey': 'string',
            'tagValue': 'string'
        },
    ]
)
Parameters:
  • name (string) –

    [REQUIRED]

    The name for the dataset.

  • schemaArn (string) –

    [REQUIRED]

    The ARN of the schema to associate with the dataset. The schema defines the dataset fields.

  • datasetGroupArn (string) –

    [REQUIRED]

    The Amazon Resource Name (ARN) of the dataset group to add the dataset to.

  • datasetType (string) –

    [REQUIRED]

    The type of dataset.

    One of the following (case insensitive) values:

    • Interactions

    • Items

    • Users

  • tags (list) –

    A list of tags to apply to the dataset.

    • (dict) –

      The optional metadata that you apply to resources to help you categorize and organize them. Each tag consists of a key and an optional value, both of which you define. For more information see Tagging Personalize resources.

      • tagKey (string) – [REQUIRED]

        One part of a key-value pair that makes up a tag. A key is a general label that acts like a category for more specific tag values.

      • tagValue (string) – [REQUIRED]

        The optional part of a key-value pair that makes up a tag. A value acts as a descriptor within a tag category (key).

Return type:

dict

Returns:

Response Syntax

{
    'datasetArn': 'string'
}

Response Structure

  • (dict) –

    • datasetArn (string) –

      The ARN of the dataset.

Exceptions

  • Personalize.Client.exceptions.InvalidInputException

  • Personalize.Client.exceptions.ResourceNotFoundException

  • Personalize.Client.exceptions.ResourceAlreadyExistsException

  • Personalize.Client.exceptions.LimitExceededException

  • Personalize.Client.exceptions.ResourceInUseException

  • Personalize.Client.exceptions.TooManyTagsException