SupplyChain / Client / create_data_integration_flow

create_data_integration_flow#

SupplyChain.Client.create_data_integration_flow(**kwargs)#

Enables you to programmatically create a data pipeline to ingest data from source systems such as Amazon S3 buckets, to a predefined Amazon Web Services Supply Chain dataset (product, inbound_order) or a temporary dataset along with the data transformation query provided with the API.

See also: AWS API Documentation

Request Syntax

response = client.create_data_integration_flow(
    instanceId='string',
    name='string',
    sources=[
        {
            'sourceType': 'S3'|'DATASET',
            'sourceName': 'string',
            's3Source': {
                'bucketName': 'string',
                'prefix': 'string',
                'options': {
                    'fileType': 'CSV'|'PARQUET'|'JSON'
                }
            },
            'datasetSource': {
                'datasetIdentifier': 'string',
                'options': {
                    'loadType': 'INCREMENTAL'|'REPLACE',
                    'dedupeRecords': True|False
                }
            }
        },
    ],
    transformation={
        'transformationType': 'SQL'|'NONE',
        'sqlTransformation': {
            'query': 'string'
        }
    },
    target={
        'targetType': 'S3'|'DATASET',
        's3Target': {
            'bucketName': 'string',
            'prefix': 'string',
            'options': {
                'fileType': 'CSV'|'PARQUET'|'JSON'
            }
        },
        'datasetTarget': {
            'datasetIdentifier': 'string',
            'options': {
                'loadType': 'INCREMENTAL'|'REPLACE',
                'dedupeRecords': True|False
            }
        }
    },
    tags={
        'string': 'string'
    }
)
Parameters:
  • instanceId (string) –

    [REQUIRED]

    The Amazon Web Services Supply Chain instance identifier.

  • name (string) –

    [REQUIRED]

    Name of the DataIntegrationFlow.

  • sources (list) –

    [REQUIRED]

    The source configurations for DataIntegrationFlow.

    • (dict) –

      The DataIntegrationFlow source parameters.

      • sourceType (string) – [REQUIRED]

        The DataIntegrationFlow source type.

      • sourceName (string) – [REQUIRED]

        The DataIntegrationFlow source name that can be used as table alias in SQL transformation query.

      • s3Source (dict) –

        The S3 DataIntegrationFlow source.

        • bucketName (string) – [REQUIRED]

          The bucketName of the S3 source objects.

        • prefix (string) – [REQUIRED]

          The prefix of the S3 source objects.

        • options (dict) –

          The other options of the S3 DataIntegrationFlow source.

          • fileType (string) –

            The Amazon S3 file type in S3 options.

      • datasetSource (dict) –

        The dataset DataIntegrationFlow source.

        • datasetIdentifier (string) – [REQUIRED]

          The ARN of the dataset.

        • options (dict) –

          The dataset DataIntegrationFlow source options.

          • loadType (string) –

            The dataset data load type in dataset options.

          • dedupeRecords (boolean) –

            The dataset load option to remove duplicates.

  • transformation (dict) –

    [REQUIRED]

    The transformation configurations for DataIntegrationFlow.

    • transformationType (string) – [REQUIRED]

      The DataIntegrationFlow transformation type.

    • sqlTransformation (dict) –

      The SQL DataIntegrationFlow transformation configuration.

      • query (string) – [REQUIRED]

        The transformation SQL query body based on SparkSQL.

  • target (dict) –

    [REQUIRED]

    The target configurations for DataIntegrationFlow.

    • targetType (string) – [REQUIRED]

      The DataIntegrationFlow target type.

    • s3Target (dict) –

      The S3 DataIntegrationFlow target.

      • bucketName (string) – [REQUIRED]

        The bucketName of the S3 target objects.

      • prefix (string) – [REQUIRED]

        The prefix of the S3 target objects.

      • options (dict) –

        The S3 DataIntegrationFlow target options.

        • fileType (string) –

          The Amazon S3 file type in S3 options.

    • datasetTarget (dict) –

      The dataset DataIntegrationFlow target.

      • datasetIdentifier (string) – [REQUIRED]

        The dataset ARN.

      • options (dict) –

        The dataset DataIntegrationFlow target options.

        • loadType (string) –

          The dataset data load type in dataset options.

        • dedupeRecords (boolean) –

          The dataset load option to remove duplicates.

  • tags (dict) –

    The tags of the DataIntegrationFlow to be created

    • (string) –

      • (string) –

Return type:

dict

Returns:

Response Syntax

{
    'instanceId': 'string',
    'name': 'string'
}

Response Structure

  • (dict) –

    The response parameters for CreateDataIntegrationFlow.

    • instanceId (string) –

      The Amazon Web Services Supply Chain instance identifier.

    • name (string) –

      The name of the DataIntegrationFlow created.

Exceptions

  • SupplyChain.Client.exceptions.ServiceQuotaExceededException

  • SupplyChain.Client.exceptions.ThrottlingException

  • SupplyChain.Client.exceptions.ResourceNotFoundException

  • SupplyChain.Client.exceptions.AccessDeniedException

  • SupplyChain.Client.exceptions.ValidationException

  • SupplyChain.Client.exceptions.InternalServerException

  • SupplyChain.Client.exceptions.ConflictException