DataZone / Client / create_data_source

create_data_source#

DataZone.Client.create_data_source(**kwargs)#

Creates an Amazon DataZone data source.

See also: AWS API Documentation

Request Syntax

response = client.create_data_source(
    assetFormsInput=[
        {
            'content': 'string',
            'formName': 'string',
            'typeIdentifier': 'string',
            'typeRevision': 'string'
        },
    ],
    clientToken='string',
    configuration={
        'glueRunConfiguration': {
            'autoImportDataQualityResult': True|False,
            'dataAccessRole': 'string',
            'relationalFilterConfigurations': [
                {
                    'databaseName': 'string',
                    'filterExpressions': [
                        {
                            'expression': 'string',
                            'type': 'INCLUDE'|'EXCLUDE'
                        },
                    ],
                    'schemaName': 'string'
                },
            ]
        },
        'redshiftRunConfiguration': {
            'dataAccessRole': 'string',
            'redshiftCredentialConfiguration': {
                'secretManagerArn': 'string'
            },
            'redshiftStorage': {
                'redshiftClusterSource': {
                    'clusterName': 'string'
                },
                'redshiftServerlessSource': {
                    'workgroupName': 'string'
                }
            },
            'relationalFilterConfigurations': [
                {
                    'databaseName': 'string',
                    'filterExpressions': [
                        {
                            'expression': 'string',
                            'type': 'INCLUDE'|'EXCLUDE'
                        },
                    ],
                    'schemaName': 'string'
                },
            ]
        }
    },
    description='string',
    domainIdentifier='string',
    enableSetting='ENABLED'|'DISABLED',
    environmentIdentifier='string',
    name='string',
    projectIdentifier='string',
    publishOnImport=True|False,
    recommendation={
        'enableBusinessNameGeneration': True|False
    },
    schedule={
        'schedule': 'string',
        'timezone': 'UTC'|'AFRICA_JOHANNESBURG'|'AMERICA_MONTREAL'|'AMERICA_SAO_PAULO'|'ASIA_BAHRAIN'|'ASIA_BANGKOK'|'ASIA_CALCUTTA'|'ASIA_DUBAI'|'ASIA_HONG_KONG'|'ASIA_JAKARTA'|'ASIA_KUALA_LUMPUR'|'ASIA_SEOUL'|'ASIA_SHANGHAI'|'ASIA_SINGAPORE'|'ASIA_TAIPEI'|'ASIA_TOKYO'|'AUSTRALIA_MELBOURNE'|'AUSTRALIA_SYDNEY'|'CANADA_CENTRAL'|'CET'|'CST6CDT'|'ETC_GMT'|'ETC_GMT0'|'ETC_GMT_ADD_0'|'ETC_GMT_ADD_1'|'ETC_GMT_ADD_10'|'ETC_GMT_ADD_11'|'ETC_GMT_ADD_12'|'ETC_GMT_ADD_2'|'ETC_GMT_ADD_3'|'ETC_GMT_ADD_4'|'ETC_GMT_ADD_5'|'ETC_GMT_ADD_6'|'ETC_GMT_ADD_7'|'ETC_GMT_ADD_8'|'ETC_GMT_ADD_9'|'ETC_GMT_NEG_0'|'ETC_GMT_NEG_1'|'ETC_GMT_NEG_10'|'ETC_GMT_NEG_11'|'ETC_GMT_NEG_12'|'ETC_GMT_NEG_13'|'ETC_GMT_NEG_14'|'ETC_GMT_NEG_2'|'ETC_GMT_NEG_3'|'ETC_GMT_NEG_4'|'ETC_GMT_NEG_5'|'ETC_GMT_NEG_6'|'ETC_GMT_NEG_7'|'ETC_GMT_NEG_8'|'ETC_GMT_NEG_9'|'EUROPE_DUBLIN'|'EUROPE_LONDON'|'EUROPE_PARIS'|'EUROPE_STOCKHOLM'|'EUROPE_ZURICH'|'ISRAEL'|'MEXICO_GENERAL'|'MST7MDT'|'PACIFIC_AUCKLAND'|'US_CENTRAL'|'US_EASTERN'|'US_MOUNTAIN'|'US_PACIFIC'
    },
    type='string'
)
Parameters:
  • assetFormsInput (list) –

    The metadata forms that are to be attached to the assets that this data source works with.

    • (dict) –

      The details of a metadata form.

      • content (string) –

        The content of the metadata form.

      • formName (string) – [REQUIRED]

        The name of the metadata form.

      • typeIdentifier (string) –

        The ID of the metadata form type.

      • typeRevision (string) –

        The revision of the metadata form type.

  • clientToken (string) –

    A unique, case-sensitive identifier that is provided to ensure the idempotency of the request.

    This field is autopopulated if not provided.

  • configuration (dict) –

    Specifies the configuration of the data source. It can be set to either glueRunConfiguration or redshiftRunConfiguration.

    Note

    This is a Tagged Union structure. Only one of the following top level keys can be set: glueRunConfiguration, redshiftRunConfiguration.

    • glueRunConfiguration (dict) –

      The configuration of the Amazon Web Services Glue data source.

      • autoImportDataQualityResult (boolean) –

        Specifies whether to automatically import data quality metrics as part of the data source run.

      • dataAccessRole (string) –

        The data access role included in the configuration details of the Amazon Web Services Glue data source.

      • relationalFilterConfigurations (list) – [REQUIRED]

        The relational filter configurations included in the configuration details of the Amazon Web Services Glue data source.

        • (dict) –

          The relational filter configuration for the data source.

          • databaseName (string) – [REQUIRED]

            The database name specified in the relational filter configuration for the data source.

          • filterExpressions (list) –

            The filter expressions specified in the relational filter configuration for the data source.

            • (dict) –

              A filter expression in Amazon DataZone.

              • expression (string) – [REQUIRED]

                The search filter expression.

              • type (string) – [REQUIRED]

                The search filter explresison type.

          • schemaName (string) –

            The schema name specified in the relational filter configuration for the data source.

    • redshiftRunConfiguration (dict) –

      The configuration of the Amazon Redshift data source.

      • dataAccessRole (string) –

        The data access role included in the configuration details of the Amazon Redshift data source.

      • redshiftCredentialConfiguration (dict) – [REQUIRED]

        The details of the credentials required to access an Amazon Redshift cluster.

        • secretManagerArn (string) – [REQUIRED]

          The ARN of a secret manager for an Amazon Redshift cluster.

      • redshiftStorage (dict) – [REQUIRED]

        The details of the Amazon Redshift storage as part of the configuration of an Amazon Redshift data source run.

        Note

        This is a Tagged Union structure. Only one of the following top level keys can be set: redshiftClusterSource, redshiftServerlessSource.

        • redshiftClusterSource (dict) –

          The details of the Amazon Redshift cluster source.

          • clusterName (string) – [REQUIRED]

            The name of an Amazon Redshift cluster.

        • redshiftServerlessSource (dict) –

          The details of the Amazon Redshift Serverless workgroup source.

          • workgroupName (string) – [REQUIRED]

            The name of the Amazon Redshift Serverless workgroup.

      • relationalFilterConfigurations (list) – [REQUIRED]

        The relational filger configurations included in the configuration details of the Amazon Redshift data source.

        • (dict) –

          The relational filter configuration for the data source.

          • databaseName (string) – [REQUIRED]

            The database name specified in the relational filter configuration for the data source.

          • filterExpressions (list) –

            The filter expressions specified in the relational filter configuration for the data source.

            • (dict) –

              A filter expression in Amazon DataZone.

              • expression (string) – [REQUIRED]

                The search filter expression.

              • type (string) – [REQUIRED]

                The search filter explresison type.

          • schemaName (string) –

            The schema name specified in the relational filter configuration for the data source.

  • description (string) – The description of the data source.

  • domainIdentifier (string) –

    [REQUIRED]

    The ID of the Amazon DataZone domain where the data source is created.

  • enableSetting (string) – Specifies whether the data source is enabled.

  • environmentIdentifier (string) –

    [REQUIRED]

    The unique identifier of the Amazon DataZone environment to which the data source publishes assets.

  • name (string) –

    [REQUIRED]

    The name of the data source.

  • projectIdentifier (string) –

    [REQUIRED]

    The identifier of the Amazon DataZone project in which you want to add this data source.

  • publishOnImport (boolean) – Specifies whether the assets that this data source creates in the inventory are to be also automatically published to the catalog.

  • recommendation (dict) –

    Specifies whether the business name generation is to be enabled for this data source.

    • enableBusinessNameGeneration (boolean) –

      Specifies whether automatic business name generation is to be enabled or not as part of the recommendation configuration.

  • schedule (dict) –

    The schedule of the data source runs.

    • schedule (string) –

      The schedule of the data source runs.

    • timezone (string) –

      The timezone of the data source run.

  • type (string) –

    [REQUIRED]

    The type of the data source.

Return type:

dict

Returns:

Response Syntax

{
    'assetFormsOutput': [
        {
            'content': 'string',
            'formName': 'string',
            'typeName': 'string',
            'typeRevision': 'string'
        },
    ],
    'configuration': {
        'glueRunConfiguration': {
            'accountId': 'string',
            'autoImportDataQualityResult': True|False,
            'dataAccessRole': 'string',
            'region': 'string',
            'relationalFilterConfigurations': [
                {
                    'databaseName': 'string',
                    'filterExpressions': [
                        {
                            'expression': 'string',
                            'type': 'INCLUDE'|'EXCLUDE'
                        },
                    ],
                    'schemaName': 'string'
                },
            ]
        },
        'redshiftRunConfiguration': {
            'accountId': 'string',
            'dataAccessRole': 'string',
            'redshiftCredentialConfiguration': {
                'secretManagerArn': 'string'
            },
            'redshiftStorage': {
                'redshiftClusterSource': {
                    'clusterName': 'string'
                },
                'redshiftServerlessSource': {
                    'workgroupName': 'string'
                }
            },
            'region': 'string',
            'relationalFilterConfigurations': [
                {
                    'databaseName': 'string',
                    'filterExpressions': [
                        {
                            'expression': 'string',
                            'type': 'INCLUDE'|'EXCLUDE'
                        },
                    ],
                    'schemaName': 'string'
                },
            ]
        }
    },
    'createdAt': datetime(2015, 1, 1),
    'description': 'string',
    'domainId': 'string',
    'enableSetting': 'ENABLED'|'DISABLED',
    'environmentId': 'string',
    'errorMessage': {
        'errorDetail': 'string',
        'errorType': 'ACCESS_DENIED_EXCEPTION'|'CONFLICT_EXCEPTION'|'INTERNAL_SERVER_EXCEPTION'|'RESOURCE_NOT_FOUND_EXCEPTION'|'SERVICE_QUOTA_EXCEEDED_EXCEPTION'|'THROTTLING_EXCEPTION'|'VALIDATION_EXCEPTION'
    },
    'id': 'string',
    'lastRunAt': datetime(2015, 1, 1),
    'lastRunErrorMessage': {
        'errorDetail': 'string',
        'errorType': 'ACCESS_DENIED_EXCEPTION'|'CONFLICT_EXCEPTION'|'INTERNAL_SERVER_EXCEPTION'|'RESOURCE_NOT_FOUND_EXCEPTION'|'SERVICE_QUOTA_EXCEEDED_EXCEPTION'|'THROTTLING_EXCEPTION'|'VALIDATION_EXCEPTION'
    },
    'lastRunStatus': 'REQUESTED'|'RUNNING'|'FAILED'|'PARTIALLY_SUCCEEDED'|'SUCCESS',
    'name': 'string',
    'projectId': 'string',
    'publishOnImport': True|False,
    'recommendation': {
        'enableBusinessNameGeneration': True|False
    },
    'schedule': {
        'schedule': 'string',
        'timezone': 'UTC'|'AFRICA_JOHANNESBURG'|'AMERICA_MONTREAL'|'AMERICA_SAO_PAULO'|'ASIA_BAHRAIN'|'ASIA_BANGKOK'|'ASIA_CALCUTTA'|'ASIA_DUBAI'|'ASIA_HONG_KONG'|'ASIA_JAKARTA'|'ASIA_KUALA_LUMPUR'|'ASIA_SEOUL'|'ASIA_SHANGHAI'|'ASIA_SINGAPORE'|'ASIA_TAIPEI'|'ASIA_TOKYO'|'AUSTRALIA_MELBOURNE'|'AUSTRALIA_SYDNEY'|'CANADA_CENTRAL'|'CET'|'CST6CDT'|'ETC_GMT'|'ETC_GMT0'|'ETC_GMT_ADD_0'|'ETC_GMT_ADD_1'|'ETC_GMT_ADD_10'|'ETC_GMT_ADD_11'|'ETC_GMT_ADD_12'|'ETC_GMT_ADD_2'|'ETC_GMT_ADD_3'|'ETC_GMT_ADD_4'|'ETC_GMT_ADD_5'|'ETC_GMT_ADD_6'|'ETC_GMT_ADD_7'|'ETC_GMT_ADD_8'|'ETC_GMT_ADD_9'|'ETC_GMT_NEG_0'|'ETC_GMT_NEG_1'|'ETC_GMT_NEG_10'|'ETC_GMT_NEG_11'|'ETC_GMT_NEG_12'|'ETC_GMT_NEG_13'|'ETC_GMT_NEG_14'|'ETC_GMT_NEG_2'|'ETC_GMT_NEG_3'|'ETC_GMT_NEG_4'|'ETC_GMT_NEG_5'|'ETC_GMT_NEG_6'|'ETC_GMT_NEG_7'|'ETC_GMT_NEG_8'|'ETC_GMT_NEG_9'|'EUROPE_DUBLIN'|'EUROPE_LONDON'|'EUROPE_PARIS'|'EUROPE_STOCKHOLM'|'EUROPE_ZURICH'|'ISRAEL'|'MEXICO_GENERAL'|'MST7MDT'|'PACIFIC_AUCKLAND'|'US_CENTRAL'|'US_EASTERN'|'US_MOUNTAIN'|'US_PACIFIC'
    },
    'status': 'CREATING'|'FAILED_CREATION'|'READY'|'UPDATING'|'FAILED_UPDATE'|'RUNNING'|'DELETING'|'FAILED_DELETION',
    'type': 'string',
    'updatedAt': datetime(2015, 1, 1)
}

Response Structure

  • (dict) –

    • assetFormsOutput (list) –

      The metadata forms attached to the assets that this data source creates.

      • (dict) –

        The details of a metadata form.

        • content (string) –

          The content of the metadata form.

        • formName (string) –

          The name of the metadata form.

        • typeName (string) –

          The name of the metadata form type.

        • typeRevision (string) –

          The revision of the metadata form type.

    • configuration (dict) –

      Specifies the configuration of the data source. It can be set to either glueRunConfiguration or redshiftRunConfiguration.

      Note

      This is a Tagged Union structure. Only one of the following top level keys will be set: glueRunConfiguration, redshiftRunConfiguration. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

      'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
      
      • glueRunConfiguration (dict) –

        The configuration of the Amazon Web Services Glue data source.

        • accountId (string) –

          The Amazon Web Services account ID included in the configuration details of the Amazon Web Services Glue data source.

        • autoImportDataQualityResult (boolean) –

          Specifies whether to automatically import data quality metrics as part of the data source run.

        • dataAccessRole (string) –

          The data access role included in the configuration details of the Amazon Web Services Glue data source.

        • region (string) –

          The Amazon Web Services region included in the configuration details of the Amazon Web Services Glue data source.

        • relationalFilterConfigurations (list) –

          The relational filter configurations included in the configuration details of the Amazon Web Services Glue data source.

          • (dict) –

            The relational filter configuration for the data source.

            • databaseName (string) –

              The database name specified in the relational filter configuration for the data source.

            • filterExpressions (list) –

              The filter expressions specified in the relational filter configuration for the data source.

              • (dict) –

                A filter expression in Amazon DataZone.

                • expression (string) –

                  The search filter expression.

                • type (string) –

                  The search filter explresison type.

            • schemaName (string) –

              The schema name specified in the relational filter configuration for the data source.

      • redshiftRunConfiguration (dict) –

        The configuration of the Amazon Redshift data source.

        • accountId (string) –

          The ID of the Amazon Web Services account included in the configuration details of the Amazon Redshift data source.

        • dataAccessRole (string) –

          The data access role included in the configuration details of the Amazon Redshift data source.

        • redshiftCredentialConfiguration (dict) –

          The details of the credentials required to access an Amazon Redshift cluster.

          • secretManagerArn (string) –

            The ARN of a secret manager for an Amazon Redshift cluster.

        • redshiftStorage (dict) –

          The details of the Amazon Redshift storage as part of the configuration of an Amazon Redshift data source run.

          Note

          This is a Tagged Union structure. Only one of the following top level keys will be set: redshiftClusterSource, redshiftServerlessSource. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

          'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
          
          • redshiftClusterSource (dict) –

            The details of the Amazon Redshift cluster source.

            • clusterName (string) –

              The name of an Amazon Redshift cluster.

          • redshiftServerlessSource (dict) –

            The details of the Amazon Redshift Serverless workgroup source.

            • workgroupName (string) –

              The name of the Amazon Redshift Serverless workgroup.

        • region (string) –

          The Amazon Web Services region included in the configuration details of the Amazon Redshift data source.

        • relationalFilterConfigurations (list) –

          The relational filger configurations included in the configuration details of the Amazon Redshift data source.

          • (dict) –

            The relational filter configuration for the data source.

            • databaseName (string) –

              The database name specified in the relational filter configuration for the data source.

            • filterExpressions (list) –

              The filter expressions specified in the relational filter configuration for the data source.

              • (dict) –

                A filter expression in Amazon DataZone.

                • expression (string) –

                  The search filter expression.

                • type (string) –

                  The search filter explresison type.

            • schemaName (string) –

              The schema name specified in the relational filter configuration for the data source.

    • createdAt (datetime) –

      The timestamp of when the data source was created.

    • description (string) –

      The description of the data source.

    • domainId (string) –

      The ID of the Amazon DataZone domain in which the data source is created.

    • enableSetting (string) –

      Specifies whether the data source is enabled.

    • environmentId (string) –

      The unique identifier of the Amazon DataZone environment to which the data source publishes assets.

    • errorMessage (dict) –

      Specifies the error message that is returned if the operation cannot be successfully completed.

      • errorDetail (string) –

        The details of the error message that is returned if the operation cannot be successfully completed.

      • errorType (string) –

        The type of the error message that is returned if the operation cannot be successfully completed.

    • id (string) –

      The unique identifier of the data source.

    • lastRunAt (datetime) –

      The timestamp that specifies when the data source was last run.

    • lastRunErrorMessage (dict) –

      Specifies the error message that is returned if the operation cannot be successfully completed.

      • errorDetail (string) –

        The details of the error message that is returned if the operation cannot be successfully completed.

      • errorType (string) –

        The type of the error message that is returned if the operation cannot be successfully completed.

    • lastRunStatus (string) –

      The status of the last run of this data source.

    • name (string) –

      The name of the data source.

    • projectId (string) –

      The ID of the Amazon DataZone project to which the data source is added.

    • publishOnImport (boolean) –

      Specifies whether the assets that this data source creates in the inventory are to be also automatically published to the catalog.

    • recommendation (dict) –

      Specifies whether the business name generation is to be enabled for this data source.

      • enableBusinessNameGeneration (boolean) –

        Specifies whether automatic business name generation is to be enabled or not as part of the recommendation configuration.

    • schedule (dict) –

      The schedule of the data source runs.

      • schedule (string) –

        The schedule of the data source runs.

      • timezone (string) –

        The timezone of the data source run.

    • status (string) –

      The status of the data source.

    • type (string) –

      The type of the data source.

    • updatedAt (datetime) –

      The timestamp of when the data source was updated.

Exceptions

  • DataZone.Client.exceptions.InternalServerException

  • DataZone.Client.exceptions.ResourceNotFoundException

  • DataZone.Client.exceptions.AccessDeniedException

  • DataZone.Client.exceptions.ThrottlingException

  • DataZone.Client.exceptions.ServiceQuotaExceededException

  • DataZone.Client.exceptions.ConflictException

  • DataZone.Client.exceptions.ValidationException

  • DataZone.Client.exceptions.UnauthorizedException