EntityResolution / Client / create_id_mapping_workflow

create_id_mapping_workflow#

EntityResolution.Client.create_id_mapping_workflow(**kwargs)#

Creates an IdMappingWorkflow object which stores the configuration of the data processing job to be run. Each IdMappingWorkflow must have a unique workflow name. To modify an existing workflow, use the UpdateIdMappingWorkflow API.

See also: AWS API Documentation

Request Syntax

response = client.create_id_mapping_workflow(
    description='string',
    idMappingTechniques={
        'idMappingType': 'PROVIDER'|'RULE_BASED',
        'providerProperties': {
            'intermediateSourceConfiguration': {
                'intermediateS3Path': 'string'
            },
            'providerConfiguration': {...}|[...]|123|123.4|'string'|True|None,
            'providerServiceArn': 'string'
        },
        'ruleBasedProperties': {
            'attributeMatchingModel': 'ONE_TO_ONE'|'MANY_TO_MANY',
            'recordMatchingModel': 'ONE_SOURCE_TO_ONE_TARGET'|'MANY_SOURCE_TO_ONE_TARGET',
            'ruleDefinitionType': 'SOURCE'|'TARGET',
            'rules': [
                {
                    'matchingKeys': [
                        'string',
                    ],
                    'ruleName': 'string'
                },
            ]
        }
    },
    inputSourceConfig=[
        {
            'inputSourceARN': 'string',
            'schemaName': 'string',
            'type': 'SOURCE'|'TARGET'
        },
    ],
    outputSourceConfig=[
        {
            'KMSArn': 'string',
            'outputS3Path': 'string'
        },
    ],
    roleArn='string',
    tags={
        'string': 'string'
    },
    workflowName='string'
)
Parameters:
  • description (string) – A description of the workflow.

  • idMappingTechniques (dict) –

    [REQUIRED]

    An object which defines the ID mapping technique and any additional configurations.

    • idMappingType (string) – [REQUIRED]

      The type of ID mapping.

    • providerProperties (dict) –

      An object which defines any additional configurations required by the provider service.

      • intermediateSourceConfiguration (dict) –

        The Amazon S3 location that temporarily stores your data while it processes. Your information won’t be saved permanently.

        • intermediateS3Path (string) – [REQUIRED]

          The Amazon S3 location (bucket and prefix). For example: s3://provider_bucket/DOC-EXAMPLE-BUCKET

      • providerConfiguration (document) –

        The required configuration fields to use with the provider service.

      • providerServiceArn (string) – [REQUIRED]

        The ARN of the provider service.

    • ruleBasedProperties (dict) –

      An object which defines any additional configurations required by rule-based matching.

      • attributeMatchingModel (string) – [REQUIRED]

        The comparison type. You can either choose ONE_TO_ONE or MANY_TO_MANY as the attributeMatchingModel.

        If you choose MANY_TO_MANY, the system can match attributes across the sub-types of an attribute type. For example, if the value of the Email field of Profile A matches the value of the BusinessEmail field of Profile B, the two profiles are matched on the Email attribute type.

        If you choose ONE_TO_ONE, the system can only match attributes if the sub-types are an exact match. For example, for the Email attribute type, the system will only consider it a match if the value of the Email field of Profile A matches the value of the Email field of Profile B.

      • recordMatchingModel (string) – [REQUIRED]

        The type of matching record that is allowed to be used in an ID mapping workflow.

        If the value is set to ONE_SOURCE_TO_ONE_TARGET, only one record in the source can be matched to the same record in the target.

        If the value is set to MANY_SOURCE_TO_ONE_TARGET, multiple records in the source can be matched to one record in the target.

      • ruleDefinitionType (string) – [REQUIRED]

        The set of rules you can use in an ID mapping workflow. The limitations specified for the source or target to define the match rules must be compatible.

      • rules (list) –

        The rules that can be used for ID mapping.

        • (dict) –

          An object containing RuleName, and MatchingKeys.

          • matchingKeys (list) – [REQUIRED]

            A list of MatchingKeys. The MatchingKeys must have been defined in the SchemaMapping. Two records are considered to match according to this rule if all of the MatchingKeys match.

            • (string) –

          • ruleName (string) – [REQUIRED]

            A name for the matching rule.

  • inputSourceConfig (list) –

    [REQUIRED]

    A list of InputSource objects, which have the fields InputSourceARN and SchemaName.

    • (dict) –

      An object containing InputSourceARN, SchemaName, and Type.

      • inputSourceARN (string) – [REQUIRED]

        An Glue table Amazon Resource Name (ARN) or a matching workflow ARN for the input source table.

      • schemaName (string) –

        The name of the schema to be retrieved.

      • type (string) –

        The type of ID namespace. There are two types: SOURCE and TARGET.

        The SOURCE contains configurations for sourceId data that will be processed in an ID mapping workflow.

        The TARGET contains a configuration of targetId which all sourceIds will resolve to.

  • outputSourceConfig (list) –

    A list of IdMappingWorkflowOutputSource objects, each of which contains fields OutputS3Path and Output.

    • (dict) –

      The output source for the ID mapping workflow.

      • KMSArn (string) –

        Customer KMS ARN for encryption at rest. If not provided, system will use an Entity Resolution managed KMS key.

      • outputS3Path (string) – [REQUIRED]

        The S3 path to which Entity Resolution will write the output table.

  • roleArn (string) – The Amazon Resource Name (ARN) of the IAM role. Entity Resolution assumes this role to create resources on your behalf as part of workflow execution.

  • tags (dict) –

    The tags used to organize, track, or control access for this resource.

    • (string) –

      • (string) –

  • workflowName (string) –

    [REQUIRED]

    The name of the workflow. There can’t be multiple IdMappingWorkflows with the same name.

Return type:

dict

Returns:

Response Syntax

{
    'description': 'string',
    'idMappingTechniques': {
        'idMappingType': 'PROVIDER'|'RULE_BASED',
        'providerProperties': {
            'intermediateSourceConfiguration': {
                'intermediateS3Path': 'string'
            },
            'providerConfiguration': {...}|[...]|123|123.4|'string'|True|None,
            'providerServiceArn': 'string'
        },
        'ruleBasedProperties': {
            'attributeMatchingModel': 'ONE_TO_ONE'|'MANY_TO_MANY',
            'recordMatchingModel': 'ONE_SOURCE_TO_ONE_TARGET'|'MANY_SOURCE_TO_ONE_TARGET',
            'ruleDefinitionType': 'SOURCE'|'TARGET',
            'rules': [
                {
                    'matchingKeys': [
                        'string',
                    ],
                    'ruleName': 'string'
                },
            ]
        }
    },
    'inputSourceConfig': [
        {
            'inputSourceARN': 'string',
            'schemaName': 'string',
            'type': 'SOURCE'|'TARGET'
        },
    ],
    'outputSourceConfig': [
        {
            'KMSArn': 'string',
            'outputS3Path': 'string'
        },
    ],
    'roleArn': 'string',
    'workflowArn': 'string',
    'workflowName': 'string'
}

Response Structure

  • (dict) –

    • description (string) –

      A description of the workflow.

    • idMappingTechniques (dict) –

      An object which defines the ID mapping technique and any additional configurations.

      • idMappingType (string) –

        The type of ID mapping.

      • providerProperties (dict) –

        An object which defines any additional configurations required by the provider service.

        • intermediateSourceConfiguration (dict) –

          The Amazon S3 location that temporarily stores your data while it processes. Your information won’t be saved permanently.

          • intermediateS3Path (string) –

            The Amazon S3 location (bucket and prefix). For example: s3://provider_bucket/DOC-EXAMPLE-BUCKET

        • providerConfiguration (document) –

          The required configuration fields to use with the provider service.

        • providerServiceArn (string) –

          The ARN of the provider service.

      • ruleBasedProperties (dict) –

        An object which defines any additional configurations required by rule-based matching.

        • attributeMatchingModel (string) –

          The comparison type. You can either choose ONE_TO_ONE or MANY_TO_MANY as the attributeMatchingModel.

          If you choose MANY_TO_MANY, the system can match attributes across the sub-types of an attribute type. For example, if the value of the Email field of Profile A matches the value of the BusinessEmail field of Profile B, the two profiles are matched on the Email attribute type.

          If you choose ONE_TO_ONE, the system can only match attributes if the sub-types are an exact match. For example, for the Email attribute type, the system will only consider it a match if the value of the Email field of Profile A matches the value of the Email field of Profile B.

        • recordMatchingModel (string) –

          The type of matching record that is allowed to be used in an ID mapping workflow.

          If the value is set to ONE_SOURCE_TO_ONE_TARGET, only one record in the source can be matched to the same record in the target.

          If the value is set to MANY_SOURCE_TO_ONE_TARGET, multiple records in the source can be matched to one record in the target.

        • ruleDefinitionType (string) –

          The set of rules you can use in an ID mapping workflow. The limitations specified for the source or target to define the match rules must be compatible.

        • rules (list) –

          The rules that can be used for ID mapping.

          • (dict) –

            An object containing RuleName, and MatchingKeys.

            • matchingKeys (list) –

              A list of MatchingKeys. The MatchingKeys must have been defined in the SchemaMapping. Two records are considered to match according to this rule if all of the MatchingKeys match.

              • (string) –

            • ruleName (string) –

              A name for the matching rule.

    • inputSourceConfig (list) –

      A list of InputSource objects, which have the fields InputSourceARN and SchemaName.

      • (dict) –

        An object containing InputSourceARN, SchemaName, and Type.

        • inputSourceARN (string) –

          An Glue table Amazon Resource Name (ARN) or a matching workflow ARN for the input source table.

        • schemaName (string) –

          The name of the schema to be retrieved.

        • type (string) –

          The type of ID namespace. There are two types: SOURCE and TARGET.

          The SOURCE contains configurations for sourceId data that will be processed in an ID mapping workflow.

          The TARGET contains a configuration of targetId which all sourceIds will resolve to.

    • outputSourceConfig (list) –

      A list of IdMappingWorkflowOutputSource objects, each of which contains fields OutputS3Path and Output.

      • (dict) –

        The output source for the ID mapping workflow.

        • KMSArn (string) –

          Customer KMS ARN for encryption at rest. If not provided, system will use an Entity Resolution managed KMS key.

        • outputS3Path (string) –

          The S3 path to which Entity Resolution will write the output table.

    • roleArn (string) –

      The Amazon Resource Name (ARN) of the IAM role. Entity Resolution assumes this role to create resources on your behalf as part of workflow execution.

    • workflowArn (string) –

      The ARN (Amazon Resource Name) that Entity Resolution generated for the IDMappingWorkflow.

    • workflowName (string) –

      The name of the workflow.

Exceptions

  • EntityResolution.Client.exceptions.ThrottlingException

  • EntityResolution.Client.exceptions.InternalServerException

  • EntityResolution.Client.exceptions.AccessDeniedException

  • EntityResolution.Client.exceptions.ExceedsLimitException

  • EntityResolution.Client.exceptions.ConflictException

  • EntityResolution.Client.exceptions.ValidationException