get_data_quality_rule_recommendation_run

Glue.Client.get_data_quality_rule_recommendation_run(**kwargs)

Gets the specified recommendation run that was used to generate rules.

See also: AWS API Documentation

Request Syntax

response = client.get_data_quality_rule_recommendation_run(
    RunId='string'
)
Parameters
RunId (string) --

[REQUIRED]

The unique run identifier associated with this run.

Return type
dict
Returns
Response Syntax
{
    'RunId': 'string',
    'DataSource': {
        'GlueTable': {
            'DatabaseName': 'string',
            'TableName': 'string',
            'CatalogId': 'string',
            'ConnectionName': 'string',
            'AdditionalOptions': {
                'string': 'string'
            }
        }
    },
    'Role': 'string',
    'NumberOfWorkers': 123,
    'Timeout': 123,
    'Status': 'STARTING'|'RUNNING'|'STOPPING'|'STOPPED'|'SUCCEEDED'|'FAILED'|'TIMEOUT',
    'ErrorString': 'string',
    'StartedOn': datetime(2015, 1, 1),
    'LastModifiedOn': datetime(2015, 1, 1),
    'CompletedOn': datetime(2015, 1, 1),
    'ExecutionTime': 123,
    'RecommendedRuleset': 'string',
    'CreatedRulesetName': 'string'
}

Response Structure

  • (dict) --
    • RunId (string) --

      The unique run identifier associated with this run.

    • DataSource (dict) --

      The data source (an Glue table) associated with this run.

      • GlueTable (dict) --

        An Glue table.

        • DatabaseName (string) --

          A database name in the Glue Data Catalog.

        • TableName (string) --

          A table name in the Glue Data Catalog.

        • CatalogId (string) --

          A unique identifier for the Glue Data Catalog.

        • ConnectionName (string) --

          The name of the connection to the Glue Data Catalog.

        • AdditionalOptions (dict) --

          Additional options for the table. Currently there are two keys supported:

          • pushDownPredicate : to filter on partitions without having to list and read all the files in your dataset.
          • catalogPartitionPredicate : to use server-side partition pruning using partition indexes in the Glue Data Catalog.
          • (string) --
            • (string) --
    • Role (string) --

      An IAM role supplied to encrypt the results of the run.

    • NumberOfWorkers (integer) --

      The number of G.1X workers to be used in the run. The default is 5.

    • Timeout (integer) --

      The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

    • Status (string) --

      The status for this run.

    • ErrorString (string) --

      The error strings that are associated with the run.

    • StartedOn (datetime) --

      The date and time when this run started.

    • LastModifiedOn (datetime) --

      A timestamp. The last point in time when this data quality rule recommendation run was modified.

    • CompletedOn (datetime) --

      The date and time when this run was completed.

    • ExecutionTime (integer) --

      The amount of time (in seconds) that the run consumed resources.

    • RecommendedRuleset (string) --

      When a start rule recommendation run completes, it creates a recommended ruleset (a set of rules). This member has those rules in Data Quality Definition Language (DQDL) format.

    • CreatedRulesetName (string) --

      The name of the ruleset that was created by the run.

Exceptions

  • Glue.Client.exceptions.EntityNotFoundException
  • Glue.Client.exceptions.InvalidInputException
  • Glue.Client.exceptions.OperationTimeoutException
  • Glue.Client.exceptions.InternalServiceException