Glue / Client / start_ml_labeling_set_generation_task_run

start_ml_labeling_set_generation_task_run#

Glue.Client.start_ml_labeling_set_generation_task_run(**kwargs)#

Starts the active learning workflow for your machine learning transform to improve the transform’s quality by generating label sets and adding labels.

When the StartMLLabelingSetGenerationTaskRun finishes, Glue will have generated a “labeling set” or a set of questions for humans to answer.

In the case of the FindMatches transform, these questions are of the form, “What is the correct way to group these rows together into groups composed entirely of matching records?”

After the labeling process is finished, you can upload your labels with a call to StartImportLabelsTaskRun. After StartImportLabelsTaskRun finishes, all future runs of the machine learning transform will use the new and improved labels and perform a higher-quality transformation.

See also: AWS API Documentation

Request Syntax

response = client.start_ml_labeling_set_generation_task_run(
    TransformId='string',
    OutputS3Path='string'
)
Parameters:
  • TransformId (string) –

    [REQUIRED]

    The unique identifier of the machine learning transform.

  • OutputS3Path (string) –

    [REQUIRED]

    The Amazon Simple Storage Service (Amazon S3) path where you generate the labeling set.

Return type:

dict

Returns:

Response Syntax

{
    'TaskRunId': 'string'
}

Response Structure

  • (dict) –

    • TaskRunId (string) –

      The unique run identifier that is associated with this task run.

Exceptions

  • Glue.Client.exceptions.EntityNotFoundException

  • Glue.Client.exceptions.InvalidInputException

  • Glue.Client.exceptions.OperationTimeoutException

  • Glue.Client.exceptions.InternalServiceException

  • Glue.Client.exceptions.ConcurrentRunsExceededException