Comprehend / Client / describe_pii_entities_detection_job
describe_pii_entities_detection_job#
- Comprehend.Client.describe_pii_entities_detection_job(**kwargs)#
Gets the properties associated with a PII entities detection job. For example, you can use this operation to get the job status.
See also: AWS API Documentation
Request Syntax
response = client.describe_pii_entities_detection_job( JobId='string' )
- Parameters:
JobId (string) –
[REQUIRED]
The identifier that Amazon Comprehend generated for the job. The operation returns this identifier in its response.
- Return type:
dict
- Returns:
Response Syntax
{ 'PiiEntitiesDetectionJobProperties': { 'JobId': 'string', 'JobArn': 'string', 'JobName': 'string', 'JobStatus': 'SUBMITTED'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'STOP_REQUESTED'|'STOPPED', 'Message': 'string', 'SubmitTime': datetime(2015, 1, 1), 'EndTime': datetime(2015, 1, 1), 'InputDataConfig': { 'S3Uri': 'string', 'InputFormat': 'ONE_DOC_PER_FILE'|'ONE_DOC_PER_LINE', 'DocumentReaderConfig': { 'DocumentReadAction': 'TEXTRACT_DETECT_DOCUMENT_TEXT'|'TEXTRACT_ANALYZE_DOCUMENT', 'DocumentReadMode': 'SERVICE_DEFAULT'|'FORCE_DOCUMENT_READ_ACTION', 'FeatureTypes': [ 'TABLES'|'FORMS', ] } }, 'OutputDataConfig': { 'S3Uri': 'string', 'KmsKeyId': 'string' }, 'RedactionConfig': { 'PiiEntityTypes': [ 'BANK_ACCOUNT_NUMBER'|'BANK_ROUTING'|'CREDIT_DEBIT_NUMBER'|'CREDIT_DEBIT_CVV'|'CREDIT_DEBIT_EXPIRY'|'PIN'|'EMAIL'|'ADDRESS'|'NAME'|'PHONE'|'SSN'|'DATE_TIME'|'PASSPORT_NUMBER'|'DRIVER_ID'|'URL'|'AGE'|'USERNAME'|'PASSWORD'|'AWS_ACCESS_KEY'|'AWS_SECRET_KEY'|'IP_ADDRESS'|'MAC_ADDRESS'|'ALL'|'LICENSE_PLATE'|'VEHICLE_IDENTIFICATION_NUMBER'|'UK_NATIONAL_INSURANCE_NUMBER'|'CA_SOCIAL_INSURANCE_NUMBER'|'US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER'|'UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER'|'IN_PERMANENT_ACCOUNT_NUMBER'|'IN_NREGA'|'INTERNATIONAL_BANK_ACCOUNT_NUMBER'|'SWIFT_CODE'|'UK_NATIONAL_HEALTH_SERVICE_NUMBER'|'CA_HEALTH_NUMBER'|'IN_AADHAAR'|'IN_VOTER_NUMBER', ], 'MaskMode': 'MASK'|'REPLACE_WITH_PII_ENTITY_TYPE', 'MaskCharacter': 'string' }, 'LanguageCode': 'en'|'es'|'fr'|'de'|'it'|'pt'|'ar'|'hi'|'ja'|'ko'|'zh'|'zh-TW', 'DataAccessRoleArn': 'string', 'Mode': 'ONLY_REDACTION'|'ONLY_OFFSETS' } }
Response Structure
(dict) –
PiiEntitiesDetectionJobProperties (dict) –
Provides information about a PII entities detection job.
JobId (string) –
The identifier assigned to the PII entities detection job.
JobArn (string) –
The Amazon Resource Name (ARN) of the PII entities detection job. It is a unique, fully qualified identifier for the job. It includes the Amazon Web Services account, Amazon Web Services Region, and the job ID. The format of the ARN is as follows:
arn:<partition>:comprehend:<region>:<account-id>:pii-entities-detection-job/<job-id>
The following is an example job ARN:
arn:aws:comprehend:us-west-2:111122223333:pii-entities-detection-job/1234abcd12ab34cd56ef1234567890ab
JobName (string) –
The name that you assigned the PII entities detection job.
JobStatus (string) –
The current status of the PII entities detection job. If the status is
FAILED
, theMessage
field shows the reason for the failure.Message (string) –
A description of the status of a job.
SubmitTime (datetime) –
The time that the PII entities detection job was submitted for processing.
EndTime (datetime) –
The time that the PII entities detection job completed.
InputDataConfig (dict) –
The input properties for a PII entities detection job.
S3Uri (string) –
The Amazon S3 URI for the input data. The URI must be in same Region as the API endpoint that you are calling. The URI can point to a single input file or it can provide the prefix for a collection of data files.
For example, if you use the URI
S3://bucketName/prefix
, if the prefix is a single file, Amazon Comprehend uses that file as input. If more than one file begins with the prefix, Amazon Comprehend uses all of them as input.InputFormat (string) –
Specifies how the text in an input file should be processed:
ONE_DOC_PER_FILE
- Each file is considered a separate document. Use this option when you are processing large documents, such as newspaper articles or scientific papers.ONE_DOC_PER_LINE
- Each line in a file is considered a separate document. Use this option when you are processing many short documents, such as text messages.
DocumentReaderConfig (dict) –
Provides configuration parameters to override the default actions for extracting text from PDF documents and image files.
DocumentReadAction (string) –
This field defines the Amazon Textract API operation that Amazon Comprehend uses to extract text from PDF files and image files. Enter one of the following values:
TEXTRACT_DETECT_DOCUMENT_TEXT
- The Amazon Comprehend service uses theDetectDocumentText
API operation.TEXTRACT_ANALYZE_DOCUMENT
- The Amazon Comprehend service uses theAnalyzeDocument
API operation.
DocumentReadMode (string) –
Determines the text extraction actions for PDF files. Enter one of the following values:
SERVICE_DEFAULT
- use the Amazon Comprehend service defaults for PDF files.FORCE_DOCUMENT_READ_ACTION
- Amazon Comprehend uses the Textract API specified by DocumentReadAction for all PDF files, including digital PDF files.
FeatureTypes (list) –
Specifies the type of Amazon Textract features to apply. If you chose
TEXTRACT_ANALYZE_DOCUMENT
as the read action, you must specify one or both of the following values:TABLES
- Returns information about any tables that are detected in the input document.FORMS
- Returns information and the data from any forms that are detected in the input document.
(string) –
Specifies the type of Amazon Textract features to apply. If you chose
TEXTRACT_ANALYZE_DOCUMENT
as the read action, you must specify one or both of the following values:TABLES
- Returns additional information about any tables that are detected in the input document.FORMS
- Returns additional information about any forms that are detected in the input document.
OutputDataConfig (dict) –
The output data configuration that you supplied when you created the PII entities detection job.
S3Uri (string) –
When you use the
PiiOutputDataConfig
object with asynchronous operations, you specify the Amazon S3 location where you want to write the output data.For a PII entity detection job, the output file is plain text, not a compressed archive. The output file name is the same as the input file, with
.out
appended at the end.KmsKeyId (string) –
ID for the Amazon Web Services Key Management Service (KMS) key that Amazon Comprehend uses to encrypt the output results from an analysis job.
RedactionConfig (dict) –
Provides configuration parameters for PII entity redaction.
This parameter is required if you set the
Mode
parameter toONLY_REDACTION
. In that case, you must provide aRedactionConfig
definition that includes thePiiEntityTypes
parameter.PiiEntityTypes (list) –
An array of the types of PII entities that Amazon Comprehend detects in the input text for your request.
(string) –
MaskMode (string) –
Specifies whether the PII entity is redacted with the mask character or the entity type.
MaskCharacter (string) –
A character that replaces each character in the redacted PII entity.
LanguageCode (string) –
The language code of the input documents
DataAccessRoleArn (string) –
The Amazon Resource Name (ARN) of the IAM role that grants Amazon Comprehend read access to your input data.
Mode (string) –
Specifies whether the output provides the locations (offsets) of PII entities or a file in which PII entities are redacted.
Exceptions
Comprehend.Client.exceptions.InvalidRequestException
Comprehend.Client.exceptions.JobNotFoundException
Comprehend.Client.exceptions.TooManyRequestsException
Comprehend.Client.exceptions.InternalServerException