Table of Contents
A low-level client representing Amazon Transcribe Service:
import boto3
client = boto3.client('transcribe')
These are the available methods:
Check if an operation can be paginated.
Creates a new custom vocabulary that you can use to change the way Amazon Transcribe handles transcription of an audio file.
See also: AWS API Documentation
Request Syntax
response = client.create_vocabulary(
    VocabularyName='string',
    LanguageCode='en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    Phrases=[
        'string',
    ],
    VocabularyFileUri='string'
)
[REQUIRED]
The name of the vocabulary. The name must be unique within an AWS account. The name is case-sensitive.
[REQUIRED]
The language code of the vocabulary entries.
An array of strings that contains the vocabulary entries.
The S3 location of the text file that contains the definition of the custom vocabulary. The URI must be in the same region as the API endpoint that you are calling. The general form is
https://s3.<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>
For example:
https://s3.us-east-1.amazonaws.com/examplebucket/vocab.txt
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
For more information about custom vocabularies, see Custom Vocabularies .
dict
Response Syntax
{
    'VocabularyName': 'string',
    'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    'VocabularyState': 'PENDING'|'READY'|'FAILED',
    'LastModifiedTime': datetime(2015, 1, 1),
    'FailureReason': 'string'
}
Response Structure
(dict) --
VocabularyName (string) --
The name of the vocabulary.
LanguageCode (string) --
The language code of the vocabulary entries.
VocabularyState (string) --
The processing state of the vocabulary. When the VocabularyState field contains READY the vocabulary is ready to be used in a StartTranscriptionJob request.
LastModifiedTime (datetime) --
The date and time that the vocabulary was created.
FailureReason (string) --
If the VocabularyState field is FAILED , this field contains information about why the job failed.
Creates a new vocabulary filter that you can use to filter words, such as profane words, from the output of a transcription job.
See also: AWS API Documentation
Request Syntax
response = client.create_vocabulary_filter(
    VocabularyFilterName='string',
    LanguageCode='en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    Words=[
        'string',
    ],
    VocabularyFilterFileUri='string'
)
[REQUIRED]
The vocabulary filter name. The name must be unique within the account that contains it.
[REQUIRED]
The language code of the words in the vocabulary filter. All words in the filter must be in the same language. The vocabulary filter can only be used with transcription jobs in the specified language.
The words to use in the vocabulary filter. Only use characters from the character set defined for custom vocabularies. For a list of character sets, see Character Sets for Custom Vocabularies .
If you provide a list of words in the Words parameter, you can't use the VocabularyFilterFileUri parameter.
The Amazon S3 location of a text file used as input to create the vocabulary filter. Only use characters from the character set defined for custom vocabularies. For a list of character sets, see Character Sets for Custom Vocabularies .
The specified file must be less than 50 KB of UTF-8 characters.
If you provide the location of a list of words in the VocabularyFilterFileUri parameter, you can't use the Words parameter.
dict
Response Syntax
{
    'VocabularyFilterName': 'string',
    'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    'LastModifiedTime': datetime(2015, 1, 1)
}
Response Structure
(dict) --
VocabularyFilterName (string) --
The name of the vocabulary filter.
LanguageCode (string) --
The language code of the words in the collection.
LastModifiedTime (datetime) --
The date and time that the vocabulary filter was modified.
Deletes a transcription job generated by Amazon Transcribe Medical and any related information.
See also: AWS API Documentation
Request Syntax
response = client.delete_medical_transcription_job(
    MedicalTranscriptionJobName='string'
)
[REQUIRED]
The name you provide to the DeleteMedicalTranscriptionJob object to delete a transcription job.
Deletes a previously submitted transcription job along with any other generated results such as the transcription, models, and so on.
See also: AWS API Documentation
Request Syntax
response = client.delete_transcription_job(
    TranscriptionJobName='string'
)
[REQUIRED]
The name of the transcription job to be deleted.
Deletes a vocabulary from Amazon Transcribe.
See also: AWS API Documentation
Request Syntax
response = client.delete_vocabulary(
    VocabularyName='string'
)
[REQUIRED]
The name of the vocabulary to delete.
Removes a vocabulary filter.
See also: AWS API Documentation
Request Syntax
response = client.delete_vocabulary_filter(
    VocabularyFilterName='string'
)
[REQUIRED]
The name of the vocabulary filter to remove.
Generate a presigned url given a client, its method, and arguments
The presigned url
Returns information about a transcription job from Amazon Transcribe Medical. To see the status of the job, check the TranscriptionJobStatus field. If the status is COMPLETED , the job is finished. You find the results of the completed job in the TranscriptFileUri field.
See also: AWS API Documentation
Request Syntax
response = client.get_medical_transcription_job(
    MedicalTranscriptionJobName='string'
)
[REQUIRED]
The name of the medical transcription job.
{
    'MedicalTranscriptionJob': {
        'MedicalTranscriptionJobName': 'string',
        'TranscriptionJobStatus': 'QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
        'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
        'MediaSampleRateHertz': 123,
        'MediaFormat': 'mp3'|'mp4'|'wav'|'flac',
        'Media': {
            'MediaFileUri': 'string'
        },
        'Transcript': {
            'TranscriptFileUri': 'string'
        },
        'StartTime': datetime(2015, 1, 1),
        'CreationTime': datetime(2015, 1, 1),
        'CompletionTime': datetime(2015, 1, 1),
        'FailureReason': 'string',
        'Settings': {
            'ShowSpeakerLabels': True|False,
            'MaxSpeakerLabels': 123,
            'ChannelIdentification': True|False,
            'ShowAlternatives': True|False,
            'MaxAlternatives': 123
        },
        'Specialty': 'PRIMARYCARE',
        'Type': 'CONVERSATION'|'DICTATION'
    }
}
Response Structure
An object that contains the results of the medical transcription job.
The name for a given medical transcription job.
The completion status of a medical transcription job.
The language code for the language spoken in the source audio file. US English (en-US) is the only supported language for medical transcriptions. Any other value you enter for language code results in a BadRequestException error.
The sample rate, in Hertz, of the source audio containing medical information.
If you don't specify the sample rate, Amazon Transcribe Medical determines it for you. If you choose to specify the sample rate, it must match the rate detected by Amazon Transcribe Medical. In most cases, you should leave the MediaSampleHertz blank and let Amazon Transcribe Medical determine the sample rate.
The format of the input media file.
Describes the input media file in a transcription request.
The S3 object location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:
s3://<bucket-name>/<keyprefix>/<objectkey>
For example:
s3://examplebucket/example.mp4s3://examplebucket/mediadocs/example.mp4
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
An object that contains the MedicalTranscript . The MedicalTranscript contains the TranscriptFileUri .
The S3 object location of the medical transcript.
Use this URI to access the medical transcript. This URI points to the S3 bucket you created to store the medical transcript.
A timestamp that shows when the job started processing.
A timestamp that shows when the job was created.
A timestamp that shows when the job was completed.
If the TranscriptionJobStatus field is FAILED , this field contains information about why the job failed.
The FailureReason field contains one of the following values:
Object that contains object.
Determines whether the transcription job uses speaker recognition to identify different speakers in the input audio. Speaker recongition labels individual speakers in the audio file. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels in the MaxSpeakerLabels field.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException .
The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers are identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.
Instructs Amazon Transcribe Medical to process each audio channel separately and then merge the transcription output of each channel into a single transcription.
Amazon Transcribe Medical also produces a transcription of each item detected on an audio channel, including the start time and end time of the item and alternative transcriptions of item. The alternative transcriptions also come with confidence scores provided by Amazon Transcribe Medical.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException
Determines whether alternative transcripts are generated along with the transcript that has the highest confidence. If you set ShowAlternatives field to true, you must also set the maximum number of alternatives to return in the MaxAlternatives field.
The maximum number of alternatives that you tell the service to return. If you specify the MaxAlternatives field, you must set the ShowAlternatives field to true.
The medical specialty of any clinicians providing a dictation or having a conversation. PRIMARYCARE is the only available setting for this object. This specialty enables you to generate transcriptions for the following medical fields:
The type of speech in the transcription job. CONVERSATION is generally used for patient-physician dialogues. DICTATION is the setting for physicians speaking their notes after seeing a patient. For more information, see how-it-works-med
Create a paginator for an operation.
Returns information about a transcription job. To see the status of the job, check the TranscriptionJobStatus field. If the status is COMPLETED , the job is finished and you can find the results at the location specified in the TranscriptFileUri field. If you enable content redaction, the redacted transcript appears in RedactedTranscriptFileUri .
See also: AWS API Documentation
Request Syntax
response = client.get_transcription_job(
    TranscriptionJobName='string'
)
[REQUIRED]
The name of the job.
{
    'TranscriptionJob': {
        'TranscriptionJobName': 'string',
        'TranscriptionJobStatus': 'QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
        'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
        'MediaSampleRateHertz': 123,
        'MediaFormat': 'mp3'|'mp4'|'wav'|'flac',
        'Media': {
            'MediaFileUri': 'string'
        },
        'Transcript': {
            'TranscriptFileUri': 'string',
            'RedactedTranscriptFileUri': 'string'
        },
        'StartTime': datetime(2015, 1, 1),
        'CreationTime': datetime(2015, 1, 1),
        'CompletionTime': datetime(2015, 1, 1),
        'FailureReason': 'string',
        'Settings': {
            'VocabularyName': 'string',
            'ShowSpeakerLabels': True|False,
            'MaxSpeakerLabels': 123,
            'ChannelIdentification': True|False,
            'ShowAlternatives': True|False,
            'MaxAlternatives': 123,
            'VocabularyFilterName': 'string',
            'VocabularyFilterMethod': 'remove'|'mask'
        },
        'JobExecutionSettings': {
            'AllowDeferredExecution': True|False,
            'DataAccessRoleArn': 'string'
        },
        'ContentRedaction': {
            'RedactionType': 'PII',
            'RedactionOutput': 'redacted'|'redacted_and_unredacted'
        }
    }
}
Response Structure
An object that contains the results of the transcription job.
The name of the transcription job.
The status of the transcription job.
The language code for the input speech.
The sample rate, in Hertz, of the audio track in the input media file.
The format of the input media file.
An object that describes the input media for the transcription job.
The S3 object location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:
s3://<bucket-name>/<keyprefix>/<objectkey>
For example:
s3://examplebucket/example.mp4s3://examplebucket/mediadocs/example.mp4
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
An object that describes the output of the transcription job.
The S3 object location of the the transcript.
Use this URI to access the transcript. If you specified an S3 bucket in the OutputBucketName field when you created the job, this is the URI of that bucket. If you chose to store the transcript in Amazon Transcribe, this is a shareable URL that provides secure access to that location.
The S3 object location of the redacted transcript.
Use this URI to access the redacated transcript. If you specified an S3 bucket in the OutputBucketName field when you created the job, this is the URI of that bucket. If you chose to store the transcript in Amazon Transcribe, this is a shareable URL that provides secure access to that location.
A timestamp that shows with the job was started processing.
A timestamp that shows when the job was created.
A timestamp that shows when the job was completed.
If the TranscriptionJobStatus field is FAILED , this field contains information about why the job failed.
The FailureReason field can contain one of the following values:
Optional settings for the transcription job. Use these settings to turn on speaker recognition, to set the maximum number of speakers that should be identified and to specify a custom vocabulary to use when processing the transcription job.
The name of a vocabulary to use when processing the transcription job.
Determines whether the transcription job uses speaker recognition to identify different speakers in the input audio. Speaker recognition labels individual speakers in the audio file. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels MaxSpeakerLabels field.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException .
The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers are identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.
Instructs Amazon Transcribe to process each audio channel separately and then merge the transcription output of each channel into a single transcription.
Amazon Transcribe also produces a transcription of each item detected on an audio channel, including the start time and end time of the item and alternative transcriptions of the item including the confidence that Amazon Transcribe has in the transcription.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException .
Determines whether the transcription contains alternative transcriptions. If you set the ShowAlternatives field to true, you must also set the maximum number of alternatives to return in the MaxAlternatives field.
The number of alternative transcriptions that the service should return. If you specify the MaxAlternatives field, you must set the ShowAlternatives field to true.
The name of the vocabulary filter to use when transcribing the audio. The filter that you specify must have the same language code as the transcription job.
Set to mask to remove filtered text from the transcript and replace it with three asterisks ("***") as placeholder text. Set to remove to remove filtered text from the transcript without using placeholder text.
Provides information about how a transcription job is executed.
Indicates whether a job should be queued by Amazon Transcribe when the concurrent execution limit is exceeded. When the AllowDeferredExecution field is true, jobs are queued and executed when the number of executing jobs falls below the concurrent execution limit. If the field is false, Amazon Transcribe returns a LimitExceededException exception.
If you specify the AllowDeferredExecution field, you must specify the DataAccessRoleArn field.
The Amazon Resource Name (ARN) of a role that has access to the S3 bucket that contains the input files. Amazon Transcribe assumes this role to read queued media files. If you have specified an output S3 bucket for the transcription results, this role should have access to the output bucket as well.
If you specify the AllowDeferredExecution field, you must specify the DataAccessRoleArn field.
An object that describes content redaction settings for the transcription job.
Request parameter that defines the entities to be redacted. The only accepted value is PII .
The output transcript file stored in either the default S3 bucket or in a bucket you specify.
When you choose redacted Amazon Transcribe outputs only the redacted transcript.
When you choose redacted_and_unredacted Amazon Transcribe outputs both the redacted and unredacted transcripts.
Gets information about a vocabulary.
See also: AWS API Documentation
Request Syntax
response = client.get_vocabulary(
    VocabularyName='string'
)
[REQUIRED]
The name of the vocabulary to return information about. The name is case-sensitive.
{
    'VocabularyName': 'string',
    'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    'VocabularyState': 'PENDING'|'READY'|'FAILED',
    'LastModifiedTime': datetime(2015, 1, 1),
    'FailureReason': 'string',
    'DownloadUri': 'string'
}
Response Structure
The name of the vocabulary to return.
The language code of the vocabulary entries.
The processing state of the vocabulary.
The date and time that the vocabulary was last modified.
If the VocabularyState field is FAILED , this field contains information about why the job failed.
The S3 location where the vocabulary is stored. Use this URI to get the contents of the vocabulary. The URI is available for a limited time.
Returns information about a vocabulary filter.
See also: AWS API Documentation
Request Syntax
response = client.get_vocabulary_filter(
    VocabularyFilterName='string'
)
[REQUIRED]
The name of the vocabulary filter for which to return information.
{
    'VocabularyFilterName': 'string',
    'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    'LastModifiedTime': datetime(2015, 1, 1),
    'DownloadUri': 'string'
}
Response Structure
The name of the vocabulary filter.
The language code of the words in the vocabulary filter.
The date and time that the contents of the vocabulary filter were updated.
The URI of the list of words in the vocabulary filter. You can use this URI to get the list of words.
Returns an object that can wait for some condition.
Lists medical transcription jobs with a specified status or substring that matches their names.
See also: AWS API Documentation
Request Syntax
response = client.list_medical_transcription_jobs(
    Status='QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
    JobNameContains='string',
    NextToken='string',
    MaxResults=123
)
dict
Response Syntax
{
    'Status': 'QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
    'NextToken': 'string',
    'MedicalTranscriptionJobSummaries': [
        {
            'MedicalTranscriptionJobName': 'string',
            'CreationTime': datetime(2015, 1, 1),
            'StartTime': datetime(2015, 1, 1),
            'CompletionTime': datetime(2015, 1, 1),
            'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
            'TranscriptionJobStatus': 'QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
            'FailureReason': 'string',
            'OutputLocationType': 'CUSTOMER_BUCKET'|'SERVICE_BUCKET',
            'Specialty': 'PRIMARYCARE',
            'Type': 'CONVERSATION'|'DICTATION'
        },
    ]
}
Response Structure
(dict) --
Status (string) --
The requested status of the medical transcription jobs returned.
NextToken (string) --
The ListMedicalTranscriptionJobs operation returns a page of jobs at a time. The maximum size of the page is set by the MaxResults parameter. If the number of jobs exceeds what can fit on a page, Amazon Transcribe Medical returns the NextPage token. Include the token in the next request to the ListMedicalTranscriptionJobs operation to return in the next page of jobs.
MedicalTranscriptionJobSummaries (list) --
A list of objects containing summary information for a transcription job.
(dict) --
Provides summary information about a transcription job.
MedicalTranscriptionJobName (string) --
The name of a medical transcription job.
CreationTime (datetime) --
A timestamp that shows when the medical transcription job was created.
StartTime (datetime) --
A timestamp that shows when the job began processing.
CompletionTime (datetime) --
A timestamp that shows when the job was completed.
LanguageCode (string) --
The language of the transcript in the source audio file.
TranscriptionJobStatus (string) --
The status of the medical transcription job.
FailureReason (string) --
If the TranscriptionJobStatus field is FAILED , a description of the error.
OutputLocationType (string) --
Indicates the location of the transcription job's output.
The CUSTOMER_BUCKET is the S3 location provided in the OutputBucketName field when the
Specialty (string) --
The medical specialty of the transcription job. Primary care is the only valid value.
Type (string) --
The speech of the clinician in the input audio.
Lists transcription jobs with the specified status.
See also: AWS API Documentation
Request Syntax
response = client.list_transcription_jobs(
    Status='QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
    JobNameContains='string',
    NextToken='string',
    MaxResults=123
)
dict
Response Syntax
{
    'Status': 'QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
    'NextToken': 'string',
    'TranscriptionJobSummaries': [
        {
            'TranscriptionJobName': 'string',
            'CreationTime': datetime(2015, 1, 1),
            'StartTime': datetime(2015, 1, 1),
            'CompletionTime': datetime(2015, 1, 1),
            'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
            'TranscriptionJobStatus': 'QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
            'FailureReason': 'string',
            'OutputLocationType': 'CUSTOMER_BUCKET'|'SERVICE_BUCKET',
            'ContentRedaction': {
                'RedactionType': 'PII',
                'RedactionOutput': 'redacted'|'redacted_and_unredacted'
            }
        },
    ]
}
Response Structure
(dict) --
Status (string) --
The requested status of the jobs returned.
NextToken (string) --
The ListTranscriptionJobs operation returns a page of jobs at a time. The maximum size of the page is set by the MaxResults parameter. If there are more jobs in the list than the page size, Amazon Transcribe returns the NextPage token. Include the token in the next request to the ListTranscriptionJobs operation to return in the next page of jobs.
TranscriptionJobSummaries (list) --
A list of objects containing summary information for a transcription job.
(dict) --
Provides a summary of information about a transcription job.
TranscriptionJobName (string) --
The name of the transcription job.
CreationTime (datetime) --
A timestamp that shows when the job was created.
StartTime (datetime) --
A timestamp that shows when the job started processing.
CompletionTime (datetime) --
A timestamp that shows when the job was completed.
LanguageCode (string) --
The language code for the input speech.
TranscriptionJobStatus (string) --
The status of the transcription job. When the status is COMPLETED , use the GetTranscriptionJob operation to get the results of the transcription.
FailureReason (string) --
If the TranscriptionJobStatus field is FAILED , a description of the error.
OutputLocationType (string) --
Indicates the location of the output of the transcription job.
If the value is CUSTOMER_BUCKET then the location is the S3 bucket specified in the outputBucketName field when the transcription job was started with the StartTranscriptionJob operation.
If the value is SERVICE_BUCKET then the output is stored by Amazon Transcribe and can be retrieved using the URI in the GetTranscriptionJob response's TranscriptFileUri field.
ContentRedaction (dict) --
The content redaction settings of the transcription job.
RedactionType (string) --
Request parameter that defines the entities to be redacted. The only accepted value is PII .
RedactionOutput (string) --
The output transcript file stored in either the default S3 bucket or in a bucket you specify.
When you choose redacted Amazon Transcribe outputs only the redacted transcript.
When you choose redacted_and_unredacted Amazon Transcribe outputs both the redacted and unredacted transcripts.
Returns a list of vocabularies that match the specified criteria. If no criteria are specified, returns the entire list of vocabularies.
See also: AWS API Documentation
Request Syntax
response = client.list_vocabularies(
    NextToken='string',
    MaxResults=123,
    StateEquals='PENDING'|'READY'|'FAILED',
    NameContains='string'
)
dict
Response Syntax
{
    'Status': 'QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
    'NextToken': 'string',
    'Vocabularies': [
        {
            'VocabularyName': 'string',
            'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
            'LastModifiedTime': datetime(2015, 1, 1),
            'VocabularyState': 'PENDING'|'READY'|'FAILED'
        },
    ]
}
Response Structure
(dict) --
Status (string) --
The requested vocabulary state.
NextToken (string) --
The ListVocabularies operation returns a page of vocabularies at a time. The maximum size of the page is set by the MaxResults parameter. If there are more jobs in the list than the page size, Amazon Transcribe returns the NextPage token. Include the token in the next request to the ListVocabularies operation to return in the next page of jobs.
Vocabularies (list) --
A list of objects that describe the vocabularies that match the search criteria in the request.
(dict) --
Provides information about a custom vocabulary.
VocabularyName (string) --
The name of the vocabulary.
LanguageCode (string) --
The language code of the vocabulary entries.
LastModifiedTime (datetime) --
The date and time that the vocabulary was last modified.
VocabularyState (string) --
The processing state of the vocabulary. If the state is READY you can use the vocabulary in a StartTranscriptionJob request.
Gets information about vocabulary filters.
See also: AWS API Documentation
Request Syntax
response = client.list_vocabulary_filters(
    NextToken='string',
    MaxResults=123,
    NameContains='string'
)
dict
Response Syntax
{
    'NextToken': 'string',
    'VocabularyFilters': [
        {
            'VocabularyFilterName': 'string',
            'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
            'LastModifiedTime': datetime(2015, 1, 1)
        },
    ]
}
Response Structure
(dict) --
NextToken (string) --
The ListVocabularyFilters operation returns a page of collections at a time. The maximum size of the page is set by the MaxResults parameter. If there are more jobs in the list than the page size, Amazon Transcribe returns the NextPage token. Include the token in the next request to the ListVocabularyFilters operation to return in the next page of jobs.
VocabularyFilters (list) --
The list of vocabulary filters. It contains at most MaxResults number of filters. If there are more filters, call the ListVocabularyFilters operation again with the NextToken parameter in the request set to the value of the NextToken field in the response.
(dict) --
Provides information about a vocabulary filter.
VocabularyFilterName (string) --
The name of the vocabulary filter. The name must be unique in the account that holds the filter.
LanguageCode (string) --
The language code of the words in the vocabulary filter.
LastModifiedTime (datetime) --
The date and time that the vocabulary was last updated.
Start a batch job to transcribe medical speech to text.
See also: AWS API Documentation
Request Syntax
response = client.start_medical_transcription_job(
    MedicalTranscriptionJobName='string',
    LanguageCode='en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    MediaSampleRateHertz=123,
    MediaFormat='mp3'|'mp4'|'wav'|'flac',
    Media={
        'MediaFileUri': 'string'
    },
    OutputBucketName='string',
    OutputEncryptionKMSKeyId='string',
    Settings={
        'ShowSpeakerLabels': True|False,
        'MaxSpeakerLabels': 123,
        'ChannelIdentification': True|False,
        'ShowAlternatives': True|False,
        'MaxAlternatives': 123
    },
    Specialty='PRIMARYCARE',
    Type='CONVERSATION'|'DICTATION'
)
[REQUIRED]
The name of the medical transcription job. You can't use the strings "." or ".." by themselves as the job name. The name must also be unique within an AWS account.
[REQUIRED]
The language code for the language spoken in the input media file. US English (en-US) is the valid value for medical transcription jobs. Any other value you enter for language code results in a BadRequestException error.
The sample rate, in Hertz, of the audio track in the input media file.
If you do not specify the media sample rate, Amazon Transcribe Medical determines the sample rate. If you specify the sample rate, it must match the rate detected by Amazon Transcribe Medical. In most cases, you should leave the MediaSampleRateHertz field blank and let Amazon Transcribe Medical determine the sample rate.
[REQUIRED]
Describes the input media file in a transcription request.
The S3 object location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:
s3://<bucket-name>/<keyprefix>/<objectkey>
For example:
s3://examplebucket/example.mp4s3://examplebucket/mediadocs/example.mp4
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
[REQUIRED]
The Amazon S3 location where the transcription is stored.
You must set OutputBucketName for Amazon Transcribe Medical to store the transcription results. Your transcript appears in the S3 location you specify. When you call the GetMedicalTranscriptionJob , the operation returns this location in the TranscriptFileUri field. The S3 bucket must have permissions that allow Amazon Transcribe Medical to put files in the bucket. For more information, see Permissions Required for IAM User Roles .
You can specify an AWS Key Management Service (KMS) key to encrypt the output of your transcription using the OutputEncryptionKMSKeyId parameter. If you don't specify a KMS key, Amazon Transcribe Medical uses the default Amazon S3 key for server-side encryption of transcripts that are placed in your S3 bucket.
The Amazon Resource Name (ARN) of the AWS Key Management Service (KMS) key used to encrypt the output of the transcription job. The user calling the StartMedicalTranscriptionJob operation must have permission to use the specified KMS key.
You use either of the following to identify a KMS key in the current account:
You can use either of the following to identify a KMS key in the current account or another account:
If you don't specify an encryption key, the output of the medical transcription job is encrypted with the default Amazon S3 key (SSE-S3).
If you specify a KMS key to encrypt your output, you must also specify an output location in the OutputBucketName parameter.
Optional settings for the medical transcription job.
Determines whether the transcription job uses speaker recognition to identify different speakers in the input audio. Speaker recongition labels individual speakers in the audio file. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels in the MaxSpeakerLabels field.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException .
The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers are identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.
Instructs Amazon Transcribe Medical to process each audio channel separately and then merge the transcription output of each channel into a single transcription.
Amazon Transcribe Medical also produces a transcription of each item detected on an audio channel, including the start time and end time of the item and alternative transcriptions of item. The alternative transcriptions also come with confidence scores provided by Amazon Transcribe Medical.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException
Determines whether alternative transcripts are generated along with the transcript that has the highest confidence. If you set ShowAlternatives field to true, you must also set the maximum number of alternatives to return in the MaxAlternatives field.
The maximum number of alternatives that you tell the service to return. If you specify the MaxAlternatives field, you must set the ShowAlternatives field to true.
[REQUIRED]
The medical specialty of any clinician speaking in the input media.
[REQUIRED]
The speech of clinician in the input audio. CONVERSATION refers to conversations clinicians have with patients. DICTATION refers to medical professionals dictating their notes about a patient encounter.
dict
Response Syntax
{
    'MedicalTranscriptionJob': {
        'MedicalTranscriptionJobName': 'string',
        'TranscriptionJobStatus': 'QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
        'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
        'MediaSampleRateHertz': 123,
        'MediaFormat': 'mp3'|'mp4'|'wav'|'flac',
        'Media': {
            'MediaFileUri': 'string'
        },
        'Transcript': {
            'TranscriptFileUri': 'string'
        },
        'StartTime': datetime(2015, 1, 1),
        'CreationTime': datetime(2015, 1, 1),
        'CompletionTime': datetime(2015, 1, 1),
        'FailureReason': 'string',
        'Settings': {
            'ShowSpeakerLabels': True|False,
            'MaxSpeakerLabels': 123,
            'ChannelIdentification': True|False,
            'ShowAlternatives': True|False,
            'MaxAlternatives': 123
        },
        'Specialty': 'PRIMARYCARE',
        'Type': 'CONVERSATION'|'DICTATION'
    }
}
Response Structure
(dict) --
MedicalTranscriptionJob (dict) --
A batch job submitted to transcribe medical speech to text.
MedicalTranscriptionJobName (string) --
The name for a given medical transcription job.
TranscriptionJobStatus (string) --
The completion status of a medical transcription job.
LanguageCode (string) --
The language code for the language spoken in the source audio file. US English (en-US) is the only supported language for medical transcriptions. Any other value you enter for language code results in a BadRequestException error.
MediaSampleRateHertz (integer) --
The sample rate, in Hertz, of the source audio containing medical information.
If you don't specify the sample rate, Amazon Transcribe Medical determines it for you. If you choose to specify the sample rate, it must match the rate detected by Amazon Transcribe Medical. In most cases, you should leave the MediaSampleHertz blank and let Amazon Transcribe Medical determine the sample rate.
MediaFormat (string) --
The format of the input media file.
Media (dict) --
Describes the input media file in a transcription request.
MediaFileUri (string) --
The S3 object location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:
s3://<bucket-name>/<keyprefix>/<objectkey>
For example:
s3://examplebucket/example.mp4
s3://examplebucket/mediadocs/example.mp4
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
Transcript (dict) --
An object that contains the MedicalTranscript . The MedicalTranscript contains the TranscriptFileUri .
TranscriptFileUri (string) --
The S3 object location of the medical transcript.
Use this URI to access the medical transcript. This URI points to the S3 bucket you created to store the medical transcript.
StartTime (datetime) --
A timestamp that shows when the job started processing.
CreationTime (datetime) --
A timestamp that shows when the job was created.
CompletionTime (datetime) --
A timestamp that shows when the job was completed.
FailureReason (string) --
If the TranscriptionJobStatus field is FAILED , this field contains information about why the job failed.
The FailureReason field contains one of the following values:
Settings (dict) --
Object that contains object.
ShowSpeakerLabels (boolean) --
Determines whether the transcription job uses speaker recognition to identify different speakers in the input audio. Speaker recongition labels individual speakers in the audio file. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels in the MaxSpeakerLabels field.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException .
MaxSpeakerLabels (integer) --
The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers are identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.
ChannelIdentification (boolean) --
Instructs Amazon Transcribe Medical to process each audio channel separately and then merge the transcription output of each channel into a single transcription.
Amazon Transcribe Medical also produces a transcription of each item detected on an audio channel, including the start time and end time of the item and alternative transcriptions of item. The alternative transcriptions also come with confidence scores provided by Amazon Transcribe Medical.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException
ShowAlternatives (boolean) --
Determines whether alternative transcripts are generated along with the transcript that has the highest confidence. If you set ShowAlternatives field to true, you must also set the maximum number of alternatives to return in the MaxAlternatives field.
MaxAlternatives (integer) --
The maximum number of alternatives that you tell the service to return. If you specify the MaxAlternatives field, you must set the ShowAlternatives field to true.
Specialty (string) --
The medical specialty of any clinicians providing a dictation or having a conversation. PRIMARYCARE is the only available setting for this object. This specialty enables you to generate transcriptions for the following medical fields:
Type (string) --
The type of speech in the transcription job. CONVERSATION is generally used for patient-physician dialogues. DICTATION is the setting for physicians speaking their notes after seeing a patient. For more information, see how-it-works-med
Starts an asynchronous job to transcribe speech to text.
See also: AWS API Documentation
Request Syntax
response = client.start_transcription_job(
    TranscriptionJobName='string',
    LanguageCode='en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    MediaSampleRateHertz=123,
    MediaFormat='mp3'|'mp4'|'wav'|'flac',
    Media={
        'MediaFileUri': 'string'
    },
    OutputBucketName='string',
    OutputEncryptionKMSKeyId='string',
    Settings={
        'VocabularyName': 'string',
        'ShowSpeakerLabels': True|False,
        'MaxSpeakerLabels': 123,
        'ChannelIdentification': True|False,
        'ShowAlternatives': True|False,
        'MaxAlternatives': 123,
        'VocabularyFilterName': 'string',
        'VocabularyFilterMethod': 'remove'|'mask'
    },
    JobExecutionSettings={
        'AllowDeferredExecution': True|False,
        'DataAccessRoleArn': 'string'
    },
    ContentRedaction={
        'RedactionType': 'PII',
        'RedactionOutput': 'redacted'|'redacted_and_unredacted'
    }
)
[REQUIRED]
The name of the job. Note that you can't use the strings "." or ".." by themselves as the job name. The name must also be unique within an AWS account.
[REQUIRED]
The language code for the language used in the input media file.
The sample rate, in Hertz, of the audio track in the input media file.
If you do not specify the media sample rate, Amazon Transcribe determines the sample rate. If you specify the sample rate, it must match the sample rate detected by Amazon Transcribe. In most cases, you should leave the MediaSampleRateHertz field blank and let Amazon Transcribe determine the sample rate.
[REQUIRED]
An object that describes the input media for a transcription job.
The S3 object location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:
s3://<bucket-name>/<keyprefix>/<objectkey>
For example:
s3://examplebucket/example.mp4s3://examplebucket/mediadocs/example.mp4
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
The location where the transcription is stored.
If you set the OutputBucketName , Amazon Transcribe puts the transcript in the specified S3 bucket. When you call the GetTranscriptionJob operation, the operation returns this location in the TranscriptFileUri field. If you enable content redaction, the redacted transcript appears in RedactedTranscriptFileUri . If you enable content redaction and choose to output an unredacted transcript, that transcript's location still appears in the TranscriptFileUri . The S3 bucket must have permissions that allow Amazon Transcribe to put files in the bucket. For more information, see Permissions Required for IAM User Roles .
You can specify an AWS Key Management Service (KMS) key to encrypt the output of your transcription using the OutputEncryptionKMSKeyId parameter. If you don't specify a KMS key, Amazon Transcribe uses the default Amazon S3 key for server-side encryption of transcripts that are placed in your S3 bucket.
If you don't set the OutputBucketName , Amazon Transcribe generates a pre-signed URL, a shareable URL that provides secure access to your transcription, and returns it in the TranscriptFileUri field. Use this URL to download the transcription.
The Amazon Resource Name (ARN) of the AWS Key Management Service (KMS) key used to encrypt the output of the transcription job. The user calling the StartTranscriptionJob operation must have permission to use the specified KMS key.
You can use either of the following to identify a KMS key in the current account:
You can use either of the following to identify a KMS key in the current account or another account:
If you don't specify an encryption key, the output of the transcription job is encrypted with the default Amazon S3 key (SSE-S3).
If you specify a KMS key to encrypt your output, you must also specify an output location in the OutputBucketName parameter.
A Settings object that provides optional settings for a transcription job.
The name of a vocabulary to use when processing the transcription job.
Determines whether the transcription job uses speaker recognition to identify different speakers in the input audio. Speaker recognition labels individual speakers in the audio file. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels MaxSpeakerLabels field.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException .
The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers are identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.
Instructs Amazon Transcribe to process each audio channel separately and then merge the transcription output of each channel into a single transcription.
Amazon Transcribe also produces a transcription of each item detected on an audio channel, including the start time and end time of the item and alternative transcriptions of the item including the confidence that Amazon Transcribe has in the transcription.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException .
Determines whether the transcription contains alternative transcriptions. If you set the ShowAlternatives field to true, you must also set the maximum number of alternatives to return in the MaxAlternatives field.
The number of alternative transcriptions that the service should return. If you specify the MaxAlternatives field, you must set the ShowAlternatives field to true.
The name of the vocabulary filter to use when transcribing the audio. The filter that you specify must have the same language code as the transcription job.
Set to mask to remove filtered text from the transcript and replace it with three asterisks ("***") as placeholder text. Set to remove to remove filtered text from the transcript without using placeholder text.
Provides information about how a transcription job is executed. Use this field to indicate that the job can be queued for deferred execution if the concurrency limit is reached and there are no slots available to immediately run the job.
Indicates whether a job should be queued by Amazon Transcribe when the concurrent execution limit is exceeded. When the AllowDeferredExecution field is true, jobs are queued and executed when the number of executing jobs falls below the concurrent execution limit. If the field is false, Amazon Transcribe returns a LimitExceededException exception.
If you specify the AllowDeferredExecution field, you must specify the DataAccessRoleArn field.
The Amazon Resource Name (ARN) of a role that has access to the S3 bucket that contains the input files. Amazon Transcribe assumes this role to read queued media files. If you have specified an output S3 bucket for the transcription results, this role should have access to the output bucket as well.
If you specify the AllowDeferredExecution field, you must specify the DataAccessRoleArn field.
An object that contains the request parameters for content redaction.
Request parameter that defines the entities to be redacted. The only accepted value is PII .
The output transcript file stored in either the default S3 bucket or in a bucket you specify.
When you choose redacted Amazon Transcribe outputs only the redacted transcript.
When you choose redacted_and_unredacted Amazon Transcribe outputs both the redacted and unredacted transcripts.
dict
Response Syntax
{
    'TranscriptionJob': {
        'TranscriptionJobName': 'string',
        'TranscriptionJobStatus': 'QUEUED'|'IN_PROGRESS'|'FAILED'|'COMPLETED',
        'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
        'MediaSampleRateHertz': 123,
        'MediaFormat': 'mp3'|'mp4'|'wav'|'flac',
        'Media': {
            'MediaFileUri': 'string'
        },
        'Transcript': {
            'TranscriptFileUri': 'string',
            'RedactedTranscriptFileUri': 'string'
        },
        'StartTime': datetime(2015, 1, 1),
        'CreationTime': datetime(2015, 1, 1),
        'CompletionTime': datetime(2015, 1, 1),
        'FailureReason': 'string',
        'Settings': {
            'VocabularyName': 'string',
            'ShowSpeakerLabels': True|False,
            'MaxSpeakerLabels': 123,
            'ChannelIdentification': True|False,
            'ShowAlternatives': True|False,
            'MaxAlternatives': 123,
            'VocabularyFilterName': 'string',
            'VocabularyFilterMethod': 'remove'|'mask'
        },
        'JobExecutionSettings': {
            'AllowDeferredExecution': True|False,
            'DataAccessRoleArn': 'string'
        },
        'ContentRedaction': {
            'RedactionType': 'PII',
            'RedactionOutput': 'redacted'|'redacted_and_unredacted'
        }
    }
}
Response Structure
(dict) --
TranscriptionJob (dict) --
An object containing details of the asynchronous transcription job.
TranscriptionJobName (string) --
The name of the transcription job.
TranscriptionJobStatus (string) --
The status of the transcription job.
LanguageCode (string) --
The language code for the input speech.
MediaSampleRateHertz (integer) --
The sample rate, in Hertz, of the audio track in the input media file.
MediaFormat (string) --
The format of the input media file.
Media (dict) --
An object that describes the input media for the transcription job.
MediaFileUri (string) --
The S3 object location of the input media file. The URI must be in the same region as the API endpoint that you are calling. The general form is:
s3://<bucket-name>/<keyprefix>/<objectkey>
For example:
s3://examplebucket/example.mp4
s3://examplebucket/mediadocs/example.mp4
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
Transcript (dict) --
An object that describes the output of the transcription job.
TranscriptFileUri (string) --
The S3 object location of the the transcript.
Use this URI to access the transcript. If you specified an S3 bucket in the OutputBucketName field when you created the job, this is the URI of that bucket. If you chose to store the transcript in Amazon Transcribe, this is a shareable URL that provides secure access to that location.
RedactedTranscriptFileUri (string) --
The S3 object location of the redacted transcript.
Use this URI to access the redacated transcript. If you specified an S3 bucket in the OutputBucketName field when you created the job, this is the URI of that bucket. If you chose to store the transcript in Amazon Transcribe, this is a shareable URL that provides secure access to that location.
StartTime (datetime) --
A timestamp that shows with the job was started processing.
CreationTime (datetime) --
A timestamp that shows when the job was created.
CompletionTime (datetime) --
A timestamp that shows when the job was completed.
FailureReason (string) --
If the TranscriptionJobStatus field is FAILED , this field contains information about why the job failed.
The FailureReason field can contain one of the following values:
Settings (dict) --
Optional settings for the transcription job. Use these settings to turn on speaker recognition, to set the maximum number of speakers that should be identified and to specify a custom vocabulary to use when processing the transcription job.
VocabularyName (string) --
The name of a vocabulary to use when processing the transcription job.
ShowSpeakerLabels (boolean) --
Determines whether the transcription job uses speaker recognition to identify different speakers in the input audio. Speaker recognition labels individual speakers in the audio file. If you set the ShowSpeakerLabels field to true, you must also set the maximum number of speaker labels MaxSpeakerLabels field.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException .
MaxSpeakerLabels (integer) --
The maximum number of speakers to identify in the input audio. If there are more speakers in the audio than this number, multiple speakers are identified as a single speaker. If you specify the MaxSpeakerLabels field, you must set the ShowSpeakerLabels field to true.
ChannelIdentification (boolean) --
Instructs Amazon Transcribe to process each audio channel separately and then merge the transcription output of each channel into a single transcription.
Amazon Transcribe also produces a transcription of each item detected on an audio channel, including the start time and end time of the item and alternative transcriptions of the item including the confidence that Amazon Transcribe has in the transcription.
You can't set both ShowSpeakerLabels and ChannelIdentification in the same request. If you set both, your request returns a BadRequestException .
ShowAlternatives (boolean) --
Determines whether the transcription contains alternative transcriptions. If you set the ShowAlternatives field to true, you must also set the maximum number of alternatives to return in the MaxAlternatives field.
MaxAlternatives (integer) --
The number of alternative transcriptions that the service should return. If you specify the MaxAlternatives field, you must set the ShowAlternatives field to true.
VocabularyFilterName (string) --
The name of the vocabulary filter to use when transcribing the audio. The filter that you specify must have the same language code as the transcription job.
VocabularyFilterMethod (string) --
Set to mask to remove filtered text from the transcript and replace it with three asterisks ("***") as placeholder text. Set to remove to remove filtered text from the transcript without using placeholder text.
JobExecutionSettings (dict) --
Provides information about how a transcription job is executed.
AllowDeferredExecution (boolean) --
Indicates whether a job should be queued by Amazon Transcribe when the concurrent execution limit is exceeded. When the AllowDeferredExecution field is true, jobs are queued and executed when the number of executing jobs falls below the concurrent execution limit. If the field is false, Amazon Transcribe returns a LimitExceededException exception.
If you specify the AllowDeferredExecution field, you must specify the DataAccessRoleArn field.
DataAccessRoleArn (string) --
The Amazon Resource Name (ARN) of a role that has access to the S3 bucket that contains the input files. Amazon Transcribe assumes this role to read queued media files. If you have specified an output S3 bucket for the transcription results, this role should have access to the output bucket as well.
If you specify the AllowDeferredExecution field, you must specify the DataAccessRoleArn field.
ContentRedaction (dict) --
An object that describes content redaction settings for the transcription job.
RedactionType (string) --
Request parameter that defines the entities to be redacted. The only accepted value is PII .
RedactionOutput (string) --
The output transcript file stored in either the default S3 bucket or in a bucket you specify.
When you choose redacted Amazon Transcribe outputs only the redacted transcript.
When you choose redacted_and_unredacted Amazon Transcribe outputs both the redacted and unredacted transcripts.
Updates an existing vocabulary with new values. The UpdateVocabulary operation overwrites all of the existing information with the values that you provide in the request.
See also: AWS API Documentation
Request Syntax
response = client.update_vocabulary(
    VocabularyName='string',
    LanguageCode='en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    Phrases=[
        'string',
    ],
    VocabularyFileUri='string'
)
[REQUIRED]
The name of the vocabulary to update. The name is case-sensitive.
[REQUIRED]
The language code of the vocabulary entries.
An array of strings containing the vocabulary entries.
The S3 location of the text file that contains the definition of the custom vocabulary. The URI must be in the same region as the API endpoint that you are calling. The general form is
https://s3.<aws-region>.amazonaws.com/<bucket-name>/<keyprefix>/<objectkey>
For example:
https://s3.us-east-1.amazonaws.com/examplebucket/vocab.txt
For more information about S3 object names, see Object Keys in the Amazon S3 Developer Guide .
For more information about custom vocabularies, see Custom Vocabularies .
dict
Response Syntax
{
    'VocabularyName': 'string',
    'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    'LastModifiedTime': datetime(2015, 1, 1),
    'VocabularyState': 'PENDING'|'READY'|'FAILED'
}
Response Structure
(dict) --
VocabularyName (string) --
The name of the vocabulary that was updated.
LanguageCode (string) --
The language code of the vocabulary entries.
LastModifiedTime (datetime) --
The date and time that the vocabulary was updated.
VocabularyState (string) --
The processing state of the vocabulary. When the VocabularyState field contains READY the vocabulary is ready to be used in a StartTranscriptionJob request.
Updates a vocabulary filter with a new list of filtered words.
See also: AWS API Documentation
Request Syntax
response = client.update_vocabulary_filter(
    VocabularyFilterName='string',
    Words=[
        'string',
    ],
    VocabularyFilterFileUri='string'
)
[REQUIRED]
The name of the vocabulary filter to update.
The words to use in the vocabulary filter. Only use characters from the character set defined for custom vocabularies. For a list of character sets, see Character Sets for Custom Vocabularies .
If you provide a list of words in the Words parameter, you can't use the VocabularyFilterFileUri parameter.
The Amazon S3 location of a text file used as input to create the vocabulary filter. Only use characters from the character set defined for custom vocabularies. For a list of character sets, see Character Sets for Custom Vocabularies .
The specified file must be less than 50 KB of UTF-8 characters.
If you provide the location of a list of words in the VocabularyFilterFileUri parameter, you can't use the Words parameter.
dict
Response Syntax
{
    'VocabularyFilterName': 'string',
    'LanguageCode': 'en-US'|'es-US'|'en-AU'|'fr-CA'|'en-GB'|'de-DE'|'pt-BR'|'fr-FR'|'it-IT'|'ko-KR'|'es-ES'|'en-IN'|'hi-IN'|'ar-SA'|'ru-RU'|'zh-CN'|'nl-NL'|'id-ID'|'ta-IN'|'fa-IR'|'en-IE'|'en-AB'|'en-WL'|'pt-PT'|'te-IN'|'tr-TR'|'de-CH'|'he-IL'|'ms-MY'|'ja-JP'|'ar-AE',
    'LastModifiedTime': datetime(2015, 1, 1)
}
Response Structure
(dict) --
VocabularyFilterName (string) --
The name of the updated vocabulary filter.
LanguageCode (string) --
The language code of the words in the vocabulary filter.
LastModifiedTime (datetime) --
The date and time that the vocabulary filter was updated.
The available paginators are: