Table of Contents
kendra.
Client
¶A low-level client representing AWSKendraFrontendService
Amazon Kendra is a service for indexing large document sets.
import boto3
client = boto3.client('kendra')
These are the available methods:
associate_entities_to_experience()
associate_personas_to_entities()
batch_delete_document()
batch_get_document_status()
batch_put_document()
can_paginate()
clear_query_suggestions()
close()
create_access_control_configuration()
create_data_source()
create_experience()
create_faq()
create_index()
create_query_suggestions_block_list()
create_thesaurus()
delete_access_control_configuration()
delete_data_source()
delete_experience()
delete_faq()
delete_index()
delete_principal_mapping()
delete_query_suggestions_block_list()
delete_thesaurus()
describe_access_control_configuration()
describe_data_source()
describe_experience()
describe_faq()
describe_index()
describe_principal_mapping()
describe_query_suggestions_block_list()
describe_query_suggestions_config()
describe_thesaurus()
disassociate_entities_from_experience()
disassociate_personas_from_entities()
get_paginator()
get_query_suggestions()
get_snapshots()
get_waiter()
list_access_control_configurations()
list_data_source_sync_jobs()
list_data_sources()
list_entity_personas()
list_experience_entities()
list_experiences()
list_faqs()
list_groups_older_than_ordering_id()
list_indices()
list_query_suggestions_block_lists()
list_tags_for_resource()
list_thesauri()
put_principal_mapping()
query()
start_data_source_sync_job()
stop_data_source_sync_job()
submit_feedback()
tag_resource()
untag_resource()
update_access_control_configuration()
update_data_source()
update_experience()
update_index()
update_query_suggestions_block_list()
update_query_suggestions_config()
update_thesaurus()
associate_entities_to_experience
(**kwargs)¶Grants users or groups in your IAM Identity Center identity source access to your Amazon Kendra experience. You can create an Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.associate_entities_to_experience(
Id='string',
IndexId='string',
EntityList=[
{
'EntityId': 'string',
'EntityType': 'USER'|'GROUP'
},
]
)
[REQUIRED]
The identifier of your Amazon Kendra experience.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
[REQUIRED]
Lists users or groups in your IAM Identity Center identity source.
Provides the configuration information for users or groups in your IAM Identity Center identity source to grant access your Amazon Kendra experience.
The identifier of a user or group in your IAM Identity Center identity source. For example, a user ID could be an email.
Specifies whether you are configuring a User
or a Group
.
dict
Response Syntax
{
'FailedEntityList': [
{
'EntityId': 'string',
'ErrorMessage': 'string'
},
]
}
Response Structure
(dict) --
FailedEntityList (list) --
Lists the users or groups in your IAM Identity Center identity source that failed to properly configure with your Amazon Kendra experience.
(dict) --
Information on the users or groups in your IAM Identity Center identity source that failed to properly configure with your Amazon Kendra experience.
EntityId (string) --
The identifier of the user or group in your IAM Identity Center identity source. For example, a user ID could be an email.
ErrorMessage (string) --
The reason the user or group in your IAM Identity Center identity source failed to properly configure with your Amazon Kendra experience.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ResourceAlreadyExistException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
associate_personas_to_entities
(**kwargs)¶Defines the specific permissions of users or groups in your IAM Identity Center identity source with access to your Amazon Kendra experience. You can create an Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.associate_personas_to_entities(
Id='string',
IndexId='string',
Personas=[
{
'EntityId': 'string',
'Persona': 'OWNER'|'VIEWER'
},
]
)
[REQUIRED]
The identifier of your Amazon Kendra experience.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
[REQUIRED]
The personas that define the specific permissions of users or groups in your IAM Identity Center identity source. The available personas or access roles are Owner
and Viewer
. For more information on these personas, see Providing access to your search page.
Provides the configuration information for users or groups in your IAM Identity Center identity source for access to your Amazon Kendra experience. Specific permissions are defined for each user or group once they are granted access to your Amazon Kendra experience.
The identifier of a user or group in your IAM Identity Center identity source. For example, a user ID could be an email.
The persona that defines the specific permissions of the user or group in your IAM Identity Center identity source. The available personas or access roles are Owner
and Viewer
. For more information on these personas, see Providing access to your search page.
dict
Response Syntax
{
'FailedEntityList': [
{
'EntityId': 'string',
'ErrorMessage': 'string'
},
]
}
Response Structure
(dict) --
FailedEntityList (list) --
Lists the users or groups in your IAM Identity Center identity source that failed to properly configure with your Amazon Kendra experience.
(dict) --
Information on the users or groups in your IAM Identity Center identity source that failed to properly configure with your Amazon Kendra experience.
EntityId (string) --
The identifier of the user or group in your IAM Identity Center identity source. For example, a user ID could be an email.
ErrorMessage (string) --
The reason the user or group in your IAM Identity Center identity source failed to properly configure with your Amazon Kendra experience.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ResourceAlreadyExistException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
batch_delete_document
(**kwargs)¶Removes one or more documents from an index. The documents must have been added with the BatchPutDocument
API.
The documents are deleted asynchronously. You can see the progress of the deletion by using Amazon Web Services CloudWatch. Any error messages related to the processing of the batch are sent to you CloudWatch log.
See also: AWS API Documentation
Request Syntax
response = client.batch_delete_document(
IndexId='string',
DocumentIdList=[
'string',
],
DataSourceSyncJobMetricTarget={
'DataSourceId': 'string',
'DataSourceSyncJobId': 'string'
}
)
[REQUIRED]
The identifier of the index that contains the documents to delete.
[REQUIRED]
One or more identifiers for documents to delete from the index.
Maps a particular data source sync job to a particular data source.
The ID of the data source that is running the sync job.
The ID of the sync job that is running on the data source.
If the ID of a sync job is not provided and there is a sync job running, then the ID of this sync job is used and metrics are generated for this sync job.
If the ID of a sync job is not provided and there is no sync job running, then no metrics are generated and documents are indexed/deleted at the index level without sync job metrics included.
dict
Response Syntax
{
'FailedDocuments': [
{
'Id': 'string',
'ErrorCode': 'InternalError'|'InvalidRequest',
'ErrorMessage': 'string'
},
]
}
Response Structure
(dict) --
FailedDocuments (list) --
A list of documents that could not be removed from the index. Each entry contains an error message that indicates why the document couldn't be removed from the index.
(dict) --
Provides information about documents that could not be removed from an index by the BatchDeleteDocument
API.
Id (string) --
The identifier of the document that couldn't be removed from the index.
ErrorCode (string) --
The error code for why the document couldn't be removed from the index.
ErrorMessage (string) --
An explanation for why the document couldn't be removed from the index.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
batch_get_document_status
(**kwargs)¶Returns the indexing status for one or more documents submitted with the BatchPutDocument API.
When you use the BatchPutDocument
API, documents are indexed asynchronously. You can use the BatchGetDocumentStatus
API to get the current status of a list of documents so that you can determine if they have been successfully indexed.
You can also use the BatchGetDocumentStatus
API to check the status of the BatchDeleteDocument API. When a document is deleted from the index, Amazon Kendra returns NOT_FOUND
as the status.
See also: AWS API Documentation
Request Syntax
response = client.batch_get_document_status(
IndexId='string',
DocumentInfoList=[
{
'DocumentId': 'string',
'Attributes': [
{
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
]
},
]
)
[REQUIRED]
The identifier of the index to add documents to. The index ID is returned by the CreateIndex API.
[REQUIRED]
A list of DocumentInfo
objects that identify the documents for which to get the status. You identify the documents by their document ID and optional attributes.
Identifies a document for which to retrieve status information
The identifier of the document.
Attributes that identify a specific version of a document to check.
The only valid attributes are:
The attributes follow these rules:
dataSourceId
and jobExecutionId
must be used together.version
is ignored if dataSourceId
and jobExecutionId
are not provided.dataSourceId
and jobExecutionId
are provided, but version
is not, the version defaults to "0".A document attribute or metadata field. To create custom document attributes, see Custom attributes.
The identifier for the attribute.
The value of the attribute.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
dict
Response Syntax
{
'Errors': [
{
'DocumentId': 'string',
'ErrorCode': 'InternalError'|'InvalidRequest',
'ErrorMessage': 'string'
},
],
'DocumentStatusList': [
{
'DocumentId': 'string',
'DocumentStatus': 'NOT_FOUND'|'PROCESSING'|'INDEXED'|'UPDATED'|'FAILED'|'UPDATE_FAILED',
'FailureCode': 'string',
'FailureReason': 'string'
},
]
}
Response Structure
(dict) --
Errors (list) --
A list of documents that Amazon Kendra couldn't get the status for. The list includes the ID of the document and the reason that the status couldn't be found.
(dict) --
Provides a response when the status of a document could not be retrieved.
DocumentId (string) --
The identifier of the document whose status could not be retrieved.
ErrorCode (string) --
Indicates the source of the error.
ErrorMessage (string) --
States that the API could not get the status of a document. This could be because the request is not valid or there is a system error.
DocumentStatusList (list) --
The status of documents. The status indicates if the document is waiting to be indexed, is in the process of indexing, has completed indexing, or failed indexing. If a document failed indexing, the status provides the reason why.
(dict) --
Provides information about the status of documents submitted for indexing.
DocumentId (string) --
The identifier of the document.
DocumentStatus (string) --
The current status of a document.
If the document was submitted for deletion, the status is NOT_FOUND
after the document is deleted.
FailureCode (string) --
Indicates the source of the error.
FailureReason (string) --
Provides detailed information about why the document couldn't be indexed. Use this information to correct the error before you resubmit the document for indexing.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
batch_put_document
(**kwargs)¶Adds one or more documents to an index.
The BatchPutDocument
API enables you to ingest inline documents or a set of documents stored in an Amazon S3 bucket. Use this API to ingest your text and unstructured text into an index, add custom attributes to the documents, and to attach an access control list to the documents added to the index.
The documents are indexed asynchronously. You can see the progress of the batch using Amazon Web Services CloudWatch. Any error messages related to processing the batch are sent to your Amazon Web Services CloudWatch log.
For an example of ingesting inline documents using Python and Java SDKs, see Adding files directly to an index.
See also: AWS API Documentation
Request Syntax
response = client.batch_put_document(
IndexId='string',
RoleArn='string',
Documents=[
{
'Id': 'string',
'Title': 'string',
'Blob': b'bytes',
'S3Path': {
'Bucket': 'string',
'Key': 'string'
},
'Attributes': [
{
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
],
'AccessControlList': [
{
'Name': 'string',
'Type': 'USER'|'GROUP',
'Access': 'ALLOW'|'DENY',
'DataSourceId': 'string'
},
],
'HierarchicalAccessControlList': [
{
'PrincipalList': [
{
'Name': 'string',
'Type': 'USER'|'GROUP',
'Access': 'ALLOW'|'DENY',
'DataSourceId': 'string'
},
]
},
],
'ContentType': 'PDF'|'HTML'|'MS_WORD'|'PLAIN_TEXT'|'PPT',
'AccessControlConfigurationId': 'string'
},
],
CustomDocumentEnrichmentConfiguration={
'InlineConfigurations': [
{
'Condition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'Target': {
'TargetDocumentAttributeKey': 'string',
'TargetDocumentAttributeValueDeletion': True|False,
'TargetDocumentAttributeValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'DocumentContentDeletion': True|False
},
],
'PreExtractionHookConfiguration': {
'InvocationCondition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LambdaArn': 'string',
'S3Bucket': 'string'
},
'PostExtractionHookConfiguration': {
'InvocationCondition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LambdaArn': 'string',
'S3Bucket': 'string'
},
'RoleArn': 'string'
}
)
[REQUIRED]
The identifier of the index to add the documents to. You need to create the index first using the CreateIndex
API.
BatchPutDocument
API. For more information, see IAM Roles for Amazon Kendra.[REQUIRED]
One or more documents to add to the index.
Documents have the following file size limits.
For more information about file size and transaction per second quotas, see Quotas.
A document in an index.
A identifier of the document in the index.
Note, each document ID must be unique per index. You cannot create a data source to index your documents with their unique IDs and then use the BatchPutDocument
API to index the same documents, or vice versa. You can delete a data source and then use the BatchPutDocument
API to index the same documents, or vice versa.
The title of the document.
The contents of the document.
Documents passed to the Blob
parameter must be base64 encoded. Your code might not need to encode the document file bytes if you're using an Amazon Web Services SDK to call Amazon Kendra APIs. If you are calling the Amazon Kendra endpoint directly using REST, you must base64 encode the contents before sending.
Information required to find a specific file in an Amazon S3 bucket.
The name of the S3 bucket that contains the file.
The name of the file.
Custom attributes to apply to the document. Use the custom attributes to provide additional information for searching, to provide facets for refining searches, and to provide additional information in the query response.
For example, 'DataSourceId' and 'DataSourceSyncJobId' are custom attributes that provide information on the synchronization of documents running on a data source. Note, 'DataSourceSyncJobId' could be an optional custom attribute as Amazon Kendra will use the ID of a running sync job.
A document attribute or metadata field. To create custom document attributes, see Custom attributes.
The identifier for the attribute.
The value of the attribute.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Information on principals (users and/or groups) and which documents they should have access to. This is useful for user context filtering, where search results are filtered based on the user or their group access to documents.
Provides user and group information for user context filtering.
The name of the user or group.
The type of principal.
Whether to allow or deny document access to the principal.
The identifier of the data source the principal should access documents from.
The list of principal lists that define the hierarchy for which documents users should have access to.
Information to define the hierarchy for which documents users should have access to.
A list of principal lists that define the hierarchy for which documents users should have access to. Each hierarchical list specifies which user or group has allow or deny access for each document.
Provides user and group information for user context filtering.
The name of the user or group.
The type of principal.
Whether to allow or deny document access to the principal.
The identifier of the data source the principal should access documents from.
The file type of the document in the Blob
field.
The identifier of the access control configuration that you want to apply to the document.
Configuration information for altering your document metadata and content during the document ingestion process when you use the BatchPutDocument
API.
For more information on how to create, modify and delete document metadata, or make other content alterations when you ingest documents into Amazon Kendra, see Customizing document metadata during the ingestion process.
Configuration information to alter document attributes or metadata fields and content when ingesting documents into Amazon Kendra.
Provides the configuration information for applying basic logic to alter document metadata and content when ingesting documents into Amazon Kendra. To apply advanced logic, to go beyond what you can do with basic logic, see HookConfiguration.
For more information, see Customizing document metadata during the ingestion process.
Configuration of the condition used for the target document attribute or metadata field when ingesting documents into Amazon Kendra.
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
The condition operator.
For example, you can use 'Contains' to partially match a string.
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Configuration of the target document attribute or metadata field when ingesting documents into Amazon Kendra. You can also include a value.
The identifier of the target document attribute or metadata field.
For example, 'Department' could be an identifier for the target attribute or metadata field that includes the department names associated with the documents.
TRUE
to delete the existing target value for your specified target attribute key. You cannot create a target value and set this toTRUE
. To create a target value (TargetDocumentAttributeValue
), set this toFALSE
.
The target value you want to create for the target attribute.
For example, 'Finance' could be the target value for the target attribute key 'Department'.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
TRUE
to delete content if the condition used for the target attribute is met.
Configuration information for invoking a Lambda function in Lambda on the original or raw documents before extracting their metadata and text. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation.
The condition used for when a Lambda function should be invoked.
For example, you can specify a condition that if there are empty date-time values, then Amazon Kendra should invoke a function that inserts the current date-time.
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
The condition operator.
For example, you can use 'Contains' to partially match a string.
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Amazon Kendra.
Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.
Configuration information for invoking a Lambda function in Lambda on the structured documents with their metadata and text extracted. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation.
The condition used for when a Lambda function should be invoked.
For example, you can specify a condition that if there are empty date-time values, then Amazon Kendra should invoke a function that inserts the current date-time.
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
The condition operator.
For example, you can use 'Contains' to partially match a string.
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Amazon Kendra.
Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.
The Amazon Resource Name (ARN) of a role with permission to run PreExtractionHookConfiguration
and PostExtractionHookConfiguration
for altering document metadata and content during the document ingestion process. For more information, see IAM roles for Amazon Kendra.
dict
Response Syntax
{
'FailedDocuments': [
{
'Id': 'string',
'ErrorCode': 'InternalError'|'InvalidRequest',
'ErrorMessage': 'string'
},
]
}
Response Structure
(dict) --
FailedDocuments (list) --
A list of documents that were not added to the index because the document failed a validation check. Each document contains an error message that indicates why the document couldn't be added to the index.
If there was an error adding a document to an index the error is reported in your Amazon Web Services CloudWatch log. For more information, see Monitoring Amazon Kendra with Amazon CloudWatch Logs
(dict) --
Provides information about a document that could not be indexed.
Id (string) --
The identifier of the document.
ErrorCode (string) --
The type of error that caused the document to fail to be indexed.
ErrorMessage (string) --
A description of the reason why the document could not be indexed.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.InternalServerException
can_paginate
(operation_name)¶Check if an operation can be paginated.
create_foo
, and you'd normally invoke the
operation as client.create_foo(**kwargs)
, if the
create_foo
operation can be paginated, you can use the
call client.get_paginator("create_foo")
.True
if the operation can be paginated,
False
otherwise.clear_query_suggestions
(**kwargs)¶Clears existing query suggestions from an index.
This deletes existing suggestions only, not the queries in the query log. After you clear suggestions, Amazon Kendra learns new suggestions based on new queries added to the query log from the time you cleared suggestions. If you do not see any new suggestions, then please allow Amazon Kendra to collect enough queries to learn new suggestions.
ClearQuerySuggestions
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.clear_query_suggestions(
IndexId='string'
)
[REQUIRED]
The identifier of the index you want to clear query suggestions from.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
close
()¶Closes underlying endpoint connections.
create_access_control_configuration
(**kwargs)¶Creates an access configuration for your documents. This includes user and group access information for your documents. This is useful for user context filtering, where search results are filtered based on the user or their group access to documents.
You can use this to re-configure your existing document level access control without indexing all of your documents again. For example, your index contains top-secret company documents that only certain employees or users should access. One of these users leaves the company or switches to a team that should be blocked from accessing top-secret documents. The user still has access to top-secret documents because the user had access when your documents were previously indexed. You can create a specific access control configuration for the user with deny access. You can later update the access control configuration to allow access if the user returns to the company and re-joins the 'top-secret' team. You can re-configure access control for your documents as circumstances change.
To apply your access control configuration to certain documents, you call the BatchPutDocument API with the AccessControlConfigurationId
included in the Document object. If you use an S3 bucket as a data source, you update the .metadata.json
with the AccessControlConfigurationId
and synchronize your data source. Amazon Kendra currently only supports access control configuration for S3 data sources and documents indexed using the BatchPutDocument
API.
See also: AWS API Documentation
Request Syntax
response = client.create_access_control_configuration(
IndexId='string',
Name='string',
Description='string',
AccessControlList=[
{
'Name': 'string',
'Type': 'USER'|'GROUP',
'Access': 'ALLOW'|'DENY',
'DataSourceId': 'string'
},
],
HierarchicalAccessControlList=[
{
'PrincipalList': [
{
'Name': 'string',
'Type': 'USER'|'GROUP',
'Access': 'ALLOW'|'DENY',
'DataSourceId': 'string'
},
]
},
],
ClientToken='string'
)
[REQUIRED]
The identifier of the index to create an access control configuration for your documents.
[REQUIRED]
A name for the access control configuration.
Information on principals (users and/or groups) and which documents they should have access to. This is useful for user context filtering, where search results are filtered based on the user or their group access to documents.
Provides user and group information for user context filtering.
The name of the user or group.
The type of principal.
Whether to allow or deny document access to the principal.
The identifier of the data source the principal should access documents from.
The list of principal lists that define the hierarchy for which documents users should have access to.
Information to define the hierarchy for which documents users should have access to.
A list of principal lists that define the hierarchy for which documents users should have access to. Each hierarchical list specifies which user or group has allow or deny access for each document.
Provides user and group information for user context filtering.
The name of the user or group.
The type of principal.
Whether to allow or deny document access to the principal.
The identifier of the data source the principal should access documents from.
A token that you provide to identify the request to create an access control configuration. Multiple calls to the CreateAccessControlConfiguration
API with the same client token will create only one access control configuration.
This field is autopopulated if not provided.
dict
Response Syntax
{
'Id': 'string'
}
Response Structure
(dict) --
Id (string) --
The identifier of the access control configuration for your documents in an index.
Exceptions
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.InternalServerException
create_data_source
(**kwargs)¶Creates a data source connector that you want to use with an Amazon Kendra index.
You specify a name, data source connector type and description for your data source. You also specify configuration information for the data source connector.
CreateDataSource
is a synchronous operation. The operation returns 200 if the data source was successfully created. Otherwise, an exception is raised.
Amazon S3 and custom data sources are the only supported data sources in the Amazon Web Services GovCloud (US-West) region.
For an example of creating an index and data source using the Python SDK, see Getting started with Python SDK. For an example of creating an index and data source using the Java SDK, see Getting started with Java SDK.
See also: AWS API Documentation
Request Syntax
response = client.create_data_source(
Name='string',
IndexId='string',
Type='S3'|'SHAREPOINT'|'DATABASE'|'SALESFORCE'|'ONEDRIVE'|'SERVICENOW'|'CUSTOM'|'CONFLUENCE'|'GOOGLEDRIVE'|'WEBCRAWLER'|'WORKDOCS'|'FSX'|'SLACK'|'BOX'|'QUIP'|'JIRA'|'GITHUB'|'ALFRESCO'|'TEMPLATE',
Configuration={
'S3Configuration': {
'BucketName': 'string',
'InclusionPrefixes': [
'string',
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'DocumentsMetadataConfiguration': {
'S3Prefix': 'string'
},
'AccessControlListConfiguration': {
'KeyPath': 'string'
}
},
'SharePointConfiguration': {
'SharePointVersion': 'SHAREPOINT_2013'|'SHAREPOINT_2016'|'SHAREPOINT_ONLINE'|'SHAREPOINT_2019',
'Urls': [
'string',
],
'SecretArn': 'string',
'CrawlAttachments': True|False,
'UseChangeLog': True|False,
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'DocumentTitleFieldName': 'string',
'DisableLocalGroups': True|False,
'SslCertificateS3Path': {
'Bucket': 'string',
'Key': 'string'
},
'AuthenticationType': 'HTTP_BASIC'|'OAUTH2',
'ProxyConfiguration': {
'Host': 'string',
'Port': 123,
'Credentials': 'string'
}
},
'DatabaseConfiguration': {
'DatabaseEngineType': 'RDS_AURORA_MYSQL'|'RDS_AURORA_POSTGRESQL'|'RDS_MYSQL'|'RDS_POSTGRESQL',
'ConnectionConfiguration': {
'DatabaseHost': 'string',
'DatabasePort': 123,
'DatabaseName': 'string',
'TableName': 'string',
'SecretArn': 'string'
},
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'ColumnConfiguration': {
'DocumentIdColumnName': 'string',
'DocumentDataColumnName': 'string',
'DocumentTitleColumnName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'ChangeDetectingColumns': [
'string',
]
},
'AclConfiguration': {
'AllowedGroupsColumnName': 'string'
},
'SqlConfiguration': {
'QueryIdentifiersEnclosingOption': 'DOUBLE_QUOTES'|'NONE'
}
},
'SalesforceConfiguration': {
'ServerUrl': 'string',
'SecretArn': 'string',
'StandardObjectConfigurations': [
{
'Name': 'ACCOUNT'|'CAMPAIGN'|'CASE'|'CONTACT'|'CONTRACT'|'DOCUMENT'|'GROUP'|'IDEA'|'LEAD'|'OPPORTUNITY'|'PARTNER'|'PRICEBOOK'|'PRODUCT'|'PROFILE'|'SOLUTION'|'TASK'|'USER',
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
],
'KnowledgeArticleConfiguration': {
'IncludedStates': [
'DRAFT'|'PUBLISHED'|'ARCHIVED',
],
'StandardKnowledgeArticleTypeConfiguration': {
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'CustomKnowledgeArticleTypeConfigurations': [
{
'Name': 'string',
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
]
},
'ChatterFeedConfiguration': {
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'IncludeFilterTypes': [
'ACTIVE_USER'|'STANDARD_USER',
]
},
'CrawlAttachments': True|False,
'StandardObjectAttachmentConfiguration': {
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'IncludeAttachmentFilePatterns': [
'string',
],
'ExcludeAttachmentFilePatterns': [
'string',
]
},
'OneDriveConfiguration': {
'TenantDomain': 'string',
'SecretArn': 'string',
'OneDriveUsers': {
'OneDriveUserList': [
'string',
],
'OneDriveUserS3Path': {
'Bucket': 'string',
'Key': 'string'
}
},
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'DisableLocalGroups': True|False
},
'ServiceNowConfiguration': {
'HostUrl': 'string',
'SecretArn': 'string',
'ServiceNowBuildVersion': 'LONDON'|'OTHERS',
'KnowledgeArticleConfiguration': {
'CrawlAttachments': True|False,
'IncludeAttachmentFilePatterns': [
'string',
],
'ExcludeAttachmentFilePatterns': [
'string',
],
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'FilterQuery': 'string'
},
'ServiceCatalogConfiguration': {
'CrawlAttachments': True|False,
'IncludeAttachmentFilePatterns': [
'string',
],
'ExcludeAttachmentFilePatterns': [
'string',
],
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'AuthenticationType': 'HTTP_BASIC'|'OAUTH2'
},
'ConfluenceConfiguration': {
'ServerUrl': 'string',
'SecretArn': 'string',
'Version': 'CLOUD'|'SERVER',
'SpaceConfiguration': {
'CrawlPersonalSpaces': True|False,
'CrawlArchivedSpaces': True|False,
'IncludeSpaces': [
'string',
],
'ExcludeSpaces': [
'string',
],
'SpaceFieldMappings': [
{
'DataSourceFieldName': 'DISPLAY_URL'|'ITEM_TYPE'|'SPACE_KEY'|'URL',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'PageConfiguration': {
'PageFieldMappings': [
{
'DataSourceFieldName': 'AUTHOR'|'CONTENT_STATUS'|'CREATED_DATE'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'MODIFIED_DATE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'BlogConfiguration': {
'BlogFieldMappings': [
{
'DataSourceFieldName': 'AUTHOR'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'PUBLISH_DATE'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'AttachmentConfiguration': {
'CrawlAttachments': True|False,
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'AUTHOR'|'CONTENT_TYPE'|'CREATED_DATE'|'DISPLAY_URL'|'FILE_SIZE'|'ITEM_TYPE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'ProxyConfiguration': {
'Host': 'string',
'Port': 123,
'Credentials': 'string'
},
'AuthenticationType': 'HTTP_BASIC'|'PAT'
},
'GoogleDriveConfiguration': {
'SecretArn': 'string',
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'ExcludeMimeTypes': [
'string',
],
'ExcludeUserAccounts': [
'string',
],
'ExcludeSharedDrives': [
'string',
]
},
'WebCrawlerConfiguration': {
'Urls': {
'SeedUrlConfiguration': {
'SeedUrls': [
'string',
],
'WebCrawlerMode': 'HOST_ONLY'|'SUBDOMAINS'|'EVERYTHING'
},
'SiteMapsConfiguration': {
'SiteMaps': [
'string',
]
}
},
'CrawlDepth': 123,
'MaxLinksPerPage': 123,
'MaxContentSizePerPageInMegaBytes': ...,
'MaxUrlsPerMinuteCrawlRate': 123,
'UrlInclusionPatterns': [
'string',
],
'UrlExclusionPatterns': [
'string',
],
'ProxyConfiguration': {
'Host': 'string',
'Port': 123,
'Credentials': 'string'
},
'AuthenticationConfiguration': {
'BasicAuthentication': [
{
'Host': 'string',
'Port': 123,
'Credentials': 'string'
},
]
}
},
'WorkDocsConfiguration': {
'OrganizationId': 'string',
'CrawlComments': True|False,
'UseChangeLog': True|False,
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'FsxConfiguration': {
'FileSystemId': 'string',
'FileSystemType': 'WINDOWS',
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'SecretArn': 'string',
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'SlackConfiguration': {
'TeamId': 'string',
'SecretArn': 'string',
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'SlackEntityList': [
'PUBLIC_CHANNEL'|'PRIVATE_CHANNEL'|'GROUP_MESSAGE'|'DIRECT_MESSAGE',
],
'UseChangeLog': True|False,
'CrawlBotMessage': True|False,
'ExcludeArchived': True|False,
'SinceCrawlDate': 'string',
'LookBackPeriod': 123,
'PrivateChannelFilter': [
'string',
],
'PublicChannelFilter': [
'string',
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'BoxConfiguration': {
'EnterpriseId': 'string',
'SecretArn': 'string',
'UseChangeLog': True|False,
'CrawlComments': True|False,
'CrawlTasks': True|False,
'CrawlWebLinks': True|False,
'FileFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'TaskFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'CommentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'WebLinkFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'QuipConfiguration': {
'Domain': 'string',
'SecretArn': 'string',
'CrawlFileComments': True|False,
'CrawlChatRooms': True|False,
'CrawlAttachments': True|False,
'FolderIds': [
'string',
],
'ThreadFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'MessageFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'JiraConfiguration': {
'JiraAccountUrl': 'string',
'SecretArn': 'string',
'UseChangeLog': True|False,
'Project': [
'string',
],
'IssueType': [
'string',
],
'Status': [
'string',
],
'IssueSubEntityFilter': [
'COMMENTS'|'ATTACHMENTS'|'WORKLOGS',
],
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'CommentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'IssueFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'ProjectFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'WorkLogFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'GitHubConfiguration': {
'SaaSConfiguration': {
'OrganizationName': 'string',
'HostUrl': 'string'
},
'OnPremiseConfiguration': {
'HostUrl': 'string',
'OrganizationName': 'string',
'SslCertificateS3Path': {
'Bucket': 'string',
'Key': 'string'
}
},
'Type': 'SAAS'|'ON_PREMISE',
'SecretArn': 'string',
'UseChangeLog': True|False,
'GitHubDocumentCrawlProperties': {
'CrawlRepositoryDocuments': True|False,
'CrawlIssue': True|False,
'CrawlIssueComment': True|False,
'CrawlIssueCommentAttachment': True|False,
'CrawlPullRequest': True|False,
'CrawlPullRequestComment': True|False,
'CrawlPullRequestCommentAttachment': True|False
},
'RepositoryFilter': [
'string',
],
'InclusionFolderNamePatterns': [
'string',
],
'InclusionFileTypePatterns': [
'string',
],
'InclusionFileNamePatterns': [
'string',
],
'ExclusionFolderNamePatterns': [
'string',
],
'ExclusionFileTypePatterns': [
'string',
],
'ExclusionFileNamePatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'GitHubRepositoryConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubCommitConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubIssueDocumentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubIssueCommentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubIssueAttachmentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubPullRequestCommentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubPullRequestDocumentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubPullRequestDocumentAttachmentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'AlfrescoConfiguration': {
'SiteUrl': 'string',
'SiteId': 'string',
'SecretArn': 'string',
'SslCertificateS3Path': {
'Bucket': 'string',
'Key': 'string'
},
'CrawlSystemFolders': True|False,
'CrawlComments': True|False,
'EntityFilter': [
'wiki'|'blog'|'documentLibrary',
],
'DocumentLibraryFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'BlogFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'WikiFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'TemplateConfiguration': {
'Template': {...}|[...]|123|123.4|'string'|True|None
}
},
VpcConfiguration={
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
Description='string',
Schedule='string',
RoleArn='string',
Tags=[
{
'Key': 'string',
'Value': 'string'
},
],
ClientToken='string',
LanguageCode='string',
CustomDocumentEnrichmentConfiguration={
'InlineConfigurations': [
{
'Condition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'Target': {
'TargetDocumentAttributeKey': 'string',
'TargetDocumentAttributeValueDeletion': True|False,
'TargetDocumentAttributeValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'DocumentContentDeletion': True|False
},
],
'PreExtractionHookConfiguration': {
'InvocationCondition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LambdaArn': 'string',
'S3Bucket': 'string'
},
'PostExtractionHookConfiguration': {
'InvocationCondition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LambdaArn': 'string',
'S3Bucket': 'string'
},
'RoleArn': 'string'
}
)
[REQUIRED]
A name for the data source connector.
[REQUIRED]
The identifier of the index you want to use with the data source connector.
[REQUIRED]
The type of data source repository. For example, SHAREPOINT
.
Configuration information to connect to your data source repository.
You can't specify the Configuration
parameter when the Type
parameter is set to CUSTOM
. If you do, you receive a ValidationException
exception.
The Configuration
parameter is required for all other data sources.
Provides the configuration information to connect to an Amazon S3 bucket as your data source.
The name of the bucket that contains the documents.
A list of S3 prefixes for the documents that should be included in the index.
A list of glob patterns for documents that should be indexed. If a document that matches an inclusion pattern also matches an exclusion pattern, the document is not indexed.
Some examples are:
A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix or inclusion pattern also matches an exclusion pattern, the document is not indexed.
Some examples are:
Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.
A prefix used to filter metadata configuration files in the Amazon Web Services S3 bucket. The S3 bucket might contain multiple metadata files. Use S3Prefix
to include only the desired metadata files.
Provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources.
Path to the Amazon S3 bucket that contains the ACL files.
Provides the configuration information to connect to Microsoft SharePoint as your data source.
The version of Microsoft SharePoint that you use.
The Microsoft SharePoint site URLs for the documents you want to index.
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the user name and password required to connect to the SharePoint instance. If you use SharePoint Server, you also need to provide the sever domain name as part of the credentials. For more information, see Using a Microsoft SharePoint Data Source.
You can also provide OAuth authentication credentials of user name, password, client ID, and client secret. For more information, see Using a SharePoint data source.
TRUE
to index document attachments.
TRUE
to use the SharePoint change log to determine which documents require updating in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in SharePoint.
A list of regular expression patterns to include certain documents in your SharePoint. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The regex applies to the display URL of the SharePoint document.
A list of regular expression patterns to exclude certain documents in your SharePoint. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The regex applies to the display URL of the SharePoint document.
Configuration information for an Amazon Virtual Private Cloud to connect to your Microsoft SharePoint. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
A list of DataSourceToIndexFieldMapping
objects that map SharePoint data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to SharePoint fields. For more information, see Mapping data source fields. The SharePoint data source field names must exist in your SharePoint custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
The Microsoft SharePoint attribute field that contains the title of the document.
TRUE
to disable local groups information.
The path to the SSL certificate stored in an Amazon S3 bucket. You use this to connect to SharePoint Server if you require a secure SSL connection.
You can simply generate a self-signed X509 certificate on any computer using OpenSSL. For an example of using OpenSSL to create an X509 certificate, see Create and sign an X509 certificate.
The name of the S3 bucket that contains the file.
The name of the file.
Whether you want to connect to SharePoint using basic authentication of user name and password, or OAuth authentication of user name, password, client ID, and client secret. You can use OAuth authentication for SharePoint Online.
Configuration information to connect to your Microsoft SharePoint site URLs via instance via a web proxy. You can use this option for SharePoint Server.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication of user name and password. To store web proxy credentials, you use a secret in Secrets Manager.
It is recommended that you follow best security practices when configuring your web proxy. This includes setting up throttling, setting up logging and monitoring, and applying security patches on a regular basis. If you use your web proxy with multiple data sources, sync jobs that occur at the same time could strain the load on your proxy. It is recommended you prepare your proxy beforehand for any security and load requirements.
The name of the website host you want to connect to via a web proxy server.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
The port number of the website host you want to connect to via a web proxy server.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Your secret ARN, which you can create in Secrets Manager
The credentials are optional. You use a secret if web proxy credentials are required to connect to a website host. Amazon Kendra currently support basic authentication to connect to a web proxy server. The secret stores your credentials.
Provides the configuration information to connect to a database as your data source.
The type of database engine that runs the database.
Configuration information that's required to connect to a database.
The name of the host for the database. Can be either a string (host.subdomain.domain.tld) or an IPv4 or IPv6 address.
The port that the database uses for connections.
The name of the database containing the document data.
The name of the table that contains the document data.
The Amazon Resource Name (ARN) of credentials stored in Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Database Data Source. For more information about Secrets Manager, see What Is Secrets Manager in the Secrets Manager user guide.
Provides the configuration information to connect to an Amazon VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Information about where the index should get the document information from the database.
The column that provides the document's identifier.
The column that contains the contents of the document.
The column that contains the title of the document.
An array of objects that map database column names to the corresponding fields in an index. You must first create the fields in the index using the UpdateIndex
API.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
One to five columns that indicate when a document in the database has changed.
Information about the database column that provides information for user context filtering.
A list of groups, separated by semi-colons, that filters a query response based on user context. The document is only returned to users that are in one of the groups specified in the UserContext
field of the Query
API.
Provides information about how Amazon Kendra uses quote marks around SQL identifiers when querying a database data source.
Determines whether Amazon Kendra encloses SQL identifiers for tables and column names in double quotes (") when making a database query.
By default, Amazon Kendra passes SQL identifiers the way that they are entered into the data source configuration. It does not change the case of identifiers or enclose them in quotes.
PostgreSQL internally converts uppercase characters to lower case characters in identifiers unless they are quoted. Choosing this option encloses identifiers in quotes so that PostgreSQL does not convert the character's case.
For MySQL databases, you must enable the ansi_quotes
option when you set this field to DOUBLE_QUOTES
.
Provides the configuration information to connect to Salesforce as your data source.
The instance URL for the Salesforce site that you want to index.
The Amazon Resource Name (ARN) of an Secrets Managersecret that contains the key/value pairs required to connect to your Salesforce instance. The secret must contain a JSON structure with the following keys:
Configuration of the Salesforce standard objects that Amazon Kendra indexes.
Provides the configuration information for indexing a single standard object.
The name of the standard object.
The name of the field in the standard object table that contains the document contents.
The name of the field in the standard object table that contains the document title.
Maps attributes or field names of the standard object to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Salesforce fields. For more information, see Mapping data source fields. The Salesforce data source field names must exist in your Salesforce custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Configuration information for the knowledge article types that Amazon Kendra indexes. Amazon Kendra indexes standard knowledge articles and the standard fields of knowledge articles, or the custom fields of custom knowledge articles, but not both.
Specifies the document states that should be included when Amazon Kendra indexes knowledge articles. You must specify at least one state.
Configuration information for standard Salesforce knowledge articles.
The name of the field that contains the document data to index.
The name of the field that contains the document title.
Maps attributes or field names of the knowledge article to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Salesforce fields. For more information, see Mapping data source fields. The Salesforce data source field names must exist in your Salesforce custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Configuration information for custom Salesforce knowledge articles.
Provides the configuration information for indexing Salesforce custom articles.
The name of the configuration.
The name of the field in the custom knowledge article that contains the document data to index.
The name of the field in the custom knowledge article that contains the document title.
Maps attributes or field names of the custom knowledge article to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Salesforce fields. For more information, see Mapping data source fields. The Salesforce data source field names must exist in your Salesforce custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Configuration information for Salesforce chatter feeds.
The name of the column in the Salesforce FeedItem table that contains the content to index. Typically this is the Body
column.
The name of the column in the Salesforce FeedItem table that contains the title of the document. This is typically the Title
column.
Maps fields from a Salesforce chatter feed into Amazon Kendra index fields.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Filters the documents in the feed based on status of the user. When you specify ACTIVE_USERS
only documents from users who have an active account are indexed. When you specify STANDARD_USER
only documents for Salesforce standard users are documented. You can specify both.
Indicates whether Amazon Kendra should index attachments to Salesforce objects.
Configuration information for processing attachments to Salesforce standard objects.
The name of the field used for the document title.
One or more objects that map fields in attachments to Amazon Kendra index fields.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain documents in your Salesforce. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the name of the attached file.
A list of regular expression patterns to exclude certain documents in your Salesforce. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the name of the attached file.
Provides the configuration information to connect to Microsoft OneDrive as your data source.
The Azure Active Directory domain of the organization.
The Amazon Resource Name (ARN) of an Secrets Managersecret that contains the user name and password to connect to OneDrive. The user namd should be the application ID for the OneDrive application, and the password is the application key for the OneDrive application.
A list of user accounts whose documents should be indexed.
A list of users whose documents should be indexed. Specify the user names in email format, for example, username@tenantdomain
. If you need to index the documents of more than 100 users, use the OneDriveUserS3Path
field to specify the location of a file containing a list of users.
The S3 bucket location of a file containing a list of users whose documents should be indexed.
The name of the S3 bucket that contains the file.
The name of the file.
A list of regular expression patterns to include certain documents in your OneDrive. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the file name.
A list of regular expression patterns to exclude certain documents in your OneDrive. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the file name.
A list of DataSourceToIndexFieldMapping
objects that map OneDrive data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to OneDrive fields. For more information, see Mapping data source fields. The OneDrive data source field names must exist in your OneDrive custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
TRUE
to disable local groups information.
Provides the configuration information to connect to ServiceNow as your data source.
The ServiceNow instance that the data source connects to. The host endpoint should look like the following: {instance}.service-now.com.
The Amazon Resource Name (ARN) of the Secrets Manager secret that contains the user name and password required to connect to the ServiceNow instance. You can also provide OAuth authentication credentials of user name, password, client ID, and client secret. For more information, see Using a ServiceNow data source.
The identifier of the release that the ServiceNow host is running. If the host is not running the LONDON
release, use OTHERS
.
Configuration information for crawling knowledge articles in the ServiceNow site.
TRUE
to index attachments to knowledge articles.
A list of regular expression patterns to include certain attachments of knowledge articles in your ServiceNow. Item that match the patterns are included in the index. Items that don't match the patterns are excluded from the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the field specified in the PatternTargetField
.
A list of regular expression patterns to exclude certain attachments of knowledge articles in your ServiceNow. Item that match the patterns are excluded from the index. Items that don't match the patterns are included in the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the field specified in the PatternTargetField
.
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
The name of the ServiceNow field that is mapped to the index document title field.
Maps attributes or field names of knoweldge articles to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to ServiceNow fields. For more information, see Mapping data source fields. The ServiceNow data source field names must exist in your ServiceNow custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A query that selects the knowledge articles to index. The query can return articles from multiple knowledge bases, and the knowledge bases can be public or private.
The query string must be one generated by the ServiceNow console. For more information, see Specifying documents to index with a query.
Configuration information for crawling service catalogs in the ServiceNow site.
TRUE
to index attachments to service catalog items.
A list of regular expression patterns to include certain attachments of catalogs in your ServiceNow. Item that match the patterns are included in the index. Items that don't match the patterns are excluded from the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the file name of the attachment.
A list of regular expression patterns to exclude certain attachments of catalogs in your ServiceNow. Item that match the patterns are excluded from the index. Items that don't match the patterns are included in the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the file name of the attachment.
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
The name of the ServiceNow field that is mapped to the index document title field.
Maps attributes or field names of catalogs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to ServiceNow fields. For more information, see Mapping data source fields. The ServiceNow data source field names must exist in your ServiceNow custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
The type of authentication used to connect to the ServiceNow instance. If you choose HTTP_BASIC
, Amazon Kendra is authenticated using the user name and password provided in the Secrets Manager secret in the SecretArn
field. If you choose OAUTH2
, Amazon Kendra is authenticated using the credentials of client ID, client secret, user name and password.
When you use OAUTH2
authentication, you must generate a token and a client secret using the ServiceNow console. For more information, see Using a ServiceNow data source.
Provides the configuration information to connect to Confluence as your data source.
The URL of your Confluence instance. Use the full URL of the server. For example, https://server.example.com:port/ . You can also use an IP address, for example, https://192.168.1.113/ .
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the user name and password required to connect to the Confluence instance. If you use Confluence Cloud, you use a generated API token as the password.
You can also provide authentication credentials in the form of a personal access token. For more information, see Using a Confluence data source.
The version or the type of Confluence installation to connect to.
Configuration information for indexing Confluence spaces.
TRUE
to index personal spaces. You can add restrictions to items in personal spaces. If personal spaces are indexed, queries without user context information may return restricted items from a personal space in their results. For more information, see Filtering on user context.
TRUE
to index archived spaces.
A list of space keys for Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are indexed. Spaces that aren't in the list aren't indexed. A space in the list must exist. Otherwise, Amazon Kendra logs an error when the data source is synchronized. If a space is in both the IncludeSpaces
and the ExcludeSpaces
list, the space is excluded.
A list of space keys of Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are not indexed. If a space is in both the ExcludeSpaces
and the IncludeSpaces
list, the space is excluded.
Maps attributes or field names of Confluence spaces to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the SpaceFieldMappings
parameter, you must specify at least one field mapping.
>Maps attributes or field names of Confluence spaces to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
The name of the field in the data source.
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
Configuration information for indexing Confluence pages.
Maps attributes or field names of Confluence pages to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the PageFieldMappings
parameter, you must specify at least one field mapping.
>Maps attributes or field names of Confluence pages to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
The name of the field in the data source.
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
Configuration information for indexing Confluence blogs.
Maps attributes or field names of Confluence blogs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the BlogFieldMappings
parameter, you must specify at least one field mapping.
Maps attributes or field names of Confluence blog to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
The name of the field in the data source.
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
Configuration information for indexing attachments to Confluence blogs and pages.
TRUE
to index attachments of pages and blogs in Confluence.
Maps attributes or field names of Confluence attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the AttachentFieldMappings
parameter, you must specify at least one field mapping.
Maps attributes or field names of Confluence attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confuence data source field names must exist in your Confluence custom metadata.
The name of the field in the data source.
You must first create the index field using the UpdateIndex
API.
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
Configuration information for an Amazon Virtual Private Cloud to connect to your Confluence. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
A list of regular expression patterns to include certain blog posts, pages, spaces, or attachments in your Confluence. Content that matches the patterns are included in the index. Content that doesn't match the patterns is excluded from the index. If content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the content isn't included in the index.
A list of regular expression patterns to exclude certain blog posts, pages, spaces, or attachments in your Confluence. Content that matches the patterns are excluded from the index. Content that doesn't match the patterns is included in the index. If content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the content isn't included in the index.
Configuration information to connect to your Confluence URL instance via a web proxy. You can use this option for Confluence Server.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication of user name and password. To store web proxy credentials, you use a secret in Secrets Manager.
It is recommended that you follow best security practices when configuring your web proxy. This includes setting up throttling, setting up logging and monitoring, and applying security patches on a regular basis. If you use your web proxy with multiple data sources, sync jobs that occur at the same time could strain the load on your proxy. It is recommended you prepare your proxy beforehand for any security and load requirements.
The name of the website host you want to connect to via a web proxy server.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
The port number of the website host you want to connect to via a web proxy server.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Your secret ARN, which you can create in Secrets Manager
The credentials are optional. You use a secret if web proxy credentials are required to connect to a website host. Amazon Kendra currently support basic authentication to connect to a web proxy server. The secret stores your credentials.
Whether you want to connect to Confluence using basic authentication of user name and password, or a personal access token. You can use a personal access token for Confluence Server.
Provides the configuration information to connect to Google Drive as your data source.
The Amazon Resource Name (ARN) of a Secrets Managersecret that contains the credentials required to connect to Google Drive. For more information, see Using a Google Workspace Drive data source.
A list of regular expression patterns to include certain items in your Google Drive, including shared drives and users' My Drives. Items that match the patterns are included in the index. Items that don't match the patterns are excluded from the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
A list of regular expression patterns to exclude certain items in your Google Drive, including shared drives and users' My Drives. Items that match the patterns are excluded from the index. Items that don't match the patterns are included in the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
Maps Google Drive data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Google Drive fields. For more information, see Mapping data source fields. The Google Drive data source field names must exist in your Google Drive custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of MIME types to exclude from the index. All documents matching the specified MIME type are excluded.
For a list of MIME types, see Using a Google Workspace Drive data source.
A list of email addresses of the users. Documents owned by these users are excluded from the index. Documents shared with excluded users are indexed unless they are excluded in another way.
A list of identifiers or shared drives to exclude from the index. All files and folders stored on the shared drive are excluded.
Provides the configuration information required for Amazon Kendra Web Crawler.
Specifies the seed or starting point URLs of the websites or the sitemap URLs of the websites you want to crawl.
You can include website subdomains. You can list up to 100 seed URLs and up to three sitemap URLs.
You can only crawl websites that use the secure communication protocol, Hypertext Transfer Protocol Secure (HTTPS). If you receive an error when crawling a website, it could be that the website is blocked from crawling.
When selecting websites to index, you must adhere to the Amazon Acceptable Use Policy and all other Amazon terms. Remember that you must only use Amazon Kendra Web Crawler to index your own webpages, or webpages that you have authorization to index.
Configuration of the seed or starting point URLs of the websites you want to crawl.
You can choose to crawl only the website host names, or the website host names with subdomains, or the website host names with subdomains and other domains that the webpages link to.
You can list up to 100 seed URLs.
The list of seed or starting point URLs of the websites you want to crawl.
The list can include a maximum of 100 seed URLs.
You can choose one of the following modes:
HOST_ONLY
– crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.SUBDOMAINS
– crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.EVERYTHING
– crawl the website host names with subdomains and other domains that the webpages link to.The default mode is set to HOST_ONLY
.
Configuration of the sitemap URLs of the websites you want to crawl.
Only URLs belonging to the same website host names are crawled. You can list up to three sitemap URLs.
The list of sitemap URLs of the websites you want to crawl.
The list can include a maximum of three sitemap URLs.
Specifies the number of levels in a website that you want to crawl.
The first level begins from the website seed or starting point URL. For example, if a website has 3 levels – index level (i.e. seed in this example), sections level, and subsections level – and you are only interested in crawling information up to the sections level (i.e. levels 0-1), you can set your depth to 1.
The default crawl depth is set to 2.
The maximum number of URLs on a webpage to include when crawling a website. This number is per webpage.
As a website’s webpages are crawled, any URLs the webpages link to are also crawled. URLs on a webpage are crawled in order of appearance.
The default maximum links per page is 100.
The maximum size (in MB) of a webpage or attachment to crawl.
Files larger than this size (in MB) are skipped/not crawled.
The default maximum size of a webpage or attachment is set to 50 MB.
The maximum number of URLs crawled per website host per minute.
A minimum of one URL is required.
The default maximum number of URLs crawled per website host per minute is 300.
A list of regular expression patterns to include certain URLs to crawl. URLs that match the patterns are included in the index. URLs that don't match the patterns are excluded from the index. If a URL matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the URL file isn't included in the index.
A list of regular expression patterns to exclude certain URLs to crawl. URLs that match the patterns are excluded from the index. URLs that don't match the patterns are included in the index. If a URL matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the URL file isn't included in the index.
Configuration information required to connect to your internal websites via a web proxy.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication. To store web proxy credentials, you use a secret in Secrets Manager.
The name of the website host you want to connect to via a web proxy server.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
The port number of the website host you want to connect to via a web proxy server.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Your secret ARN, which you can create in Secrets Manager
The credentials are optional. You use a secret if web proxy credentials are required to connect to a website host. Amazon Kendra currently support basic authentication to connect to a web proxy server. The secret stores your credentials.
Configuration information required to connect to websites using authentication.
You can connect to websites using basic authentication of user name and password. You use a secret in Secrets Manager to store your authentication credentials.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
The list of configuration information that's required to connect to and crawl a website host using basic authentication credentials.
The list includes the name and port number of the website host.
Provides the configuration information to connect to websites that require basic user authentication.
The name of the website host you want to connect to using authentication credentials.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
The port number of the website host you want to connect to using authentication credentials.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Your secret ARN, which you can create in Secrets Manager
You use a secret if basic authentication credentials are required to connect to a website. The secret stores your credentials of user name and password.
Provides the configuration information to connect to Amazon WorkDocs as your data source.
The identifier of the directory corresponding to your Amazon WorkDocs site repository.
You can find the organization ID in the Directory Service by going to Active Directory , then Directories . Your Amazon WorkDocs site directory has an ID, which is the organization ID. You can also set up a new Amazon WorkDocs directory in the Directory Service console and enable a Amazon WorkDocs site for the directory in the Amazon WorkDocs console.
TRUE
to include comments on documents in your index. Including comments in your index means each comment is a document that can be searched on.
The default is set to FALSE
.
TRUE
to use the Amazon WorkDocs change log to determine which documents require updating in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in Amazon WorkDocs.
A list of regular expression patterns to include certain files in your Amazon WorkDocs site repository. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain files in your Amazon WorkDocs site repository. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of DataSourceToIndexFieldMapping
objects that map Amazon WorkDocs data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Amazon WorkDocs fields. For more information, see Mapping data source fields. The Amazon WorkDocs data source field names must exist in your Amazon WorkDocs custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Provides the configuration information to connect to Amazon FSx as your data source.
The identifier of the Amazon FSx file system.
You can find your file system ID on the file system dashboard in the Amazon FSx console. For information on how to create a file system in Amazon FSx console, using Windows File Server as an example, see Amazon FSx Getting started guide.
The Amazon FSx file system type. Windows is currently the only supported type.
Configuration information for an Amazon Virtual Private Cloud to connect to your Amazon FSx. Your Amazon FSx instance must reside inside your VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Amazon FSx file system. Windows is currently the only supported type. The secret must contain a JSON structure with the following keys:
A list of regular expression patterns to include certain files in your Amazon FSx file system. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain files in your Amazon FSx file system. Files that match the patterns are excluded from the index. Files that don't match the patterns are included in the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of DataSourceToIndexFieldMapping
objects that map Amazon FSx data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Amazon FSx fields. For more information, see Mapping data source fields. The Amazon FSx data source field names must exist in your Amazon FSx custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Provides the configuration information to connect to Slack as your data source.
The identifier of the team in the Slack workspace. For example, T0123456789 .
You can find your team ID in the URL of the main page of your Slack workspace. When you log in to Slack via a browser, you are directed to the URL of the main page. For example, https://app.slack.com/client/T0123456789 /....
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Slack workspace team. The secret must contain a JSON structure with the following keys:
Configuration information for an Amazon Virtual Private Cloud to connect to your Slack. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Specify whether to index public channels, private channels, group messages, and direct messages. You can specify one or more of these options.
TRUE
to use the Slack change log to determine which documents require updating in the index. Depending on the Slack change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in Slack.
TRUE
to index bot messages from your Slack workspace team.
TRUE
to exclude archived messages to index from your Slack workspace team.
The date to start crawling your data from your Slack workspace team. The date must follow this format: yyyy-mm-dd
.
The number of hours for change log to look back from when you last synchronized your data. You can look back up to 7 days or 168 hours.
Change log updates your index only if new content was added since you last synced your data. Updated or deleted content from before you last synced does not get updated in your index. To capture updated or deleted content before you last synced, set the LookBackPeriod
to the number of hours you want change log to look back.
The list of private channel names from your Slack workspace team. You use this if you want to index specific private channels, not all private channels. You can also use regular expression patterns to filter private channels.
The list of public channel names to index from your Slack workspace team. You use this if you want to index specific public channels, not all public channels. You can also use regular expression patterns to filter public channels.
A list of regular expression patterns to include certain attached files in your Slack workspace team. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain attached files in your Slack workspace team. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of DataSourceToIndexFieldMapping
objects that map Slack data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Slack fields. For more information, see Mapping data source fields. The Slack data source field names must exist in your Slack custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Provides the configuration information to connect to Box as your data source.
The identifier of the Box Enterprise platform. You can find the enterprise ID in the Box Developer Console settings or when you create an app in Box and download your authentication credentials. For example, 801234567 .
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Box platform. The secret must contain a JSON structure with the following keys:
You create an application in Box to generate the keys or credentials required for the secret. For more information, see Using a Box data source.
TRUE
to use the Slack change log to determine which documents require updating in the index. Depending on the data source change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents.
TRUE
to index comments.
TRUE
to index the contents of tasks.
TRUE
to index web links.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box files to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box tasks to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box web links to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain files and folders in your Box platform. Files and folders that match the patterns are included in the index. Files and folders that don't match the patterns are excluded from the index. If a file or folder matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file or folder isn't included in the index.
A list of regular expression patterns to exclude certain files and folders from your Box platform. Files and folders that match the patterns are excluded from the index.Files and folders that don't match the patterns are included in the index. If a file or folder matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file or folder isn't included in the index.
Configuration information for an Amazon VPC to connect to your Box. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Provides the configuration information to connect to Quip as your data source.
The Quip site domain. For example, https://quip-company.quipdomain.com/browse . The domain in this example is "quipdomain".
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs that are required to connect to your Quip. The secret must contain a JSON structure with the following keys:
TRUE
to index file comments.
TRUE
to index the contents of chat rooms.
TRUE
to index attachments.
The identifiers of the Quip folders you want to index. You can find the folder ID in your browser URL when you access your folder in Quip. For example, https://quip-company.quipdomain.com/zlLuOVNSarTL/folder-name . The folder ID in this example is "zlLuOVNSarTL".
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Quip threads to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Quip fields. For more information, see Mapping data source fields. The Quip field names must exist in your Quip custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Quip messages to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Quip fields. For more information, see Mapping data source fields. The Quip field names must exist in your Quip custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Quip attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Quip fields. For more information, see Mapping data source fields. The Quip field names must exist in your Quip custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain files in your Quip file system. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence, and the file isn't included in the index.
A list of regular expression patterns to exclude certain files in your Quip file system. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence, and the file isn't included in the index.
Configuration information for an Amazon Virtual Private Cloud (VPC) to connect to your Quip. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Provides the configuration information to connect to Jira as your data source.
The URL of the Jira account. For example, company.atlassian.net or https://jira.company.com . You can find your Jira account URL in the URL of your profile page for Jira desktop.
The Amazon Resource Name (ARN) of a secret in Secrets Manager contains the key-value pairs required to connect to your Jira data source. The secret must contain a JSON structure with the following keys:
TRUE
to use the Jira change log to determine which documents require updating in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in Jira.
Specify which projects to crawl in your Jira data source. You can specify one or more Jira project IDs.
Specify which issue types to crawl in your Jira data source. You can specify one or more of these options to crawl.
Specify which statuses to crawl in your Jira data source. You can specify one or more of these options to crawl.
Specify whether to crawl comments, attachments, and work logs. You can specify one or more of these options.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira issues to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira projects to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira work logs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain file paths, file names, and file types in your Jira data source. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain file paths, file names, and file types in your Jira data source. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
Configuration information for an Amazon Virtual Private Cloud to connect to your Jira. Your Jira account must reside inside your VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Provides the configuration information to connect to GitHub as your data source.
Configuration information to connect to GitHub Enterprise Cloud (SaaS).
The name of the organization of the GitHub Enterprise Cloud (SaaS) account you want to connect to. You can find your organization name by logging into GitHub desktop and selecting Your organizations under your profile picture dropdown.
The GitHub host URL or API endpoint URL. For example, https://api.github.com .
Configuration information to connect to GitHub Enterprise Server (on premises).
The GitHub host URL or API endpoint URL. For example, https://on-prem-host-url/api/v3/
The name of the organization of the GitHub Enterprise Server (in-premise) account you want to connect to. You can find your organization name by logging into GitHub desktop and selecting Your organizations under your profile picture dropdown.
The path to the SSL certificate stored in an Amazon S3 bucket. You use this to connect to GitHub if you require a secure SSL connection.
You can simply generate a self-signed X509 certificate on any computer using OpenSSL. For an example of using OpenSSL to create an X509 certificate, see Create and sign an X509 certificate.
The name of the S3 bucket that contains the file.
The name of the file.
The type of GitHub service you want to connect to—GitHub Enterprise Cloud (SaaS) or GitHub Enterprise Server (on premises).
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your GitHub. The secret must contain a JSON structure with the following keys:
TRUE
to use the GitHub change log to determine which documents require updating in the index. Depending on the GitHub change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in GitHub.
Configuration information to include certain types of GitHub content. You can configure to index repository files only, or also include issues and pull requests, comments, and comment attachments.
TRUE
to index all files with a repository.
TRUE
to index all issues within a repository.
TRUE
to index all comments on issues.
TRUE
to include all comment attachments for issues.
TRUE
to index all pull requests within a repository.
TRUE
to index all comments on pull requests.
TRUE
to include all comment attachments for pull requests.
A list of names of the specific repositories you want to index.
A list of regular expression patterns to include certain folder names in your GitHub repository or repositories. Folder names that match the patterns are included in the index. Folder names that don't match the patterns are excluded from the index. If a folder matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the folder isn't included in the index.
A list of regular expression patterns to include certain file types in your GitHub repository or repositories. File types that match the patterns are included in the index. File types that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to include certain file names in your GitHub repository or repositories. File names that match the patterns are included in the index. File names that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain folder names in your GitHub repository or repositories. Folder names that match the patterns are excluded from the index. Folder names that don't match the patterns are included in the index. If a folder matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the folder isn't included in the index.
A list of regular expression patterns to exclude certain file types in your GitHub repository or repositories. File types that match the patterns are excluded from the index. File types that don't match the patterns are included in the index. If a file matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain file names in your GitHub repository or repositories. File names that match the patterns are excluded from the index. File names that don't match the patterns are included in the index. If a file matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
Configuration information of an Amazon Virtual Private Cloud to connect to your GitHub. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
A list of DataSourceToIndexFieldMapping
objects that map GitHub repository attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub commits to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub issues to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub issue comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub issue attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub pull request comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub pull requests to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub pull request attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Provides the configuration information to connect to Alfresco as your data source.
The URL of the Alfresco site. For example, https://hostname:8080 .
The identifier of the Alfresco site. For example, my-site .
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Alfresco data source. The secret must contain a JSON structure with the following keys:
The path to the SSL certificate stored in an Amazon S3 bucket. You use this to connect to Alfresco if you require a secure SSL connection.
You can simply generate a self-signed X509 certificate on any computer using OpenSSL. For an example of using OpenSSL to create an X509 certificate, see Create and sign an X509 certificate.
The name of the S3 bucket that contains the file.
The name of the file.
TRUE
to index shared files.
TRUE
to index comments of blogs and other content.
Specify whether to index document libraries, wikis, or blogs. You can specify one or more of these options.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Alfresco document libraries to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Alfresco fields. For more information, see Mapping data source fields. The Alfresco data source field names must exist in your Alfresco custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Alfresco blogs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Alfresco fields. For more information, see Mapping data source fields. The Alfresco data source field names must exist in your Alfresco custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Alfresco wikis to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Alfresco fields. For more information, see Mapping data source fields. The Alfresco data source field names must exist in your Alfresco custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain files in your Alfresco data source. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain files in your Alfresco data source. Files that match the patterns are excluded from the index. Files that don't match the patterns are included in the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
Configuration information for an Amazon Virtual Private Cloud to connect to your Alfresco. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Provides a template for the configuration information to connect to your data source.
The template schema used for the data source, where templates schemas are supported.
Configuration information for an Amazon Virtual Private Cloud to connect to your data source. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Sets the frequency for Amazon Kendra to check the documents in your data source repository and update the index. If you don't set a schedule Amazon Kendra will not periodically update the index. You can call the StartDataSourceSyncJob
API to update the index.
You can't specify the Schedule
parameter when the Type
parameter is set to CUSTOM
. If you do, you receive a ValidationException
exception.
The Amazon Resource Name (ARN) of a role with permission to access the data source and required resources. For more information, see IAM roles for Amazon Kendra.
You can't specify the RoleArn
parameter when the Type
parameter is set to CUSTOM
. If you do, you receive a ValidationException
exception.
The RoleArn
parameter is required for all other data sources.
A list of key-value pairs that identify the data source connector. You can use the tags to identify and organize your resources and to control access to resources.
A list of key/value pairs that identify an index, FAQ, or data source. Tag keys and values can consist of Unicode letters, digits, white space, and any of the following symbols: _ . : / = + - @.
The key for the tag. Keys are not case sensitive and must be unique for the index, FAQ, or data source.
The value associated with the tag. The value may be an empty string but it can't be null.
A token that you provide to identify the request to create a data source connector. Multiple calls to the CreateDataSource
API with the same client token will create only one data source connector.
This field is autopopulated if not provided.
Configuration information for altering document metadata and content during the document ingestion process.
For more information on how to create, modify and delete document metadata, or make other content alterations when you ingest documents into Amazon Kendra, see Customizing document metadata during the ingestion process.
Configuration information to alter document attributes or metadata fields and content when ingesting documents into Amazon Kendra.
Provides the configuration information for applying basic logic to alter document metadata and content when ingesting documents into Amazon Kendra. To apply advanced logic, to go beyond what you can do with basic logic, see HookConfiguration.
For more information, see Customizing document metadata during the ingestion process.
Configuration of the condition used for the target document attribute or metadata field when ingesting documents into Amazon Kendra.
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
The condition operator.
For example, you can use 'Contains' to partially match a string.
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Configuration of the target document attribute or metadata field when ingesting documents into Amazon Kendra. You can also include a value.
The identifier of the target document attribute or metadata field.
For example, 'Department' could be an identifier for the target attribute or metadata field that includes the department names associated with the documents.
TRUE
to delete the existing target value for your specified target attribute key. You cannot create a target value and set this toTRUE
. To create a target value (TargetDocumentAttributeValue
), set this toFALSE
.
The target value you want to create for the target attribute.
For example, 'Finance' could be the target value for the target attribute key 'Department'.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
TRUE
to delete content if the condition used for the target attribute is met.
Configuration information for invoking a Lambda function in Lambda on the original or raw documents before extracting their metadata and text. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation.
The condition used for when a Lambda function should be invoked.
For example, you can specify a condition that if there are empty date-time values, then Amazon Kendra should invoke a function that inserts the current date-time.
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
The condition operator.
For example, you can use 'Contains' to partially match a string.
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Amazon Kendra.
Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.
Configuration information for invoking a Lambda function in Lambda on the structured documents with their metadata and text extracted. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation.
The condition used for when a Lambda function should be invoked.
For example, you can specify a condition that if there are empty date-time values, then Amazon Kendra should invoke a function that inserts the current date-time.
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
The condition operator.
For example, you can use 'Contains' to partially match a string.
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Amazon Kendra.
Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.
The Amazon Resource Name (ARN) of a role with permission to run PreExtractionHookConfiguration
and PostExtractionHookConfiguration
for altering document metadata and content during the document ingestion process. For more information, see IAM roles for Amazon Kendra.
dict
Response Syntax
{
'Id': 'string'
}
Response Structure
(dict) --
Id (string) --
The identifier of the data source connector.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ResourceAlreadyExistException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
create_experience
(**kwargs)¶Creates an Amazon Kendra experience such as a search application. For more information on creating a search application experience, including using the Python and Java SDKs, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.create_experience(
Name='string',
IndexId='string',
RoleArn='string',
Configuration={
'ContentSourceConfiguration': {
'DataSourceIds': [
'string',
],
'FaqIds': [
'string',
],
'DirectPutContent': True|False
},
'UserIdentityConfiguration': {
'IdentityAttributeName': 'string'
}
},
Description='string',
ClientToken='string'
)
[REQUIRED]
A name for your Amazon Kendra experience.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
Query
API, QuerySuggestions
API, SubmitFeedback
API, and IAM Identity Center that stores your user and group information. For more information, see IAM roles for Amazon Kendra.Configuration information for your Amazon Kendra experience. This includes ContentSourceConfiguration
, which specifies the data source IDs and/or FAQ IDs, and UserIdentityConfiguration
, which specifies the user or group information to grant access to your Amazon Kendra experience.
The identifiers of your data sources and FAQs. Or, you can specify that you want to use documents indexed via the BatchPutDocument
API. This is the content you want to use for your Amazon Kendra experience.
The identifier of the data sources you want to use for your Amazon Kendra experience.
The identifier of the FAQs that you want to use for your Amazon Kendra experience.
TRUE
to use documents you indexed directly using theBatchPutDocument
API.
The IAM Identity Center field name that contains the identifiers of your users, such as their emails.
The IAM Identity Center field name that contains the identifiers of your users, such as their emails. This is used for user context filtering and for granting access to your Amazon Kendra experience. You must set up IAM Identity Center with Amazon Kendra. You must include your users and groups in your Access Control List when you ingest documents into your index. For more information, see Getting started with an IAM Identity Center identity source.
A token that you provide to identify the request to create your Amazon Kendra experience. Multiple calls to the CreateExperience
API with the same client token creates only one Amazon Kendra experience.
This field is autopopulated if not provided.
dict
Response Syntax
{
'Id': 'string'
}
Response Structure
(dict) --
Id (string) --
The identifier for your created Amazon Kendra experience.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
create_faq
(**kwargs)¶Creates an new set of frequently asked question (FAQ) questions and answers.
Adding FAQs to an index is an asynchronous operation.
For an example of adding an FAQ to an index using Python and Java SDKs, see Using your FAQ file.
See also: AWS API Documentation
Request Syntax
response = client.create_faq(
IndexId='string',
Name='string',
Description='string',
S3Path={
'Bucket': 'string',
'Key': 'string'
},
RoleArn='string',
Tags=[
{
'Key': 'string',
'Value': 'string'
},
],
FileFormat='CSV'|'CSV_WITH_HEADER'|'JSON',
ClientToken='string',
LanguageCode='string'
)
[REQUIRED]
The identifier of the index for the FAQ.
[REQUIRED]
A name for the FAQ.
[REQUIRED]
The path to the FAQ file in S3.
The name of the S3 bucket that contains the file.
The name of the file.
[REQUIRED]
The Amazon Resource Name (ARN) of a role with permission to access the S3 bucket that contains the FAQs. For more information, see IAM Roles for Amazon Kendra.
A list of key-value pairs that identify the FAQ. You can use the tags to identify and organize your resources and to control access to resources.
A list of key/value pairs that identify an index, FAQ, or data source. Tag keys and values can consist of Unicode letters, digits, white space, and any of the following symbols: _ . : / = + - @.
The key for the tag. Keys are not case sensitive and must be unique for the index, FAQ, or data source.
The value associated with the tag. The value may be an empty string but it can't be null.
The format of the FAQ input file. You can choose between a basic CSV format, a CSV format that includes customs attributes in a header, and a JSON format that includes custom attributes.
The format must match the format of the file stored in the S3 bucket identified in the S3Path
parameter.
For more information, see Adding questions and answers.
A token that you provide to identify the request to create a FAQ. Multiple calls to the CreateFaqRequest
API with the same client token will create only one FAQ.
This field is autopopulated if not provided.
dict
Response Syntax
{
'Id': 'string'
}
Response Structure
(dict) --
Id (string) --
The identifier of the FAQ.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
create_index
(**kwargs)¶Creates an Amazon Kendra index. Index creation is an asynchronous API. To determine if index creation has completed, check the Status
field returned from a call to DescribeIndex
. The Status
field is set to ACTIVE
when the index is ready to use.
Once the index is active you can index your documents using the BatchPutDocument
API or using one of the supported data sources.
For an example of creating an index and data source using the Python SDK, see Getting started with Python SDK. For an example of creating an index and data source using the Java SDK, see Getting started with Java SDK.
See also: AWS API Documentation
Request Syntax
response = client.create_index(
Name='string',
Edition='DEVELOPER_EDITION'|'ENTERPRISE_EDITION',
RoleArn='string',
ServerSideEncryptionConfiguration={
'KmsKeyId': 'string'
},
Description='string',
ClientToken='string',
Tags=[
{
'Key': 'string',
'Value': 'string'
},
],
UserTokenConfigurations=[
{
'JwtTokenTypeConfiguration': {
'KeyLocation': 'URL'|'SECRET_MANAGER',
'URL': 'string',
'SecretManagerArn': 'string',
'UserNameAttributeField': 'string',
'GroupAttributeField': 'string',
'Issuer': 'string',
'ClaimRegex': 'string'
},
'JsonTokenTypeConfiguration': {
'UserNameAttributeField': 'string',
'GroupAttributeField': 'string'
}
},
],
UserContextPolicy='ATTRIBUTE_FILTER'|'USER_TOKEN',
UserGroupResolutionConfiguration={
'UserGroupResolutionMode': 'AWS_SSO'|'NONE'
}
)
[REQUIRED]
A name for the index.
The Amazon Kendra edition to use for the index. Choose DEVELOPER_EDITION
for indexes intended for development, testing, or proof of concept. Use ENTERPRISE_EDITION
for your production databases. Once you set the edition for an index, it can't be changed.
The Edition
parameter is optional. If you don't supply a value, the default is ENTERPRISE_EDITION
.
For more information on quota limits for enterprise and developer editions, see Quotas.
[REQUIRED]
An Identity and Access Management (IAM) role that gives Amazon Kendra permissions to access your Amazon CloudWatch logs and metrics. This is also the role you use when you call the BatchPutDocument
API to index documents from an Amazon S3 bucket.
The identifier of the KMS customer managed key (CMK) that's used to encrypt data indexed by Amazon Kendra. Amazon Kendra doesn't support asymmetric CMKs.
The identifier of the KMS key. Amazon Kendra doesn't support asymmetric keys.
A token that you provide to identify the request to create an index. Multiple calls to the CreateIndex
API with the same client token will create only one index.
This field is autopopulated if not provided.
A list of key-value pairs that identify the index. You can use the tags to identify and organize your resources and to control access to resources.
A list of key/value pairs that identify an index, FAQ, or data source. Tag keys and values can consist of Unicode letters, digits, white space, and any of the following symbols: _ . : / = + - @.
The key for the tag. Keys are not case sensitive and must be unique for the index, FAQ, or data source.
The value associated with the tag. The value may be an empty string but it can't be null.
The user token configuration.
Provides the configuration information for a token.
Information about the JWT token type configuration.
The location of the key.
The signing key URL.
The Amazon Resource Name (arn) of the secret.
The user name attribute field.
The group attribute field.
The issuer of the token.
The regular expression that identifies the claim.
Information about the JSON token type configuration.
The user name attribute field.
The group attribute field.
The user context policy.
ATTRIBUTE_FILTER
All indexed content is searchable and displayable for all users. If you want to filter search results on user context, you can use the attribute filters of _user_id
and _group_ids
or you can provide user and group information in UserContext
.
USER_TOKEN
Enables token-based user access control to filter search results on user context. All documents with no access control and all documents accessible to the user will be searchable and displayable.
Enables fetching access levels of groups and users from an IAM Identity Center (successor to Single Sign-On) identity source. To configure this, see UserGroupResolutionConfiguration.
The identity store provider (mode) you want to use to fetch access levels of groups and users. IAM Identity Center (successor to Single Sign-On) is currently the only available mode. Your users and groups must exist in an IAM Identity Center identity source in order to use this mode.
dict
Response Syntax
{
'Id': 'string'
}
Response Structure
(dict) --
Id (string) --
The identifier of the index. Use this identifier when you query an index, set up a data source, or index a document.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceAlreadyExistException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.InternalServerException
create_query_suggestions_block_list
(**kwargs)¶Creates a block list to exlcude certain queries from suggestions.
Any query that contains words or phrases specified in the block list is blocked or filtered out from being shown as a suggestion.
You need to provide the file location of your block list text file in your S3 bucket. In your text file, enter each block word or phrase on a separate line.
For information on the current quota limits for block lists, see Quotas for Amazon Kendra.
CreateQuerySuggestionsBlockList
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
For an example of creating a block list for query suggestions using the Python SDK, see Query suggestions block list.
See also: AWS API Documentation
Request Syntax
response = client.create_query_suggestions_block_list(
IndexId='string',
Name='string',
Description='string',
SourceS3Path={
'Bucket': 'string',
'Key': 'string'
},
ClientToken='string',
RoleArn='string',
Tags=[
{
'Key': 'string',
'Value': 'string'
},
]
)
[REQUIRED]
The identifier of the index you want to create a query suggestions block list for.
[REQUIRED]
A user friendly name for the block list.
For example, the block list named 'offensive-words' includes all offensive words that could appear in user queries and need to be blocked from suggestions.
A user-friendly description for the block list.
For example, the description "List of all offensive words that can appear in user queries and need to be blocked from suggestions."
[REQUIRED]
The S3 path to your block list text file in your S3 bucket.
Each block word or phrase should be on a separate line in a text file.
For information on the current quota limits for block lists, see Quotas for Amazon Kendra.
The name of the S3 bucket that contains the file.
The name of the file.
A token that you provide to identify the request to create a query suggestions block list.
This field is autopopulated if not provided.
[REQUIRED]
The IAM (Identity and Access Management) role used by Amazon Kendra to access the block list text file in your S3 bucket.
You need permissions to the role ARN (Amazon Web Services Resource Name). The role needs S3 read permissions to your file in S3 and needs to give STS (Security Token Service) assume role permissions to Amazon Kendra.
A tag that you can assign to a block list that categorizes the block list.
A list of key/value pairs that identify an index, FAQ, or data source. Tag keys and values can consist of Unicode letters, digits, white space, and any of the following symbols: _ . : / = + - @.
The key for the tag. Keys are not case sensitive and must be unique for the index, FAQ, or data source.
The value associated with the tag. The value may be an empty string but it can't be null.
dict
Response Syntax
{
'Id': 'string'
}
Response Structure
(dict) --
Id (string) --
The identifier of the created block list.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.InternalServerException
create_thesaurus
(**kwargs)¶Creates a thesaurus for an index. The thesaurus contains a list of synonyms in Solr format.
For an example of adding a thesaurus file to an index, see Adding custom synonyms to an index.
See also: AWS API Documentation
Request Syntax
response = client.create_thesaurus(
IndexId='string',
Name='string',
Description='string',
RoleArn='string',
Tags=[
{
'Key': 'string',
'Value': 'string'
},
],
SourceS3Path={
'Bucket': 'string',
'Key': 'string'
},
ClientToken='string'
)
[REQUIRED]
The identifier of the index for the thesaurus.
[REQUIRED]
A name for the thesaurus.
[REQUIRED]
An IAM role that gives Amazon Kendra permissions to access thesaurus file specified in SourceS3Path
.
A list of key-value pairs that identify the thesaurus. You can use the tags to identify and organize your resources and to control access to resources.
A list of key/value pairs that identify an index, FAQ, or data source. Tag keys and values can consist of Unicode letters, digits, white space, and any of the following symbols: _ . : / = + - @.
The key for the tag. Keys are not case sensitive and must be unique for the index, FAQ, or data source.
The value associated with the tag. The value may be an empty string but it can't be null.
[REQUIRED]
The path to the thesaurus file in S3.
The name of the S3 bucket that contains the file.
The name of the file.
A token that you provide to identify the request to create a thesaurus. Multiple calls to the CreateThesaurus
API with the same client token will create only one thesaurus.
This field is autopopulated if not provided.
dict
Response Syntax
{
'Id': 'string'
}
Response Structure
(dict) --
Id (string) --
The identifier of the thesaurus.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
delete_access_control_configuration
(**kwargs)¶Deletes an access control configuration that you created for your documents in an index. This includes user and group access information for your documents. This is useful for user context filtering, where search results are filtered based on the user or their group access to documents.
See also: AWS API Documentation
Request Syntax
response = client.delete_access_control_configuration(
IndexId='string',
Id='string'
)
[REQUIRED]
The identifier of the index for an access control configuration.
[REQUIRED]
The identifier of the access control configuration you want to delete.
dict
Response Syntax
{}
Response Structure
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
delete_data_source
(**kwargs)¶Deletes an Amazon Kendra data source connector. An exception is not thrown if the data source is already being deleted. While the data source is being deleted, the Status
field returned by a call to the DescribeDataSource
API is set to DELETING
. For more information, see Deleting Data Sources.
See also: AWS API Documentation
Request Syntax
response = client.delete_data_source(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of the data source connector you want to delete.
[REQUIRED]
The identifier of the index used with the data source connector.
None
Exceptions
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.InternalServerException
delete_experience
(**kwargs)¶Deletes your Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.delete_experience(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of your Amazon Kendra experience you want to delete.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
dict
Response Syntax
{}
Response Structure
Exceptions
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.InternalServerException
delete_faq
(**kwargs)¶Removes an FAQ from an index.
See also: AWS API Documentation
Request Syntax
response = client.delete_faq(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of the FAQ you want to remove.
[REQUIRED]
The identifier of the index for the FAQ.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
delete_index
(**kwargs)¶Deletes an existing Amazon Kendra index. An exception is not thrown if the index is already being deleted. While the index is being deleted, the Status
field returned by a call to the DescribeIndex
API is set to DELETING
.
See also: AWS API Documentation
Request Syntax
response = client.delete_index(
Id='string'
)
[REQUIRED]
The identifier of the index you want to delete.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
delete_principal_mapping
(**kwargs)¶Deletes a group so that all users and sub groups that belong to the group can no longer access documents only available to that group.
For example, after deleting the group "Summer Interns", all interns who belonged to that group no longer see intern-only documents in their search results.
If you want to delete or replace users or sub groups of a group, you need to use the PutPrincipalMapping
operation. For example, if a user in the group "Engineering" leaves the engineering team and another user takes their place, you provide an updated list of users or sub groups that belong to the "Engineering" group when calling PutPrincipalMapping
. You can update your internal list of users or sub groups and input this list when calling PutPrincipalMapping
.
DeletePrincipalMapping
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.delete_principal_mapping(
IndexId='string',
DataSourceId='string',
GroupId='string',
OrderingId=123
)
[REQUIRED]
The identifier of the index you want to delete a group from.
The identifier of the data source you want to delete a group from.
A group can be tied to multiple data sources. You can delete a group from accessing documents in a certain data source. For example, the groups "Research", "Engineering", and "Sales and Marketing" are all tied to the company's documents stored in the data sources Confluence and Salesforce. You want to delete "Research" and "Engineering" groups from Salesforce, so that these groups cannot access customer-related documents stored in Salesforce. Only "Sales and Marketing" should access documents in the Salesforce data source.
[REQUIRED]
The identifier of the group you want to delete.
The timestamp identifier you specify to ensure Amazon Kendra does not override the latest DELETE
action with previous actions. The highest number ID, which is the ordering ID, is the latest action you want to process and apply on top of other actions with lower number IDs. This prevents previous actions with lower number IDs from possibly overriding the latest action.
The ordering ID can be the UNIX time of the last update you made to a group members list. You would then provide this list when calling PutPrincipalMapping
. This ensures your DELETE
action for that updated group with the latest members list doesn't get overwritten by earlier DELETE
actions for the same group which are yet to be processed.
The default ordering ID is the current UNIX time in milliseconds that the action was received by Amazon Kendra.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
delete_query_suggestions_block_list
(**kwargs)¶Deletes a block list used for query suggestions for an index.
A deleted block list might not take effect right away. Amazon Kendra needs to refresh the entire suggestions list to add back the queries that were previously blocked.
DeleteQuerySuggestionsBlockList
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.delete_query_suggestions_block_list(
IndexId='string',
Id='string'
)
[REQUIRED]
The identifier of the index for the block list.
[REQUIRED]
The identifier of the block list you want to delete.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.InternalServerException
delete_thesaurus
(**kwargs)¶Deletes an existing Amazon Kendra thesaurus.
See also: AWS API Documentation
Request Syntax
response = client.delete_thesaurus(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of the thesaurus you want to delete.
[REQUIRED]
The identifier of the index for the thesaurus.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
describe_access_control_configuration
(**kwargs)¶Gets information about an access control configuration that you created for your documents in an index. This includes user and group access information for your documents. This is useful for user context filtering, where search results are filtered based on the user or their group access to documents.
See also: AWS API Documentation
Request Syntax
response = client.describe_access_control_configuration(
IndexId='string',
Id='string'
)
[REQUIRED]
The identifier of the index for an access control configuration.
[REQUIRED]
The identifier of the access control configuration you want to get information on.
dict
Response Syntax
{
'Name': 'string',
'Description': 'string',
'ErrorMessage': 'string',
'AccessControlList': [
{
'Name': 'string',
'Type': 'USER'|'GROUP',
'Access': 'ALLOW'|'DENY',
'DataSourceId': 'string'
},
],
'HierarchicalAccessControlList': [
{
'PrincipalList': [
{
'Name': 'string',
'Type': 'USER'|'GROUP',
'Access': 'ALLOW'|'DENY',
'DataSourceId': 'string'
},
]
},
]
}
Response Structure
(dict) --
Name (string) --
The name for the access control configuration.
Description (string) --
The description for the access control configuration.
ErrorMessage (string) --
The error message containing details if there are issues processing the access control configuration.
AccessControlList (list) --
Information on principals (users and/or groups) and which documents they should have access to. This is useful for user context filtering, where search results are filtered based on the user or their group access to documents.
(dict) --
Provides user and group information for user context filtering.
Name (string) --
The name of the user or group.
Type (string) --
The type of principal.
Access (string) --
Whether to allow or deny document access to the principal.
DataSourceId (string) --
The identifier of the data source the principal should access documents from.
HierarchicalAccessControlList (list) --
The list of principal lists that define the hierarchy for which documents users should have access to.
(dict) --
Information to define the hierarchy for which documents users should have access to.
PrincipalList (list) --
A list of principal lists that define the hierarchy for which documents users should have access to. Each hierarchical list specifies which user or group has allow or deny access for each document.
(dict) --
Provides user and group information for user context filtering.
Name (string) --
The name of the user or group.
Type (string) --
The type of principal.
Access (string) --
Whether to allow or deny document access to the principal.
DataSourceId (string) --
The identifier of the data source the principal should access documents from.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
describe_data_source
(**kwargs)¶Gets information about an Amazon Kendra data source connector.
See also: AWS API Documentation
Request Syntax
response = client.describe_data_source(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of the data source connector.
[REQUIRED]
The identifier of the index used with the data source connector.
dict
Response Syntax
{
'Id': 'string',
'IndexId': 'string',
'Name': 'string',
'Type': 'S3'|'SHAREPOINT'|'DATABASE'|'SALESFORCE'|'ONEDRIVE'|'SERVICENOW'|'CUSTOM'|'CONFLUENCE'|'GOOGLEDRIVE'|'WEBCRAWLER'|'WORKDOCS'|'FSX'|'SLACK'|'BOX'|'QUIP'|'JIRA'|'GITHUB'|'ALFRESCO'|'TEMPLATE',
'Configuration': {
'S3Configuration': {
'BucketName': 'string',
'InclusionPrefixes': [
'string',
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'DocumentsMetadataConfiguration': {
'S3Prefix': 'string'
},
'AccessControlListConfiguration': {
'KeyPath': 'string'
}
},
'SharePointConfiguration': {
'SharePointVersion': 'SHAREPOINT_2013'|'SHAREPOINT_2016'|'SHAREPOINT_ONLINE'|'SHAREPOINT_2019',
'Urls': [
'string',
],
'SecretArn': 'string',
'CrawlAttachments': True|False,
'UseChangeLog': True|False,
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'DocumentTitleFieldName': 'string',
'DisableLocalGroups': True|False,
'SslCertificateS3Path': {
'Bucket': 'string',
'Key': 'string'
},
'AuthenticationType': 'HTTP_BASIC'|'OAUTH2',
'ProxyConfiguration': {
'Host': 'string',
'Port': 123,
'Credentials': 'string'
}
},
'DatabaseConfiguration': {
'DatabaseEngineType': 'RDS_AURORA_MYSQL'|'RDS_AURORA_POSTGRESQL'|'RDS_MYSQL'|'RDS_POSTGRESQL',
'ConnectionConfiguration': {
'DatabaseHost': 'string',
'DatabasePort': 123,
'DatabaseName': 'string',
'TableName': 'string',
'SecretArn': 'string'
},
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'ColumnConfiguration': {
'DocumentIdColumnName': 'string',
'DocumentDataColumnName': 'string',
'DocumentTitleColumnName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'ChangeDetectingColumns': [
'string',
]
},
'AclConfiguration': {
'AllowedGroupsColumnName': 'string'
},
'SqlConfiguration': {
'QueryIdentifiersEnclosingOption': 'DOUBLE_QUOTES'|'NONE'
}
},
'SalesforceConfiguration': {
'ServerUrl': 'string',
'SecretArn': 'string',
'StandardObjectConfigurations': [
{
'Name': 'ACCOUNT'|'CAMPAIGN'|'CASE'|'CONTACT'|'CONTRACT'|'DOCUMENT'|'GROUP'|'IDEA'|'LEAD'|'OPPORTUNITY'|'PARTNER'|'PRICEBOOK'|'PRODUCT'|'PROFILE'|'SOLUTION'|'TASK'|'USER',
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
],
'KnowledgeArticleConfiguration': {
'IncludedStates': [
'DRAFT'|'PUBLISHED'|'ARCHIVED',
],
'StandardKnowledgeArticleTypeConfiguration': {
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'CustomKnowledgeArticleTypeConfigurations': [
{
'Name': 'string',
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
]
},
'ChatterFeedConfiguration': {
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'IncludeFilterTypes': [
'ACTIVE_USER'|'STANDARD_USER',
]
},
'CrawlAttachments': True|False,
'StandardObjectAttachmentConfiguration': {
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'IncludeAttachmentFilePatterns': [
'string',
],
'ExcludeAttachmentFilePatterns': [
'string',
]
},
'OneDriveConfiguration': {
'TenantDomain': 'string',
'SecretArn': 'string',
'OneDriveUsers': {
'OneDriveUserList': [
'string',
],
'OneDriveUserS3Path': {
'Bucket': 'string',
'Key': 'string'
}
},
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'DisableLocalGroups': True|False
},
'ServiceNowConfiguration': {
'HostUrl': 'string',
'SecretArn': 'string',
'ServiceNowBuildVersion': 'LONDON'|'OTHERS',
'KnowledgeArticleConfiguration': {
'CrawlAttachments': True|False,
'IncludeAttachmentFilePatterns': [
'string',
],
'ExcludeAttachmentFilePatterns': [
'string',
],
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'FilterQuery': 'string'
},
'ServiceCatalogConfiguration': {
'CrawlAttachments': True|False,
'IncludeAttachmentFilePatterns': [
'string',
],
'ExcludeAttachmentFilePatterns': [
'string',
],
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'AuthenticationType': 'HTTP_BASIC'|'OAUTH2'
},
'ConfluenceConfiguration': {
'ServerUrl': 'string',
'SecretArn': 'string',
'Version': 'CLOUD'|'SERVER',
'SpaceConfiguration': {
'CrawlPersonalSpaces': True|False,
'CrawlArchivedSpaces': True|False,
'IncludeSpaces': [
'string',
],
'ExcludeSpaces': [
'string',
],
'SpaceFieldMappings': [
{
'DataSourceFieldName': 'DISPLAY_URL'|'ITEM_TYPE'|'SPACE_KEY'|'URL',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'PageConfiguration': {
'PageFieldMappings': [
{
'DataSourceFieldName': 'AUTHOR'|'CONTENT_STATUS'|'CREATED_DATE'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'MODIFIED_DATE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'BlogConfiguration': {
'BlogFieldMappings': [
{
'DataSourceFieldName': 'AUTHOR'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'PUBLISH_DATE'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'AttachmentConfiguration': {
'CrawlAttachments': True|False,
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'AUTHOR'|'CONTENT_TYPE'|'CREATED_DATE'|'DISPLAY_URL'|'FILE_SIZE'|'ITEM_TYPE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'ProxyConfiguration': {
'Host': 'string',
'Port': 123,
'Credentials': 'string'
},
'AuthenticationType': 'HTTP_BASIC'|'PAT'
},
'GoogleDriveConfiguration': {
'SecretArn': 'string',
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'ExcludeMimeTypes': [
'string',
],
'ExcludeUserAccounts': [
'string',
],
'ExcludeSharedDrives': [
'string',
]
},
'WebCrawlerConfiguration': {
'Urls': {
'SeedUrlConfiguration': {
'SeedUrls': [
'string',
],
'WebCrawlerMode': 'HOST_ONLY'|'SUBDOMAINS'|'EVERYTHING'
},
'SiteMapsConfiguration': {
'SiteMaps': [
'string',
]
}
},
'CrawlDepth': 123,
'MaxLinksPerPage': 123,
'MaxContentSizePerPageInMegaBytes': ...,
'MaxUrlsPerMinuteCrawlRate': 123,
'UrlInclusionPatterns': [
'string',
],
'UrlExclusionPatterns': [
'string',
],
'ProxyConfiguration': {
'Host': 'string',
'Port': 123,
'Credentials': 'string'
},
'AuthenticationConfiguration': {
'BasicAuthentication': [
{
'Host': 'string',
'Port': 123,
'Credentials': 'string'
},
]
}
},
'WorkDocsConfiguration': {
'OrganizationId': 'string',
'CrawlComments': True|False,
'UseChangeLog': True|False,
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'FsxConfiguration': {
'FileSystemId': 'string',
'FileSystemType': 'WINDOWS',
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'SecretArn': 'string',
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'SlackConfiguration': {
'TeamId': 'string',
'SecretArn': 'string',
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'SlackEntityList': [
'PUBLIC_CHANNEL'|'PRIVATE_CHANNEL'|'GROUP_MESSAGE'|'DIRECT_MESSAGE',
],
'UseChangeLog': True|False,
'CrawlBotMessage': True|False,
'ExcludeArchived': True|False,
'SinceCrawlDate': 'string',
'LookBackPeriod': 123,
'PrivateChannelFilter': [
'string',
],
'PublicChannelFilter': [
'string',
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'BoxConfiguration': {
'EnterpriseId': 'string',
'SecretArn': 'string',
'UseChangeLog': True|False,
'CrawlComments': True|False,
'CrawlTasks': True|False,
'CrawlWebLinks': True|False,
'FileFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'TaskFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'CommentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'WebLinkFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'QuipConfiguration': {
'Domain': 'string',
'SecretArn': 'string',
'CrawlFileComments': True|False,
'CrawlChatRooms': True|False,
'CrawlAttachments': True|False,
'FolderIds': [
'string',
],
'ThreadFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'MessageFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'JiraConfiguration': {
'JiraAccountUrl': 'string',
'SecretArn': 'string',
'UseChangeLog': True|False,
'Project': [
'string',
],
'IssueType': [
'string',
],
'Status': [
'string',
],
'IssueSubEntityFilter': [
'COMMENTS'|'ATTACHMENTS'|'WORKLOGS',
],
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'CommentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'IssueFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'ProjectFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'WorkLogFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'GitHubConfiguration': {
'SaaSConfiguration': {
'OrganizationName': 'string',
'HostUrl': 'string'
},
'OnPremiseConfiguration': {
'HostUrl': 'string',
'OrganizationName': 'string',
'SslCertificateS3Path': {
'Bucket': 'string',
'Key': 'string'
}
},
'Type': 'SAAS'|'ON_PREMISE',
'SecretArn': 'string',
'UseChangeLog': True|False,
'GitHubDocumentCrawlProperties': {
'CrawlRepositoryDocuments': True|False,
'CrawlIssue': True|False,
'CrawlIssueComment': True|False,
'CrawlIssueCommentAttachment': True|False,
'CrawlPullRequest': True|False,
'CrawlPullRequestComment': True|False,
'CrawlPullRequestCommentAttachment': True|False
},
'RepositoryFilter': [
'string',
],
'InclusionFolderNamePatterns': [
'string',
],
'InclusionFileTypePatterns': [
'string',
],
'InclusionFileNamePatterns': [
'string',
],
'ExclusionFolderNamePatterns': [
'string',
],
'ExclusionFileTypePatterns': [
'string',
],
'ExclusionFileNamePatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'GitHubRepositoryConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubCommitConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubIssueDocumentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubIssueCommentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubIssueAttachmentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubPullRequestCommentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubPullRequestDocumentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubPullRequestDocumentAttachmentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'AlfrescoConfiguration': {
'SiteUrl': 'string',
'SiteId': 'string',
'SecretArn': 'string',
'SslCertificateS3Path': {
'Bucket': 'string',
'Key': 'string'
},
'CrawlSystemFolders': True|False,
'CrawlComments': True|False,
'EntityFilter': [
'wiki'|'blog'|'documentLibrary',
],
'DocumentLibraryFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'BlogFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'WikiFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'TemplateConfiguration': {
'Template': {...}|[...]|123|123.4|'string'|True|None
}
},
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'Description': 'string',
'Status': 'CREATING'|'DELETING'|'FAILED'|'UPDATING'|'ACTIVE',
'Schedule': 'string',
'RoleArn': 'string',
'ErrorMessage': 'string',
'LanguageCode': 'string',
'CustomDocumentEnrichmentConfiguration': {
'InlineConfigurations': [
{
'Condition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'Target': {
'TargetDocumentAttributeKey': 'string',
'TargetDocumentAttributeValueDeletion': True|False,
'TargetDocumentAttributeValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'DocumentContentDeletion': True|False
},
],
'PreExtractionHookConfiguration': {
'InvocationCondition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LambdaArn': 'string',
'S3Bucket': 'string'
},
'PostExtractionHookConfiguration': {
'InvocationCondition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LambdaArn': 'string',
'S3Bucket': 'string'
},
'RoleArn': 'string'
}
}
Response Structure
(dict) --
Id (string) --
The identifier of the data source connector.
IndexId (string) --
The identifier of the index used with the data source connector.
Name (string) --
The name for the data source connector.
Type (string) --
The type of the data source. For example, SHAREPOINT
.
Configuration (dict) --
Configuration details for the data source connector. This shows how the data source is configured. The configuration options for a data source depend on the data source provider.
S3Configuration (dict) --
Provides the configuration information to connect to an Amazon S3 bucket as your data source.
BucketName (string) --
The name of the bucket that contains the documents.
InclusionPrefixes (list) --
A list of S3 prefixes for the documents that should be included in the index.
InclusionPatterns (list) --
A list of glob patterns for documents that should be indexed. If a document that matches an inclusion pattern also matches an exclusion pattern, the document is not indexed.
Some examples are:
ExclusionPatterns (list) --
A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix or inclusion pattern also matches an exclusion pattern, the document is not indexed.
Some examples are:
DocumentsMetadataConfiguration (dict) --
Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.
S3Prefix (string) --
A prefix used to filter metadata configuration files in the Amazon Web Services S3 bucket. The S3 bucket might contain multiple metadata files. Use S3Prefix
to include only the desired metadata files.
AccessControlListConfiguration (dict) --
Provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources.
KeyPath (string) --
Path to the Amazon S3 bucket that contains the ACL files.
SharePointConfiguration (dict) --
Provides the configuration information to connect to Microsoft SharePoint as your data source.
SharePointVersion (string) --
The version of Microsoft SharePoint that you use.
Urls (list) --
The Microsoft SharePoint site URLs for the documents you want to index.
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the user name and password required to connect to the SharePoint instance. If you use SharePoint Server, you also need to provide the sever domain name as part of the credentials. For more information, see Using a Microsoft SharePoint Data Source.
You can also provide OAuth authentication credentials of user name, password, client ID, and client secret. For more information, see Using a SharePoint data source.
CrawlAttachments (boolean) --
TRUE
to index document attachments.
UseChangeLog (boolean) --
TRUE
to use the SharePoint change log to determine which documents require updating in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in SharePoint.
InclusionPatterns (list) --
A list of regular expression patterns to include certain documents in your SharePoint. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The regex applies to the display URL of the SharePoint document.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain documents in your SharePoint. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The regex applies to the display URL of the SharePoint document.
VpcConfiguration (dict) --
Configuration information for an Amazon Virtual Private Cloud to connect to your Microsoft SharePoint. For more information, see Configuring a VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map SharePoint data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to SharePoint fields. For more information, see Mapping data source fields. The SharePoint data source field names must exist in your SharePoint custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
DocumentTitleFieldName (string) --
The Microsoft SharePoint attribute field that contains the title of the document.
DisableLocalGroups (boolean) --
TRUE
to disable local groups information.
SslCertificateS3Path (dict) --
The path to the SSL certificate stored in an Amazon S3 bucket. You use this to connect to SharePoint Server if you require a secure SSL connection.
You can simply generate a self-signed X509 certificate on any computer using OpenSSL. For an example of using OpenSSL to create an X509 certificate, see Create and sign an X509 certificate.
Bucket (string) --
The name of the S3 bucket that contains the file.
Key (string) --
The name of the file.
AuthenticationType (string) --
Whether you want to connect to SharePoint using basic authentication of user name and password, or OAuth authentication of user name, password, client ID, and client secret. You can use OAuth authentication for SharePoint Online.
ProxyConfiguration (dict) --
Configuration information to connect to your Microsoft SharePoint site URLs via instance via a web proxy. You can use this option for SharePoint Server.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication of user name and password. To store web proxy credentials, you use a secret in Secrets Manager.
It is recommended that you follow best security practices when configuring your web proxy. This includes setting up throttling, setting up logging and monitoring, and applying security patches on a regular basis. If you use your web proxy with multiple data sources, sync jobs that occur at the same time could strain the load on your proxy. It is recommended you prepare your proxy beforehand for any security and load requirements.
Host (string) --
The name of the website host you want to connect to via a web proxy server.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
Port (integer) --
The port number of the website host you want to connect to via a web proxy server.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Credentials (string) --
Your secret ARN, which you can create in Secrets Manager
The credentials are optional. You use a secret if web proxy credentials are required to connect to a website host. Amazon Kendra currently support basic authentication to connect to a web proxy server. The secret stores your credentials.
DatabaseConfiguration (dict) --
Provides the configuration information to connect to a database as your data source.
DatabaseEngineType (string) --
The type of database engine that runs the database.
ConnectionConfiguration (dict) --
Configuration information that's required to connect to a database.
DatabaseHost (string) --
The name of the host for the database. Can be either a string (host.subdomain.domain.tld) or an IPv4 or IPv6 address.
DatabasePort (integer) --
The port that the database uses for connections.
DatabaseName (string) --
The name of the database containing the document data.
TableName (string) --
The name of the table that contains the document data.
SecretArn (string) --
The Amazon Resource Name (ARN) of credentials stored in Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Database Data Source. For more information about Secrets Manager, see What Is Secrets Manager in the Secrets Manager user guide.
VpcConfiguration (dict) --
Provides the configuration information to connect to an Amazon VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
ColumnConfiguration (dict) --
Information about where the index should get the document information from the database.
DocumentIdColumnName (string) --
The column that provides the document's identifier.
DocumentDataColumnName (string) --
The column that contains the contents of the document.
DocumentTitleColumnName (string) --
The column that contains the title of the document.
FieldMappings (list) --
An array of objects that map database column names to the corresponding fields in an index. You must first create the fields in the index using the UpdateIndex
API.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ChangeDetectingColumns (list) --
One to five columns that indicate when a document in the database has changed.
AclConfiguration (dict) --
Information about the database column that provides information for user context filtering.
AllowedGroupsColumnName (string) --
A list of groups, separated by semi-colons, that filters a query response based on user context. The document is only returned to users that are in one of the groups specified in the UserContext
field of the Query
API.
SqlConfiguration (dict) --
Provides information about how Amazon Kendra uses quote marks around SQL identifiers when querying a database data source.
QueryIdentifiersEnclosingOption (string) --
Determines whether Amazon Kendra encloses SQL identifiers for tables and column names in double quotes (") when making a database query.
By default, Amazon Kendra passes SQL identifiers the way that they are entered into the data source configuration. It does not change the case of identifiers or enclose them in quotes.
PostgreSQL internally converts uppercase characters to lower case characters in identifiers unless they are quoted. Choosing this option encloses identifiers in quotes so that PostgreSQL does not convert the character's case.
For MySQL databases, you must enable the ansi_quotes
option when you set this field to DOUBLE_QUOTES
.
SalesforceConfiguration (dict) --
Provides the configuration information to connect to Salesforce as your data source.
ServerUrl (string) --
The instance URL for the Salesforce site that you want to index.
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Managersecret that contains the key/value pairs required to connect to your Salesforce instance. The secret must contain a JSON structure with the following keys:
StandardObjectConfigurations (list) --
Configuration of the Salesforce standard objects that Amazon Kendra indexes.
(dict) --
Provides the configuration information for indexing a single standard object.
Name (string) --
The name of the standard object.
DocumentDataFieldName (string) --
The name of the field in the standard object table that contains the document contents.
DocumentTitleFieldName (string) --
The name of the field in the standard object table that contains the document title.
FieldMappings (list) --
Maps attributes or field names of the standard object to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Salesforce fields. For more information, see Mapping data source fields. The Salesforce data source field names must exist in your Salesforce custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
KnowledgeArticleConfiguration (dict) --
Configuration information for the knowledge article types that Amazon Kendra indexes. Amazon Kendra indexes standard knowledge articles and the standard fields of knowledge articles, or the custom fields of custom knowledge articles, but not both.
IncludedStates (list) --
Specifies the document states that should be included when Amazon Kendra indexes knowledge articles. You must specify at least one state.
StandardKnowledgeArticleTypeConfiguration (dict) --
Configuration information for standard Salesforce knowledge articles.
DocumentDataFieldName (string) --
The name of the field that contains the document data to index.
DocumentTitleFieldName (string) --
The name of the field that contains the document title.
FieldMappings (list) --
Maps attributes or field names of the knowledge article to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Salesforce fields. For more information, see Mapping data source fields. The Salesforce data source field names must exist in your Salesforce custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
CustomKnowledgeArticleTypeConfigurations (list) --
Configuration information for custom Salesforce knowledge articles.
(dict) --
Provides the configuration information for indexing Salesforce custom articles.
Name (string) --
The name of the configuration.
DocumentDataFieldName (string) --
The name of the field in the custom knowledge article that contains the document data to index.
DocumentTitleFieldName (string) --
The name of the field in the custom knowledge article that contains the document title.
FieldMappings (list) --
Maps attributes or field names of the custom knowledge article to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Salesforce fields. For more information, see Mapping data source fields. The Salesforce data source field names must exist in your Salesforce custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ChatterFeedConfiguration (dict) --
Configuration information for Salesforce chatter feeds.
DocumentDataFieldName (string) --
The name of the column in the Salesforce FeedItem table that contains the content to index. Typically this is the Body
column.
DocumentTitleFieldName (string) --
The name of the column in the Salesforce FeedItem table that contains the title of the document. This is typically the Title
column.
FieldMappings (list) --
Maps fields from a Salesforce chatter feed into Amazon Kendra index fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
IncludeFilterTypes (list) --
Filters the documents in the feed based on status of the user. When you specify ACTIVE_USERS
only documents from users who have an active account are indexed. When you specify STANDARD_USER
only documents for Salesforce standard users are documented. You can specify both.
CrawlAttachments (boolean) --
Indicates whether Amazon Kendra should index attachments to Salesforce objects.
StandardObjectAttachmentConfiguration (dict) --
Configuration information for processing attachments to Salesforce standard objects.
DocumentTitleFieldName (string) --
The name of the field used for the document title.
FieldMappings (list) --
One or more objects that map fields in attachments to Amazon Kendra index fields.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
IncludeAttachmentFilePatterns (list) --
A list of regular expression patterns to include certain documents in your Salesforce. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the name of the attached file.
ExcludeAttachmentFilePatterns (list) --
A list of regular expression patterns to exclude certain documents in your Salesforce. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the name of the attached file.
OneDriveConfiguration (dict) --
Provides the configuration information to connect to Microsoft OneDrive as your data source.
TenantDomain (string) --
The Azure Active Directory domain of the organization.
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Managersecret that contains the user name and password to connect to OneDrive. The user namd should be the application ID for the OneDrive application, and the password is the application key for the OneDrive application.
OneDriveUsers (dict) --
A list of user accounts whose documents should be indexed.
OneDriveUserList (list) --
A list of users whose documents should be indexed. Specify the user names in email format, for example, username@tenantdomain
. If you need to index the documents of more than 100 users, use the OneDriveUserS3Path
field to specify the location of a file containing a list of users.
OneDriveUserS3Path (dict) --
The S3 bucket location of a file containing a list of users whose documents should be indexed.
Bucket (string) --
The name of the S3 bucket that contains the file.
Key (string) --
The name of the file.
InclusionPatterns (list) --
A list of regular expression patterns to include certain documents in your OneDrive. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the file name.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain documents in your OneDrive. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the file name.
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map OneDrive data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to OneDrive fields. For more information, see Mapping data source fields. The OneDrive data source field names must exist in your OneDrive custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
DisableLocalGroups (boolean) --
TRUE
to disable local groups information.
ServiceNowConfiguration (dict) --
Provides the configuration information to connect to ServiceNow as your data source.
HostUrl (string) --
The ServiceNow instance that the data source connects to. The host endpoint should look like the following: {instance}.service-now.com.
SecretArn (string) --
The Amazon Resource Name (ARN) of the Secrets Manager secret that contains the user name and password required to connect to the ServiceNow instance. You can also provide OAuth authentication credentials of user name, password, client ID, and client secret. For more information, see Using a ServiceNow data source.
ServiceNowBuildVersion (string) --
The identifier of the release that the ServiceNow host is running. If the host is not running the LONDON
release, use OTHERS
.
KnowledgeArticleConfiguration (dict) --
Configuration information for crawling knowledge articles in the ServiceNow site.
CrawlAttachments (boolean) --
TRUE
to index attachments to knowledge articles.
IncludeAttachmentFilePatterns (list) --
A list of regular expression patterns to include certain attachments of knowledge articles in your ServiceNow. Item that match the patterns are included in the index. Items that don't match the patterns are excluded from the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the field specified in the PatternTargetField
.
ExcludeAttachmentFilePatterns (list) --
A list of regular expression patterns to exclude certain attachments of knowledge articles in your ServiceNow. Item that match the patterns are excluded from the index. Items that don't match the patterns are included in the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the field specified in the PatternTargetField
.
DocumentDataFieldName (string) --
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
DocumentTitleFieldName (string) --
The name of the ServiceNow field that is mapped to the index document title field.
FieldMappings (list) --
Maps attributes or field names of knoweldge articles to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to ServiceNow fields. For more information, see Mapping data source fields. The ServiceNow data source field names must exist in your ServiceNow custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
FilterQuery (string) --
A query that selects the knowledge articles to index. The query can return articles from multiple knowledge bases, and the knowledge bases can be public or private.
The query string must be one generated by the ServiceNow console. For more information, see Specifying documents to index with a query.
ServiceCatalogConfiguration (dict) --
Configuration information for crawling service catalogs in the ServiceNow site.
CrawlAttachments (boolean) --
TRUE
to index attachments to service catalog items.
IncludeAttachmentFilePatterns (list) --
A list of regular expression patterns to include certain attachments of catalogs in your ServiceNow. Item that match the patterns are included in the index. Items that don't match the patterns are excluded from the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the file name of the attachment.
ExcludeAttachmentFilePatterns (list) --
A list of regular expression patterns to exclude certain attachments of catalogs in your ServiceNow. Item that match the patterns are excluded from the index. Items that don't match the patterns are included in the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the file name of the attachment.
DocumentDataFieldName (string) --
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
DocumentTitleFieldName (string) --
The name of the ServiceNow field that is mapped to the index document title field.
FieldMappings (list) --
Maps attributes or field names of catalogs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to ServiceNow fields. For more information, see Mapping data source fields. The ServiceNow data source field names must exist in your ServiceNow custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
AuthenticationType (string) --
The type of authentication used to connect to the ServiceNow instance. If you choose HTTP_BASIC
, Amazon Kendra is authenticated using the user name and password provided in the Secrets Manager secret in the SecretArn
field. If you choose OAUTH2
, Amazon Kendra is authenticated using the credentials of client ID, client secret, user name and password.
When you use OAUTH2
authentication, you must generate a token and a client secret using the ServiceNow console. For more information, see Using a ServiceNow data source.
ConfluenceConfiguration (dict) --
Provides the configuration information to connect to Confluence as your data source.
ServerUrl (string) --
The URL of your Confluence instance. Use the full URL of the server. For example, https://server.example.com:port/ . You can also use an IP address, for example, https://192.168.1.113/ .
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the user name and password required to connect to the Confluence instance. If you use Confluence Cloud, you use a generated API token as the password.
You can also provide authentication credentials in the form of a personal access token. For more information, see Using a Confluence data source.
Version (string) --
The version or the type of Confluence installation to connect to.
SpaceConfiguration (dict) --
Configuration information for indexing Confluence spaces.
CrawlPersonalSpaces (boolean) --
TRUE
to index personal spaces. You can add restrictions to items in personal spaces. If personal spaces are indexed, queries without user context information may return restricted items from a personal space in their results. For more information, see Filtering on user context.
CrawlArchivedSpaces (boolean) --
TRUE
to index archived spaces.
IncludeSpaces (list) --
A list of space keys for Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are indexed. Spaces that aren't in the list aren't indexed. A space in the list must exist. Otherwise, Amazon Kendra logs an error when the data source is synchronized. If a space is in both the IncludeSpaces
and the ExcludeSpaces
list, the space is excluded.
ExcludeSpaces (list) --
A list of space keys of Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are not indexed. If a space is in both the ExcludeSpaces
and the IncludeSpaces
list, the space is excluded.
SpaceFieldMappings (list) --
Maps attributes or field names of Confluence spaces to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the SpaceFieldMappings
parameter, you must specify at least one field mapping.
(dict) --
>Maps attributes or field names of Confluence spaces to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
PageConfiguration (dict) --
Configuration information for indexing Confluence pages.
PageFieldMappings (list) --
Maps attributes or field names of Confluence pages to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the PageFieldMappings
parameter, you must specify at least one field mapping.
(dict) --
>Maps attributes or field names of Confluence pages to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
BlogConfiguration (dict) --
Configuration information for indexing Confluence blogs.
BlogFieldMappings (list) --
Maps attributes or field names of Confluence blogs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the BlogFieldMappings
parameter, you must specify at least one field mapping.
(dict) --
Maps attributes or field names of Confluence blog to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
DataSourceFieldName (string) --
The name of the field in the data source.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
AttachmentConfiguration (dict) --
Configuration information for indexing attachments to Confluence blogs and pages.
CrawlAttachments (boolean) --
TRUE
to index attachments of pages and blogs in Confluence.
AttachmentFieldMappings (list) --
Maps attributes or field names of Confluence attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the AttachentFieldMappings
parameter, you must specify at least one field mapping.
(dict) --
Maps attributes or field names of Confluence attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confuence data source field names must exist in your Confluence custom metadata.
DataSourceFieldName (string) --
The name of the field in the data source.
You must first create the index field using the UpdateIndex
API.
DateFieldFormat (string) --
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
IndexFieldName (string) --
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
VpcConfiguration (dict) --
Configuration information for an Amazon Virtual Private Cloud to connect to your Confluence. For more information, see Configuring a VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
InclusionPatterns (list) --
A list of regular expression patterns to include certain blog posts, pages, spaces, or attachments in your Confluence. Content that matches the patterns are included in the index. Content that doesn't match the patterns is excluded from the index. If content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the content isn't included in the index.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain blog posts, pages, spaces, or attachments in your Confluence. Content that matches the patterns are excluded from the index. Content that doesn't match the patterns is included in the index. If content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the content isn't included in the index.
ProxyConfiguration (dict) --
Configuration information to connect to your Confluence URL instance via a web proxy. You can use this option for Confluence Server.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication of user name and password. To store web proxy credentials, you use a secret in Secrets Manager.
It is recommended that you follow best security practices when configuring your web proxy. This includes setting up throttling, setting up logging and monitoring, and applying security patches on a regular basis. If you use your web proxy with multiple data sources, sync jobs that occur at the same time could strain the load on your proxy. It is recommended you prepare your proxy beforehand for any security and load requirements.
Host (string) --
The name of the website host you want to connect to via a web proxy server.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
Port (integer) --
The port number of the website host you want to connect to via a web proxy server.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Credentials (string) --
Your secret ARN, which you can create in Secrets Manager
The credentials are optional. You use a secret if web proxy credentials are required to connect to a website host. Amazon Kendra currently support basic authentication to connect to a web proxy server. The secret stores your credentials.
AuthenticationType (string) --
Whether you want to connect to Confluence using basic authentication of user name and password, or a personal access token. You can use a personal access token for Confluence Server.
GoogleDriveConfiguration (dict) --
Provides the configuration information to connect to Google Drive as your data source.
SecretArn (string) --
The Amazon Resource Name (ARN) of a Secrets Managersecret that contains the credentials required to connect to Google Drive. For more information, see Using a Google Workspace Drive data source.
InclusionPatterns (list) --
A list of regular expression patterns to include certain items in your Google Drive, including shared drives and users' My Drives. Items that match the patterns are included in the index. Items that don't match the patterns are excluded from the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain items in your Google Drive, including shared drives and users' My Drives. Items that match the patterns are excluded from the index. Items that don't match the patterns are included in the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
FieldMappings (list) --
Maps Google Drive data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Google Drive fields. For more information, see Mapping data source fields. The Google Drive data source field names must exist in your Google Drive custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ExcludeMimeTypes (list) --
A list of MIME types to exclude from the index. All documents matching the specified MIME type are excluded.
For a list of MIME types, see Using a Google Workspace Drive data source.
ExcludeUserAccounts (list) --
A list of email addresses of the users. Documents owned by these users are excluded from the index. Documents shared with excluded users are indexed unless they are excluded in another way.
ExcludeSharedDrives (list) --
A list of identifiers or shared drives to exclude from the index. All files and folders stored on the shared drive are excluded.
WebCrawlerConfiguration (dict) --
Provides the configuration information required for Amazon Kendra Web Crawler.
Urls (dict) --
Specifies the seed or starting point URLs of the websites or the sitemap URLs of the websites you want to crawl.
You can include website subdomains. You can list up to 100 seed URLs and up to three sitemap URLs.
You can only crawl websites that use the secure communication protocol, Hypertext Transfer Protocol Secure (HTTPS). If you receive an error when crawling a website, it could be that the website is blocked from crawling.
When selecting websites to index, you must adhere to the Amazon Acceptable Use Policy and all other Amazon terms. Remember that you must only use Amazon Kendra Web Crawler to index your own webpages, or webpages that you have authorization to index.
SeedUrlConfiguration (dict) --
Configuration of the seed or starting point URLs of the websites you want to crawl.
You can choose to crawl only the website host names, or the website host names with subdomains, or the website host names with subdomains and other domains that the webpages link to.
You can list up to 100 seed URLs.
SeedUrls (list) --
The list of seed or starting point URLs of the websites you want to crawl.
The list can include a maximum of 100 seed URLs.
WebCrawlerMode (string) --
You can choose one of the following modes:
HOST_ONLY
– crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.SUBDOMAINS
– crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.EVERYTHING
– crawl the website host names with subdomains and other domains that the webpages link to.The default mode is set to HOST_ONLY
.
SiteMapsConfiguration (dict) --
Configuration of the sitemap URLs of the websites you want to crawl.
Only URLs belonging to the same website host names are crawled. You can list up to three sitemap URLs.
SiteMaps (list) --
The list of sitemap URLs of the websites you want to crawl.
The list can include a maximum of three sitemap URLs.
CrawlDepth (integer) --
Specifies the number of levels in a website that you want to crawl.
The first level begins from the website seed or starting point URL. For example, if a website has 3 levels – index level (i.e. seed in this example), sections level, and subsections level – and you are only interested in crawling information up to the sections level (i.e. levels 0-1), you can set your depth to 1.
The default crawl depth is set to 2.
MaxLinksPerPage (integer) --
The maximum number of URLs on a webpage to include when crawling a website. This number is per webpage.
As a website’s webpages are crawled, any URLs the webpages link to are also crawled. URLs on a webpage are crawled in order of appearance.
The default maximum links per page is 100.
MaxContentSizePerPageInMegaBytes (float) --
The maximum size (in MB) of a webpage or attachment to crawl.
Files larger than this size (in MB) are skipped/not crawled.
The default maximum size of a webpage or attachment is set to 50 MB.
MaxUrlsPerMinuteCrawlRate (integer) --
The maximum number of URLs crawled per website host per minute.
A minimum of one URL is required.
The default maximum number of URLs crawled per website host per minute is 300.
UrlInclusionPatterns (list) --
A list of regular expression patterns to include certain URLs to crawl. URLs that match the patterns are included in the index. URLs that don't match the patterns are excluded from the index. If a URL matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the URL file isn't included in the index.
UrlExclusionPatterns (list) --
A list of regular expression patterns to exclude certain URLs to crawl. URLs that match the patterns are excluded from the index. URLs that don't match the patterns are included in the index. If a URL matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the URL file isn't included in the index.
ProxyConfiguration (dict) --
Configuration information required to connect to your internal websites via a web proxy.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication. To store web proxy credentials, you use a secret in Secrets Manager.
Host (string) --
The name of the website host you want to connect to via a web proxy server.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
Port (integer) --
The port number of the website host you want to connect to via a web proxy server.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Credentials (string) --
Your secret ARN, which you can create in Secrets Manager
The credentials are optional. You use a secret if web proxy credentials are required to connect to a website host. Amazon Kendra currently support basic authentication to connect to a web proxy server. The secret stores your credentials.
AuthenticationConfiguration (dict) --
Configuration information required to connect to websites using authentication.
You can connect to websites using basic authentication of user name and password. You use a secret in Secrets Manager to store your authentication credentials.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
BasicAuthentication (list) --
The list of configuration information that's required to connect to and crawl a website host using basic authentication credentials.
The list includes the name and port number of the website host.
(dict) --
Provides the configuration information to connect to websites that require basic user authentication.
Host (string) --
The name of the website host you want to connect to using authentication credentials.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
Port (integer) --
The port number of the website host you want to connect to using authentication credentials.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Credentials (string) --
Your secret ARN, which you can create in Secrets Manager
You use a secret if basic authentication credentials are required to connect to a website. The secret stores your credentials of user name and password.
WorkDocsConfiguration (dict) --
Provides the configuration information to connect to Amazon WorkDocs as your data source.
OrganizationId (string) --
The identifier of the directory corresponding to your Amazon WorkDocs site repository.
You can find the organization ID in the Directory Service by going to Active Directory , then Directories . Your Amazon WorkDocs site directory has an ID, which is the organization ID. You can also set up a new Amazon WorkDocs directory in the Directory Service console and enable a Amazon WorkDocs site for the directory in the Amazon WorkDocs console.
CrawlComments (boolean) --
TRUE
to include comments on documents in your index. Including comments in your index means each comment is a document that can be searched on.
The default is set to FALSE
.
UseChangeLog (boolean) --
TRUE
to use the Amazon WorkDocs change log to determine which documents require updating in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in Amazon WorkDocs.
InclusionPatterns (list) --
A list of regular expression patterns to include certain files in your Amazon WorkDocs site repository. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain files in your Amazon WorkDocs site repository. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map Amazon WorkDocs data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Amazon WorkDocs fields. For more information, see Mapping data source fields. The Amazon WorkDocs data source field names must exist in your Amazon WorkDocs custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
FsxConfiguration (dict) --
Provides the configuration information to connect to Amazon FSx as your data source.
FileSystemId (string) --
The identifier of the Amazon FSx file system.
You can find your file system ID on the file system dashboard in the Amazon FSx console. For information on how to create a file system in Amazon FSx console, using Windows File Server as an example, see Amazon FSx Getting started guide.
FileSystemType (string) --
The Amazon FSx file system type. Windows is currently the only supported type.
VpcConfiguration (dict) --
Configuration information for an Amazon Virtual Private Cloud to connect to your Amazon FSx. Your Amazon FSx instance must reside inside your VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Amazon FSx file system. Windows is currently the only supported type. The secret must contain a JSON structure with the following keys:
InclusionPatterns (list) --
A list of regular expression patterns to include certain files in your Amazon FSx file system. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain files in your Amazon FSx file system. Files that match the patterns are excluded from the index. Files that don't match the patterns are included in the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map Amazon FSx data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Amazon FSx fields. For more information, see Mapping data source fields. The Amazon FSx data source field names must exist in your Amazon FSx custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
SlackConfiguration (dict) --
Provides the configuration information to connect to Slack as your data source.
TeamId (string) --
The identifier of the team in the Slack workspace. For example, T0123456789 .
You can find your team ID in the URL of the main page of your Slack workspace. When you log in to Slack via a browser, you are directed to the URL of the main page. For example, https://app.slack.com/client/T0123456789 /....
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Slack workspace team. The secret must contain a JSON structure with the following keys:
VpcConfiguration (dict) --
Configuration information for an Amazon Virtual Private Cloud to connect to your Slack. For more information, see Configuring a VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
SlackEntityList (list) --
Specify whether to index public channels, private channels, group messages, and direct messages. You can specify one or more of these options.
UseChangeLog (boolean) --
TRUE
to use the Slack change log to determine which documents require updating in the index. Depending on the Slack change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in Slack.
CrawlBotMessage (boolean) --
TRUE
to index bot messages from your Slack workspace team.
ExcludeArchived (boolean) --
TRUE
to exclude archived messages to index from your Slack workspace team.
SinceCrawlDate (string) --
The date to start crawling your data from your Slack workspace team. The date must follow this format: yyyy-mm-dd
.
LookBackPeriod (integer) --
The number of hours for change log to look back from when you last synchronized your data. You can look back up to 7 days or 168 hours.
Change log updates your index only if new content was added since you last synced your data. Updated or deleted content from before you last synced does not get updated in your index. To capture updated or deleted content before you last synced, set the LookBackPeriod
to the number of hours you want change log to look back.
PrivateChannelFilter (list) --
The list of private channel names from your Slack workspace team. You use this if you want to index specific private channels, not all private channels. You can also use regular expression patterns to filter private channels.
PublicChannelFilter (list) --
The list of public channel names to index from your Slack workspace team. You use this if you want to index specific public channels, not all public channels. You can also use regular expression patterns to filter public channels.
InclusionPatterns (list) --
A list of regular expression patterns to include certain attached files in your Slack workspace team. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain attached files in your Slack workspace team. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
FieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map Slack data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Slack fields. For more information, see Mapping data source fields. The Slack data source field names must exist in your Slack custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
BoxConfiguration (dict) --
Provides the configuration information to connect to Box as your data source.
EnterpriseId (string) --
The identifier of the Box Enterprise platform. You can find the enterprise ID in the Box Developer Console settings or when you create an app in Box and download your authentication credentials. For example, 801234567 .
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Box platform. The secret must contain a JSON structure with the following keys:
You create an application in Box to generate the keys or credentials required for the secret. For more information, see Using a Box data source.
UseChangeLog (boolean) --
TRUE
to use the Slack change log to determine which documents require updating in the index. Depending on the data source change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents.
CrawlComments (boolean) --
TRUE
to index comments.
CrawlTasks (boolean) --
TRUE
to index the contents of tasks.
CrawlWebLinks (boolean) --
TRUE
to index web links.
FileFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box files to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
TaskFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box tasks to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
CommentFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
WebLinkFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box web links to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
InclusionPatterns (list) --
A list of regular expression patterns to include certain files and folders in your Box platform. Files and folders that match the patterns are included in the index. Files and folders that don't match the patterns are excluded from the index. If a file or folder matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file or folder isn't included in the index.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain files and folders from your Box platform. Files and folders that match the patterns are excluded from the index.Files and folders that don't match the patterns are included in the index. If a file or folder matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file or folder isn't included in the index.
VpcConfiguration (dict) --
Configuration information for an Amazon VPC to connect to your Box. For more information, see Configuring a VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
QuipConfiguration (dict) --
Provides the configuration information to connect to Quip as your data source.
Domain (string) --
The Quip site domain. For example, https://quip-company.quipdomain.com/browse . The domain in this example is "quipdomain".
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs that are required to connect to your Quip. The secret must contain a JSON structure with the following keys:
CrawlFileComments (boolean) --
TRUE
to index file comments.
CrawlChatRooms (boolean) --
TRUE
to index the contents of chat rooms.
CrawlAttachments (boolean) --
TRUE
to index attachments.
FolderIds (list) --
The identifiers of the Quip folders you want to index. You can find the folder ID in your browser URL when you access your folder in Quip. For example, https://quip-company.quipdomain.com/zlLuOVNSarTL/folder-name . The folder ID in this example is "zlLuOVNSarTL".
ThreadFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Quip threads to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Quip fields. For more information, see Mapping data source fields. The Quip field names must exist in your Quip custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
MessageFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Quip messages to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Quip fields. For more information, see Mapping data source fields. The Quip field names must exist in your Quip custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
AttachmentFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Quip attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Quip fields. For more information, see Mapping data source fields. The Quip field names must exist in your Quip custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
InclusionPatterns (list) --
A list of regular expression patterns to include certain files in your Quip file system. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence, and the file isn't included in the index.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain files in your Quip file system. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence, and the file isn't included in the index.
VpcConfiguration (dict) --
Configuration information for an Amazon Virtual Private Cloud (VPC) to connect to your Quip. For more information, see Configuring a VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
JiraConfiguration (dict) --
Provides the configuration information to connect to Jira as your data source.
JiraAccountUrl (string) --
The URL of the Jira account. For example, company.atlassian.net or https://jira.company.com . You can find your Jira account URL in the URL of your profile page for Jira desktop.
SecretArn (string) --
The Amazon Resource Name (ARN) of a secret in Secrets Manager contains the key-value pairs required to connect to your Jira data source. The secret must contain a JSON structure with the following keys:
UseChangeLog (boolean) --
TRUE
to use the Jira change log to determine which documents require updating in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in Jira.
Project (list) --
Specify which projects to crawl in your Jira data source. You can specify one or more Jira project IDs.
IssueType (list) --
Specify which issue types to crawl in your Jira data source. You can specify one or more of these options to crawl.
Status (list) --
Specify which statuses to crawl in your Jira data source. You can specify one or more of these options to crawl.
IssueSubEntityFilter (list) --
Specify whether to crawl comments, attachments, and work logs. You can specify one or more of these options.
AttachmentFieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
CommentFieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
IssueFieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira issues to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
ProjectFieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira projects to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
WorkLogFieldMappings (list) --
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira work logs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
InclusionPatterns (list) --
A list of regular expression patterns to include certain file paths, file names, and file types in your Jira data source. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain file paths, file names, and file types in your Jira data source. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
VpcConfiguration (dict) --
Configuration information for an Amazon Virtual Private Cloud to connect to your Jira. Your Jira account must reside inside your VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
GitHubConfiguration (dict) --
Provides the configuration information to connect to GitHub as your data source.
SaaSConfiguration (dict) --
Configuration information to connect to GitHub Enterprise Cloud (SaaS).
OrganizationName (string) --
The name of the organization of the GitHub Enterprise Cloud (SaaS) account you want to connect to. You can find your organization name by logging into GitHub desktop and selecting Your organizations under your profile picture dropdown.
HostUrl (string) --
The GitHub host URL or API endpoint URL. For example, https://api.github.com .
OnPremiseConfiguration (dict) --
Configuration information to connect to GitHub Enterprise Server (on premises).
HostUrl (string) --
The GitHub host URL or API endpoint URL. For example, https://on-prem-host-url/api/v3/
OrganizationName (string) --
The name of the organization of the GitHub Enterprise Server (in-premise) account you want to connect to. You can find your organization name by logging into GitHub desktop and selecting Your organizations under your profile picture dropdown.
SslCertificateS3Path (dict) --
The path to the SSL certificate stored in an Amazon S3 bucket. You use this to connect to GitHub if you require a secure SSL connection.
You can simply generate a self-signed X509 certificate on any computer using OpenSSL. For an example of using OpenSSL to create an X509 certificate, see Create and sign an X509 certificate.
Bucket (string) --
The name of the S3 bucket that contains the file.
Key (string) --
The name of the file.
Type (string) --
The type of GitHub service you want to connect to—GitHub Enterprise Cloud (SaaS) or GitHub Enterprise Server (on premises).
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your GitHub. The secret must contain a JSON structure with the following keys:
UseChangeLog (boolean) --
TRUE
to use the GitHub change log to determine which documents require updating in the index. Depending on the GitHub change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in GitHub.
GitHubDocumentCrawlProperties (dict) --
Configuration information to include certain types of GitHub content. You can configure to index repository files only, or also include issues and pull requests, comments, and comment attachments.
CrawlRepositoryDocuments (boolean) --
TRUE
to index all files with a repository.
CrawlIssue (boolean) --
TRUE
to index all issues within a repository.
CrawlIssueComment (boolean) --
TRUE
to index all comments on issues.
CrawlIssueCommentAttachment (boolean) --
TRUE
to include all comment attachments for issues.
CrawlPullRequest (boolean) --
TRUE
to index all pull requests within a repository.
CrawlPullRequestComment (boolean) --
TRUE
to index all comments on pull requests.
CrawlPullRequestCommentAttachment (boolean) --
TRUE
to include all comment attachments for pull requests.
RepositoryFilter (list) --
A list of names of the specific repositories you want to index.
InclusionFolderNamePatterns (list) --
A list of regular expression patterns to include certain folder names in your GitHub repository or repositories. Folder names that match the patterns are included in the index. Folder names that don't match the patterns are excluded from the index. If a folder matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the folder isn't included in the index.
InclusionFileTypePatterns (list) --
A list of regular expression patterns to include certain file types in your GitHub repository or repositories. File types that match the patterns are included in the index. File types that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
InclusionFileNamePatterns (list) --
A list of regular expression patterns to include certain file names in your GitHub repository or repositories. File names that match the patterns are included in the index. File names that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
ExclusionFolderNamePatterns (list) --
A list of regular expression patterns to exclude certain folder names in your GitHub repository or repositories. Folder names that match the patterns are excluded from the index. Folder names that don't match the patterns are included in the index. If a folder matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the folder isn't included in the index.
ExclusionFileTypePatterns (list) --
A list of regular expression patterns to exclude certain file types in your GitHub repository or repositories. File types that match the patterns are excluded from the index. File types that don't match the patterns are included in the index. If a file matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
ExclusionFileNamePatterns (list) --
A list of regular expression patterns to exclude certain file names in your GitHub repository or repositories. File names that match the patterns are excluded from the index. File names that don't match the patterns are included in the index. If a file matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
VpcConfiguration (dict) --
Configuration information of an Amazon Virtual Private Cloud to connect to your GitHub. For more information, see Configuring a VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
GitHubRepositoryConfigurationFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map GitHub repository attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
GitHubCommitConfigurationFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub commits to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
GitHubIssueDocumentConfigurationFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub issues to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
GitHubIssueCommentConfigurationFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub issue comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
GitHubIssueAttachmentConfigurationFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub issue attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
GitHubPullRequestCommentConfigurationFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub pull request comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
GitHubPullRequestDocumentConfigurationFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub pull requests to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
GitHubPullRequestDocumentAttachmentConfigurationFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub pull request attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
AlfrescoConfiguration (dict) --
Provides the configuration information to connect to Alfresco as your data source.
SiteUrl (string) --
The URL of the Alfresco site. For example, https://hostname:8080 .
SiteId (string) --
The identifier of the Alfresco site. For example, my-site .
SecretArn (string) --
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Alfresco data source. The secret must contain a JSON structure with the following keys:
SslCertificateS3Path (dict) --
The path to the SSL certificate stored in an Amazon S3 bucket. You use this to connect to Alfresco if you require a secure SSL connection.
You can simply generate a self-signed X509 certificate on any computer using OpenSSL. For an example of using OpenSSL to create an X509 certificate, see Create and sign an X509 certificate.
Bucket (string) --
The name of the S3 bucket that contains the file.
Key (string) --
The name of the file.
CrawlSystemFolders (boolean) --
TRUE
to index shared files.
CrawlComments (boolean) --
TRUE
to index comments of blogs and other content.
EntityFilter (list) --
Specify whether to index document libraries, wikis, or blogs. You can specify one or more of these options.
DocumentLibraryFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Alfresco document libraries to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Alfresco fields. For more information, see Mapping data source fields. The Alfresco data source field names must exist in your Alfresco custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
BlogFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Alfresco blogs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Alfresco fields. For more information, see Mapping data source fields. The Alfresco data source field names must exist in your Alfresco custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
WikiFieldMappings (list) --
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Alfresco wikis to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Alfresco fields. For more information, see Mapping data source fields. The Alfresco data source field names must exist in your Alfresco custom metadata.
(dict) --
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
DataSourceFieldName (string) --
The name of the column or attribute in the data source.
DateFieldFormat (string) --
The type of data stored in the column or attribute.
IndexFieldName (string) --
The name of the field in the index.
InclusionPatterns (list) --
A list of regular expression patterns to include certain files in your Alfresco data source. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
ExclusionPatterns (list) --
A list of regular expression patterns to exclude certain files in your Alfresco data source. Files that match the patterns are excluded from the index. Files that don't match the patterns are included in the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
VpcConfiguration (dict) --
Configuration information for an Amazon Virtual Private Cloud to connect to your Alfresco. For more information, see Configuring a VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
TemplateConfiguration (dict) --
Provides a template for the configuration information to connect to your data source.
Template (document) --
The template schema used for the data source, where templates schemas are supported.
VpcConfiguration (dict) --
Configuration information for an Amazon Virtual Private Cloud to connect to your data source. For more information, see Configuring a VPC.
SubnetIds (list) --
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
SecurityGroupIds (list) --
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
CreatedAt (datetime) --
The Unix timestamp of when the data source connector was created.
UpdatedAt (datetime) --
The Unix timestamp of when the data source connector was last updated.
Description (string) --
The description for the data source connector.
Status (string) --
The current status of the data source connector. When the status is ACTIVE
the data source is ready to use. When the status is FAILED
, the ErrorMessage
field contains the reason that the data source failed.
Schedule (string) --
The schedule for Amazon Kendra to update the index.
RoleArn (string) --
The Amazon Resource Name (ARN) of the role with permission to access the data source and required resources.
ErrorMessage (string) --
When the Status
field value is FAILED
, the ErrorMessage
field contains a description of the error that caused the data source to fail.
LanguageCode (string) --
The code for a language. This shows a supported language for all documents in the data source. English is supported by default. For more information on supported languages, including their codes, see Adding documents in languages other than English.
CustomDocumentEnrichmentConfiguration (dict) --
Configuration information for altering document metadata and content during the document ingestion process when you describe a data source.
For more information on how to create, modify and delete document metadata, or make other content alterations when you ingest documents into Amazon Kendra, see Customizing document metadata during the ingestion process.
InlineConfigurations (list) --
Configuration information to alter document attributes or metadata fields and content when ingesting documents into Amazon Kendra.
(dict) --
Provides the configuration information for applying basic logic to alter document metadata and content when ingesting documents into Amazon Kendra. To apply advanced logic, to go beyond what you can do with basic logic, see HookConfiguration.
For more information, see Customizing document metadata during the ingestion process.
Condition (dict) --
Configuration of the condition used for the target document attribute or metadata field when ingesting documents into Amazon Kendra.
ConditionDocumentAttributeKey (string) --
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
Operator (string) --
The condition operator.
For example, you can use 'Contains' to partially match a string.
ConditionOnValue (dict) --
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings. The default maximum length or number of strings is 10.
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Target (dict) --
Configuration of the target document attribute or metadata field when ingesting documents into Amazon Kendra. You can also include a value.
TargetDocumentAttributeKey (string) --
The identifier of the target document attribute or metadata field.
For example, 'Department' could be an identifier for the target attribute or metadata field that includes the department names associated with the documents.
TargetDocumentAttributeValueDeletion (boolean) --
TRUE
to delete the existing target value for your specified target attribute key. You cannot create a target value and set this toTRUE
. To create a target value (TargetDocumentAttributeValue
), set this toFALSE
.
TargetDocumentAttributeValue (dict) --
The target value you want to create for the target attribute.
For example, 'Finance' could be the target value for the target attribute key 'Department'.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings. The default maximum length or number of strings is 10.
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
DocumentContentDeletion (boolean) --
TRUE
to delete content if the condition used for the target attribute is met.
PreExtractionHookConfiguration (dict) --
Configuration information for invoking a Lambda function in Lambda on the original or raw documents before extracting their metadata and text. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation.
InvocationCondition (dict) --
The condition used for when a Lambda function should be invoked.
For example, you can specify a condition that if there are empty date-time values, then Amazon Kendra should invoke a function that inserts the current date-time.
ConditionDocumentAttributeKey (string) --
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
Operator (string) --
The condition operator.
For example, you can use 'Contains' to partially match a string.
ConditionOnValue (dict) --
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings. The default maximum length or number of strings is 10.
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
LambdaArn (string) --
The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Amazon Kendra.
S3Bucket (string) --
Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.
PostExtractionHookConfiguration (dict) --
Configuration information for invoking a Lambda function in Lambda on the structured documents with their metadata and text extracted. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation.
InvocationCondition (dict) --
The condition used for when a Lambda function should be invoked.
For example, you can specify a condition that if there are empty date-time values, then Amazon Kendra should invoke a function that inserts the current date-time.
ConditionDocumentAttributeKey (string) --
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
Operator (string) --
The condition operator.
For example, you can use 'Contains' to partially match a string.
ConditionOnValue (dict) --
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings. The default maximum length or number of strings is 10.
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
LambdaArn (string) --
The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Amazon Kendra.
S3Bucket (string) --
Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.
RoleArn (string) --
The Amazon Resource Name (ARN) of a role with permission to run PreExtractionHookConfiguration
and PostExtractionHookConfiguration
for altering document metadata and content during the document ingestion process. For more information, see IAM roles for Amazon Kendra.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
describe_experience
(**kwargs)¶Gets information about your Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.describe_experience(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of your Amazon Kendra experience you want to get information on.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
dict
Response Syntax
{
'Id': 'string',
'IndexId': 'string',
'Name': 'string',
'Endpoints': [
{
'EndpointType': 'HOME',
'Endpoint': 'string'
},
],
'Configuration': {
'ContentSourceConfiguration': {
'DataSourceIds': [
'string',
],
'FaqIds': [
'string',
],
'DirectPutContent': True|False
},
'UserIdentityConfiguration': {
'IdentityAttributeName': 'string'
}
},
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'Description': 'string',
'Status': 'CREATING'|'ACTIVE'|'DELETING'|'FAILED',
'RoleArn': 'string',
'ErrorMessage': 'string'
}
Response Structure
(dict) --
Id (string) --
Shows the identifier of your Amazon Kendra experience.
IndexId (string) --
Shows the identifier of the index for your Amazon Kendra experience.
Name (string) --
Shows the name of your Amazon Kendra experience.
Endpoints (list) --
Shows the endpoint URLs for your Amazon Kendra experiences. The URLs are unique and fully hosted by Amazon Web Services.
(dict) --
Provides the configuration information for the endpoint for your Amazon Kendra experience.
EndpointType (string) --
The type of endpoint for your Amazon Kendra experience. The type currently available is HOME
, which is a unique and fully hosted URL to the home page of your Amazon Kendra experience.
Endpoint (string) --
The endpoint of your Amazon Kendra experience.
Configuration (dict) --
Shows the configuration information for your Amazon Kendra experience. This includes ContentSourceConfiguration
, which specifies the data source IDs and/or FAQ IDs, and UserIdentityConfiguration
, which specifies the user or group information to grant access to your Amazon Kendra experience.
ContentSourceConfiguration (dict) --
The identifiers of your data sources and FAQs. Or, you can specify that you want to use documents indexed via the BatchPutDocument
API. This is the content you want to use for your Amazon Kendra experience.
DataSourceIds (list) --
The identifier of the data sources you want to use for your Amazon Kendra experience.
FaqIds (list) --
The identifier of the FAQs that you want to use for your Amazon Kendra experience.
DirectPutContent (boolean) --
TRUE
to use documents you indexed directly using theBatchPutDocument
API.
UserIdentityConfiguration (dict) --
The IAM Identity Center field name that contains the identifiers of your users, such as their emails.
IdentityAttributeName (string) --
The IAM Identity Center field name that contains the identifiers of your users, such as their emails. This is used for user context filtering and for granting access to your Amazon Kendra experience. You must set up IAM Identity Center with Amazon Kendra. You must include your users and groups in your Access Control List when you ingest documents into your index. For more information, see Getting started with an IAM Identity Center identity source.
CreatedAt (datetime) --
Shows the date-time your Amazon Kendra experience was created.
UpdatedAt (datetime) --
Shows the date-time your Amazon Kendra experience was last updated.
Description (string) --
Shows the description for your Amazon Kendra experience.
Status (string) --
The current processing status of your Amazon Kendra experience. When the status is ACTIVE
, your Amazon Kendra experience is ready to use. When the status is FAILED
, the ErrorMessage
field contains the reason that this failed.
RoleArn (string) --
Shows the Amazon Resource Name (ARN) of a role with permission to access Query
API, QuerySuggestions
API, SubmitFeedback
API, and IAM Identity Center that stores your user and group information.
ErrorMessage (string) --
The reason your Amazon Kendra experience could not properly process.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
describe_faq
(**kwargs)¶Gets information about an FAQ list.
See also: AWS API Documentation
Request Syntax
response = client.describe_faq(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of the FAQ you want to get information on.
[REQUIRED]
The identifier of the index for the FAQ.
dict
Response Syntax
{
'Id': 'string',
'IndexId': 'string',
'Name': 'string',
'Description': 'string',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'S3Path': {
'Bucket': 'string',
'Key': 'string'
},
'Status': 'CREATING'|'UPDATING'|'ACTIVE'|'DELETING'|'FAILED',
'RoleArn': 'string',
'ErrorMessage': 'string',
'FileFormat': 'CSV'|'CSV_WITH_HEADER'|'JSON',
'LanguageCode': 'string'
}
Response Structure
(dict) --
Id (string) --
The identifier of the FAQ.
IndexId (string) --
The identifier of the index for the FAQ.
Name (string) --
The name that you gave the FAQ when it was created.
Description (string) --
The description of the FAQ that you provided when it was created.
CreatedAt (datetime) --
The date and time that the FAQ was created.
UpdatedAt (datetime) --
The date and time that the FAQ was last updated.
S3Path (dict) --
Information required to find a specific file in an Amazon S3 bucket.
Bucket (string) --
The name of the S3 bucket that contains the file.
Key (string) --
The name of the file.
Status (string) --
The status of the FAQ. It is ready to use when the status is ACTIVE
.
RoleArn (string) --
The Amazon Resource Name (ARN) of the role that provides access to the S3 bucket containing the input files for the FAQ.
ErrorMessage (string) --
If the Status
field is FAILED
, the ErrorMessage
field contains the reason why the FAQ failed.
FileFormat (string) --
The file format used by the input files for the FAQ.
LanguageCode (string) --
The code for a language. This shows a supported language for the FAQ document. English is supported by default. For more information on supported languages, including their codes, see Adding documents in languages other than English.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
describe_index
(**kwargs)¶Gets information about an existing Amazon Kendra index.
See also: AWS API Documentation
Request Syntax
response = client.describe_index(
Id='string'
)
[REQUIRED]
The identifier of the index you want to get information on.
{
'Name': 'string',
'Id': 'string',
'Edition': 'DEVELOPER_EDITION'|'ENTERPRISE_EDITION',
'RoleArn': 'string',
'ServerSideEncryptionConfiguration': {
'KmsKeyId': 'string'
},
'Status': 'CREATING'|'ACTIVE'|'DELETING'|'FAILED'|'UPDATING'|'SYSTEM_UPDATING',
'Description': 'string',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'DocumentMetadataConfigurations': [
{
'Name': 'string',
'Type': 'STRING_VALUE'|'STRING_LIST_VALUE'|'LONG_VALUE'|'DATE_VALUE',
'Relevance': {
'Freshness': True|False,
'Importance': 123,
'Duration': 'string',
'RankOrder': 'ASCENDING'|'DESCENDING',
'ValueImportanceMap': {
'string': 123
}
},
'Search': {
'Facetable': True|False,
'Searchable': True|False,
'Displayable': True|False,
'Sortable': True|False
}
},
],
'IndexStatistics': {
'FaqStatistics': {
'IndexedQuestionAnswersCount': 123
},
'TextDocumentStatistics': {
'IndexedTextDocumentsCount': 123,
'IndexedTextBytes': 123
}
},
'ErrorMessage': 'string',
'CapacityUnits': {
'StorageCapacityUnits': 123,
'QueryCapacityUnits': 123
},
'UserTokenConfigurations': [
{
'JwtTokenTypeConfiguration': {
'KeyLocation': 'URL'|'SECRET_MANAGER',
'URL': 'string',
'SecretManagerArn': 'string',
'UserNameAttributeField': 'string',
'GroupAttributeField': 'string',
'Issuer': 'string',
'ClaimRegex': 'string'
},
'JsonTokenTypeConfiguration': {
'UserNameAttributeField': 'string',
'GroupAttributeField': 'string'
}
},
],
'UserContextPolicy': 'ATTRIBUTE_FILTER'|'USER_TOKEN',
'UserGroupResolutionConfiguration': {
'UserGroupResolutionMode': 'AWS_SSO'|'NONE'
}
}
Response Structure
The name of the index.
The identifier of the index.
The Amazon Kendra edition used for the index. You decide the edition when you create the index.
The Amazon Resource Name (ARN) of the IAM role that gives Amazon Kendra permission to write to your Amazon Cloudwatch logs.
The identifier of the KMScustomer master key (CMK) that is used to encrypt your data. Amazon Kendra doesn't support asymmetric CMKs.
The identifier of the KMS key. Amazon Kendra doesn't support asymmetric keys.
The current status of the index. When the value is ACTIVE
, the index is ready for use. If the Status
field value is FAILED
, the ErrorMessage
field contains a message that explains why.
The description for the index.
The Unix datetime that the index was created.
The Unix datetime that the index was last updated.
Configuration information for document metadata or fields. Document metadata are fields or attributes associated with your documents. For example, the company department name associated with each document.
Specifies the properties, such as relevance tuning and searchability, of an index field.
The name of the index field.
The data type of the index field.
Provides tuning parameters to determine how the field affects the search results.
Indicates that this field determines how "fresh" a document is. For example, if document 1 was created on November 5, and document 2 was created on October 31, document 1 is "fresher" than document 2. You can only set the Freshness
field on one DATE
type field. Only applies to DATE
fields.
The relative importance of the field in the search. Larger numbers provide more of a boost than smaller numbers.
Specifies the time period that the boost applies to. For example, to make the boost apply to documents with the field value within the last month, you would use "2628000s". Once the field value is beyond the specified range, the effect of the boost drops off. The higher the importance, the faster the effect drops off. If you don't specify a value, the default is 3 months. The value of the field is a numeric string followed by the character "s", for example "86400s" for one day, or "604800s" for one week.
Only applies to DATE
fields.
Determines how values should be interpreted.
When the RankOrder
field is ASCENDING
, higher numbers are better. For example, a document with a rating score of 10 is higher ranking than a document with a rating score of 1.
When the RankOrder
field is DESCENDING
, lower numbers are better. For example, in a task tracking application, a priority 1 task is more important than a priority 5 task.
Only applies to LONG
and DOUBLE
fields.
A list of values that should be given a different boost when they appear in the result list. For example, if you are boosting a field called "department," query terms that match the department field are boosted in the result. However, you can add entries from the department field to boost documents with those values higher.
For example, you can add entries to the map with names of departments. If you add "HR",5 and "Legal",3 those departments are given special attention when they appear in the metadata of a document. When those terms appear they are given the specified importance instead of the regular importance for the boost.
Provides information about how the field is used during a search.
Indicates that the field can be used to create search facets, a count of results for each value in the field. The default is false
.
Determines whether the field is used in the search. If the Searchable
field is true
, you can use relevance tuning to manually tune how Amazon Kendra weights the field in the search. The default is true
for string fields and false
for number and date fields.
Determines whether the field is returned in the query response. The default is true
.
Determines whether the field can be used to sort the results of a query. If you specify sorting on a field that does not have Sortable
set to true
, Amazon Kendra returns an exception. The default is false
.
Provides information about the number of FAQ questions and answers and the number of text documents indexed.
The number of question and answer topics in the index.
The total number of FAQ questions and answers contained in the index.
The number of text documents indexed.
The number of text documents indexed.
The total size, in bytes, of the indexed documents.
When the Status
field value is FAILED
, the ErrorMessage
field contains a message that explains why.
For Enterprise Edition indexes, you can choose to use additional capacity to meet the needs of your application. This contains the capacity units used for the index. A query or document storage capacity of zero indicates that the index is using the default capacity. For more information on the default capacity for an index and adjusting this, see Adjusting capacity.
The amount of extra storage capacity for an index. A single capacity unit provides 30 GB of storage space or 100,000 documents, whichever is reached first. You can add up to 100 extra capacity units.
The amount of extra query capacity for an index and GetQuerySuggestions capacity.
A single extra capacity unit for an index provides 0.1 queries per second or approximately 8,000 queries per day. You can add up to 100 extra capacity units.
GetQuerySuggestions
capacity is five times the provisioned query capacity for an index, or the base capacity of 2.5 calls per second, whichever is higher. For example, the base capacity for an index is 0.1 queries per second, andGetQuerySuggestions
capacity has a base of 2.5 calls per second. If you add another 0.1 queries per second to total 0.2 queries per second for an index, theGetQuerySuggestions
capacity is 2.5 calls per second (higher than five times 0.2 queries per second).
The user token configuration for the Amazon Kendra index.
Provides the configuration information for a token.
Information about the JWT token type configuration.
The location of the key.
The signing key URL.
The Amazon Resource Name (arn) of the secret.
The user name attribute field.
The group attribute field.
The issuer of the token.
The regular expression that identifies the claim.
Information about the JSON token type configuration.
The user name attribute field.
The group attribute field.
The user context policy for the Amazon Kendra index.
Whether you have enabled the configuration for fetching access levels of groups and users from an IAM Identity Center (successor to Single Sign-On) identity source.
The identity store provider (mode) you want to use to fetch access levels of groups and users. IAM Identity Center (successor to Single Sign-On) is currently the only available mode. Your users and groups must exist in an IAM Identity Center identity source in order to use this mode.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
describe_principal_mapping
(**kwargs)¶Describes the processing of PUT
and DELETE
actions for mapping users to their groups. This includes information on the status of actions currently processing or yet to be processed, when actions were last updated, when actions were received by Amazon Kendra, the latest action that should process and apply after other actions, and useful error messages if an action could not be processed.
DescribePrincipalMapping
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.describe_principal_mapping(
IndexId='string',
DataSourceId='string',
GroupId='string'
)
[REQUIRED]
The identifier of the index required to check the processing of PUT
and DELETE
actions for mapping users to their groups.
PUT
and DELETE
actions for mapping users to their groups.[REQUIRED]
The identifier of the group required to check the processing of PUT
and DELETE
actions for mapping users to their groups.
dict
Response Syntax
{
'IndexId': 'string',
'DataSourceId': 'string',
'GroupId': 'string',
'GroupOrderingIdSummaries': [
{
'Status': 'FAILED'|'SUCCEEDED'|'PROCESSING'|'DELETING'|'DELETED',
'LastUpdatedAt': datetime(2015, 1, 1),
'ReceivedAt': datetime(2015, 1, 1),
'OrderingId': 123,
'FailureReason': 'string'
},
]
}
Response Structure
(dict) --
IndexId (string) --
Shows the identifier of the index to see information on the processing of PUT
and DELETE
actions for mapping users to their groups.
DataSourceId (string) --
Shows the identifier of the data source to see information on the processing of PUT
and DELETE
actions for mapping users to their groups.
GroupId (string) --
Shows the identifier of the group to see information on the processing of PUT
and DELETE
actions for mapping users to their groups.
GroupOrderingIdSummaries (list) --
Shows the following information on the processing of PUT
and DELETE
actions for mapping users to their groups:
PROCESSING
, SUCCEEDED
, DELETING
, DELETED
, or FAILED
.(dict) --
Summary information on the processing of PUT
and DELETE
actions for mapping users to their groups.
Status (string) --
The current processing status of actions for mapping users to their groups. The status can be either PROCESSING
, SUCCEEDED
, DELETING
, DELETED
, or FAILED
.
LastUpdatedAt (datetime) --
The last date-time an action was updated. An action can be a PUT
or DELETE
action for mapping users to their groups.
ReceivedAt (datetime) --
The date-time an action was received by Amazon Kendra. An action can be a PUT
or DELETE
action for mapping users to their groups.
OrderingId (integer) --
The order in which actions should complete processing. An action can be a PUT
or DELETE
action for mapping users to their groups.
FailureReason (string) --
The reason an action could not be processed. An action can be a PUT
or DELETE
action for mapping users to their groups.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
describe_query_suggestions_block_list
(**kwargs)¶Gets information about a block list used for query suggestions for an index.
This is used to check the current settings that are applied to a block list.
DescribeQuerySuggestionsBlockList
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.describe_query_suggestions_block_list(
IndexId='string',
Id='string'
)
[REQUIRED]
The identifier of the index for the block list.
[REQUIRED]
The identifier of the block list you want to get information on.
dict
Response Syntax
{
'IndexId': 'string',
'Id': 'string',
'Name': 'string',
'Description': 'string',
'Status': 'ACTIVE'|'CREATING'|'DELETING'|'UPDATING'|'ACTIVE_BUT_UPDATE_FAILED'|'FAILED',
'ErrorMessage': 'string',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'SourceS3Path': {
'Bucket': 'string',
'Key': 'string'
},
'ItemCount': 123,
'FileSizeBytes': 123,
'RoleArn': 'string'
}
Response Structure
(dict) --
IndexId (string) --
The identifier of the index for the block list.
Id (string) --
The identifier of the block list.
Name (string) --
The name of the block list.
Description (string) --
The description for the block list.
Status (string) --
The current status of the block list. When the value is ACTIVE
, the block list is ready for use.
ErrorMessage (string) --
The error message containing details if there are issues processing the block list.
CreatedAt (datetime) --
The date-time a block list for query suggestions was created.
UpdatedAt (datetime) --
The date-time a block list for query suggestions was last updated.
SourceS3Path (dict) --
Shows the current S3 path to your block list text file in your S3 bucket.
Each block word or phrase should be on a separate line in a text file.
For information on the current quota limits for block lists, see Quotas for Amazon Kendra.
Bucket (string) --
The name of the S3 bucket that contains the file.
Key (string) --
The name of the file.
ItemCount (integer) --
The current number of valid, non-empty words or phrases in the block list text file.
FileSizeBytes (integer) --
The current size of the block list text file in S3.
RoleArn (string) --
The IAM (Identity and Access Management) role used by Amazon Kendra to access the block list text file in S3.
The role needs S3 read permissions to your file in S3 and needs to give STS (Security Token Service) assume role permissions to Amazon Kendra.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
describe_query_suggestions_config
(**kwargs)¶Gets information on the settings of query suggestions for an index.
This is used to check the current settings applied to query suggestions.
DescribeQuerySuggestionsConfig
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.describe_query_suggestions_config(
IndexId='string'
)
[REQUIRED]
The identifier of the index with query suggestions that you want to get information on.
{
'Mode': 'ENABLED'|'LEARN_ONLY',
'Status': 'ACTIVE'|'UPDATING',
'QueryLogLookBackWindowInDays': 123,
'IncludeQueriesWithoutUserInformation': True|False,
'MinimumNumberOfQueryingUsers': 123,
'MinimumQueryCount': 123,
'LastSuggestionsBuildTime': datetime(2015, 1, 1),
'LastClearTime': datetime(2015, 1, 1),
'TotalSuggestionsCount': 123
}
Response Structure
Whether query suggestions are currently in ENABLED
mode or LEARN_ONLY
mode.
By default, Amazon Kendra enables query suggestions. LEARN_ONLY
turns off query suggestions for your users. You can change the mode using the UpdateQuerySuggestionsConfig API.
Whether the status of query suggestions settings is currently ACTIVE
or UPDATING
.
Active means the current settings apply and Updating means your changed settings are in the process of applying.
How recent your queries are in your query log time window (in days).
TRUE
to use all queries, otherwise use only queries that include user information to generate the query suggestions.
The minimum number of unique users who must search a query in order for the query to be eligible to suggest to your users.
The minimum number of times a query must be searched in order for the query to be eligible to suggest to your users.
The date-time query suggestions for an index was last updated.
The date-time query suggestions for an index was last cleared.
After you clear suggestions, Amazon Kendra learns new suggestions based on new queries added to the query log from the time you cleared suggestions. Amazon Kendra only considers re-occurences of a query from the time you cleared suggestions.
The current total count of query suggestions for an index.
This count can change when you update your query suggestions settings, if you filter out certain queries from suggestions using a block list, and as the query log accumulates more queries for Amazon Kendra to learn from.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
describe_thesaurus
(**kwargs)¶Gets information about an existing Amazon Kendra thesaurus.
See also: AWS API Documentation
Request Syntax
response = client.describe_thesaurus(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of the thesaurus you want to get information on.
[REQUIRED]
The identifier of the index for the thesaurus.
dict
Response Syntax
{
'Id': 'string',
'IndexId': 'string',
'Name': 'string',
'Description': 'string',
'Status': 'CREATING'|'ACTIVE'|'DELETING'|'UPDATING'|'ACTIVE_BUT_UPDATE_FAILED'|'FAILED',
'ErrorMessage': 'string',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'RoleArn': 'string',
'SourceS3Path': {
'Bucket': 'string',
'Key': 'string'
},
'FileSizeBytes': 123,
'TermCount': 123,
'SynonymRuleCount': 123
}
Response Structure
(dict) --
Id (string) --
The identifier of the thesaurus.
IndexId (string) --
The identifier of the index for the thesaurus.
Name (string) --
The thesaurus name.
Description (string) --
The thesaurus description.
Status (string) --
The current status of the thesaurus. When the value is ACTIVE
, queries are able to use the thesaurus. If the Status
field value is FAILED
, the ErrorMessage
field provides more information.
If the status is ACTIVE_BUT_UPDATE_FAILED
, it means that Amazon Kendra could not ingest the new thesaurus file. The old thesaurus file is still active.
ErrorMessage (string) --
When the Status
field value is FAILED
, the ErrorMessage
field provides more information.
CreatedAt (datetime) --
The Unix datetime that the thesaurus was created.
UpdatedAt (datetime) --
The Unix datetime that the thesaurus was last updated.
RoleArn (string) --
An IAM role that gives Amazon Kendra permissions to access thesaurus file specified in SourceS3Path
.
SourceS3Path (dict) --
Information required to find a specific file in an Amazon S3 bucket.
Bucket (string) --
The name of the S3 bucket that contains the file.
Key (string) --
The name of the file.
FileSizeBytes (integer) --
The size of the thesaurus file in bytes.
TermCount (integer) --
The number of unique terms in the thesaurus file. For example, the synonyms a,b,c
and a=>d
, the term count would be 4.
SynonymRuleCount (integer) --
The number of synonym rules in the thesaurus file.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
disassociate_entities_from_experience
(**kwargs)¶Prevents users or groups in your IAM Identity Center identity source from accessing your Amazon Kendra experience. You can create an Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.disassociate_entities_from_experience(
Id='string',
IndexId='string',
EntityList=[
{
'EntityId': 'string',
'EntityType': 'USER'|'GROUP'
},
]
)
[REQUIRED]
The identifier of your Amazon Kendra experience.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
[REQUIRED]
Lists users or groups in your IAM Identity Center identity source.
Provides the configuration information for users or groups in your IAM Identity Center identity source to grant access your Amazon Kendra experience.
The identifier of a user or group in your IAM Identity Center identity source. For example, a user ID could be an email.
Specifies whether you are configuring a User
or a Group
.
dict
Response Syntax
{
'FailedEntityList': [
{
'EntityId': 'string',
'ErrorMessage': 'string'
},
]
}
Response Structure
(dict) --
FailedEntityList (list) --
Lists the users or groups in your IAM Identity Center identity source that failed to properly remove access to your Amazon Kendra experience.
(dict) --
Information on the users or groups in your IAM Identity Center identity source that failed to properly configure with your Amazon Kendra experience.
EntityId (string) --
The identifier of the user or group in your IAM Identity Center identity source. For example, a user ID could be an email.
ErrorMessage (string) --
The reason the user or group in your IAM Identity Center identity source failed to properly configure with your Amazon Kendra experience.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
disassociate_personas_from_entities
(**kwargs)¶Removes the specific permissions of users or groups in your IAM Identity Center identity source with access to your Amazon Kendra experience. You can create an Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.disassociate_personas_from_entities(
Id='string',
IndexId='string',
EntityIds=[
'string',
]
)
[REQUIRED]
The identifier of your Amazon Kendra experience.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
[REQUIRED]
The identifiers of users or groups in your IAM Identity Center identity source. For example, user IDs could be user emails.
dict
Response Syntax
{
'FailedEntityList': [
{
'EntityId': 'string',
'ErrorMessage': 'string'
},
]
}
Response Structure
(dict) --
FailedEntityList (list) --
Lists the users or groups in your IAM Identity Center identity source that failed to properly remove access to your Amazon Kendra experience.
(dict) --
Information on the users or groups in your IAM Identity Center identity source that failed to properly configure with your Amazon Kendra experience.
EntityId (string) --
The identifier of the user or group in your IAM Identity Center identity source. For example, a user ID could be an email.
ErrorMessage (string) --
The reason the user or group in your IAM Identity Center identity source failed to properly configure with your Amazon Kendra experience.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
get_paginator
(operation_name)¶Create a paginator for an operation.
create_foo
, and you'd normally invoke the
operation as client.create_foo(**kwargs)
, if the
create_foo
operation can be paginated, you can use the
call client.get_paginator("create_foo")
.client.can_paginate
method to
check if an operation is pageable.get_query_suggestions
(**kwargs)¶Fetches the queries that are suggested to your users.
GetQuerySuggestions
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.get_query_suggestions(
IndexId='string',
QueryText='string',
MaxSuggestionsCount=123
)
[REQUIRED]
The identifier of the index you want to get query suggestions from.
[REQUIRED]
The text of a user's query to generate query suggestions.
A query is suggested if the query prefix matches what a user starts to type as their query.
Amazon Kendra does not show any suggestions if a user types fewer than two characters or more than 60 characters. A query must also have at least one search result and contain at least one word of more than four characters.
dict
Response Syntax
{
'QuerySuggestionsId': 'string',
'Suggestions': [
{
'Id': 'string',
'Value': {
'Text': {
'Text': 'string',
'Highlights': [
{
'BeginOffset': 123,
'EndOffset': 123
},
]
}
}
},
]
}
Response Structure
(dict) --
QuerySuggestionsId (string) --
The identifier for a list of query suggestions for an index.
Suggestions (list) --
A list of query suggestions for an index.
(dict) --
A single query suggestion.
Id (string) --
The UUID (universally unique identifier) of a single query suggestion.
Value (dict) --
The value for the UUID (universally unique identifier) of a single query suggestion.
The value is the text string of a suggestion.
Text (dict) --
The SuggestionTextWithHighlights
structure that contains the query suggestion text and highlights.
Text (string) --
The query suggestion text to display to the user.
Highlights (list) --
The beginning and end of the query suggestion text that should be highlighted.
(dict) --
The text highlights for a single query suggestion.
BeginOffset (integer) --
The zero-based location in the response string where the highlight starts.
EndOffset (integer) --
The zero-based location in the response string where the highlight ends.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.InternalServerException
get_snapshots
(**kwargs)¶Retrieves search metrics data. The data provides a snapshot of how your users interact with your search application and how effective the application is.
See also: AWS API Documentation
Request Syntax
response = client.get_snapshots(
IndexId='string',
Interval='THIS_MONTH'|'THIS_WEEK'|'ONE_WEEK_AGO'|'TWO_WEEKS_AGO'|'ONE_MONTH_AGO'|'TWO_MONTHS_AGO',
MetricType='QUERIES_BY_COUNT'|'QUERIES_BY_ZERO_CLICK_RATE'|'QUERIES_BY_ZERO_RESULT_RATE'|'DOCS_BY_CLICK_COUNT'|'AGG_QUERY_DOC_METRICS'|'TREND_QUERY_DOC_METRICS',
NextToken='string',
MaxResults=123
)
[REQUIRED]
The identifier of the index to get search metrics data.
[REQUIRED]
The time interval or time window to get search metrics data. The time interval uses the time zone of your index. You can view data in the following time windows:
THIS_WEEK
: The current week, starting on the Sunday and ending on the day before the current date.ONE_WEEK_AGO
: The previous week, starting on the Sunday and ending on the following Saturday.TWO_WEEKS_AGO
: The week before the previous week, starting on the Sunday and ending on the following Saturday.THIS_MONTH
: The current month, starting on the first day of the month and ending on the day before the current date.ONE_MONTH_AGO
: The previous month, starting on the first day of the month and ending on the last day of the month.TWO_MONTHS_AGO
: The month before the previous month, starting on the first day of the month and ending on last day of the month.[REQUIRED]
The metric you want to retrieve. You can specify only one metric per call.
For more information about the metrics you can view, see Gaining insights with search analytics.
dict
Response Syntax
{
'SnapShotTimeFilter': {
'StartTime': datetime(2015, 1, 1),
'EndTime': datetime(2015, 1, 1)
},
'SnapshotsDataHeader': [
'string',
],
'SnapshotsData': [
[
'string',
],
],
'NextToken': 'string'
}
Response Structure
(dict) --
SnapShotTimeFilter (dict) --
The date-time for the beginning and end of the time window for the search metrics data.
StartTime (datetime) --
The UNIX datetime of the beginning of the time range.
EndTime (datetime) --
The UNIX datetime of the end of the time range.
SnapshotsDataHeader (list) --
The column headers for the search metrics data.
SnapshotsData (list) --
The search metrics data. The data returned depends on the metric type you requested.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token, which you can use in a later request to retrieve the next set of search metrics data.
Exceptions
kendra.Client.exceptions.InvalidRequestException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
get_waiter
(waiter_name)¶Returns an object that can wait for some condition.
list_access_control_configurations
(**kwargs)¶Lists one or more access control configurations for an index. This includes user and group access information for your documents. This is useful for user context filtering, where search results are filtered based on the user or their group access to documents.
See also: AWS API Documentation
Request Syntax
response = client.list_access_control_configurations(
IndexId='string',
NextToken='string',
MaxResults=123
)
[REQUIRED]
The identifier of the index for the access control configuration.
dict
Response Syntax
{
'NextToken': 'string',
'AccessControlConfigurations': [
{
'Id': 'string'
},
]
}
Response Structure
(dict) --
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token, which you can use in the subsequent request to retrieve the next set of access control configurations.
AccessControlConfigurations (list) --
The details of your access control configurations.
(dict) --
Summary information on an access control configuration that you created for your documents in an index.
Id (string) --
The identifier of the access control configuration.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
list_data_source_sync_jobs
(**kwargs)¶Gets statistics about synchronizing a data source connector.
See also: AWS API Documentation
Request Syntax
response = client.list_data_source_sync_jobs(
Id='string',
IndexId='string',
NextToken='string',
MaxResults=123,
StartTimeFilter={
'StartTime': datetime(2015, 1, 1),
'EndTime': datetime(2015, 1, 1)
},
StatusFilter='FAILED'|'SUCCEEDED'|'SYNCING'|'INCOMPLETE'|'STOPPING'|'ABORTED'|'SYNCING_INDEXING'
)
[REQUIRED]
The identifier of the data source connector.
[REQUIRED]
The identifier of the index used with the data source connector.
When specified, the synchronization jobs returned in the list are limited to jobs between the specified dates.
The UNIX datetime of the beginning of the time range.
The UNIX datetime of the end of the time range.
Status
field equal to the specified status.dict
Response Syntax
{
'History': [
{
'ExecutionId': 'string',
'StartTime': datetime(2015, 1, 1),
'EndTime': datetime(2015, 1, 1),
'Status': 'FAILED'|'SUCCEEDED'|'SYNCING'|'INCOMPLETE'|'STOPPING'|'ABORTED'|'SYNCING_INDEXING',
'ErrorMessage': 'string',
'ErrorCode': 'InternalError'|'InvalidRequest',
'DataSourceErrorCode': 'string',
'Metrics': {
'DocumentsAdded': 'string',
'DocumentsModified': 'string',
'DocumentsDeleted': 'string',
'DocumentsFailed': 'string',
'DocumentsScanned': 'string'
}
},
],
'NextToken': 'string'
}
Response Structure
(dict) --
History (list) --
A history of synchronization jobs for the data source connector.
(dict) --
Provides information about a data source synchronization job.
ExecutionId (string) --
A identifier for the synchronization job.
StartTime (datetime) --
The UNIX datetime that the synchronization job started.
EndTime (datetime) --
The UNIX datetime that the synchronization job completed.
Status (string) --
The execution status of the synchronization job. When the Status
field is set to SUCCEEDED
, the synchronization job is done. If the status code is set to FAILED
, the ErrorCode
and ErrorMessage
fields give you the reason for the failure.
ErrorMessage (string) --
If the Status
field is set to ERROR
, the ErrorMessage
field contains a description of the error that caused the synchronization to fail.
ErrorCode (string) --
If the Status
field is set to FAILED
, the ErrorCode
field indicates the reason the synchronization failed.
DataSourceErrorCode (string) --
If the reason that the synchronization failed is due to an error with the underlying data source, this field contains a code that identifies the error.
Metrics (dict) --
Maps a batch delete document request to a specific data source sync job. This is optional and should only be supplied when documents are deleted by a data source connector.
DocumentsAdded (string) --
The number of documents added from the data source up to now in the data source sync.
DocumentsModified (string) --
The number of documents modified in the data source up to now in the data source sync run.
DocumentsDeleted (string) --
The number of documents deleted from the data source up to now in the data source sync run.
DocumentsFailed (string) --
The number of documents that failed to sync from the data source up to now in the data source sync run.
DocumentsScanned (string) --
The current number of documents crawled by the current sync job in the data source.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token that you can use in the subsequent request to retrieve the next set of jobs.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.InternalServerException
list_data_sources
(**kwargs)¶Lists the data source connectors that you have created.
See also: AWS API Documentation
Request Syntax
response = client.list_data_sources(
IndexId='string',
NextToken='string',
MaxResults=123
)
[REQUIRED]
The identifier of the index used with one or more data source connectors.
dict
Response Syntax
{
'SummaryItems': [
{
'Name': 'string',
'Id': 'string',
'Type': 'S3'|'SHAREPOINT'|'DATABASE'|'SALESFORCE'|'ONEDRIVE'|'SERVICENOW'|'CUSTOM'|'CONFLUENCE'|'GOOGLEDRIVE'|'WEBCRAWLER'|'WORKDOCS'|'FSX'|'SLACK'|'BOX'|'QUIP'|'JIRA'|'GITHUB'|'ALFRESCO'|'TEMPLATE',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'Status': 'CREATING'|'DELETING'|'FAILED'|'UPDATING'|'ACTIVE',
'LanguageCode': 'string'
},
],
'NextToken': 'string'
}
Response Structure
(dict) --
SummaryItems (list) --
An array of summary information for one or more data source connector.
(dict) --
Summary information for a Amazon Kendra data source.
Name (string) --
The name of the data source.
Id (string) --
The identifier for the data source.
Type (string) --
The type of the data source.
CreatedAt (datetime) --
The UNIX datetime that the data source was created.
UpdatedAt (datetime) --
The UNIX datetime that the data source was lasted updated.
Status (string) --
The status of the data source. When the status is ACTIVE
the data source is ready to use.
LanguageCode (string) --
The code for a language. This shows a supported language for all documents in the data source. English is supported by default. For more information on supported languages, including their codes, see Adding documents in languages other than English.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token that you can use in the subsequent request to retrieve the next set of data source connectors.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.InternalServerException
list_entity_personas
(**kwargs)¶Lists specific permissions of users and groups with access to your Amazon Kendra experience.
See also: AWS API Documentation
Request Syntax
response = client.list_entity_personas(
Id='string',
IndexId='string',
NextToken='string',
MaxResults=123
)
[REQUIRED]
The identifier of your Amazon Kendra experience.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
dict
Response Syntax
{
'SummaryItems': [
{
'EntityId': 'string',
'Persona': 'OWNER'|'VIEWER',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1)
},
],
'NextToken': 'string'
}
Response Structure
(dict) --
SummaryItems (list) --
An array of summary information for one or more users or groups.
(dict) --
Summary information for users or groups in your IAM Identity Center identity source. This applies to users and groups with specific permissions that define their level of access to your Amazon Kendra experience. You can create an Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
EntityId (string) --
The identifier of a user or group in your IAM Identity Center identity source. For example, a user ID could be an email.
Persona (string) --
The persona that defines the specific permissions of the user or group in your IAM Identity Center identity source. The available personas or access roles are Owner
and Viewer
. For more information on these personas, see Providing access to your search page.
CreatedAt (datetime) --
The date-time the summary information was created.
UpdatedAt (datetime) --
The date-time the summary information was last updated.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token, which you can use in a later request to retrieve the next set of users or groups.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.InternalServerException
list_experience_entities
(**kwargs)¶Lists users or groups in your IAM Identity Center identity source that are granted access to your Amazon Kendra experience. You can create an Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.list_experience_entities(
Id='string',
IndexId='string',
NextToken='string'
)
[REQUIRED]
The identifier of your Amazon Kendra experience.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
dict
Response Syntax
{
'SummaryItems': [
{
'EntityId': 'string',
'EntityType': 'USER'|'GROUP',
'DisplayData': {
'UserName': 'string',
'GroupName': 'string',
'IdentifiedUserName': 'string',
'FirstName': 'string',
'LastName': 'string'
}
},
],
'NextToken': 'string'
}
Response Structure
(dict) --
SummaryItems (list) --
An array of summary information for one or more users or groups.
(dict) --
Summary information for users or groups in your IAM Identity Center identity source with granted access to your Amazon Kendra experience. You can create an Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
EntityId (string) --
The identifier of a user or group in your IAM Identity Center identity source. For example, a user ID could be an email.
EntityType (string) --
Shows the type as User
or Group
.
DisplayData (dict) --
Information about the user entity.
UserName (string) --
The name of the user.
GroupName (string) --
The name of the group.
IdentifiedUserName (string) --
The user name of the user.
FirstName (string) --
The first name of the user.
LastName (string) --
The last name of the user.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token, which you can use in a later request to retrieve the next set of users or groups.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.InternalServerException
list_experiences
(**kwargs)¶Lists one or more Amazon Kendra experiences. You can create an Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.list_experiences(
IndexId='string',
NextToken='string',
MaxResults=123
)
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
dict
Response Syntax
{
'SummaryItems': [
{
'Name': 'string',
'Id': 'string',
'CreatedAt': datetime(2015, 1, 1),
'Status': 'CREATING'|'ACTIVE'|'DELETING'|'FAILED',
'Endpoints': [
{
'EndpointType': 'HOME',
'Endpoint': 'string'
},
]
},
],
'NextToken': 'string'
}
Response Structure
(dict) --
SummaryItems (list) --
An array of summary information for one or more Amazon Kendra experiences.
(dict) --
Summary information for your Amazon Kendra experience. You can create an Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
Name (string) --
The name of your Amazon Kendra experience.
Id (string) --
The identifier of your Amazon Kendra experience.
CreatedAt (datetime) --
The date-time your Amazon Kendra experience was created.
Status (string) --
The processing status of your Amazon Kendra experience.
Endpoints (list) --
The endpoint URLs for your Amazon Kendra experiences. The URLs are unique and fully hosted by Amazon Web Services.
(dict) --
Provides the configuration information for the endpoint for your Amazon Kendra experience.
EndpointType (string) --
The type of endpoint for your Amazon Kendra experience. The type currently available is HOME
, which is a unique and fully hosted URL to the home page of your Amazon Kendra experience.
Endpoint (string) --
The endpoint of your Amazon Kendra experience.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token, which you can use in a later request to retrieve the next set of Amazon Kendra experiences.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.InternalServerException
list_faqs
(**kwargs)¶Gets a list of FAQ lists associated with an index.
See also: AWS API Documentation
Request Syntax
response = client.list_faqs(
IndexId='string',
NextToken='string',
MaxResults=123
)
[REQUIRED]
The index that contains the FAQ lists.
dict
Response Syntax
{
'NextToken': 'string',
'FaqSummaryItems': [
{
'Id': 'string',
'Name': 'string',
'Status': 'CREATING'|'UPDATING'|'ACTIVE'|'DELETING'|'FAILED',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'FileFormat': 'CSV'|'CSV_WITH_HEADER'|'JSON',
'LanguageCode': 'string'
},
]
}
Response Structure
(dict) --
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token that you can use in the subsequent request to retrieve the next set of FAQs.
FaqSummaryItems (list) --
information about the FAQs associated with the specified index.
(dict) --
Summary information for frequently asked questions and answers included in an index.
Id (string) --
The identifier of the FAQ.
Name (string) --
The name that you assigned the FAQ when you created or updated the FAQ.
Status (string) --
The current status of the FAQ. When the status is ACTIVE
the FAQ is ready for use.
CreatedAt (datetime) --
The UNIX datetime that the FAQ was added to the index.
UpdatedAt (datetime) --
The UNIX datetime that the FAQ was last updated.
FileFormat (string) --
The file type used to create the FAQ.
LanguageCode (string) --
The code for a language. This shows a supported language for the FAQ document as part of the summary information for FAQs. English is supported by default. For more information on supported languages, including their codes, see Adding documents in languages other than English.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
list_groups_older_than_ordering_id
(**kwargs)¶Provides a list of groups that are mapped to users before a given ordering or timestamp identifier.
ListGroupsOlderThanOrderingId
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.list_groups_older_than_ordering_id(
IndexId='string',
DataSourceId='string',
OrderingId=123,
NextToken='string',
MaxResults=123
)
[REQUIRED]
The identifier of the index for getting a list of groups mapped to users before a given ordering or timestamp identifier.
[REQUIRED]
The timestamp identifier used for the latest PUT
or DELETE
action for mapping users to their groups.
dict
Response Syntax
{
'GroupsSummaries': [
{
'GroupId': 'string',
'OrderingId': 123
},
],
'NextToken': 'string'
}
Response Structure
(dict) --
GroupsSummaries (list) --
Summary information for list of groups that are mapped to users before a given ordering or timestamp identifier.
(dict) --
Summary information for groups.
GroupId (string) --
The identifier of the group you want group summary information on.
OrderingId (integer) --
The timestamp identifier used for the latest PUT
or DELETE
action.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token that you can use in the subsequent request to retrieve the next set of groups that are mapped to users before a given ordering or timestamp identifier.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.InternalServerException
list_indices
(**kwargs)¶Lists the Amazon Kendra indexes that you created.
See also: AWS API Documentation
Request Syntax
response = client.list_indices(
NextToken='string',
MaxResults=123
)
dict
Response Syntax
{
'IndexConfigurationSummaryItems': [
{
'Name': 'string',
'Id': 'string',
'Edition': 'DEVELOPER_EDITION'|'ENTERPRISE_EDITION',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'Status': 'CREATING'|'ACTIVE'|'DELETING'|'FAILED'|'UPDATING'|'SYSTEM_UPDATING'
},
],
'NextToken': 'string'
}
Response Structure
(dict) --
IndexConfigurationSummaryItems (list) --
An array of summary information on the configuration of one or more indexes.
(dict) --
Summary information on the configuration of an index.
Name (string) --
The name of the index.
Id (string) --
A identifier for the index. Use this to identify the index when you are using APIs such as Query
, DescribeIndex
, UpdateIndex
, and DeleteIndex
.
Edition (string) --
Indicates whether the index is a Enterprise Edition index or a Developer Edition index.
CreatedAt (datetime) --
The Unix timestamp when the index was created.
UpdatedAt (datetime) --
The Unix timestamp when the index was last updated by the UpdateIndex
API.
Status (string) --
The current status of the index. When the status is ACTIVE
, the index is ready to search.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token that you can use in the subsequent request to retrieve the next set of indexes.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
list_query_suggestions_block_lists
(**kwargs)¶Lists the block lists used for query suggestions for an index.
For information on the current quota limits for block lists, see Quotas for Amazon Kendra.
ListQuerySuggestionsBlockLists
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.list_query_suggestions_block_lists(
IndexId='string',
NextToken='string',
MaxResults=123
)
[REQUIRED]
The identifier of the index for a list of all block lists that exist for that index.
For information on the current quota limits for block lists, see Quotas for Amazon Kendra.
BlockListSummaryItems
).dict
Response Syntax
{
'BlockListSummaryItems': [
{
'Id': 'string',
'Name': 'string',
'Status': 'ACTIVE'|'CREATING'|'DELETING'|'UPDATING'|'ACTIVE_BUT_UPDATE_FAILED'|'FAILED',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1),
'ItemCount': 123
},
],
'NextToken': 'string'
}
Response Structure
(dict) --
BlockListSummaryItems (list) --
Summary items for a block list.
This includes summary items on the block list ID, block list name, when the block list was created, when the block list was last updated, and the count of block words/phrases in the block list.
For information on the current quota limits for block lists, see Quotas for Amazon Kendra.
(dict) --
Summary information on a query suggestions block list.
This includes information on the block list ID, block list name, when the block list was created, when the block list was last updated, and the count of block words/phrases in the block list.
For information on the current quota limits for block lists, see Quotas for Amazon Kendra.
Id (string) --
The identifier of a block list.
Name (string) --
The name of the block list.
Status (string) --
The status of the block list.
CreatedAt (datetime) --
The date-time summary information for a query suggestions block list was last created.
UpdatedAt (datetime) --
The date-time the block list was last updated.
ItemCount (integer) --
The number of items in the block list file.
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token that you can use in the subsequent request to retrieve the next set of block lists.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
Gets a list of tags associated with a specified resource. Indexes, FAQs, and data sources can have tags associated with them.
See also: AWS API Documentation
Request Syntax
response = client.list_tags_for_resource(
ResourceARN='string'
)
[REQUIRED]
The Amazon Resource Name (ARN) of the index, FAQ, or data source to get a list of tags for.
{
'Tags': [
{
'Key': 'string',
'Value': 'string'
},
]
}
Response Structure
A list of tags associated with the index, FAQ, or data source.
A list of key/value pairs that identify an index, FAQ, or data source. Tag keys and values can consist of Unicode letters, digits, white space, and any of the following symbols: _ . : / = + - @.
The key for the tag. Keys are not case sensitive and must be unique for the index, FAQ, or data source.
The value associated with the tag. The value may be an empty string but it can't be null.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceUnavailableException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
list_thesauri
(**kwargs)¶Lists the thesauri for an index.
See also: AWS API Documentation
Request Syntax
response = client.list_thesauri(
IndexId='string',
NextToken='string',
MaxResults=123
)
[REQUIRED]
The identifier of the index with one or more thesauri.
ThesaurusSummaryItems
).dict
Response Syntax
{
'NextToken': 'string',
'ThesaurusSummaryItems': [
{
'Id': 'string',
'Name': 'string',
'Status': 'CREATING'|'ACTIVE'|'DELETING'|'UPDATING'|'ACTIVE_BUT_UPDATE_FAILED'|'FAILED',
'CreatedAt': datetime(2015, 1, 1),
'UpdatedAt': datetime(2015, 1, 1)
},
]
}
Response Structure
(dict) --
NextToken (string) --
If the response is truncated, Amazon Kendra returns this token that you can use in the subsequent request to retrieve the next set of thesauri.
ThesaurusSummaryItems (list) --
An array of summary information for a thesaurus or multiple thesauri.
(dict) --
An array of summary information for a thesaurus or multiple thesauri.
Id (string) --
The identifier of the thesaurus.
Name (string) --
The name of the thesaurus.
Status (string) --
The status of the thesaurus.
CreatedAt (datetime) --
The Unix datetime that the thesaurus was created.
UpdatedAt (datetime) --
The Unix datetime that the thesaurus was last updated.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
put_principal_mapping
(**kwargs)¶Maps users to their groups so that you only need to provide the user ID when you issue the query.
You can also map sub groups to groups. For example, the group "Company Intellectual Property Teams" includes sub groups "Research" and "Engineering". These sub groups include their own list of users or people who work in these teams. Only users who work in research and engineering, and therefore belong in the intellectual property group, can see top-secret company documents in their search results.
This is useful for user context filtering, where search results are filtered based on the user or their group access to documents. For more information, see Filtering on user context.
If more than five PUT
actions for a group are currently processing, a validation exception is thrown.
PutPrincipalMapping
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.put_principal_mapping(
IndexId='string',
DataSourceId='string',
GroupId='string',
GroupMembers={
'MemberGroups': [
{
'GroupId': 'string',
'DataSourceId': 'string'
},
],
'MemberUsers': [
{
'UserId': 'string'
},
],
'S3PathforGroupMembers': {
'Bucket': 'string',
'Key': 'string'
}
},
OrderingId=123,
RoleArn='string'
)
[REQUIRED]
The identifier of the index you want to map users to their groups.
The identifier of the data source you want to map users to their groups.
This is useful if a group is tied to multiple data sources, but you only want the group to access documents of a certain data source. For example, the groups "Research", "Engineering", and "Sales and Marketing" are all tied to the company's documents stored in the data sources Confluence and Salesforce. However, "Sales and Marketing" team only needs access to customer-related documents stored in Salesforce.
[REQUIRED]
The identifier of the group you want to map its users to.
[REQUIRED]
The list that contains your users or sub groups that belong the same group.
For example, the group "Company" includes the user "CEO" and the sub groups "Research", "Engineering", and "Sales and Marketing".
If you have more than 1000 users and/or sub groups for a single group, you need to provide the path to the S3 file that lists your users and sub groups for a group. Your sub groups can contain more than 1000 users, but the list of sub groups that belong to a group (and/or users) must be no more than 1000.
A list of sub groups that belong to a group. For example, the sub groups "Research", "Engineering", and "Sales and Marketing" all belong to the group "Company".
The sub groups that belong to a group.
The identifier of the sub group you want to map to a group.
The identifier of the data source for the sub group you want to map to a group.
A list of users that belong to a group. For example, a list of interns all belong to the "Interns" group.
The users that belong to a group.
The identifier of the user you want to map to a group.
If you have more than 1000 users and/or sub groups for a single group, you need to provide the path to the S3 file that lists your users and sub groups for a group. Your sub groups can contain more than 1000 users, but the list of sub groups that belong to a group (and/or users) must be no more than 1000.
You can download this example S3 file that uses the correct format for listing group members. Note, dataSourceId
is optional. The value of type
for a group is always GROUP
and for a user it is always USER
.
The name of the S3 bucket that contains the file.
The name of the file.
The timestamp identifier you specify to ensure Amazon Kendra does not override the latest PUT
action with previous actions. The highest number ID, which is the ordering ID, is the latest action you want to process and apply on top of other actions with lower number IDs. This prevents previous actions with lower number IDs from possibly overriding the latest action.
The ordering ID can be the UNIX time of the last update you made to a group members list. You would then provide this list when calling PutPrincipalMapping
. This ensures your PUT
action for that updated group with the latest members list doesn't get overwritten by earlier PUT
actions for the same group which are yet to be processed.
The default ordering ID is the current UNIX time in milliseconds that the action was received by Amazon Kendra.
The Amazon Resource Name (ARN) of a role that has access to the S3 file that contains your list of users or sub groups that belong to a group.
For more information, see IAM roles for Amazon Kendra.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.InternalServerException
query
(**kwargs)¶Searches an active index. Use this API to search your documents using query. The Query
API enables to do faceted search and to filter results based on document attributes.
It also enables you to provide user context that Amazon Kendra uses to enforce document access control in the search results.
Amazon Kendra searches your index for text content and question and answer (FAQ) content. By default the response contains three types of results.
You can specify that the query return only one type of result using the QueryResultTypeConfig
parameter.
Each query returns the 100 most relevant results.
See also: AWS API Documentation
Request Syntax
response = client.query(
IndexId='string',
QueryText='string',
AttributeFilter={
'AndAllFilters': [
{'... recursive ...'},
],
'OrAllFilters': [
{'... recursive ...'},
],
'NotFilter': {'... recursive ...'},
'EqualsTo': {
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'ContainsAll': {
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'ContainsAny': {
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'GreaterThan': {
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'GreaterThanOrEquals': {
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LessThan': {
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LessThanOrEquals': {
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
}
},
Facets=[
{
'DocumentAttributeKey': 'string',
'Facets': {'... recursive ...'},
'MaxResults': 123
},
],
RequestedDocumentAttributes=[
'string',
],
QueryResultTypeFilter='DOCUMENT'|'QUESTION_ANSWER'|'ANSWER',
DocumentRelevanceOverrideConfigurations=[
{
'Name': 'string',
'Relevance': {
'Freshness': True|False,
'Importance': 123,
'Duration': 'string',
'RankOrder': 'ASCENDING'|'DESCENDING',
'ValueImportanceMap': {
'string': 123
}
}
},
],
PageNumber=123,
PageSize=123,
SortingConfiguration={
'DocumentAttributeKey': 'string',
'SortOrder': 'DESC'|'ASC'
},
UserContext={
'Token': 'string',
'UserId': 'string',
'Groups': [
'string',
],
'DataSourceGroups': [
{
'GroupId': 'string',
'DataSourceId': 'string'
},
]
},
VisitorId='string',
SpellCorrectionConfiguration={
'IncludeQuerySpellCheckSuggestions': True|False
}
)
[REQUIRED]
The identifier of the index to search. The identifier is returned in the response from the CreateIndex
API.
Enables filtered searches based on document attributes. You can only provide one attribute filter; however, the AndAllFilters
, NotFilter
, and OrAllFilters
parameters contain a list of other filters.
The AttributeFilter
parameter enables you to create a set of filtering rules that a document must satisfy to be included in the query results.
Performs a logical AND
operation on all supplied filters.
Provides filtering the query results based on document attributes or metadata fields.
When you use the AndAllFilters
or OrAllFilters
, filters you can use 2 layers under the first attribute filter. For example, you can use:
<AndAllFilters>
<OrAllFilters>
<EqualsTo>
If you use more than 2 layers, you receive a ValidationException
exception with the message " AttributeFilter
cannot have a depth of more than 2."
If you use more than 10 attribute filters in a given list for AndAllFilters
or OrAllFilters
, you receive a ValidationException
with the message " AttributeFilter
cannot have a length of more than 10".
Performs a logical OR
operation on all supplied filters.
Provides filtering the query results based on document attributes or metadata fields.
When you use the AndAllFilters
or OrAllFilters
, filters you can use 2 layers under the first attribute filter. For example, you can use:
<AndAllFilters>
<OrAllFilters>
<EqualsTo>
If you use more than 2 layers, you receive a ValidationException
exception with the message " AttributeFilter
cannot have a depth of more than 2."
If you use more than 10 attribute filters in a given list for AndAllFilters
or OrAllFilters
, you receive a ValidationException
with the message " AttributeFilter
cannot have a length of more than 10".
Performs a logical NOT
operation on all supplied filters.
Performs an equals operation on two document attributes or metadata fields.
The identifier for the attribute.
The value of the attribute.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Returns true when a document contains all of the specified document attributes or metadata fields. This filter is only applicable to StringListValue
metadata.
The identifier for the attribute.
The value of the attribute.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Returns true when a document contains any of the specified document attributes or metadata fields. This filter is only applicable to StringListValue
metadata.
The identifier for the attribute.
The value of the attribute.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Performs a greater than operation on two document attributes or metadata fields. Use with a document attribute of type Date
or Long
.
The identifier for the attribute.
The value of the attribute.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Performs a greater or equals than operation on two document attributes or metadata fields. Use with a document attribute of type Date
or Long
.
The identifier for the attribute.
The value of the attribute.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Performs a less than operation on two document attributes or metadata fields. Use with a document attribute of type Date
or Long
.
The identifier for the attribute.
The value of the attribute.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Performs a less than or equals operation on two document attributes or metadata fields. Use with a document attribute of type Date
or Long
.
The identifier for the attribute.
The value of the attribute.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
An array of documents attributes. Amazon Kendra returns a count for each attribute key specified. This helps your users narrow their search.
Information about a document attribute. You can use document attributes as facets.
For example, the document attribute or facet "Department" includes the values "HR", "Engineering", and "Accounting". You can display these values in the search results so that documents can be searched by department.
You can display up to 10 facet values per facet for a query. If you want to increase this limit, contact Support.
The unique key for the document attribute.
An array of document attributes that are nested facets within a facet.
For example, the document attribute or facet "Department" includes a value called "Engineering". In addition, the document attribute or facet "SubDepartment" includes the values "Frontend" and "Backend" for documents assigned to "Engineering". You can display nested facets in the search results so that documents can be searched not only by department but also by a sub department within a department. This helps your users further narrow their search.
You can only have one nested facet within a facet. If you want to increase this limit, contact Support.
Maximum number of facet values per facet. The default is 10. You can use this to limit the number of facet values to less than 10. If you want to increase the default, contact Support.
An array of document attributes to include in the response. You can limit the response to include certain document attributes. By default all document attributes are included in the response.
Overrides relevance tuning configurations of fields or attributes set at the index level.
If you use this API to override the relevance tuning configured at the index level, but there is no relevance tuning configured at the index level, then Amazon Kendra does not apply any relevance tuning.
If there is relevance tuning configured at the index level, but you do not use this API to override any relevance tuning in the index, then Amazon Kendra uses the relevance tuning that is configured at the index level.
If there is relevance tuning configured for fields at the index level, but you use this API to override only some of these fields, then for the fields you did not override, the importance is set to 1.
Overrides the document relevance properties of a custom index field.
The name of the index field.
Provides information for tuning the relevance of a field in a search. When a query includes terms that match the field, the results are given a boost in the response based on these tuning parameters.
Indicates that this field determines how "fresh" a document is. For example, if document 1 was created on November 5, and document 2 was created on October 31, document 1 is "fresher" than document 2. You can only set the Freshness
field on one DATE
type field. Only applies to DATE
fields.
The relative importance of the field in the search. Larger numbers provide more of a boost than smaller numbers.
Specifies the time period that the boost applies to. For example, to make the boost apply to documents with the field value within the last month, you would use "2628000s". Once the field value is beyond the specified range, the effect of the boost drops off. The higher the importance, the faster the effect drops off. If you don't specify a value, the default is 3 months. The value of the field is a numeric string followed by the character "s", for example "86400s" for one day, or "604800s" for one week.
Only applies to DATE
fields.
Determines how values should be interpreted.
When the RankOrder
field is ASCENDING
, higher numbers are better. For example, a document with a rating score of 10 is higher ranking than a document with a rating score of 1.
When the RankOrder
field is DESCENDING
, lower numbers are better. For example, in a task tracking application, a priority 1 task is more important than a priority 5 task.
Only applies to LONG
and DOUBLE
fields.
A list of values that should be given a different boost when they appear in the result list. For example, if you are boosting a field called "department," query terms that match the department field are boosted in the result. However, you can add entries from the department field to boost documents with those values higher.
For example, you can add entries to the map with names of departments. If you add "HR",5 and "Legal",3 those departments are given special attention when they appear in the metadata of a document. When those terms appear they are given the specified importance instead of the regular importance for the boost.
PageSize
parameter. By default, Amazon Kendra returns the first page of results. Use this parameter to get result pages after the first one.Provides information that determines how the results of the query are sorted. You can set the field that Amazon Kendra should sort the results on, and specify whether the results should be sorted in ascending or descending order. In the case of ties in sorting the results, the results are sorted by relevance.
If you don't provide sorting configuration, the results are sorted by the relevance that Amazon Kendra determines for the result.
The name of the document attribute used to sort the response. You can use any field that has the Sortable
flag set to true.
You can also sort by any of the following built-in attributes:
The order that the results should be returned in. In case of ties, the relevance assigned to the result by Amazon Kendra is used as the tie-breaker.
The user context token or user and group information.
The user context token for filtering search results for a user. It must be a JWT or a JSON token.
The identifier of the user you want to filter search results based on their access to documents.
The list of groups you want to filter search results based on the groups' access to documents.
The list of data source groups you want to filter search results based on groups' access to documents in that data source.
Data source information for user context filtering.
The identifier of the group you want to add to your list of groups. This is for filtering search results based on the groups' access to documents.
The identifier of the data source group you want to add to your list of data source groups. This is for filtering search results based on the groups' access to documents in that data source.
VisitorId
should be a unique identifier, such as a GUID. Don't use personally identifiable information, such as the user's email address, as the VisitorId
.Enables suggested spell corrections for queries.
TRUE
to suggest spell corrections for queries.
dict
Response Syntax
{
'QueryId': 'string',
'ResultItems': [
{
'Id': 'string',
'Type': 'DOCUMENT'|'QUESTION_ANSWER'|'ANSWER',
'Format': 'TABLE'|'TEXT',
'AdditionalAttributes': [
{
'Key': 'string',
'ValueType': 'TEXT_WITH_HIGHLIGHTS_VALUE',
'Value': {
'TextWithHighlightsValue': {
'Text': 'string',
'Highlights': [
{
'BeginOffset': 123,
'EndOffset': 123,
'TopAnswer': True|False,
'Type': 'STANDARD'|'THESAURUS_SYNONYM'
},
]
}
}
},
],
'DocumentId': 'string',
'DocumentTitle': {
'Text': 'string',
'Highlights': [
{
'BeginOffset': 123,
'EndOffset': 123,
'TopAnswer': True|False,
'Type': 'STANDARD'|'THESAURUS_SYNONYM'
},
]
},
'DocumentExcerpt': {
'Text': 'string',
'Highlights': [
{
'BeginOffset': 123,
'EndOffset': 123,
'TopAnswer': True|False,
'Type': 'STANDARD'|'THESAURUS_SYNONYM'
},
]
},
'DocumentURI': 'string',
'DocumentAttributes': [
{
'Key': 'string',
'Value': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
],
'ScoreAttributes': {
'ScoreConfidence': 'VERY_HIGH'|'HIGH'|'MEDIUM'|'LOW'|'NOT_AVAILABLE'
},
'FeedbackToken': 'string',
'TableExcerpt': {
'Rows': [
{
'Cells': [
{
'Value': 'string',
'TopAnswer': True|False,
'Highlighted': True|False,
'Header': True|False
},
]
},
],
'TotalNumberOfRows': 123
}
},
],
'FacetResults': [
{
'DocumentAttributeKey': 'string',
'DocumentAttributeValueType': 'STRING_VALUE'|'STRING_LIST_VALUE'|'LONG_VALUE'|'DATE_VALUE',
'DocumentAttributeValueCountPairs': [
{
'DocumentAttributeValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
},
'Count': 123,
'FacetResults': {'... recursive ...'}
},
]
},
],
'TotalNumberOfResults': 123,
'Warnings': [
{
'Message': 'string',
'Code': 'QUERY_LANGUAGE_INVALID_SYNTAX'
},
],
'SpellCorrectedQueries': [
{
'SuggestedQueryText': 'string',
'Corrections': [
{
'BeginOffset': 123,
'EndOffset': 123,
'Term': 'string',
'CorrectedTerm': 'string'
},
]
},
]
}
Response Structure
(dict) --
QueryId (string) --
The identifier for the search. You use QueryId
to identify the search when using the feedback API.
ResultItems (list) --
The results of the search.
(dict) --
A single query result.
A query result contains information about a document returned by the query. This includes the original location of the document, a list of attributes assigned to the document, and relevant text from the document that satisfies the query.
Id (string) --
The identifier for the query result.
Type (string) --
The type of document within the response. For example, a response could include a question-answer that's relevant to the query.
Format (string) --
If the Type
of document within the response is ANSWER
, then it is either a TABLE
answer or TEXT
answer. If it's a table answer, a table excerpt is returned in TableExcerpt
. If it's a text answer, a text excerpt is returned in DocumentExcerpt
.
AdditionalAttributes (list) --
One or more additional attributes associated with the query result.
(dict) --
An attribute returned from an index query.
Key (string) --
The key that identifies the attribute.
ValueType (string) --
The data type of the Value
property.
Value (dict) --
An object that contains the attribute value.
TextWithHighlightsValue (dict) --
The text associated with the attribute and information about the highlight to apply to the text.
Text (string) --
The text to display to the user.
Highlights (list) --
The beginning and end of the text that should be highlighted.
(dict) --
Provides information that you can use to highlight a search result so that your users can quickly identify terms in the response.
BeginOffset (integer) --
The zero-based location in the response string where the highlight starts.
EndOffset (integer) --
The zero-based location in the response string where the highlight ends.
TopAnswer (boolean) --
Indicates whether the response is the best response. True if this is the best response; otherwise, false.
Type (string) --
The highlight type.
DocumentId (string) --
The identifier for the document.
DocumentTitle (dict) --
The title of the document. Contains the text of the title and information for highlighting the relevant terms in the title.
Text (string) --
The text to display to the user.
Highlights (list) --
The beginning and end of the text that should be highlighted.
(dict) --
Provides information that you can use to highlight a search result so that your users can quickly identify terms in the response.
BeginOffset (integer) --
The zero-based location in the response string where the highlight starts.
EndOffset (integer) --
The zero-based location in the response string where the highlight ends.
TopAnswer (boolean) --
Indicates whether the response is the best response. True if this is the best response; otherwise, false.
Type (string) --
The highlight type.
DocumentExcerpt (dict) --
An extract of the text in the document. Contains information about highlighting the relevant terms in the excerpt.
Text (string) --
The text to display to the user.
Highlights (list) --
The beginning and end of the text that should be highlighted.
(dict) --
Provides information that you can use to highlight a search result so that your users can quickly identify terms in the response.
BeginOffset (integer) --
The zero-based location in the response string where the highlight starts.
EndOffset (integer) --
The zero-based location in the response string where the highlight ends.
TopAnswer (boolean) --
Indicates whether the response is the best response. True if this is the best response; otherwise, false.
Type (string) --
The highlight type.
DocumentURI (string) --
The URI of the original location of the document.
DocumentAttributes (list) --
An array of document attributes assigned to a document in the search results. For example, the document author ( _author
) or the source URI ( _source_uri
) of the document.
(dict) --
A document attribute or metadata field. To create custom document attributes, see Custom attributes.
Key (string) --
The identifier for the attribute.
Value (dict) --
The value of the attribute.
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings. The default maximum length or number of strings is 10.
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
ScoreAttributes (dict) --
Indicates the confidence that Amazon Kendra has that a result matches the query that you provided. Each result is placed into a bin that indicates the confidence, VERY_HIGH
, HIGH
, MEDIUM
and LOW
. You can use the score to determine if a response meets the confidence needed for your application.
The field is only set to LOW
when the Type
field is set to DOCUMENT
and Amazon Kendra is not confident that the result matches the query.
ScoreConfidence (string) --
A relative ranking for how well the response matches the query.
FeedbackToken (string) --
A token that identifies a particular result from a particular query. Use this token to provide click-through feedback for the result. For more information, see Submitting feedback.
TableExcerpt (dict) --
An excerpt from a table within a document.
Rows (list) --
A list of rows in the table excerpt.
(dict) --
Information about a row in a table excerpt.
Cells (list) --
A list of table cells in a row.
(dict) --
Provides information about a table cell in a table excerpt.
Value (string) --
The actual value or content within a table cell. A table cell could contain a date value of a year, or a string value of text, for example.
TopAnswer (boolean) --
TRUE
if the response of the table cell is the top answer. This is the cell value or content with the highest confidence score or is the most relevant to the query.
Highlighted (boolean) --
TRUE
means that the table cell has a high enough confidence and is relevant to the query, so the value or content should be highlighted.
Header (boolean) --
TRUE
means that the table cell should be treated as a header.
TotalNumberOfRows (integer) --
A count of the number of rows in the original table within the document.
FacetResults (list) --
Contains the facet results. A FacetResult
contains the counts for each attribute key that was specified in the Facets
input parameter.
(dict) --
The facet values for the documents in the response.
DocumentAttributeKey (string) --
The key for the facet values. This is the same as the DocumentAttributeKey
provided in the query.
DocumentAttributeValueType (string) --
The data type of the facet value. This is the same as the type defined for the index field when it was created.
DocumentAttributeValueCountPairs (list) --
An array of key/value pairs, where the key is the value of the attribute and the count is the number of documents that share the key value.
(dict) --
Provides the count of documents that match a particular attribute when doing a faceted search.
DocumentAttributeValue (dict) --
The value of the attribute. For example, "HR".
StringValue (string) --
A string, such as "department".
StringListValue (list) --
A list of strings. The default maximum length or number of strings is 10.
LongValue (integer) --
A long integer value.
DateValue (datetime) --
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Count (integer) --
The number of documents in the response that have the attribute value for the key.
FacetResults (list) --
Contains the results of a document attribute that is a nested facet. A FacetResult
contains the counts for each facet nested within a facet.
For example, the document attribute or facet "Department" includes a value called "Engineering". In addition, the document attribute or facet "SubDepartment" includes the values "Frontend" and "Backend" for documents assigned to "Engineering". You can display nested facets in the search results so that documents can be searched not only by department but also by a sub department within a department. The counts for documents that belong to "Frontend" and "Backend" within "Engineering" are returned for a query.
TotalNumberOfResults (integer) --
The total number of items found by the search; however, you can only retrieve up to 100 items. For example, if the search found 192 items, you can only retrieve the first 100 of the items.
Warnings (list) --
A list of warning codes and their messages on problems with your query.
Amazon Kendra currently only supports one type of warning, which is a warning on invalid syntax used in the query. For examples of invalid query syntax, see Searching with advanced query syntax.
(dict) --
The warning code and message that explains a problem with a query.
Message (string) --
The message that explains the problem with the query.
Code (string) --
The code used to show the type of warning for the query.
SpellCorrectedQueries (list) --
A list of information related to suggested spell corrections for a query.
(dict) --
A query with suggested spell corrections.
SuggestedQueryText (string) --
The query with the suggested spell corrections.
Corrections (list) --
The corrected misspelled word or words in a query.
(dict) --
A corrected misspelled word in a query.
BeginOffset (integer) --
The zero-based location in the response string or text where the corrected word starts.
EndOffset (integer) --
The zero-based location in the response string or text where the corrected word ends.
Term (string) --
The string or text of a misspelled word in a query.
CorrectedTerm (string) --
The string or text of a corrected misspelled word in a query.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.InternalServerException
start_data_source_sync_job
(**kwargs)¶Starts a synchronization job for a data source connector. If a synchronization job is already in progress, Amazon Kendra returns a ResourceInUseException
exception.
See also: AWS API Documentation
Request Syntax
response = client.start_data_source_sync_job(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of the data source connector to synchronize.
[REQUIRED]
The identifier of the index used with the data source connector.
dict
Response Syntax
{
'ExecutionId': 'string'
}
Response Structure
(dict) --
ExecutionId (string) --
Identifies a particular synchronization job.
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ResourceInUseException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.InternalServerException
stop_data_source_sync_job
(**kwargs)¶Stops a synchronization job that is currently running. You can't stop a scheduled synchronization job.
See also: AWS API Documentation
Request Syntax
response = client.stop_data_source_sync_job(
Id='string',
IndexId='string'
)
[REQUIRED]
The identifier of the data source connector for which to stop the synchronization jobs.
[REQUIRED]
The identifier of the index used with the data source connector.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
submit_feedback
(**kwargs)¶Enables you to provide feedback to Amazon Kendra to improve the performance of your index.
SubmitFeedback
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.submit_feedback(
IndexId='string',
QueryId='string',
ClickFeedbackItems=[
{
'ResultId': 'string',
'ClickTime': datetime(2015, 1, 1)
},
],
RelevanceFeedbackItems=[
{
'ResultId': 'string',
'RelevanceValue': 'RELEVANT'|'NOT_RELEVANT'
},
]
)
[REQUIRED]
The identifier of the index that was queried.
[REQUIRED]
The identifier of the specific query for which you are submitting feedback. The query ID is returned in the response to the Query
API.
Tells Amazon Kendra that a particular search result link was chosen by the user.
Gathers information about when a particular result was clicked by a user. Your application uses the SubmitFeedback
API to provide click information.
The identifier of the search result that was clicked.
The Unix timestamp of the date and time that the result was clicked.
Provides Amazon Kendra with relevant or not relevant feedback for whether a particular item was relevant to the search.
Provides feedback on how relevant a document is to a search. Your application uses the SubmitFeedback
API to provide relevance information.
The identifier of the search result that the user provided relevance feedback for.
Whether to document was relevant or not relevant to the search.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceUnavailableException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
tag_resource
(**kwargs)¶Adds the specified tag to the specified index, FAQ, or data source resource. If the tag already exists, the existing value is replaced with the new value.
See also: AWS API Documentation
Request Syntax
response = client.tag_resource(
ResourceARN='string',
Tags=[
{
'Key': 'string',
'Value': 'string'
},
]
)
[REQUIRED]
The Amazon Resource Name (ARN) of the index, FAQ, or data source to tag.
[REQUIRED]
A list of tag keys to add to the index, FAQ, or data source. If a tag already exists, the existing value is replaced with the new value.
A list of key/value pairs that identify an index, FAQ, or data source. Tag keys and values can consist of Unicode letters, digits, white space, and any of the following symbols: _ . : / = + - @.
The key for the tag. Keys are not case sensitive and must be unique for the index, FAQ, or data source.
The value associated with the tag. The value may be an empty string but it can't be null.
dict
Response Syntax
{}
Response Structure
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceUnavailableException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
untag_resource
(**kwargs)¶Removes a tag from an index, FAQ, or a data source.
See also: AWS API Documentation
Request Syntax
response = client.untag_resource(
ResourceARN='string',
TagKeys=[
'string',
]
)
[REQUIRED]
The Amazon Resource Name (ARN) of the index, FAQ, or data source to remove the tag from.
[REQUIRED]
A list of tag keys to remove from the index, FAQ, or data source. If a tag key does not exist on the resource, it is ignored.
dict
Response Syntax
{}
Response Structure
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceUnavailableException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
update_access_control_configuration
(**kwargs)¶Updates an access control configuration for your documents in an index. This includes user and group access information for your documents. This is useful for user context filtering, where search results are filtered based on the user or their group access to documents.
You can update an access control configuration you created without indexing all of your documents again. For example, your index contains top-secret company documents that only certain employees or users should access. You created an 'allow' access control configuration for one user who recently joined the 'top-secret' team, switching from a team with 'deny' access to top-secret documents. However, the user suddenly returns to their previous team and should no longer have access to top secret documents. You can update the access control configuration to re-configure access control for your documents as circumstances change.
You call the BatchPutDocument API to apply the updated access control configuration, with the AccessControlConfigurationId
included in the Document object. If you use an S3 bucket as a data source, you synchronize your data source to apply the AccessControlConfigurationId
in the .metadata.json
file. Amazon Kendra currently only supports access control configuration for S3 data sources and documents indexed using the BatchPutDocument
API.
See also: AWS API Documentation
Request Syntax
response = client.update_access_control_configuration(
IndexId='string',
Id='string',
Name='string',
Description='string',
AccessControlList=[
{
'Name': 'string',
'Type': 'USER'|'GROUP',
'Access': 'ALLOW'|'DENY',
'DataSourceId': 'string'
},
],
HierarchicalAccessControlList=[
{
'PrincipalList': [
{
'Name': 'string',
'Type': 'USER'|'GROUP',
'Access': 'ALLOW'|'DENY',
'DataSourceId': 'string'
},
]
},
]
)
[REQUIRED]
The identifier of the index for an access control configuration.
[REQUIRED]
The identifier of the access control configuration you want to update.
Information you want to update on principals (users and/or groups) and which documents they should have access to. This is useful for user context filtering, where search results are filtered based on the user or their group access to documents.
Provides user and group information for user context filtering.
The name of the user or group.
The type of principal.
Whether to allow or deny document access to the principal.
The identifier of the data source the principal should access documents from.
The updated list of principal lists that define the hierarchy for which documents users should have access to.
Information to define the hierarchy for which documents users should have access to.
A list of principal lists that define the hierarchy for which documents users should have access to. Each hierarchical list specifies which user or group has allow or deny access for each document.
Provides user and group information for user context filtering.
The name of the user or group.
The type of principal.
Whether to allow or deny document access to the principal.
The identifier of the data source the principal should access documents from.
dict
Response Syntax
{}
Response Structure
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.InternalServerException
update_data_source
(**kwargs)¶Updates an existing Amazon Kendra data source connector.
See also: AWS API Documentation
Request Syntax
response = client.update_data_source(
Id='string',
Name='string',
IndexId='string',
Configuration={
'S3Configuration': {
'BucketName': 'string',
'InclusionPrefixes': [
'string',
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'DocumentsMetadataConfiguration': {
'S3Prefix': 'string'
},
'AccessControlListConfiguration': {
'KeyPath': 'string'
}
},
'SharePointConfiguration': {
'SharePointVersion': 'SHAREPOINT_2013'|'SHAREPOINT_2016'|'SHAREPOINT_ONLINE'|'SHAREPOINT_2019',
'Urls': [
'string',
],
'SecretArn': 'string',
'CrawlAttachments': True|False,
'UseChangeLog': True|False,
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'DocumentTitleFieldName': 'string',
'DisableLocalGroups': True|False,
'SslCertificateS3Path': {
'Bucket': 'string',
'Key': 'string'
},
'AuthenticationType': 'HTTP_BASIC'|'OAUTH2',
'ProxyConfiguration': {
'Host': 'string',
'Port': 123,
'Credentials': 'string'
}
},
'DatabaseConfiguration': {
'DatabaseEngineType': 'RDS_AURORA_MYSQL'|'RDS_AURORA_POSTGRESQL'|'RDS_MYSQL'|'RDS_POSTGRESQL',
'ConnectionConfiguration': {
'DatabaseHost': 'string',
'DatabasePort': 123,
'DatabaseName': 'string',
'TableName': 'string',
'SecretArn': 'string'
},
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'ColumnConfiguration': {
'DocumentIdColumnName': 'string',
'DocumentDataColumnName': 'string',
'DocumentTitleColumnName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'ChangeDetectingColumns': [
'string',
]
},
'AclConfiguration': {
'AllowedGroupsColumnName': 'string'
},
'SqlConfiguration': {
'QueryIdentifiersEnclosingOption': 'DOUBLE_QUOTES'|'NONE'
}
},
'SalesforceConfiguration': {
'ServerUrl': 'string',
'SecretArn': 'string',
'StandardObjectConfigurations': [
{
'Name': 'ACCOUNT'|'CAMPAIGN'|'CASE'|'CONTACT'|'CONTRACT'|'DOCUMENT'|'GROUP'|'IDEA'|'LEAD'|'OPPORTUNITY'|'PARTNER'|'PRICEBOOK'|'PRODUCT'|'PROFILE'|'SOLUTION'|'TASK'|'USER',
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
],
'KnowledgeArticleConfiguration': {
'IncludedStates': [
'DRAFT'|'PUBLISHED'|'ARCHIVED',
],
'StandardKnowledgeArticleTypeConfiguration': {
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'CustomKnowledgeArticleTypeConfigurations': [
{
'Name': 'string',
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
]
},
'ChatterFeedConfiguration': {
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'IncludeFilterTypes': [
'ACTIVE_USER'|'STANDARD_USER',
]
},
'CrawlAttachments': True|False,
'StandardObjectAttachmentConfiguration': {
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'IncludeAttachmentFilePatterns': [
'string',
],
'ExcludeAttachmentFilePatterns': [
'string',
]
},
'OneDriveConfiguration': {
'TenantDomain': 'string',
'SecretArn': 'string',
'OneDriveUsers': {
'OneDriveUserList': [
'string',
],
'OneDriveUserS3Path': {
'Bucket': 'string',
'Key': 'string'
}
},
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'DisableLocalGroups': True|False
},
'ServiceNowConfiguration': {
'HostUrl': 'string',
'SecretArn': 'string',
'ServiceNowBuildVersion': 'LONDON'|'OTHERS',
'KnowledgeArticleConfiguration': {
'CrawlAttachments': True|False,
'IncludeAttachmentFilePatterns': [
'string',
],
'ExcludeAttachmentFilePatterns': [
'string',
],
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'FilterQuery': 'string'
},
'ServiceCatalogConfiguration': {
'CrawlAttachments': True|False,
'IncludeAttachmentFilePatterns': [
'string',
],
'ExcludeAttachmentFilePatterns': [
'string',
],
'DocumentDataFieldName': 'string',
'DocumentTitleFieldName': 'string',
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'AuthenticationType': 'HTTP_BASIC'|'OAUTH2'
},
'ConfluenceConfiguration': {
'ServerUrl': 'string',
'SecretArn': 'string',
'Version': 'CLOUD'|'SERVER',
'SpaceConfiguration': {
'CrawlPersonalSpaces': True|False,
'CrawlArchivedSpaces': True|False,
'IncludeSpaces': [
'string',
],
'ExcludeSpaces': [
'string',
],
'SpaceFieldMappings': [
{
'DataSourceFieldName': 'DISPLAY_URL'|'ITEM_TYPE'|'SPACE_KEY'|'URL',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'PageConfiguration': {
'PageFieldMappings': [
{
'DataSourceFieldName': 'AUTHOR'|'CONTENT_STATUS'|'CREATED_DATE'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'MODIFIED_DATE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'BlogConfiguration': {
'BlogFieldMappings': [
{
'DataSourceFieldName': 'AUTHOR'|'DISPLAY_URL'|'ITEM_TYPE'|'LABELS'|'PUBLISH_DATE'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'AttachmentConfiguration': {
'CrawlAttachments': True|False,
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'AUTHOR'|'CONTENT_TYPE'|'CREATED_DATE'|'DISPLAY_URL'|'FILE_SIZE'|'ITEM_TYPE'|'PARENT_ID'|'SPACE_KEY'|'SPACE_NAME'|'URL'|'VERSION',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'ProxyConfiguration': {
'Host': 'string',
'Port': 123,
'Credentials': 'string'
},
'AuthenticationType': 'HTTP_BASIC'|'PAT'
},
'GoogleDriveConfiguration': {
'SecretArn': 'string',
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'ExcludeMimeTypes': [
'string',
],
'ExcludeUserAccounts': [
'string',
],
'ExcludeSharedDrives': [
'string',
]
},
'WebCrawlerConfiguration': {
'Urls': {
'SeedUrlConfiguration': {
'SeedUrls': [
'string',
],
'WebCrawlerMode': 'HOST_ONLY'|'SUBDOMAINS'|'EVERYTHING'
},
'SiteMapsConfiguration': {
'SiteMaps': [
'string',
]
}
},
'CrawlDepth': 123,
'MaxLinksPerPage': 123,
'MaxContentSizePerPageInMegaBytes': ...,
'MaxUrlsPerMinuteCrawlRate': 123,
'UrlInclusionPatterns': [
'string',
],
'UrlExclusionPatterns': [
'string',
],
'ProxyConfiguration': {
'Host': 'string',
'Port': 123,
'Credentials': 'string'
},
'AuthenticationConfiguration': {
'BasicAuthentication': [
{
'Host': 'string',
'Port': 123,
'Credentials': 'string'
},
]
}
},
'WorkDocsConfiguration': {
'OrganizationId': 'string',
'CrawlComments': True|False,
'UseChangeLog': True|False,
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'FsxConfiguration': {
'FileSystemId': 'string',
'FileSystemType': 'WINDOWS',
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'SecretArn': 'string',
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'SlackConfiguration': {
'TeamId': 'string',
'SecretArn': 'string',
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'SlackEntityList': [
'PUBLIC_CHANNEL'|'PRIVATE_CHANNEL'|'GROUP_MESSAGE'|'DIRECT_MESSAGE',
],
'UseChangeLog': True|False,
'CrawlBotMessage': True|False,
'ExcludeArchived': True|False,
'SinceCrawlDate': 'string',
'LookBackPeriod': 123,
'PrivateChannelFilter': [
'string',
],
'PublicChannelFilter': [
'string',
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'FieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'BoxConfiguration': {
'EnterpriseId': 'string',
'SecretArn': 'string',
'UseChangeLog': True|False,
'CrawlComments': True|False,
'CrawlTasks': True|False,
'CrawlWebLinks': True|False,
'FileFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'TaskFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'CommentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'WebLinkFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'QuipConfiguration': {
'Domain': 'string',
'SecretArn': 'string',
'CrawlFileComments': True|False,
'CrawlChatRooms': True|False,
'CrawlAttachments': True|False,
'FolderIds': [
'string',
],
'ThreadFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'MessageFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'JiraConfiguration': {
'JiraAccountUrl': 'string',
'SecretArn': 'string',
'UseChangeLog': True|False,
'Project': [
'string',
],
'IssueType': [
'string',
],
'Status': [
'string',
],
'IssueSubEntityFilter': [
'COMMENTS'|'ATTACHMENTS'|'WORKLOGS',
],
'AttachmentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'CommentFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'IssueFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'ProjectFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'WorkLogFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'GitHubConfiguration': {
'SaaSConfiguration': {
'OrganizationName': 'string',
'HostUrl': 'string'
},
'OnPremiseConfiguration': {
'HostUrl': 'string',
'OrganizationName': 'string',
'SslCertificateS3Path': {
'Bucket': 'string',
'Key': 'string'
}
},
'Type': 'SAAS'|'ON_PREMISE',
'SecretArn': 'string',
'UseChangeLog': True|False,
'GitHubDocumentCrawlProperties': {
'CrawlRepositoryDocuments': True|False,
'CrawlIssue': True|False,
'CrawlIssueComment': True|False,
'CrawlIssueCommentAttachment': True|False,
'CrawlPullRequest': True|False,
'CrawlPullRequestComment': True|False,
'CrawlPullRequestCommentAttachment': True|False
},
'RepositoryFilter': [
'string',
],
'InclusionFolderNamePatterns': [
'string',
],
'InclusionFileTypePatterns': [
'string',
],
'InclusionFileNamePatterns': [
'string',
],
'ExclusionFolderNamePatterns': [
'string',
],
'ExclusionFileTypePatterns': [
'string',
],
'ExclusionFileNamePatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
'GitHubRepositoryConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubCommitConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubIssueDocumentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubIssueCommentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubIssueAttachmentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubPullRequestCommentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubPullRequestDocumentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'GitHubPullRequestDocumentAttachmentConfigurationFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
]
},
'AlfrescoConfiguration': {
'SiteUrl': 'string',
'SiteId': 'string',
'SecretArn': 'string',
'SslCertificateS3Path': {
'Bucket': 'string',
'Key': 'string'
},
'CrawlSystemFolders': True|False,
'CrawlComments': True|False,
'EntityFilter': [
'wiki'|'blog'|'documentLibrary',
],
'DocumentLibraryFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'BlogFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'WikiFieldMappings': [
{
'DataSourceFieldName': 'string',
'DateFieldFormat': 'string',
'IndexFieldName': 'string'
},
],
'InclusionPatterns': [
'string',
],
'ExclusionPatterns': [
'string',
],
'VpcConfiguration': {
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
}
},
'TemplateConfiguration': {
'Template': {...}|[...]|123|123.4|'string'|True|None
}
},
VpcConfiguration={
'SubnetIds': [
'string',
],
'SecurityGroupIds': [
'string',
]
},
Description='string',
Schedule='string',
RoleArn='string',
LanguageCode='string',
CustomDocumentEnrichmentConfiguration={
'InlineConfigurations': [
{
'Condition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'Target': {
'TargetDocumentAttributeKey': 'string',
'TargetDocumentAttributeValueDeletion': True|False,
'TargetDocumentAttributeValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'DocumentContentDeletion': True|False
},
],
'PreExtractionHookConfiguration': {
'InvocationCondition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LambdaArn': 'string',
'S3Bucket': 'string'
},
'PostExtractionHookConfiguration': {
'InvocationCondition': {
'ConditionDocumentAttributeKey': 'string',
'Operator': 'GreaterThan'|'GreaterThanOrEquals'|'LessThan'|'LessThanOrEquals'|'Equals'|'NotEquals'|'Contains'|'NotContains'|'Exists'|'NotExists'|'BeginsWith',
'ConditionOnValue': {
'StringValue': 'string',
'StringListValue': [
'string',
],
'LongValue': 123,
'DateValue': datetime(2015, 1, 1)
}
},
'LambdaArn': 'string',
'S3Bucket': 'string'
},
'RoleArn': 'string'
}
)
[REQUIRED]
The identifier of the data source connector you want to update.
[REQUIRED]
The identifier of the index used with the data source connector.
Configuration information you want to update for the data source connector.
Provides the configuration information to connect to an Amazon S3 bucket as your data source.
The name of the bucket that contains the documents.
A list of S3 prefixes for the documents that should be included in the index.
A list of glob patterns for documents that should be indexed. If a document that matches an inclusion pattern also matches an exclusion pattern, the document is not indexed.
Some examples are:
A list of glob patterns for documents that should not be indexed. If a document that matches an inclusion prefix or inclusion pattern also matches an exclusion pattern, the document is not indexed.
Some examples are:
Document metadata files that contain information such as the document access control information, source URI, document author, and custom attributes. Each metadata file contains metadata about a single document.
A prefix used to filter metadata configuration files in the Amazon Web Services S3 bucket. The S3 bucket might contain multiple metadata files. Use S3Prefix
to include only the desired metadata files.
Provides the path to the S3 bucket that contains the user context filtering files for the data source. For the format of the file, see Access control for S3 data sources.
Path to the Amazon S3 bucket that contains the ACL files.
Provides the configuration information to connect to Microsoft SharePoint as your data source.
The version of Microsoft SharePoint that you use.
The Microsoft SharePoint site URLs for the documents you want to index.
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the user name and password required to connect to the SharePoint instance. If you use SharePoint Server, you also need to provide the sever domain name as part of the credentials. For more information, see Using a Microsoft SharePoint Data Source.
You can also provide OAuth authentication credentials of user name, password, client ID, and client secret. For more information, see Using a SharePoint data source.
TRUE
to index document attachments.
TRUE
to use the SharePoint change log to determine which documents require updating in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in SharePoint.
A list of regular expression patterns to include certain documents in your SharePoint. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The regex applies to the display URL of the SharePoint document.
A list of regular expression patterns to exclude certain documents in your SharePoint. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The regex applies to the display URL of the SharePoint document.
Configuration information for an Amazon Virtual Private Cloud to connect to your Microsoft SharePoint. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
A list of DataSourceToIndexFieldMapping
objects that map SharePoint data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to SharePoint fields. For more information, see Mapping data source fields. The SharePoint data source field names must exist in your SharePoint custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
The Microsoft SharePoint attribute field that contains the title of the document.
TRUE
to disable local groups information.
The path to the SSL certificate stored in an Amazon S3 bucket. You use this to connect to SharePoint Server if you require a secure SSL connection.
You can simply generate a self-signed X509 certificate on any computer using OpenSSL. For an example of using OpenSSL to create an X509 certificate, see Create and sign an X509 certificate.
The name of the S3 bucket that contains the file.
The name of the file.
Whether you want to connect to SharePoint using basic authentication of user name and password, or OAuth authentication of user name, password, client ID, and client secret. You can use OAuth authentication for SharePoint Online.
Configuration information to connect to your Microsoft SharePoint site URLs via instance via a web proxy. You can use this option for SharePoint Server.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication of user name and password. To store web proxy credentials, you use a secret in Secrets Manager.
It is recommended that you follow best security practices when configuring your web proxy. This includes setting up throttling, setting up logging and monitoring, and applying security patches on a regular basis. If you use your web proxy with multiple data sources, sync jobs that occur at the same time could strain the load on your proxy. It is recommended you prepare your proxy beforehand for any security and load requirements.
The name of the website host you want to connect to via a web proxy server.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
The port number of the website host you want to connect to via a web proxy server.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Your secret ARN, which you can create in Secrets Manager
The credentials are optional. You use a secret if web proxy credentials are required to connect to a website host. Amazon Kendra currently support basic authentication to connect to a web proxy server. The secret stores your credentials.
Provides the configuration information to connect to a database as your data source.
The type of database engine that runs the database.
Configuration information that's required to connect to a database.
The name of the host for the database. Can be either a string (host.subdomain.domain.tld) or an IPv4 or IPv6 address.
The port that the database uses for connections.
The name of the database containing the document data.
The name of the table that contains the document data.
The Amazon Resource Name (ARN) of credentials stored in Secrets Manager. The credentials should be a user/password pair. For more information, see Using a Database Data Source. For more information about Secrets Manager, see What Is Secrets Manager in the Secrets Manager user guide.
Provides the configuration information to connect to an Amazon VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Information about where the index should get the document information from the database.
The column that provides the document's identifier.
The column that contains the contents of the document.
The column that contains the title of the document.
An array of objects that map database column names to the corresponding fields in an index. You must first create the fields in the index using the UpdateIndex
API.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
One to five columns that indicate when a document in the database has changed.
Information about the database column that provides information for user context filtering.
A list of groups, separated by semi-colons, that filters a query response based on user context. The document is only returned to users that are in one of the groups specified in the UserContext
field of the Query
API.
Provides information about how Amazon Kendra uses quote marks around SQL identifiers when querying a database data source.
Determines whether Amazon Kendra encloses SQL identifiers for tables and column names in double quotes (") when making a database query.
By default, Amazon Kendra passes SQL identifiers the way that they are entered into the data source configuration. It does not change the case of identifiers or enclose them in quotes.
PostgreSQL internally converts uppercase characters to lower case characters in identifiers unless they are quoted. Choosing this option encloses identifiers in quotes so that PostgreSQL does not convert the character's case.
For MySQL databases, you must enable the ansi_quotes
option when you set this field to DOUBLE_QUOTES
.
Provides the configuration information to connect to Salesforce as your data source.
The instance URL for the Salesforce site that you want to index.
The Amazon Resource Name (ARN) of an Secrets Managersecret that contains the key/value pairs required to connect to your Salesforce instance. The secret must contain a JSON structure with the following keys:
Configuration of the Salesforce standard objects that Amazon Kendra indexes.
Provides the configuration information for indexing a single standard object.
The name of the standard object.
The name of the field in the standard object table that contains the document contents.
The name of the field in the standard object table that contains the document title.
Maps attributes or field names of the standard object to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Salesforce fields. For more information, see Mapping data source fields. The Salesforce data source field names must exist in your Salesforce custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Configuration information for the knowledge article types that Amazon Kendra indexes. Amazon Kendra indexes standard knowledge articles and the standard fields of knowledge articles, or the custom fields of custom knowledge articles, but not both.
Specifies the document states that should be included when Amazon Kendra indexes knowledge articles. You must specify at least one state.
Configuration information for standard Salesforce knowledge articles.
The name of the field that contains the document data to index.
The name of the field that contains the document title.
Maps attributes or field names of the knowledge article to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Salesforce fields. For more information, see Mapping data source fields. The Salesforce data source field names must exist in your Salesforce custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Configuration information for custom Salesforce knowledge articles.
Provides the configuration information for indexing Salesforce custom articles.
The name of the configuration.
The name of the field in the custom knowledge article that contains the document data to index.
The name of the field in the custom knowledge article that contains the document title.
Maps attributes or field names of the custom knowledge article to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Salesforce fields. For more information, see Mapping data source fields. The Salesforce data source field names must exist in your Salesforce custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Configuration information for Salesforce chatter feeds.
The name of the column in the Salesforce FeedItem table that contains the content to index. Typically this is the Body
column.
The name of the column in the Salesforce FeedItem table that contains the title of the document. This is typically the Title
column.
Maps fields from a Salesforce chatter feed into Amazon Kendra index fields.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Filters the documents in the feed based on status of the user. When you specify ACTIVE_USERS
only documents from users who have an active account are indexed. When you specify STANDARD_USER
only documents for Salesforce standard users are documented. You can specify both.
Indicates whether Amazon Kendra should index attachments to Salesforce objects.
Configuration information for processing attachments to Salesforce standard objects.
The name of the field used for the document title.
One or more objects that map fields in attachments to Amazon Kendra index fields.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain documents in your Salesforce. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the name of the attached file.
A list of regular expression patterns to exclude certain documents in your Salesforce. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the name of the attached file.
Provides the configuration information to connect to Microsoft OneDrive as your data source.
The Azure Active Directory domain of the organization.
The Amazon Resource Name (ARN) of an Secrets Managersecret that contains the user name and password to connect to OneDrive. The user namd should be the application ID for the OneDrive application, and the password is the application key for the OneDrive application.
A list of user accounts whose documents should be indexed.
A list of users whose documents should be indexed. Specify the user names in email format, for example, username@tenantdomain
. If you need to index the documents of more than 100 users, use the OneDriveUserS3Path
field to specify the location of a file containing a list of users.
The S3 bucket location of a file containing a list of users whose documents should be indexed.
The name of the S3 bucket that contains the file.
The name of the file.
A list of regular expression patterns to include certain documents in your OneDrive. Documents that match the patterns are included in the index. Documents that don't match the patterns are excluded from the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the file name.
A list of regular expression patterns to exclude certain documents in your OneDrive. Documents that match the patterns are excluded from the index. Documents that don't match the patterns are included in the index. If a document matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the document isn't included in the index.
The pattern is applied to the file name.
A list of DataSourceToIndexFieldMapping
objects that map OneDrive data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to OneDrive fields. For more information, see Mapping data source fields. The OneDrive data source field names must exist in your OneDrive custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
TRUE
to disable local groups information.
Provides the configuration information to connect to ServiceNow as your data source.
The ServiceNow instance that the data source connects to. The host endpoint should look like the following: {instance}.service-now.com.
The Amazon Resource Name (ARN) of the Secrets Manager secret that contains the user name and password required to connect to the ServiceNow instance. You can also provide OAuth authentication credentials of user name, password, client ID, and client secret. For more information, see Using a ServiceNow data source.
The identifier of the release that the ServiceNow host is running. If the host is not running the LONDON
release, use OTHERS
.
Configuration information for crawling knowledge articles in the ServiceNow site.
TRUE
to index attachments to knowledge articles.
A list of regular expression patterns to include certain attachments of knowledge articles in your ServiceNow. Item that match the patterns are included in the index. Items that don't match the patterns are excluded from the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the field specified in the PatternTargetField
.
A list of regular expression patterns to exclude certain attachments of knowledge articles in your ServiceNow. Item that match the patterns are excluded from the index. Items that don't match the patterns are included in the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the field specified in the PatternTargetField
.
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
The name of the ServiceNow field that is mapped to the index document title field.
Maps attributes or field names of knoweldge articles to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to ServiceNow fields. For more information, see Mapping data source fields. The ServiceNow data source field names must exist in your ServiceNow custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A query that selects the knowledge articles to index. The query can return articles from multiple knowledge bases, and the knowledge bases can be public or private.
The query string must be one generated by the ServiceNow console. For more information, see Specifying documents to index with a query.
Configuration information for crawling service catalogs in the ServiceNow site.
TRUE
to index attachments to service catalog items.
A list of regular expression patterns to include certain attachments of catalogs in your ServiceNow. Item that match the patterns are included in the index. Items that don't match the patterns are excluded from the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the file name of the attachment.
A list of regular expression patterns to exclude certain attachments of catalogs in your ServiceNow. Item that match the patterns are excluded from the index. Items that don't match the patterns are included in the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
The regex is applied to the file name of the attachment.
The name of the ServiceNow field that is mapped to the index document contents field in the Amazon Kendra index.
The name of the ServiceNow field that is mapped to the index document title field.
Maps attributes or field names of catalogs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to ServiceNow fields. For more information, see Mapping data source fields. The ServiceNow data source field names must exist in your ServiceNow custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
The type of authentication used to connect to the ServiceNow instance. If you choose HTTP_BASIC
, Amazon Kendra is authenticated using the user name and password provided in the Secrets Manager secret in the SecretArn
field. If you choose OAUTH2
, Amazon Kendra is authenticated using the credentials of client ID, client secret, user name and password.
When you use OAUTH2
authentication, you must generate a token and a client secret using the ServiceNow console. For more information, see Using a ServiceNow data source.
Provides the configuration information to connect to Confluence as your data source.
The URL of your Confluence instance. Use the full URL of the server. For example, https://server.example.com:port/ . You can also use an IP address, for example, https://192.168.1.113/ .
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the user name and password required to connect to the Confluence instance. If you use Confluence Cloud, you use a generated API token as the password.
You can also provide authentication credentials in the form of a personal access token. For more information, see Using a Confluence data source.
The version or the type of Confluence installation to connect to.
Configuration information for indexing Confluence spaces.
TRUE
to index personal spaces. You can add restrictions to items in personal spaces. If personal spaces are indexed, queries without user context information may return restricted items from a personal space in their results. For more information, see Filtering on user context.
TRUE
to index archived spaces.
A list of space keys for Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are indexed. Spaces that aren't in the list aren't indexed. A space in the list must exist. Otherwise, Amazon Kendra logs an error when the data source is synchronized. If a space is in both the IncludeSpaces
and the ExcludeSpaces
list, the space is excluded.
A list of space keys of Confluence spaces. If you include a key, the blogs, documents, and attachments in the space are not indexed. If a space is in both the ExcludeSpaces
and the IncludeSpaces
list, the space is excluded.
Maps attributes or field names of Confluence spaces to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the SpaceFieldMappings
parameter, you must specify at least one field mapping.
>Maps attributes or field names of Confluence spaces to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
The name of the field in the data source.
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
Configuration information for indexing Confluence pages.
Maps attributes or field names of Confluence pages to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the PageFieldMappings
parameter, you must specify at least one field mapping.
>Maps attributes or field names of Confluence pages to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
The name of the field in the data source.
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
Configuration information for indexing Confluence blogs.
Maps attributes or field names of Confluence blogs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the BlogFieldMappings
parameter, you must specify at least one field mapping.
Maps attributes or field names of Confluence blog to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
The name of the field in the data source.
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
Configuration information for indexing attachments to Confluence blogs and pages.
TRUE
to index attachments of pages and blogs in Confluence.
Maps attributes or field names of Confluence attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confluence data source field names must exist in your Confluence custom metadata.
If you specify the AttachentFieldMappings
parameter, you must specify at least one field mapping.
Maps attributes or field names of Confluence attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Confluence fields. For more information, see Mapping data source fields. The Confuence data source field names must exist in your Confluence custom metadata.
The name of the field in the data source.
You must first create the index field using the UpdateIndex
API.
The format for date fields in the data source. If the field specified in DataSourceFieldName
is a date field you must specify the date format. If the field is not a date field, an exception is thrown.
The name of the index field to map to the Confluence data source field. The index field type must match the Confluence field type.
Configuration information for an Amazon Virtual Private Cloud to connect to your Confluence. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
A list of regular expression patterns to include certain blog posts, pages, spaces, or attachments in your Confluence. Content that matches the patterns are included in the index. Content that doesn't match the patterns is excluded from the index. If content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the content isn't included in the index.
A list of regular expression patterns to exclude certain blog posts, pages, spaces, or attachments in your Confluence. Content that matches the patterns are excluded from the index. Content that doesn't match the patterns is included in the index. If content matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the content isn't included in the index.
Configuration information to connect to your Confluence URL instance via a web proxy. You can use this option for Confluence Server.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication of user name and password. To store web proxy credentials, you use a secret in Secrets Manager.
It is recommended that you follow best security practices when configuring your web proxy. This includes setting up throttling, setting up logging and monitoring, and applying security patches on a regular basis. If you use your web proxy with multiple data sources, sync jobs that occur at the same time could strain the load on your proxy. It is recommended you prepare your proxy beforehand for any security and load requirements.
The name of the website host you want to connect to via a web proxy server.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
The port number of the website host you want to connect to via a web proxy server.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Your secret ARN, which you can create in Secrets Manager
The credentials are optional. You use a secret if web proxy credentials are required to connect to a website host. Amazon Kendra currently support basic authentication to connect to a web proxy server. The secret stores your credentials.
Whether you want to connect to Confluence using basic authentication of user name and password, or a personal access token. You can use a personal access token for Confluence Server.
Provides the configuration information to connect to Google Drive as your data source.
The Amazon Resource Name (ARN) of a Secrets Managersecret that contains the credentials required to connect to Google Drive. For more information, see Using a Google Workspace Drive data source.
A list of regular expression patterns to include certain items in your Google Drive, including shared drives and users' My Drives. Items that match the patterns are included in the index. Items that don't match the patterns are excluded from the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
A list of regular expression patterns to exclude certain items in your Google Drive, including shared drives and users' My Drives. Items that match the patterns are excluded from the index. Items that don't match the patterns are included in the index. If an item matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the item isn't included in the index.
Maps Google Drive data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Google Drive fields. For more information, see Mapping data source fields. The Google Drive data source field names must exist in your Google Drive custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of MIME types to exclude from the index. All documents matching the specified MIME type are excluded.
For a list of MIME types, see Using a Google Workspace Drive data source.
A list of email addresses of the users. Documents owned by these users are excluded from the index. Documents shared with excluded users are indexed unless they are excluded in another way.
A list of identifiers or shared drives to exclude from the index. All files and folders stored on the shared drive are excluded.
Provides the configuration information required for Amazon Kendra Web Crawler.
Specifies the seed or starting point URLs of the websites or the sitemap URLs of the websites you want to crawl.
You can include website subdomains. You can list up to 100 seed URLs and up to three sitemap URLs.
You can only crawl websites that use the secure communication protocol, Hypertext Transfer Protocol Secure (HTTPS). If you receive an error when crawling a website, it could be that the website is blocked from crawling.
When selecting websites to index, you must adhere to the Amazon Acceptable Use Policy and all other Amazon terms. Remember that you must only use Amazon Kendra Web Crawler to index your own webpages, or webpages that you have authorization to index.
Configuration of the seed or starting point URLs of the websites you want to crawl.
You can choose to crawl only the website host names, or the website host names with subdomains, or the website host names with subdomains and other domains that the webpages link to.
You can list up to 100 seed URLs.
The list of seed or starting point URLs of the websites you want to crawl.
The list can include a maximum of 100 seed URLs.
You can choose one of the following modes:
HOST_ONLY
– crawl only the website host names. For example, if the seed URL is "abc.example.com", then only URLs with host name "abc.example.com" are crawled.SUBDOMAINS
– crawl the website host names with subdomains. For example, if the seed URL is "abc.example.com", then "a.abc.example.com" and "b.abc.example.com" are also crawled.EVERYTHING
– crawl the website host names with subdomains and other domains that the webpages link to.The default mode is set to HOST_ONLY
.
Configuration of the sitemap URLs of the websites you want to crawl.
Only URLs belonging to the same website host names are crawled. You can list up to three sitemap URLs.
The list of sitemap URLs of the websites you want to crawl.
The list can include a maximum of three sitemap URLs.
Specifies the number of levels in a website that you want to crawl.
The first level begins from the website seed or starting point URL. For example, if a website has 3 levels – index level (i.e. seed in this example), sections level, and subsections level – and you are only interested in crawling information up to the sections level (i.e. levels 0-1), you can set your depth to 1.
The default crawl depth is set to 2.
The maximum number of URLs on a webpage to include when crawling a website. This number is per webpage.
As a website’s webpages are crawled, any URLs the webpages link to are also crawled. URLs on a webpage are crawled in order of appearance.
The default maximum links per page is 100.
The maximum size (in MB) of a webpage or attachment to crawl.
Files larger than this size (in MB) are skipped/not crawled.
The default maximum size of a webpage or attachment is set to 50 MB.
The maximum number of URLs crawled per website host per minute.
A minimum of one URL is required.
The default maximum number of URLs crawled per website host per minute is 300.
A list of regular expression patterns to include certain URLs to crawl. URLs that match the patterns are included in the index. URLs that don't match the patterns are excluded from the index. If a URL matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the URL file isn't included in the index.
A list of regular expression patterns to exclude certain URLs to crawl. URLs that match the patterns are excluded from the index. URLs that don't match the patterns are included in the index. If a URL matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the URL file isn't included in the index.
Configuration information required to connect to your internal websites via a web proxy.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
Web proxy credentials are optional and you can use them to connect to a web proxy server that requires basic authentication. To store web proxy credentials, you use a secret in Secrets Manager.
The name of the website host you want to connect to via a web proxy server.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
The port number of the website host you want to connect to via a web proxy server.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Your secret ARN, which you can create in Secrets Manager
The credentials are optional. You use a secret if web proxy credentials are required to connect to a website host. Amazon Kendra currently support basic authentication to connect to a web proxy server. The secret stores your credentials.
Configuration information required to connect to websites using authentication.
You can connect to websites using basic authentication of user name and password. You use a secret in Secrets Manager to store your authentication credentials.
You must provide the website host name and port number. For example, the host name of https://a.example.com/page1.html is "a.example.com" and the port is 443, the standard port for HTTPS.
The list of configuration information that's required to connect to and crawl a website host using basic authentication credentials.
The list includes the name and port number of the website host.
Provides the configuration information to connect to websites that require basic user authentication.
The name of the website host you want to connect to using authentication credentials.
For example, the host name of https://a.example.com/page1.html is "a.example.com".
The port number of the website host you want to connect to using authentication credentials.
For example, the port for https://a.example.com/page1.html is 443, the standard port for HTTPS.
Your secret ARN, which you can create in Secrets Manager
You use a secret if basic authentication credentials are required to connect to a website. The secret stores your credentials of user name and password.
Provides the configuration information to connect to Amazon WorkDocs as your data source.
The identifier of the directory corresponding to your Amazon WorkDocs site repository.
You can find the organization ID in the Directory Service by going to Active Directory , then Directories . Your Amazon WorkDocs site directory has an ID, which is the organization ID. You can also set up a new Amazon WorkDocs directory in the Directory Service console and enable a Amazon WorkDocs site for the directory in the Amazon WorkDocs console.
TRUE
to include comments on documents in your index. Including comments in your index means each comment is a document that can be searched on.
The default is set to FALSE
.
TRUE
to use the Amazon WorkDocs change log to determine which documents require updating in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in Amazon WorkDocs.
A list of regular expression patterns to include certain files in your Amazon WorkDocs site repository. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain files in your Amazon WorkDocs site repository. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of DataSourceToIndexFieldMapping
objects that map Amazon WorkDocs data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Amazon WorkDocs fields. For more information, see Mapping data source fields. The Amazon WorkDocs data source field names must exist in your Amazon WorkDocs custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Provides the configuration information to connect to Amazon FSx as your data source.
The identifier of the Amazon FSx file system.
You can find your file system ID on the file system dashboard in the Amazon FSx console. For information on how to create a file system in Amazon FSx console, using Windows File Server as an example, see Amazon FSx Getting started guide.
The Amazon FSx file system type. Windows is currently the only supported type.
Configuration information for an Amazon Virtual Private Cloud to connect to your Amazon FSx. Your Amazon FSx instance must reside inside your VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Amazon FSx file system. Windows is currently the only supported type. The secret must contain a JSON structure with the following keys:
A list of regular expression patterns to include certain files in your Amazon FSx file system. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain files in your Amazon FSx file system. Files that match the patterns are excluded from the index. Files that don't match the patterns are included in the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of DataSourceToIndexFieldMapping
objects that map Amazon FSx data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Amazon FSx fields. For more information, see Mapping data source fields. The Amazon FSx data source field names must exist in your Amazon FSx custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Provides the configuration information to connect to Slack as your data source.
The identifier of the team in the Slack workspace. For example, T0123456789 .
You can find your team ID in the URL of the main page of your Slack workspace. When you log in to Slack via a browser, you are directed to the URL of the main page. For example, https://app.slack.com/client/T0123456789 /....
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Slack workspace team. The secret must contain a JSON structure with the following keys:
Configuration information for an Amazon Virtual Private Cloud to connect to your Slack. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Specify whether to index public channels, private channels, group messages, and direct messages. You can specify one or more of these options.
TRUE
to use the Slack change log to determine which documents require updating in the index. Depending on the Slack change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in Slack.
TRUE
to index bot messages from your Slack workspace team.
TRUE
to exclude archived messages to index from your Slack workspace team.
The date to start crawling your data from your Slack workspace team. The date must follow this format: yyyy-mm-dd
.
The number of hours for change log to look back from when you last synchronized your data. You can look back up to 7 days or 168 hours.
Change log updates your index only if new content was added since you last synced your data. Updated or deleted content from before you last synced does not get updated in your index. To capture updated or deleted content before you last synced, set the LookBackPeriod
to the number of hours you want change log to look back.
The list of private channel names from your Slack workspace team. You use this if you want to index specific private channels, not all private channels. You can also use regular expression patterns to filter private channels.
The list of public channel names to index from your Slack workspace team. You use this if you want to index specific public channels, not all public channels. You can also use regular expression patterns to filter public channels.
A list of regular expression patterns to include certain attached files in your Slack workspace team. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain attached files in your Slack workspace team. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of DataSourceToIndexFieldMapping
objects that map Slack data source attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Slack fields. For more information, see Mapping data source fields. The Slack data source field names must exist in your Slack custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Provides the configuration information to connect to Box as your data source.
The identifier of the Box Enterprise platform. You can find the enterprise ID in the Box Developer Console settings or when you create an app in Box and download your authentication credentials. For example, 801234567 .
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Box platform. The secret must contain a JSON structure with the following keys:
You create an application in Box to generate the keys or credentials required for the secret. For more information, see Using a Box data source.
TRUE
to use the Slack change log to determine which documents require updating in the index. Depending on the data source change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents.
TRUE
to index comments.
TRUE
to index the contents of tasks.
TRUE
to index web links.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box files to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box tasks to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Box web links to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Box fields. For more information, see Mapping data source fields. The Box field names must exist in your Box custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain files and folders in your Box platform. Files and folders that match the patterns are included in the index. Files and folders that don't match the patterns are excluded from the index. If a file or folder matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file or folder isn't included in the index.
A list of regular expression patterns to exclude certain files and folders from your Box platform. Files and folders that match the patterns are excluded from the index.Files and folders that don't match the patterns are included in the index. If a file or folder matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file or folder isn't included in the index.
Configuration information for an Amazon VPC to connect to your Box. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Provides the configuration information to connect to Quip as your data source.
The Quip site domain. For example, https://quip-company.quipdomain.com/browse . The domain in this example is "quipdomain".
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs that are required to connect to your Quip. The secret must contain a JSON structure with the following keys:
TRUE
to index file comments.
TRUE
to index the contents of chat rooms.
TRUE
to index attachments.
The identifiers of the Quip folders you want to index. You can find the folder ID in your browser URL when you access your folder in Quip. For example, https://quip-company.quipdomain.com/zlLuOVNSarTL/folder-name . The folder ID in this example is "zlLuOVNSarTL".
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Quip threads to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Quip fields. For more information, see Mapping data source fields. The Quip field names must exist in your Quip custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Quip messages to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Quip fields. For more information, see Mapping data source fields. The Quip field names must exist in your Quip custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Quip attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Quip fields. For more information, see Mapping data source fields. The Quip field names must exist in your Quip custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain files in your Quip file system. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence, and the file isn't included in the index.
A list of regular expression patterns to exclude certain files in your Quip file system. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence, and the file isn't included in the index.
Configuration information for an Amazon Virtual Private Cloud (VPC) to connect to your Quip. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Provides the configuration information to connect to Jira as your data source.
The URL of the Jira account. For example, company.atlassian.net or https://jira.company.com . You can find your Jira account URL in the URL of your profile page for Jira desktop.
The Amazon Resource Name (ARN) of a secret in Secrets Manager contains the key-value pairs required to connect to your Jira data source. The secret must contain a JSON structure with the following keys:
TRUE
to use the Jira change log to determine which documents require updating in the index. Depending on the change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in Jira.
Specify which projects to crawl in your Jira data source. You can specify one or more Jira project IDs.
Specify which issue types to crawl in your Jira data source. You can specify one or more of these options to crawl.
Specify which statuses to crawl in your Jira data source. You can specify one or more of these options to crawl.
Specify whether to crawl comments, attachments, and work logs. You can specify one or more of these options.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira issues to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira projects to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping objects that map attributes or field names of Jira work logs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex API before you map to Jira fields. For more information, see Mapping data source fields. The Jira data source field names must exist in your Jira custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain file paths, file names, and file types in your Jira data source. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain file paths, file names, and file types in your Jira data source. Files that match the patterns are excluded from the index. Files that don’t match the patterns are included in the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
Configuration information for an Amazon Virtual Private Cloud to connect to your Jira. Your Jira account must reside inside your VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Provides the configuration information to connect to GitHub as your data source.
Configuration information to connect to GitHub Enterprise Cloud (SaaS).
The name of the organization of the GitHub Enterprise Cloud (SaaS) account you want to connect to. You can find your organization name by logging into GitHub desktop and selecting Your organizations under your profile picture dropdown.
The GitHub host URL or API endpoint URL. For example, https://api.github.com .
Configuration information to connect to GitHub Enterprise Server (on premises).
The GitHub host URL or API endpoint URL. For example, https://on-prem-host-url/api/v3/
The name of the organization of the GitHub Enterprise Server (in-premise) account you want to connect to. You can find your organization name by logging into GitHub desktop and selecting Your organizations under your profile picture dropdown.
The path to the SSL certificate stored in an Amazon S3 bucket. You use this to connect to GitHub if you require a secure SSL connection.
You can simply generate a self-signed X509 certificate on any computer using OpenSSL. For an example of using OpenSSL to create an X509 certificate, see Create and sign an X509 certificate.
The name of the S3 bucket that contains the file.
The name of the file.
The type of GitHub service you want to connect to—GitHub Enterprise Cloud (SaaS) or GitHub Enterprise Server (on premises).
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your GitHub. The secret must contain a JSON structure with the following keys:
TRUE
to use the GitHub change log to determine which documents require updating in the index. Depending on the GitHub change log's size, it may take longer for Amazon Kendra to use the change log than to scan all of your documents in GitHub.
Configuration information to include certain types of GitHub content. You can configure to index repository files only, or also include issues and pull requests, comments, and comment attachments.
TRUE
to index all files with a repository.
TRUE
to index all issues within a repository.
TRUE
to index all comments on issues.
TRUE
to include all comment attachments for issues.
TRUE
to index all pull requests within a repository.
TRUE
to index all comments on pull requests.
TRUE
to include all comment attachments for pull requests.
A list of names of the specific repositories you want to index.
A list of regular expression patterns to include certain folder names in your GitHub repository or repositories. Folder names that match the patterns are included in the index. Folder names that don't match the patterns are excluded from the index. If a folder matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the folder isn't included in the index.
A list of regular expression patterns to include certain file types in your GitHub repository or repositories. File types that match the patterns are included in the index. File types that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to include certain file names in your GitHub repository or repositories. File names that match the patterns are included in the index. File names that don't match the patterns are excluded from the index. If a file matches both an inclusion and exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain folder names in your GitHub repository or repositories. Folder names that match the patterns are excluded from the index. Folder names that don't match the patterns are included in the index. If a folder matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the folder isn't included in the index.
A list of regular expression patterns to exclude certain file types in your GitHub repository or repositories. File types that match the patterns are excluded from the index. File types that don't match the patterns are included in the index. If a file matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain file names in your GitHub repository or repositories. File names that match the patterns are excluded from the index. File names that don't match the patterns are included in the index. If a file matches both an exclusion and inclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
Configuration information of an Amazon Virtual Private Cloud to connect to your GitHub. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
A list of DataSourceToIndexFieldMapping
objects that map GitHub repository attributes or field names to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub commits to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub issues to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub issue comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub issue attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub pull request comments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub pull requests to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of GitHub pull request attachments to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to GitHub fields. For more information, see Mapping data source fields. The GitHub data source field names must exist in your GitHub custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
Provides the configuration information to connect to Alfresco as your data source.
The URL of the Alfresco site. For example, https://hostname:8080 .
The identifier of the Alfresco site. For example, my-site .
The Amazon Resource Name (ARN) of an Secrets Manager secret that contains the key-value pairs required to connect to your Alfresco data source. The secret must contain a JSON structure with the following keys:
The path to the SSL certificate stored in an Amazon S3 bucket. You use this to connect to Alfresco if you require a secure SSL connection.
You can simply generate a self-signed X509 certificate on any computer using OpenSSL. For an example of using OpenSSL to create an X509 certificate, see Create and sign an X509 certificate.
The name of the S3 bucket that contains the file.
The name of the file.
TRUE
to index shared files.
TRUE
to index comments of blogs and other content.
Specify whether to index document libraries, wikis, or blogs. You can specify one or more of these options.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Alfresco document libraries to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Alfresco fields. For more information, see Mapping data source fields. The Alfresco data source field names must exist in your Alfresco custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Alfresco blogs to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Alfresco fields. For more information, see Mapping data source fields. The Alfresco data source field names must exist in your Alfresco custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of DataSourceToIndexFieldMapping
objects that map attributes or field names of Alfresco wikis to Amazon Kendra index field names. To create custom fields, use the UpdateIndex
API before you map to Alfresco fields. For more information, see Mapping data source fields. The Alfresco data source field names must exist in your Alfresco custom metadata.
Maps a column or attribute in the data source to an index field. You must first create the fields in the index using the UpdateIndex
API.
The name of the column or attribute in the data source.
The type of data stored in the column or attribute.
The name of the field in the index.
A list of regular expression patterns to include certain files in your Alfresco data source. Files that match the patterns are included in the index. Files that don't match the patterns are excluded from the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
A list of regular expression patterns to exclude certain files in your Alfresco data source. Files that match the patterns are excluded from the index. Files that don't match the patterns are included in the index. If a file matches both an inclusion pattern and an exclusion pattern, the exclusion pattern takes precedence and the file isn't included in the index.
Configuration information for an Amazon Virtual Private Cloud to connect to your Alfresco. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Provides a template for the configuration information to connect to your data source.
The template schema used for the data source, where templates schemas are supported.
Configuration information for an Amazon Virtual Private Cloud to connect to your data source. For more information, see Configuring a VPC.
A list of identifiers for subnets within your Amazon VPC. The subnets should be able to connect to each other in the VPC, and they should have outgoing access to the Internet through a NAT device.
A list of identifiers of security groups within your Amazon VPC. The security groups should enable Amazon Kendra to connect to the data source.
Configuration information you want to update for altering document metadata and content during the document ingestion process.
For more information on how to create, modify and delete document metadata, or make other content alterations when you ingest documents into Amazon Kendra, see Customizing document metadata during the ingestion process.
Configuration information to alter document attributes or metadata fields and content when ingesting documents into Amazon Kendra.
Provides the configuration information for applying basic logic to alter document metadata and content when ingesting documents into Amazon Kendra. To apply advanced logic, to go beyond what you can do with basic logic, see HookConfiguration.
For more information, see Customizing document metadata during the ingestion process.
Configuration of the condition used for the target document attribute or metadata field when ingesting documents into Amazon Kendra.
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
The condition operator.
For example, you can use 'Contains' to partially match a string.
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
Configuration of the target document attribute or metadata field when ingesting documents into Amazon Kendra. You can also include a value.
The identifier of the target document attribute or metadata field.
For example, 'Department' could be an identifier for the target attribute or metadata field that includes the department names associated with the documents.
TRUE
to delete the existing target value for your specified target attribute key. You cannot create a target value and set this toTRUE
. To create a target value (TargetDocumentAttributeValue
), set this toFALSE
.
The target value you want to create for the target attribute.
For example, 'Finance' could be the target value for the target attribute key 'Department'.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
TRUE
to delete content if the condition used for the target attribute is met.
Configuration information for invoking a Lambda function in Lambda on the original or raw documents before extracting their metadata and text. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation.
The condition used for when a Lambda function should be invoked.
For example, you can specify a condition that if there are empty date-time values, then Amazon Kendra should invoke a function that inserts the current date-time.
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
The condition operator.
For example, you can use 'Contains' to partially match a string.
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Amazon Kendra.
Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.
Configuration information for invoking a Lambda function in Lambda on the structured documents with their metadata and text extracted. You can use a Lambda function to apply advanced logic for creating, modifying, or deleting document metadata and content. For more information, see Advanced data manipulation.
The condition used for when a Lambda function should be invoked.
For example, you can specify a condition that if there are empty date-time values, then Amazon Kendra should invoke a function that inserts the current date-time.
The identifier of the document attribute used for the condition.
For example, 'Source_URI' could be an identifier for the attribute or metadata field that contains source URIs associated with the documents.
Amazon Kendra currently does not support _document_body
as an attribute key used for the condition.
The condition operator.
For example, you can use 'Contains' to partially match a string.
The value used by the operator.
For example, you can specify the value 'financial' for strings in the 'Source_URI' field that partially match or contain this value.
A string, such as "department".
A list of strings. The default maximum length or number of strings is 10.
A long integer value.
A date expressed as an ISO 8601 string.
It is important for the time zone to be included in the ISO 8601 date-time format. For example, 2012-03-25T12:30:10+01:00 is the ISO 8601 date-time format for March 25th 2012 at 12:30PM (plus 10 seconds) in Central European Time.
The Amazon Resource Name (ARN) of a role with permission to run a Lambda function during ingestion. For more information, see IAM roles for Amazon Kendra.
Stores the original, raw documents or the structured, parsed documents before and after altering them. For more information, see Data contracts for Lambda functions.
The Amazon Resource Name (ARN) of a role with permission to run PreExtractionHookConfiguration
and PostExtractionHookConfiguration
for altering document metadata and content during the document ingestion process. For more information, see IAM roles for Amazon Kendra.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
update_experience
(**kwargs)¶Updates your Amazon Kendra experience such as a search application. For more information on creating a search application experience, see Building a search experience with no code.
See also: AWS API Documentation
Request Syntax
response = client.update_experience(
Id='string',
Name='string',
IndexId='string',
RoleArn='string',
Configuration={
'ContentSourceConfiguration': {
'DataSourceIds': [
'string',
],
'FaqIds': [
'string',
],
'DirectPutContent': True|False
},
'UserIdentityConfiguration': {
'IdentityAttributeName': 'string'
}
},
Description='string'
)
[REQUIRED]
The identifier of your Amazon Kendra experience you want to update.
[REQUIRED]
The identifier of the index for your Amazon Kendra experience.
Query
API, QuerySuggestions
API, SubmitFeedback
API, and IAM Identity Center that stores your user and group information. For more information, see IAM roles for Amazon Kendra.Configuration information you want to update for your Amazon Kendra experience.
The identifiers of your data sources and FAQs. Or, you can specify that you want to use documents indexed via the BatchPutDocument
API. This is the content you want to use for your Amazon Kendra experience.
The identifier of the data sources you want to use for your Amazon Kendra experience.
The identifier of the FAQs that you want to use for your Amazon Kendra experience.
TRUE
to use documents you indexed directly using theBatchPutDocument
API.
The IAM Identity Center field name that contains the identifiers of your users, such as their emails.
The IAM Identity Center field name that contains the identifiers of your users, such as their emails. This is used for user context filtering and for granting access to your Amazon Kendra experience. You must set up IAM Identity Center with Amazon Kendra. You must include your users and groups in your Access Control List when you ingest documents into your index. For more information, see Getting started with an IAM Identity Center identity source.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
update_index
(**kwargs)¶Updates an existing Amazon Kendra index.
See also: AWS API Documentation
Request Syntax
response = client.update_index(
Id='string',
Name='string',
RoleArn='string',
Description='string',
DocumentMetadataConfigurationUpdates=[
{
'Name': 'string',
'Type': 'STRING_VALUE'|'STRING_LIST_VALUE'|'LONG_VALUE'|'DATE_VALUE',
'Relevance': {
'Freshness': True|False,
'Importance': 123,
'Duration': 'string',
'RankOrder': 'ASCENDING'|'DESCENDING',
'ValueImportanceMap': {
'string': 123
}
},
'Search': {
'Facetable': True|False,
'Searchable': True|False,
'Displayable': True|False,
'Sortable': True|False
}
},
],
CapacityUnits={
'StorageCapacityUnits': 123,
'QueryCapacityUnits': 123
},
UserTokenConfigurations=[
{
'JwtTokenTypeConfiguration': {
'KeyLocation': 'URL'|'SECRET_MANAGER',
'URL': 'string',
'SecretManagerArn': 'string',
'UserNameAttributeField': 'string',
'GroupAttributeField': 'string',
'Issuer': 'string',
'ClaimRegex': 'string'
},
'JsonTokenTypeConfiguration': {
'UserNameAttributeField': 'string',
'GroupAttributeField': 'string'
}
},
],
UserContextPolicy='ATTRIBUTE_FILTER'|'USER_TOKEN',
UserGroupResolutionConfiguration={
'UserGroupResolutionMode': 'AWS_SSO'|'NONE'
}
)
[REQUIRED]
The identifier of the index you want to update.
The document metadata configuration you want to update for the index. Document metadata are fields or attributes associated with your documents. For example, the company department name associated with each document.
Specifies the properties, such as relevance tuning and searchability, of an index field.
The name of the index field.
The data type of the index field.
Provides tuning parameters to determine how the field affects the search results.
Indicates that this field determines how "fresh" a document is. For example, if document 1 was created on November 5, and document 2 was created on October 31, document 1 is "fresher" than document 2. You can only set the Freshness
field on one DATE
type field. Only applies to DATE
fields.
The relative importance of the field in the search. Larger numbers provide more of a boost than smaller numbers.
Specifies the time period that the boost applies to. For example, to make the boost apply to documents with the field value within the last month, you would use "2628000s". Once the field value is beyond the specified range, the effect of the boost drops off. The higher the importance, the faster the effect drops off. If you don't specify a value, the default is 3 months. The value of the field is a numeric string followed by the character "s", for example "86400s" for one day, or "604800s" for one week.
Only applies to DATE
fields.
Determines how values should be interpreted.
When the RankOrder
field is ASCENDING
, higher numbers are better. For example, a document with a rating score of 10 is higher ranking than a document with a rating score of 1.
When the RankOrder
field is DESCENDING
, lower numbers are better. For example, in a task tracking application, a priority 1 task is more important than a priority 5 task.
Only applies to LONG
and DOUBLE
fields.
A list of values that should be given a different boost when they appear in the result list. For example, if you are boosting a field called "department," query terms that match the department field are boosted in the result. However, you can add entries from the department field to boost documents with those values higher.
For example, you can add entries to the map with names of departments. If you add "HR",5 and "Legal",3 those departments are given special attention when they appear in the metadata of a document. When those terms appear they are given the specified importance instead of the regular importance for the boost.
Provides information about how the field is used during a search.
Indicates that the field can be used to create search facets, a count of results for each value in the field. The default is false
.
Determines whether the field is used in the search. If the Searchable
field is true
, you can use relevance tuning to manually tune how Amazon Kendra weights the field in the search. The default is true
for string fields and false
for number and date fields.
Determines whether the field is returned in the query response. The default is true
.
Determines whether the field can be used to sort the results of a query. If you specify sorting on a field that does not have Sortable
set to true
, Amazon Kendra returns an exception. The default is false
.
Sets the number of additional document storage and query capacity units that should be used by the index. You can change the capacity of the index up to 5 times per day, or make 5 API calls.
If you are using extra storage units, you can't reduce the storage capacity below what is required to meet the storage needs for your index.
The amount of extra storage capacity for an index. A single capacity unit provides 30 GB of storage space or 100,000 documents, whichever is reached first. You can add up to 100 extra capacity units.
The amount of extra query capacity for an index and GetQuerySuggestions capacity.
A single extra capacity unit for an index provides 0.1 queries per second or approximately 8,000 queries per day. You can add up to 100 extra capacity units.
GetQuerySuggestions
capacity is five times the provisioned query capacity for an index, or the base capacity of 2.5 calls per second, whichever is higher. For example, the base capacity for an index is 0.1 queries per second, andGetQuerySuggestions
capacity has a base of 2.5 calls per second. If you add another 0.1 queries per second to total 0.2 queries per second for an index, theGetQuerySuggestions
capacity is 2.5 calls per second (higher than five times 0.2 queries per second).
The user token configuration.
Provides the configuration information for a token.
Information about the JWT token type configuration.
The location of the key.
The signing key URL.
The Amazon Resource Name (arn) of the secret.
The user name attribute field.
The group attribute field.
The issuer of the token.
The regular expression that identifies the claim.
Information about the JSON token type configuration.
The user name attribute field.
The group attribute field.
Enables fetching access levels of groups and users from an IAM Identity Center (successor to Single Sign-On) identity source. To configure this, see UserGroupResolutionConfiguration.
The identity store provider (mode) you want to use to fetch access levels of groups and users. IAM Identity Center (successor to Single Sign-On) is currently the only available mode. Your users and groups must exist in an IAM Identity Center identity source in order to use this mode.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ServiceQuotaExceededException
kendra.Client.exceptions.InternalServerException
update_query_suggestions_block_list
(**kwargs)¶Updates a block list used for query suggestions for an index.
Updates to a block list might not take effect right away. Amazon Kendra needs to refresh the entire suggestions list to apply any updates to the block list. Other changes not related to the block list apply immediately.
If a block list is updating, then you need to wait for the first update to finish before submitting another update.
Amazon Kendra supports partial updates, so you only need to provide the fields you want to update.
UpdateQuerySuggestionsBlockList
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.update_query_suggestions_block_list(
IndexId='string',
Id='string',
Name='string',
Description='string',
SourceS3Path={
'Bucket': 'string',
'Key': 'string'
},
RoleArn='string'
)
[REQUIRED]
The identifier of the index for the block list.
[REQUIRED]
The identifier of the block list you want to update.
The S3 path where your block list text file sits in S3.
If you update your block list and provide the same path to the block list text file in S3, then Amazon Kendra reloads the file to refresh the block list. Amazon Kendra does not automatically refresh your block list. You need to call the UpdateQuerySuggestionsBlockList
API to refresh you block list.
If you update your block list, then Amazon Kendra asynchronously refreshes all query suggestions with the latest content in the S3 file. This means changes might not take effect immediately.
The name of the S3 bucket that contains the file.
The name of the file.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.InternalServerException
update_query_suggestions_config
(**kwargs)¶Updates the settings of query suggestions for an index.
Amazon Kendra supports partial updates, so you only need to provide the fields you want to update.
If an update is currently processing (i.e. 'happening'), you need to wait for the update to finish before making another update.
Updates to query suggestions settings might not take effect right away. The time for your updated settings to take effect depends on the updates made and the number of search queries in your index.
You can still enable/disable query suggestions at any time.
UpdateQuerySuggestionsConfig
is currently not supported in the Amazon Web Services GovCloud (US-West) region.
See also: AWS API Documentation
Request Syntax
response = client.update_query_suggestions_config(
IndexId='string',
Mode='ENABLED'|'LEARN_ONLY',
QueryLogLookBackWindowInDays=123,
IncludeQueriesWithoutUserInformation=True|False,
MinimumNumberOfQueryingUsers=123,
MinimumQueryCount=123
)
[REQUIRED]
The identifier of the index with query suggestions you want to update.
Set the mode to ENABLED
or LEARN_ONLY
.
By default, Amazon Kendra enables query suggestions. LEARN_ONLY
mode allows you to turn off query suggestions. You can to update this at any time.
In LEARN_ONLY
mode, Amazon Kendra continues to learn from new queries to keep suggestions up to date for when you are ready to switch to ENABLED mode again.
How recent your queries are in your query log time window.
The time window is the number of days from current day to past days.
By default, Amazon Kendra sets this to 180.
TRUE
to include queries without user information (i.e. all queries, irrespective of the user), otherwiseFALSE
to only include queries with user information.
If you pass user information to Amazon Kendra along with the queries, you can set this flag to FALSE
and instruct Amazon Kendra to only consider queries with user information.
If you set to FALSE
, Amazon Kendra only considers queries searched at least MinimumQueryCount
times across MinimumNumberOfQueryingUsers
unique users for suggestions.
If you set to TRUE
, Amazon Kendra ignores all user information and learns from all queries.
The minimum number of unique users who must search a query in order for the query to be eligible to suggest to your users.
Increasing this number might decrease the number of suggestions. However, this ensures a query is searched by many users and is truly popular to suggest to users.
How you tune this setting depends on your specific needs.
The the minimum number of times a query must be searched in order to be eligible to suggest to your users.
Decreasing this number increases the number of suggestions. However, this affects the quality of suggestions as it sets a low bar for a query to be considered popular to suggest to users.
How you tune this setting depends on your specific needs.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.InternalServerException
update_thesaurus
(**kwargs)¶Updates a thesaurus for an index.
See also: AWS API Documentation
Request Syntax
response = client.update_thesaurus(
Id='string',
Name='string',
IndexId='string',
Description='string',
RoleArn='string',
SourceS3Path={
'Bucket': 'string',
'Key': 'string'
}
)
[REQUIRED]
The identifier of the thesaurus you want to update.
[REQUIRED]
The identifier of the index for the thesaurus.
SourceS3Path
.Information required to find a specific file in an Amazon S3 bucket.
The name of the S3 bucket that contains the file.
The name of the file.
None
Exceptions
kendra.Client.exceptions.ValidationException
kendra.Client.exceptions.ResourceNotFoundException
kendra.Client.exceptions.ThrottlingException
kendra.Client.exceptions.AccessDeniedException
kendra.Client.exceptions.ConflictException
kendra.Client.exceptions.InternalServerException
The available paginators are: