AgentsforBedrock / Client / ingest_knowledge_base_documents
ingest_knowledge_base_documents#
- AgentsforBedrock.Client.ingest_knowledge_base_documents(**kwargs)#
- Ingests documents directly into the knowledge base that is connected to the data source. The - dataSourceTypespecified in the content for each document must match the type of the data source that you specify in the header. For more information, see Ingest documents into a knowledge base in real-time in the Amazon Bedrock User Guide.- See also: AWS API Documentation - Request Syntax- response = client.ingest_knowledge_base_documents( clientToken='string', dataSourceId='string', documents=[ { 'content': { 'custom': { 'customDocumentIdentifier': { 'id': 'string' }, 'inlineContent': { 'byteContent': { 'data': b'bytes', 'mimeType': 'string' }, 'textContent': { 'data': 'string' }, 'type': 'BYTE'|'TEXT' }, 's3Location': { 'bucketOwnerAccountId': 'string', 'uri': 'string' }, 'sourceType': 'IN_LINE'|'S3_LOCATION' }, 'dataSourceType': 'CUSTOM'|'S3', 's3': { 's3Location': { 'uri': 'string' } } }, 'metadata': { 'inlineAttributes': [ { 'key': 'string', 'value': { 'booleanValue': True|False, 'numberValue': 123.0, 'stringListValue': [ 'string', ], 'stringValue': 'string', 'type': 'BOOLEAN'|'NUMBER'|'STRING'|'STRING_LIST' } }, ], 's3Location': { 'bucketOwnerAccountId': 'string', 'uri': 'string' }, 'type': 'IN_LINE_ATTRIBUTE'|'S3_LOCATION' } }, ], knowledgeBaseId='string' ) - Parameters:
- clientToken (string) – - A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If this token matches a previous request, Amazon Bedrock ignores the request, but does not return an error. For more information, see Ensuring idempotency. - This field is autopopulated if not provided. 
- dataSourceId (string) – - [REQUIRED] - The unique identifier of the data source connected to the knowledge base that you’re adding documents to. 
- documents (list) – - [REQUIRED] - A list of objects, each of which contains information about the documents to add. - (dict) – - Contains information about a document to ingest into a knowledge base and metadata to associate with it. - content (dict) – [REQUIRED] - Contains the content of the document. - custom (dict) – - Contains information about the content to ingest into a knowledge base connected to a custom data source. - customDocumentIdentifier (dict) – [REQUIRED] - A unique identifier for the document. - id (string) – [REQUIRED] - The identifier of the document to ingest into a custom data source. 
 
- inlineContent (dict) – - Contains information about content defined inline to ingest into a knowledge base. - byteContent (dict) – - Contains information about content defined inline in bytes. - data (bytes) – [REQUIRED] - The base64-encoded string of the content. 
- mimeType (string) – [REQUIRED] - The MIME type of the content. For a list of MIME types, see Media Types. The following MIME types are supported: - text/plain 
- text/html 
- text/csv 
- text/vtt 
- message/rfc822 
- application/xhtml+xml 
- application/pdf 
- application/msword 
- application/vnd.ms-word.document.macroenabled.12 
- application/vnd.ms-word.template.macroenabled.12 
- application/vnd.ms-excel 
- application/vnd.ms-excel.addin.macroenabled.12 
- application/vnd.ms-excel.sheet.macroenabled.12 
- application/vnd.ms-excel.template.macroenabled.12 
- application/vnd.ms-excel.sheet.binary.macroenabled.12 
- application/vnd.ms-spreadsheetml 
- application/vnd.openxmlformats-officedocument.spreadsheetml.sheet 
- application/vnd.openxmlformats-officedocument.spreadsheetml.template 
- application/vnd.openxmlformats-officedocument.wordprocessingml.document 
- application/vnd.openxmlformats-officedocument.wordprocessingml.template 
 
 
- textContent (dict) – - Contains information about content defined inline in text. - data (string) – [REQUIRED] - The text of the content. 
 
- type (string) – [REQUIRED] - The type of inline content to define. 
 
- s3Location (dict) – - Contains information about the Amazon S3 location of the file from which to ingest data. - bucketOwnerAccountId (string) – - The identifier of the Amazon Web Services account that owns the S3 bucket containing the content to ingest. 
- uri (string) – [REQUIRED] - The S3 URI of the file containing the content to ingest. 
 
- sourceType (string) – [REQUIRED] - The source of the data to ingest. 
 
- dataSourceType (string) – [REQUIRED] - The type of data source that is connected to the knowledge base to which to ingest this document. 
- s3 (dict) – - Contains information about the content to ingest into a knowledge base connected to an Amazon S3 data source - s3Location (dict) – [REQUIRED] - The S3 location of the file containing the content to ingest. - uri (string) – [REQUIRED] - The location’s URI. For example, - s3://my-bucket/chunk-processor/.
 
 
 
- metadata (dict) – - Contains the metadata to associate with the document. - inlineAttributes (list) – - An array of objects, each of which defines a metadata attribute to associate with the content to ingest. You define the attributes inline. - (dict) – - Contains information about a metadata attribute. - key (string) – [REQUIRED] - The key of the metadata attribute. 
- value (dict) – [REQUIRED] - Contains the value of the metadata attribute. - booleanValue (boolean) – - The value of the Boolean metadata attribute. 
- numberValue (float) – - The value of the numeric metadata attribute. 
- stringListValue (list) – - An array of strings that define the value of the metadata attribute. - (string) – 
 
- stringValue (string) – - The value of the string metadata attribute. 
- type (string) – [REQUIRED] - The type of the metadata attribute. 
 
 
 
- s3Location (dict) – - The Amazon S3 location of the file containing metadata to associate with the content to ingest. - bucketOwnerAccountId (string) – - The identifier of the Amazon Web Services account that owns the S3 bucket containing the content to ingest. 
- uri (string) – [REQUIRED] - The S3 URI of the file containing the content to ingest. 
 
- type (string) – [REQUIRED] - The type of the source source from which to add metadata. 
 
 
 
- knowledgeBaseId (string) – - [REQUIRED] - The unique identifier of the knowledge base to ingest the documents into. 
 
- Return type:
- dict 
- Returns:
- Response Syntax- { 'documentDetails': [ { 'dataSourceId': 'string', 'identifier': { 'custom': { 'id': 'string' }, 'dataSourceType': 'CUSTOM'|'S3', 's3': { 'uri': 'string' } }, 'knowledgeBaseId': 'string', 'status': 'INDEXED'|'PARTIALLY_INDEXED'|'PENDING'|'FAILED'|'METADATA_PARTIALLY_INDEXED'|'METADATA_UPDATE_FAILED'|'IGNORED'|'NOT_FOUND'|'STARTING'|'IN_PROGRESS'|'DELETING'|'DELETE_IN_PROGRESS', 'statusReason': 'string', 'updatedAt': datetime(2015, 1, 1) }, ] } - Response Structure- (dict) – - documentDetails (list) – - A list of objects, each of which contains information about the documents that were ingested. - (dict) – - Contains the details for a document that was ingested or deleted. - dataSourceId (string) – - The identifier of the data source connected to the knowledge base that the document was ingested into or deleted from. 
- identifier (dict) – - Contains information that identifies the document. - custom (dict) – - Contains information that identifies the document in a custom data source. - id (string) – - The identifier of the document to ingest into a custom data source. 
 
- dataSourceType (string) – - The type of data source connected to the knowledge base that contains the document. 
- s3 (dict) – - Contains information that identifies the document in an S3 data source. - uri (string) – - The location’s URI. For example, - s3://my-bucket/chunk-processor/.
 
 
- knowledgeBaseId (string) – - The identifier of the knowledge base that the document was ingested into or deleted from. 
- status (string) – - The ingestion status of the document. The following statuses are possible: - STARTED – You submitted the ingestion job containing the document. 
- PENDING – The document is waiting to be ingested. 
- IN_PROGRESS – The document is being ingested. 
- INDEXED – The document was successfully indexed. 
- PARTIALLY_INDEXED – The document was partially indexed. 
- METADATA_PARTIALLY_INDEXED – You submitted metadata for an existing document and it was partially indexed. 
- METADATA_UPDATE_FAILED – You submitted a metadata update for an existing document but it failed. 
- FAILED – The document failed to be ingested. 
- NOT_FOUND – The document wasn’t found. 
- IGNORED – The document was ignored during ingestion. 
- DELETING – You submitted the delete job containing the document. 
- DELETE_IN_PROGRESS – The document is being deleted. 
 
- statusReason (string) – - The reason for the status. Appears alongside the status - IGNORED.
- updatedAt (datetime) – - The date and time at which the document was last updated. 
 
 
 
 
 - Exceptions- AgentsforBedrock.Client.exceptions.ThrottlingException
- AgentsforBedrock.Client.exceptions.AccessDeniedException
- AgentsforBedrock.Client.exceptions.ValidationException
- AgentsforBedrock.Client.exceptions.InternalServerException
- AgentsforBedrock.Client.exceptions.ResourceNotFoundException
- AgentsforBedrock.Client.exceptions.ServiceQuotaExceededException