Rekognition / Client / detect_labels

detect_labels#

Rekognition.Client.detect_labels(**kwargs)#

Detects instances of real-world entities within an image (JPEG or PNG) provided as input. This includes objects like flower, tree, and table; events like wedding, graduation, and birthday party; and concepts like landscape, evening, and nature.

For an example, see Analyzing images stored in an Amazon S3 bucket in the Amazon Rekognition Developer Guide.

You pass the input image as base64-encoded image bytes or as a reference to an image in an Amazon S3 bucket. If you use the AWS CLI to call Amazon Rekognition operations, passing image bytes is not supported. The image must be either a PNG or JPEG formatted file.

Optional Parameters

You can specify one or both of the GENERAL_LABELS and IMAGE_PROPERTIES feature types when calling the DetectLabels API. Including GENERAL_LABELS will ensure the response includes the labels detected in the input image, while including ``IMAGE_PROPERTIES ``will ensure the response includes information about the image quality and color.

When using GENERAL_LABELS and/or IMAGE_PROPERTIES you can provide filtering criteria to the Settings parameter. You can filter with sets of individual labels or with label categories. You can specify inclusive filters, exclusive filters, or a combination of inclusive and exclusive filters. For more information on filtering see Detecting Labels in an Image.

When getting labels, you can specify MinConfidence to control the confidence threshold for the labels returned. The default is 55%. You can also add the MaxLabels parameter to limit the number of labels returned. The default and upper limit is 1000 labels. These arguments are only valid when supplying GENERAL_LABELS as a feature type.

Response Elements

For each object, scene, and concept the API returns one or more labels. The API returns the following types of information about labels:

Name - The name of the detected label.
Confidence - The level of confidence in the label assigned to a detected object.
Parents - The ancestor labels for a detected label. DetectLabels returns a hierarchical taxonomy of detected labels. For example, a detected car might be assigned the label car. The label car has two parent labels: Vehicle (its parent) and Transportation (its grandparent). The response includes the all ancestors for a label, where every ancestor is a unique label. In the previous example, Car, Vehicle, and Transportation are returned as unique labels in the response.
Aliases - Possible Aliases for the label.
Categories - The label categories that the detected label belongs to.
BoundingBox — Bounding boxes are described for all instances of detected common object labels, returned in an array of Instance objects. An Instance object contains a BoundingBox object, describing the location of the label on the input image. It also includes the confidence for the accuracy of the detected bounding box.

The API returns the following information regarding the image, as part of the ImageProperties structure:

Quality - Information about the Sharpness, Brightness, and Contrast of the input image, scored between 0 to 100. Image quality is returned for the entire image, as well as the background and the foreground.
Dominant Color - An array of the dominant colors in the image.
Foreground - Information about the sharpness, brightness, and dominant colors of the input image’s foreground.
Background - Information about the sharpness, brightness, and dominant colors of the input image’s background.

The list of returned labels will include at least one label for every detected object, along with information about that label. In the following example, suppose the input image has a lighthouse, the sea, and a rock. The response includes all three labels, one for each object, as well as the confidence in the label:

{Name: lighthouse, Confidence: 98.4629}

{Name: rock,Confidence: 79.2097}

{Name: sea,Confidence: 75.061}

The list of labels can include multiple labels for the same object. For example, if the input image shows a flower (for example, a tulip), the operation might return the following three labels.

{Name: flower,Confidence: 99.0562}

{Name: plant,Confidence: 99.0562}

{Name: tulip,Confidence: 99.0562}

In this example, the detection algorithm more precisely identifies the flower as a tulip.

Note

If the object detected is a person, the operation doesn’t provide the same facial details that the DetectFaces operation provides.

This is a stateless API operation that doesn’t return any data.

This operation requires permissions to perform the rekognition:DetectLabels action.

Request Syntax

response = client.detect_labels(
    Image={
        'Bytes': b'bytes',
        'S3Object': {
            'Bucket': 'string',
            'Name': 'string',
            'Version': 'string'
        }
    },
    MaxLabels=123,
    MinConfidence=...,
    Features=[
        'GENERAL_LABELS'|'IMAGE_PROPERTIES',
    ],
    Settings={
        'GeneralLabels': {
            'LabelInclusionFilters': [
                'string',
            ],
            'LabelExclusionFilters': [
                'string',
            ],
            'LabelCategoryInclusionFilters': [
                'string',
            ],
            'LabelCategoryExclusionFilters': [
                'string',
            ]
        },
        'ImageProperties': {
            'MaxDominantColors': 123
        }
    }
)

Parameters:

Image (dict) –
[REQUIRED]

The input image as base64-encoded bytes or an S3 object. If you use the AWS CLI to call Amazon Rekognition operations, passing image bytes is not supported. Images stored in an S3 Bucket do not need to be base64-encoded.

If you are using an AWS SDK to call Amazon Rekognition, you might not need to base64-encode image bytes passed using the Bytes field. For more information, see Images in the Amazon Rekognition developer guide.
- Bytes (bytes) –
  
  Blob of image bytes up to 5 MBs. Note that the maximum image size you can pass to DetectCustomLabels is 4MB.
- S3Object (dict) –
  
  Identifies an S3 object as the image source.
  - Bucket (string) –
    
    Name of the S3 bucket.
  - Name (string) –
    
    S3 object key name.
  - Version (string) –
    
    If the bucket is versioning enabled, you can specify the object version.
MaxLabels (integer) – Maximum number of labels you want the service to return in the response. The service returns the specified number of highest confidence labels. Only valid when GENERAL_LABELS is specified as a feature type in the Feature input parameter.
MinConfidence (float) –
Specifies the minimum confidence level for the labels to return. Amazon Rekognition doesn’t return any labels with confidence lower than this specified value.

If MinConfidence is not specified, the operation returns labels with a confidence values greater than or equal to 55 percent. Only valid when GENERAL_LABELS is specified as a feature type in the Feature input parameter.
Features (list) –
A list of the types of analysis to perform. Specifying GENERAL_LABELS uses the label detection feature, while specifying IMAGE_PROPERTIES returns information regarding image color and quality. If no option is specified GENERAL_LABELS is used by default.
- (string) –
Settings (dict) –
A list of the filters to be applied to returned detected labels and image properties. Specified filters can be inclusive, exclusive, or a combination of both. Filters can be used for individual labels or label categories. The exact label names or label categories must be supplied. For a full list of labels and label categories, see Detecting labels.
- GeneralLabels (dict) –
  
  Contains the specified filters for GENERAL_LABELS.
  - LabelInclusionFilters (list) –
    
    The labels that should be included in the return from DetectLabels.
    - (string) –
  - LabelExclusionFilters (list) –
    
    The labels that should be excluded from the return from DetectLabels.
    - (string) –
  - LabelCategoryInclusionFilters (list) –
    
    The label categories that should be included in the return from DetectLabels.
    - (string) –
  - LabelCategoryExclusionFilters (list) –
    
    The label categories that should be excluded from the return from DetectLabels.
    - (string) –
- ImageProperties (dict) –
  
  Contains the chosen number of maximum dominant colors in an image.
  - MaxDominantColors (integer) –
    
    The maximum number of dominant colors to return when detecting labels in an image. The default value is 10.

Return type:

dict

Returns:

Response Syntax

{
    'Labels': [
        {
            'Name': 'string',
            'Confidence': ...,
            'Instances': [
                {
                    'BoundingBox': {
                        'Width': ...,
                        'Height': ...,
                        'Left': ...,
                        'Top': ...
                    },
                    'Confidence': ...,
                    'DominantColors': [
                        {
                            'Red': 123,
                            'Blue': 123,
                            'Green': 123,
                            'HexCode': 'string',
                            'CSSColor': 'string',
                            'SimplifiedColor': 'string',
                            'PixelPercent': ...
                        },
                    ]
                },
            ],
            'Parents': [
                {
                    'Name': 'string'
                },
            ],
            'Aliases': [
                {
                    'Name': 'string'
                },
            ],
            'Categories': [
                {
                    'Name': 'string'
                },
            ]
        },
    ],
    'OrientationCorrection': 'ROTATE_0'|'ROTATE_90'|'ROTATE_180'|'ROTATE_270',
    'LabelModelVersion': 'string',
    'ImageProperties': {
        'Quality': {
            'Brightness': ...,
            'Sharpness': ...,
            'Contrast': ...
        },
        'DominantColors': [
            {
                'Red': 123,
                'Blue': 123,
                'Green': 123,
                'HexCode': 'string',
                'CSSColor': 'string',
                'SimplifiedColor': 'string',
                'PixelPercent': ...
            },
        ],
        'Foreground': {
            'Quality': {
                'Brightness': ...,
                'Sharpness': ...,
                'Contrast': ...
            },
            'DominantColors': [
                {
                    'Red': 123,
                    'Blue': 123,
                    'Green': 123,
                    'HexCode': 'string',
                    'CSSColor': 'string',
                    'SimplifiedColor': 'string',
                    'PixelPercent': ...
                },
            ]
        },
        'Background': {
            'Quality': {
                'Brightness': ...,
                'Sharpness': ...,
                'Contrast': ...
            },
            'DominantColors': [
                {
                    'Red': 123,
                    'Blue': 123,
                    'Green': 123,
                    'HexCode': 'string',
                    'CSSColor': 'string',
                    'SimplifiedColor': 'string',
                    'PixelPercent': ...
                },
            ]
        }
    }
}

Response Structure

(dict) –
- Labels (list) –
  
  An array of labels for the real-world objects detected.
  - (dict) –
    
    Structure containing details about the detected label, including the name, detected instances, parent labels, and level of confidence.
    - Name (string) –
      
      The name (label) of the object or scene.
    - Confidence (float) –
      
      Level of confidence.
    - Instances (list) –
      
      If Label represents an object, Instances contains the bounding boxes for each instance of the detected object. Bounding boxes are returned for common object labels such as people, cars, furniture, apparel or pets.
      - (dict) –
        
        An instance of a label returned by Amazon Rekognition Image ( DetectLabels) or by Amazon Rekognition Video ( GetLabelDetection).
        
        BoundingBox (dict) –
        
        The position of the label instance on the image.
        
        Width (float) –
        
        Width of the bounding box as a ratio of the overall image width.
        
        Height (float) –
        
        Height of the bounding box as a ratio of the overall image height.
        
        Left (float) –
        
        Left coordinate of the bounding box as a ratio of overall image width.
        
        Top (float) –
        
        Top coordinate of the bounding box as a ratio of overall image height.
        
        Confidence (float) –
        
        The confidence that Amazon Rekognition has in the accuracy of the bounding box.
        
        DominantColors (list) –
        
        The dominant colors found in an individual instance of a label.
        
        (dict) –
        
        A description of the dominant colors in an image.
        
        Red (integer) –
        
        The Red RGB value for a dominant color.
        
        Blue (integer) –
        
        The Blue RGB value for a dominant color.
        
        Green (integer) –
        
        The Green RGB value for a dominant color.
        
        HexCode (string) –
        
        The Hex code equivalent of the RGB values for a dominant color.
        
        CSSColor (string) –
        
        The CSS color name of a dominant color.
        
        SimplifiedColor (string) –
        
        One of 12 simplified color names applied to a dominant color.
        
        PixelPercent (float) –
        
        The percentage of image pixels that have a given dominant color.
    - Parents (list) –
      
      The parent labels for a label. The response includes all ancestor labels.
      - (dict) –
        
        A parent label for a label. A label can have 0, 1, or more parents.
        
        Name (string) –
        
        The name of the parent label.
    - Aliases (list) –
      
      A list of potential aliases for a given label.
      - (dict) –
        
        A potential alias of for a given label.
        
        Name (string) –
        
        The name of an alias for a given label.
    - Categories (list) –
      
      A list of the categories associated with a given label.
      - (dict) –
        
        The category that applies to a given label.
        
        Name (string) –
        
        The name of a category that applies to a given label.
- OrientationCorrection (string) –
  
  The value of OrientationCorrection is always null.
  
  If the input image is in .jpeg format, it might contain exchangeable image file format (Exif) metadata that includes the image’s orientation. Amazon Rekognition uses this orientation information to perform image correction. The bounding box coordinates are translated to represent object locations after the orientation information in the Exif metadata is used to correct the image orientation. Images in .png format don’t contain Exif metadata.
  
  Amazon Rekognition doesn’t perform image correction for images in .png format and .jpeg images without orientation information in the image Exif metadata. The bounding box coordinates aren’t translated and represent the object locations before the image is rotated.
- LabelModelVersion (string) –
  
  Version number of the label detection model that was used to detect labels.
- ImageProperties (dict) –
  
  Information about the properties of the input image, such as brightness, sharpness, contrast, and dominant colors.
  - Quality (dict) –
    
    Information about the quality of the image foreground as defined by brightness, sharpness, and contrast. The higher the value the greater the brightness, sharpness, and contrast respectively.
    - Brightness (float) –
      
      The brightness of an image provided for label detection.
    - Sharpness (float) –
      
      The sharpness of an image provided for label detection.
    - Contrast (float) –
      
      The contrast of an image provided for label detection.
  - DominantColors (list) –
    
    Information about the dominant colors found in an image, described with RGB values, CSS color name, simplified color name, and PixelPercentage (the percentage of image pixels that have a particular color).
    - (dict) –
      
      A description of the dominant colors in an image.
      - Red (integer) –
        
        The Red RGB value for a dominant color.
      - Blue (integer) –
        
        The Blue RGB value for a dominant color.
      - Green (integer) –
        
        The Green RGB value for a dominant color.
      - HexCode (string) –
        
        The Hex code equivalent of the RGB values for a dominant color.
      - CSSColor (string) –
        
        The CSS color name of a dominant color.
      - SimplifiedColor (string) –
        
        One of 12 simplified color names applied to a dominant color.
      - PixelPercent (float) –
        
        The percentage of image pixels that have a given dominant color.
  - Foreground (dict) –
    
    Information about the properties of an image’s foreground, including the foreground’s quality and dominant colors, including the quality and dominant colors of the image.
    - Quality (dict) –
      
      The quality of the image foreground as defined by brightness and sharpness.
      - Brightness (float) –
        
        The brightness of an image provided for label detection.
      - Sharpness (float) –
        
        The sharpness of an image provided for label detection.
      - Contrast (float) –
        
        The contrast of an image provided for label detection.
    - DominantColors (list) –
      
      The dominant colors found in the foreground of an image, defined with RGB values, CSS color name, simplified color name, and PixelPercentage (the percentage of image pixels that have a particular color).
      - (dict) –
        
        A description of the dominant colors in an image.
        
        Red (integer) –
        
        The Red RGB value for a dominant color.
        
        Blue (integer) –
        
        The Blue RGB value for a dominant color.
        
        Green (integer) –
        
        The Green RGB value for a dominant color.
        
        HexCode (string) –
        
        The Hex code equivalent of the RGB values for a dominant color.
        
        CSSColor (string) –
        
        The CSS color name of a dominant color.
        
        SimplifiedColor (string) –
        
        One of 12 simplified color names applied to a dominant color.
        
        PixelPercent (float) –
        
        The percentage of image pixels that have a given dominant color.
  - Background (dict) –
    
    Information about the properties of an image’s background, including the background’s quality and dominant colors, including the quality and dominant colors of the image.
    - Quality (dict) –
      
      The quality of the image background as defined by brightness and sharpness.
      - Brightness (float) –
        
        The brightness of an image provided for label detection.
      - Sharpness (float) –
        
        The sharpness of an image provided for label detection.
      - Contrast (float) –
        
        The contrast of an image provided for label detection.
    - DominantColors (list) –
      
      The dominant colors found in the background of an image, defined with RGB values, CSS color name, simplified color name, and PixelPercentage (the percentage of image pixels that have a particular color).
      - (dict) –
        
        A description of the dominant colors in an image.
        
        Red (integer) –
        
        The Red RGB value for a dominant color.
        
        Blue (integer) –
        
        The Blue RGB value for a dominant color.
        
        Green (integer) –
        
        The Green RGB value for a dominant color.
        
        HexCode (string) –
        
        The Hex code equivalent of the RGB values for a dominant color.
        
        CSSColor (string) –
        
        The CSS color name of a dominant color.
        
        SimplifiedColor (string) –
        
        One of 12 simplified color names applied to a dominant color.
        
        PixelPercent (float) –
        
        The percentage of image pixels that have a given dominant color.

Exceptions

Rekognition.Client.exceptions.InvalidS3ObjectException
Rekognition.Client.exceptions.InvalidParameterException
Rekognition.Client.exceptions.ImageTooLargeException
Rekognition.Client.exceptions.AccessDeniedException
Rekognition.Client.exceptions.InternalServerError
Rekognition.Client.exceptions.ThrottlingException
Rekognition.Client.exceptions.ProvisionedThroughputExceededException
Rekognition.Client.exceptions.InvalidImageFormatException

Examples

This operation detects labels in the supplied image

response = client.detect_labels(
    Image={
        'S3Object': {
            'Bucket': 'mybucket',
            'Name': 'myphoto',
        },
    },
    MaxLabels=123,
    MinConfidence=70,
)

print(response)

Expected Output:

{
    'Labels': [
        {
            'Confidence': 99.25072479248047,
            'Name': 'People',
        },
        {
            'Confidence': 99.25074005126953,
            'Name': 'Person',
        },
    ],
    'ResponseMetadata': {
        '...': '...',
    },
}