CloudWatchEvidently / Client / get_experiment_results



Retrieves the results of a running or completed experiment. No results are available until there have been 100 events for each variation and at least 10 minutes have passed since the start of the experiment. To increase the statistical power, Evidently performs an additional offline p-value analysis at the end of the experiment. Offline p-value analysis can detect statistical significance in some cases where the anytime p-values used during the experiment do not find statistical significance.

Experiment results are available up to 63 days after the start of the experiment. They are not available after that because of CloudWatch data retention policies.

See also: AWS API Documentation

Request Syntax

response = client.get_experiment_results(
    endTime=datetime(2015, 1, 1),
    startTime=datetime(2015, 1, 1),
  • baseStat (string) – The statistic used to calculate experiment results. Currently the only valid value is mean, which uses the mean of the collected values as the statistic.

  • endTime (datetime) – The date and time that the experiment ended, if it is completed. This must be no longer than 30 days after the experiment start time.

  • experiment (string) –


    The name of the experiment to retrieve the results of.

  • metricNames (list) –


    The names of the experiment metrics that you want to see the results of.

    • (string) –

  • period (integer) – In seconds, the amount of time to aggregate results together.

  • project (string) –


    The name or ARN of the project that contains the experiment that you want to see the results of.

  • reportNames (list) –

    The names of the report types that you want to see. Currently, BayesianInference is the only valid value.

    • (string) –

  • resultStats (list) –

    The statistics that you want to see in the returned results.

    • PValue specifies to use p-values for the results. A p-value is used in hypothesis testing to measure how often you are willing to make a mistake in rejecting the null hypothesis. A general practice is to reject the null hypothesis and declare that the results are statistically significant when the p-value is less than 0.05.

    • ConfidenceInterval specifies a confidence interval for the results. The confidence interval represents the range of values for the chosen metric that is likely to contain the true difference between the baseStat of a variation and the baseline. Evidently returns the 95% confidence interval.

    • TreatmentEffect is the difference in the statistic specified by the baseStat parameter between each variation and the default variation.

    • BaseStat returns the statistical values collected for the metric for each variation. The statistic uses the same statistic specified in the baseStat parameter. Therefore, if baseStat is mean, this returns the mean of the values collected for each variation.

    • (string) –

  • startTime (datetime) – The date and time that the experiment started.

  • treatmentNames (list) –


    The names of the experiment treatments that you want to see the results for.

    • (string) –

Return type:



Response Syntax

    'details': 'string',
    'reports': [
            'content': 'string',
            'metricName': 'string',
            'reportName': 'BayesianInference',
            'treatmentName': 'string'
    'resultsData': [
            'metricName': 'string',
            'resultStat': 'Mean'|'TreatmentEffect'|'ConfidenceIntervalUpperBound'|'ConfidenceIntervalLowerBound'|'PValue',
            'treatmentName': 'string',
            'values': [
    'timestamps': [
        datetime(2015, 1, 1),

Response Structure

  • (dict) –

    • details (string) –

      If the experiment doesn’t yet have enough events to provide valid results, this field is returned with the message Not enough events to generate results. If there are enough events to provide valid results, this field is not returned.

    • reports (list) –

      An array of structures that include the reports that you requested.

      • (dict) –

        A structure that contains results of an experiment.

        • content (string) –

          The content of the report.

        • metricName (string) –

          The name of the metric that is analyzed in this experiment report.

        • reportName (string) –

          The type of analysis used for this report.

        • treatmentName (string) –

          The name of the variation that this report pertains to.

    • resultsData (list) –

      An array of structures that include experiment results including metric names and values.

      • (dict) –

        A structure that contains experiment results for one metric that is monitored in the experiment.

        • metricName (string) –

          The name of the metric.

        • resultStat (string) –

          The experiment statistic that these results pertain to.

        • treatmentName (string) –

          The treatment, or variation, that returned the values in this structure.

        • values (list) –

          The values for the metricName that were recorded in the experiment.

          • (float) –

    • timestamps (list) –

      The timestamps of each result returned.

      • (datetime) –


  • CloudWatchEvidently.Client.exceptions.ThrottlingException

  • CloudWatchEvidently.Client.exceptions.ValidationException

  • CloudWatchEvidently.Client.exceptions.ConflictException

  • CloudWatchEvidently.Client.exceptions.ResourceNotFoundException

  • CloudWatchEvidently.Client.exceptions.AccessDeniedException