Experience Insights

Analyze how your experiments and pesonalizations are serving your goals.

In order to use Experience Insights, make sure to follow the steps in the Insights setup section.

Experience Insights tracks the performance of your Personalizations and Experiments. It empowers marketers and data analysts with powerful insights, enabling them to make data-driven decisions. Our tool is designed to provide statistical analysis without creating yet another source of truth, integrating directly into your current data workflow.

Experience Insights provides you with clear measures such as Conversion Rate, Average Conversion Value and Total Users/Profiles for every Experiment or Personalization. On top of that, you can compare the performance of individual variants. Every Ninetailed Experience with a set holdout/control group is eligible for performance comparisons. We compare variants with measures such as Uplift and Probability to Be Best, giving you clear insights into what performs best, using Bayesian inference.

In the upcoming chapters, we will guide you through all steps necessary to evaluate the performance of Experiences with ease and take action where required.

Step 1: Select Your Experience

To access Experience Insights, open any Ninetailed Experience entry in your Content Source such as Contentful or Contentstack.

Step 2: Select Your View

Once you have selected a specific Experience, click on the Insights tab to see insights. Before we dive into performance analysis, let's make sure we have selected the correct view.

Time Period

Besides the standard 30-day interval, you are free to choose any reporting time frame for your Experiences.


Metrics are conversion events or goals. By default, Ninetailed tracks the performance of all metrics that you have set up across all of your Experiences. Depending on the type and scope of the metric, some measures will change accordingly.

Make sure you have set up your desired metrics before the start of an Experiment or Personalization. You can always add metrics at a later stage. However, they will only be available from the time that they have been added to Ninetailed.

Step 3: Performing Analysis

Finally, an analysis can be performed to derive action items. The main table provides a set of variants with respective measures.

We recommend running each Experience for at least 2 weeks to account for behavioral anomalies, before making any decisions based on performance analysis.


Each Experience typically has at least two variants: A “control” and a “variant 1”.

  • Control - Also known as “baseline” or “holdout,” displays the values where a user has not been presented with a Personalization or variant that is being experimented with. This serves as a comparison item for any Personalization or variant.

  • Variant 1 - This represents the variant or Personalization that you are running. It is compared against the baseline (or against other variants if you are running a multi-variant Experiment).

If you are running an Experiment with multiple variants, you will be presented with more values accordingly.


At the heart of Experience Insights are measures that track the actual performance of goals across Experiences. We are using Bayesian inference to provide dynamic statistical significance.

Conversion Rate or Conversion Value

Conversions measure the success of how many sessions or users have reached the given metric goal in the given time period. Depending on the metric, there are two ways how conversions are displayed:

If the metric is a "binary" metric, the conversion rate is evaluated. It displays what percentage of sessions or users have first seen the given Experience and then reach the conversion goal.

If the metric tracks a "conversion value", the average value for that conversion is displayed. Sessions or users must first see the given Experience and then achieve the conversion goal. For every session or user that has not achieved any conversion goal or only seen the given Experience afterwards, the value 0 is added to the overall average.


The uplift measures the relative lift or drop of the conversion rate or conversion value compared to the control group. It is calculated using Bayesian inference, the industry standard for statistical Experiments. Combined with the Probability to Be Best, the uplift is a leading measure to derive action items from the analysis.


Reach measures the number of profiles any variant has reached in the given time period.

Probability to Be Best

The Probability to Be Best measures the likelihood that the individual variant is the best out of all given variants and is based on Bayesian inference and Monte Carlo simulations. The two contributing factors to this measure are the reach of each variant and the relative uplift. Therefore, the Probability to Be Best is a robust method to make a statistically significant decision for winning variants.

Example: Imagine you run an experiment for a few days. After several days, it has only reached a few hundred users. Now, even if one of your variants shows a high uplift of 50%, the Probability to Be Best might still not be significantly high. Similarly, if your experiment reaches a large amount of users but no variant is performing at a much higher rate, the Probability to Be Best will not decide on any winner.

Choosing a Winner

By default, Ninetailed highlights a "winning variant" at the 95%-confidence level. This means, that any variant with a Probability to Be Best that is equal or higher than 95% will receive a "winner" tag.

The tag is only descriptive. Ninetailed does not take any further action besides highlighting a winner. Depending on whether you are analyzing the performance of an Experiment or a Personalization, you might derive different action items.

Example: If a winning variant has been chosen for an Experiment, you might want to

  • Turn that variant into your new baseline

  • Use that new variant as a new Personalization

  • Create a new Experiment with yet another variant to keep testing what works best

Example: If a winning variant has been chosen for a Personalization, you might want to

  • decrease the holdout percentage

  • add more components to your Personalization to increase the uplift even further

  • create an Experiment for the same Audience to see how you can improve your Personalization

Data availability

Data for Experience Insights is processed in near real-time in hourly intervals.

Last updated