Finding The Formula For Success with PepsiCo!

Finding The Formula For Success with PepsiCo!2019-06-19T20:01:05+00:00

Project Description

ProFormulate Muffin Reformulation for PepsiCoThe ProFormulate framework was used to reformulate several muffin products for PepsiCo Foods. Initially, PepsiCo wanted to investigate a specific attribute across 26 flavours of muffins and reformulate according to the results.

By including all of the muffin formulas together in one model, ProSensus not only determined which were the key ingredients but also achieved a common platform for designing and evaluating new formulas.

The next time PepsiCo Foods wanted to reformulate muffins (to different design criteria), the model helped them get there in just two iterations.

The Challenge

Our client wanted to achieve a reduction in a specific quality attribute called AOI, without introducing new ingredients, and while maintaining the sensory properties of the existing product as much as possible. AOI was a poorly understood property for which a first-principles model was unavailable.

The Results

Four different products were targeted, and dramatic AOI reductions ranging from 47%-55% were achieved. Maintenance of the sensory properties was challenging because they had not been quantified, but ProFormulate can accommodate less-than-perfect data sets.

Our Approach: The ProFormulate Framework

Assess Available Data

Ideallly, recipe data and quantified outcomes are available for several variations on a product, or a family of products. Alternatively, the ProFormulate approach can generate highly efficient designed experiments even with no prior data. For the muffin batters, AOI had been quantified for all muffin formulas currently in production, and these were paired with the recipe data and label-stated baking parameters to build an initial PLS model.

Interpret the Data with Latent Variable Models

The initial PLS model distilled 60 variables down to 14 latent (“underlying”) variables, with an excellent quality of fit (R2=0.96) and quality of prediction (Q2=0.94). Even just two latent variables do a good job of representing the model space, with R2>0.85 and Q2>0.6.

The biplot for the first two latent variables (below) demonstrates the power of using PLS for this application, i.e. PLS captures the underlying correlation structure of the x- and y-variables in addition to providing the regression coefficients for the y-variables.

Product Reformulation Case StudyThe blue points and confidence ellipses show the distribution of observations in the first two model dimensions. Observations located near each other are similar in formulation (x-values) and in AOI (y-value).

The nature of the similarities is illustrated by the location of the variables. For example, Leaven2 and Leaven3 have high weights in the first model dimension; they are located farthest from the origin. Observations to the right of the origin tend to have more Leaven3 and less Leaven2.

AOI is positively correlated with Spice7, Misc10, and Leaven2, located nearby, and negatively correlated with VegFru7 and Dairy1, located diagonally across the origin. The initial PLS model identified ingredient correlations with AOI, and these findings were followed up with a literature search to better understand the correlations. The model was also used as a basis for further experimentation.

Augment models with designed experiments

Next, ProFormulate aims to augment the model with a few carefully selected runs. Design of experiments is a tool for making data balanced and representative [2], and this goal remains the same in latent variable spaces. Specific runs are chosen to fill in any gaps in the latent variable space, extend the model to cover new regions or focus in on areas of interest. In this case study, the model was augmented by designing a 2-level factorial in the first three latent variables.

What’s Wrong with Traditional DOE’s?

Food Reformulation Multivariate AnalysisTraditional use of factorial and/or mixture design of experiments can be problematic in product development. The number of experiments can quickly become intractable, especially if we want to explore a wide range of possible raw materials.

Formulating an optimal experiment in the latent variable space is an efficient alternative. Since the latent variables are an orthogonal representation of the dominant directions in the original variables, we can typically explore the important regions with far fewer experiments. To illustrate, this figure shows a traditional 3-factor, 2-level factorial design for one muffin formula projected into the latent variable space.


ProFormulate uses optimization techniques to invert the PLS model; this inversion selects raw materials and process conditions that best meet the project goals while meeting relevant constraints. In this case study, optimization was used to find recipes that minimized the attribute of interest (AOI), met nutritional constraints, and achieved the required cost targets.

Multivariate constraints helped to ensure that sensory attributes would be in-line with the original muffin batters. Optimization produced several recipe alternatives for each of the four target products; these were baked and evaluated in the laboratory.

Completing the Cycle

The new data generated by the optimization step was used to update the model. This step completes the ProFormulate process, and ensures that the model contains and represents all knowledge gained during the project. When the next reformulation is undertaken, this model is used as the starting point for determining a recommended formulation straight off or a few strategic experimental runs, depending on the project goals.

ProFormulate Plus

The inclusion of raw material characterizations greatly extends the capabilities of the ProFormulate process. PLS models built with this data express the final product properties in terms of the raw material properties. The optimization can therefore select raw materials that have never been tried before, based on their property values [3].


  1. E. Nichols, Latent Variable Methods: Case Studies in the Food Industry, Hamilton: McMaster University, 2011.
  2. S. Wold, M. Josefson, J. Gottfries and A. Linusson, “The utility of multivariate design in PLS modeling,” Journal of Chemometrics, pp. 156-165, 2004.
  3. K. Muteki, J. F. MacGregor and T. Ueda, “Rapid Development of New Polymer Blends: The Optimal Selection of Materials and Blend Ratios,” Industrial & Engineering Chemistry Research, 45, 4653-4660, 2006.