In 2024, AI is top-of-mind for executives and leadership teams around the world and across every industry. Everyone is looking for new opportunities to unlock the power of AI within their organization – to gain a competitive advantage today or, at least, to not be left in the dust tomorrow. Manufacturers of formulated products are certainly one of the many organizations feeling the urgency to act fast.
 
 
Until very recently, manufacturers simply accepted that critical R&D work took many years, a large budget, and a small number of highly-specialized scientists to formulate new products or reformulate existing products. However, that narrative has come to an abrupt end delivered by advances in data acquisition, AI, and high-performance computing. It’s now undeniable that product formulation is undergoing a digital transformation. Scientists at the forefront are already leveraging explainable AI to gain a deeper understanding of ingredient and performance relationships, improve knowledge sharing, reduce the need for physical experimentation, and ultimately accelerate formulation timelines.

Subject-Matter Experts

Explainable AI is an excellent fit for product formulation and research departments. Machine learning tools are being used to build effective predictive models and assist with related data analysis. These models can be applied in the forwards direction to simulate the outcome of new ingredient combinations or in the backwards direction to optimize ingredient combinations for given performance property targets. The implementation value is clear; however, some who are looking to adopt this technology seem to have less clarity around how much up-front and on-going effort is required. In fact, AI is not black magic that will turn formulation science into simply hitting the “easy button” or sending a prompt to GPT. Your workflows will not reset overnight. Your subject-matter experts (SMEs) will continue to be your secret ingredient.

Project selection, data curation, model building and model validation, as well as interpretation of simulation and optimization scenarios require supervision, if not detailed input, from subject-matter experts in order to identify opportunities that are likely to succeed and to extract actionable results. FormuSense has been developed around a proven framework that empowers SMEs to accomplish these tasks efficiently and confidently with explainable AI. In contrast to black-box methods, FormuSense uncovers correlations among and across ingredients and performance properties with thoughtfully designed tools and intuitive visualizations.

Model Structure

Among the various tasks requiring subject-matter expertise, model building and model validation are arguably most important. As with any data-centric model, garbage in equals garbage out. This applies not only to the quality of the data included, but also the structure of the model itself. For example, it may be difficult to improve an initially poorly-fit model without the historical context of the dataset and relevant domain knowledge that a SME typically holds. However, coupling input from that expert along with tools and visualizations in FormuSense, non-linearities can be readily identified and meaningful data transformations can be generated that are likely to improve model fit.

Detecting Non-linearities

While domain-knowledge is the first and foremost “tool” that can be used to identify the presence of non-linearities, there are several indicators within FormuSense visualizations that can be explored with an SME to aid in intelligently screening for non-linearities including:

  • Defined curvature in observed vs predicted plots
  • Defined curvature in X vs Y plots
  • Poor model fit (R2 plots), especially on validation set
  • Unusual X-score vs Y-score (t vs u) plot results

Building Data Transformations

FormuSense users may readily build data transformations to handle non-linearities. The custom equation builder allows users to configure a specific transformation to address input or findings from an SME. This is the most reliable way to improve a model without overfitting noise.
Below are some examples of custom transformations:

  • Transform a Y variable (log Y1)
  • Transform an X variable (X12)
  • Variable interaction (X1*X2)
  • Variable range (a < X1 ≤ b)

In the absence of SME input, a large number of transformations (resultant of all combinations of transformation equations and variables) can be generated for further evaluation, or FormuSense can screen for signal in all possible transformations and suggest a smaller number of transformations for consideration.

Baked Goods Example

Consider a baked goods dataset, where the goal was to reformulate an existing product to minimize a key quality property.

Although the initial PLS model developed in FormuSense resulted in an acceptable model fit on the final dataset (training + validation data), the R2 on the validation set alone (unseen data) was below 60%.

To address this shortcoming, the SME explored data transformations on one particular ingredient class, based on her understanding of the food chemistry. The X vs Y plot below confirmed her expectation that keeping Leaven within a particular range of the overall recipe correlates with low values of the quality property that is to be minimized.

Accordingly, a transformation was generated in FormuSense to isolate a particular range of Leaven which then significantly improved the R2 on the validation set for the quality property of interest (from 59% to 71%).

Try FormuSense Today

FormuSenseWhile data transformations are not always necessary, there is always value to be had in involving SMEs in model development, validation and results interpretation.

Explainable AI tools, such as FormuSense, are empowering SMEs to conduct highly efficient materials formulations research.
 
Activate a 30-day trial of FormuSense today to receive a sample formulation dataset and tutorial instructions that will motivate your own digital transformation.