Simulation

Ratios and mixture properties for simulation scenarios are calculated according to the same mathematical derivations provided in the preceding sections.

Simulation in FormuSense is a projection of a new observation onto the latent variable space for the selected model and these methods are presented in the literature. For brevity, the main equations used to project the simulated observation onto the model plane are as follows.

Note

The following equations hold true when there is no missing data. The methods used in FormuSense for handling missing data are proprietary and differ from the following.

Score Calculation:

\[T_{sim} = X_{sim} W_{model}^{star}\]

\(T_{sim}\) = all scores for the simulation data
\(X_{sim}\) = mean-centered and block-scaled X simulation data
\(W_{model}^{star}\) = transformed model X-weights

X Predicted Calculation:

\[\hat{X}_{sim} = T_{sim} P_{model}^T\]

\(\hat{X}_{sim}\) = predicted X for the simulation

\(P_{model}\) = model X-loadings

SPEX Calculation:

\[SPE_{X} = \sum_{i = 1}^{K} (X_{sim, i} - \hat{X}_{sim, i})^2\]

\(X_{sim, i}\) & \(\hat{X}_{sim, i}\) = X data and predicted for the simulation for variable \(i\)

\(K\) = all X variables

Hotelling’s \(T^2\) Calculation:

\[HotT^2 = \sum_{a = 1}^{A} \frac{t_{a, sim}^2}{s_{a}^2}\]

\(t_{a, sim}\) = score for component \(a\) for the simulation
\(s_{a}\) = variance of score \(t_{a}\) for component \(a\) in the training model
\(A\) = all model components

Y Predicted Calculation:

\[\hat{Y}_{sim} = T_{sim} Q_{model}^T\]

\(\hat{Y}_{sim}\) = predicted Y for the simulation

\(Q_{model}\) = model Y-weights

References

Wold, S.; Sjostrom, M.; Eriksson, L. PLS-regression: a basic tool of chemometrics. Chemometrics and Intelligent Laboratory Systems, 58, 2001, pp. 109-130.
MacGregor, J. F.; Yu, H.; Munoz, S. G.; Flores-Cerrillo, J. Data-based latent variable methods for process analysis, monitoring and control. Computers and Chemical Engineering, 29, 2005, pp. 1217-1223.
Geladi, P.; Kowalski, B. R. Partial Least-Squares Regression: A Tutorial. Analytica Chimica Acta, 185, 1986, pp. 1-17.
Kourti, T. Application of latent variable methods to process control and multivariate statistical process control in industry. Int. J. Adapt. Control Signal Processing, 19, 2005, pp. 213-246

Find Closest Formulation

The Find Closest Formulation tool calculates the distance between a simulation scenario and each historical model formulation in the score space based on the Mahalanobis distance. The historical model formulation with the shortest distance in the score space is selected as the closest match.

\[d_{f} = \sum_{a = 1}^{A} \frac{(t_{a, sim} - t_{a, f})^2}{s_{a}^2},\qquad \forall \; f \in F\]

Where,
\(d_{f}\) = score space distance between simulated formulation \(sim\) and historical formulation \(f\)
\(t_{a, sim}\) & \(t_{a, f}\) = scores for \(sim\) and \(f\) for component \(a\)
\(s_{a}\) = variance of score \(t_{a}\) for component \(a\)
\(F\) = all historical formulations
\(A\) = all components

Note

If the closest formulation is held by multiple historical formulations, then FormuSense will select the formulation last in alphabetical order as the closest match.

Reference

The derivation of the Mahalanobis distance in relation to the score space can be found in the paper: MacGregor, J. F.; Kourti, T. Statistical Process Control of Multivariate Processes. Control Engineering Practice, 1995, Vol. 3 Iss. 3, 404-414.