Pharma’s Big Hurdle with Big Data

By |2019-03-26T09:27:04-04:00March 26, 2019||

Pharmaceuticals Big DataThe effective use of Big Data in manufacturing is an advancement consistent with the Industry 4.0 movement. Recently, the pharmaceutical industry has recognized the value collecting data; Big Data is capable of aiding both manufacturers and patients. As such, the industry has started aggressively investing in Big Data solutions.

Astonishingly, pharmaceutical data is now doubling every 5 months.  However, most Big Data initiatives are yet to provide the value that was originally anticipated; as unfortunately, most of this complex high-volume data that is collected is never analyzed1.

ProSensus’ expertise is in analyzing large manufacturing datasets, from a variety of industries, using multivariate analysis. Unlike some industries, most pharma companies have already collected and organized their data. The challenge at hand is to optimize the use of the available data. ProSensus helps companies with this task by providing non-biased, complex analyses across the entire production process.

We develop powerful multi-block models that simultaneously consider all types of available data in a single, interpretable model. These models help our clients determine how variability in all stages of the manufacturing process impacts individual operating units and final product quality. ProSensus has experience analyzing many types of data including:

  • Raw material properties
  • Batch formulation / recipe
  • High-frequency process measurements in a single reactor or multi-stage process
  • Low frequency lab samples
  • In-line NIR spectra measurements
  • Theoretical calculations
  • Batch quality measurements

Batch Quality Properties Pharmaceuticals

Based on the client’s interest, ProSensus performs multivariate analysis on these comprehensive datasets with multiple objectives such as:

  • Troubleshooting process operating problems – Examples include troubleshooting reduced yield or contamination.
  • Quantifying differences between repeat equipment – Examples include identifying differences in parallel reactors, centrifuges, etc.
  • Predicting final batch quality – A multi-block PLS model can be developed to combine all data blocks, with the aim of identifying sources of variation correlated to product quality, and predicting the product quality at the end of the batch.
  • Optimizing final batch quality – Constrained optimization can be applied to a developed model to determine best operating conditions to maximize quality, while respecting process and regulatory constraints.
  • Assessing NIR as batch health indicator – If in-line NIR data is being collected, it is often not used to its full potential. The feasibility of using this data as a signature of process health throughout the duration of a batch can be assessed.
  • Assessing in-line implementation – Batch monitoring for early fault detection and final quality prediction may be of interest if successful models are developed. ProSensus will provide a feasibility assessment, model building, and recommendations for in-line

A typical multi-block MVA project with ProSensus includes: 1) assistance with data collection and organization, 2) analysis and modeling, 3) presentation and documentation of results and 4) recommendations for future work.

Our extensive experience in multi-block multivariate analysis will allow your Pharma company to gain valuable insights from your Big Data.  Contact us today to get started.


  1. Oliver, A. (2018). Big Data is Failing Pharma. Retrieved from

About the Author: Monica Salib

Monica Salib, B.Eng.
Project Engineer
Monica holds a chemical engineering degree from McMaster University. She has been involved with client projects in rapid product development, troubleshooting analyses, in-house courses, and advanced modeling sessions. Monica has impressed clients with her attention to detail and ability to organize large datasets.