Big Data Analysis Interrogating Raw Material Variability
and the Impact on Process Performance
by
Maria Emilia Lopez Marino
B.S., Chemical Engineering, Universidad Nacional de Mar del Plata, 2013
Submitted to the MIT Department of Civil and Environmental Engineering and MIT Sloan
School of Management in partial fulfillment of the requirements for the degrees of
Master of Science in Civil and Environmental Engineering
and
Master of Business Administration
in conjunction with the Leaders for Global Operations Program at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY
June 2019
0 2019 Maria Emilia Lopez Marino. All rights reserved
Signature redacted
Signature of A uthor ...
MIT Department of Civil and Environmental Engineering and Sloan bo of Management
May 8, 2019
Signature redacted
C ertified by ...
.
R y Welsch, Thesis Supervisor
Professor of Statistics and Managen
tScice, M
4T.r-pan Schpol pIManagement
b
efifitre
y ... .... .... .... ... .... .... .... ... ...Signature redacted
Ford Pro
A re tp 1
Accented hv
Philip Gschwend, Thesis Supervisor
fesspr of Civil and Environmenal Engineering
Signature redacted
...
...
Heidi Nepf, Chair, Gra uate Pro am Committee
Donald and Martha Harleman Professor of Ciyil and Environpental Engineering
Signature redacted
p ... ...Maura Herson, Assistant Jbean, MBA Prgram
WRUATH1
H
TN,0
MIT Sloan School of Management
JUN
0
42019
LIBRARIES
ARCHNES
Pub c lnformation
THIS PAGE IS INTENTIONALLY LEFT BLANK
Public Information
Big Data Analysis Interrogating Raw Material Variability
and the Impact on Process Performance
by
Maria Emilia Lopez Marino
B.S., Chemical Engineering, Universidad Nacional de Mar del Plata, 2013
Submitted to the MIT Department of Civil and Environmental Engineering and MIT Sloan School of Management on May 8, 2019, in partial fulfillment of the requirements for the degrees
of Master of Science in Civil and Environmental Engineering and Master of Business Administration
Abstract
Within the biopharmaceutical industry, material sciences is a rapidly growing field to continue to ensure reliable production and delivery of medicines. Consequently, there is an on-going need to evaluate and assess new materials, driven by novel process technologies and new modalities. Finding a solution to technically assess the impact of raw material attributes on the manufacturing process represents a significant opportunity to ensure supply.
This study seeks to develop a novel predictive framework to assess the impact of raw material variability on the performance of commercial biologic manufacturing processes. Through machine learning techniques, the impact of two strategic raw materials is evaluated by modeling and predicting the outcomes of critical process performance variables and product quality attributes. As part of this research, we aimed to equip Amgen Inc. with a novel learning tool delivering the potential to uncover a deeper level of material variability understanding which: (1) ensures reliable supply through consistent performance, (2) provides insights to material attributes, and (3) delivers the capability to solve material-related investigations more efficiently.
Models trained via machine learning showed 89 % average accuracy on predictions for new data. In addition to the demonstrated predictive power, the models developed were highly interpretable and illustrated correlations with several material attributes. Henceforth, the framework developed is the starting point of a novel methodology towards input material variability understanding.
The predictive framework was implemented as a web-tool and is currently being piloted at Amgen Inc. The modular design of the predictive models and the web-tool enable the application to other production processes and associated raw materials, and could be generalized across the industry.
Thesis Supervisor: Roy Welsch
Title: Eastman Kodak Leaders for Global Operations Professor of Management MIT Sloan School of Management
Thesis Supervisor: Philip Gschwend
Title: Ford Professor of Civil and Environmental Engineering Department of Civil and Environmental Engineering
Pubic Information
THIS PAGE IS INTENTIONALLY LEFT BLANK
Public Information
Acknowledgements
I would like to start thanking my MIT academic advisors, Philip Gschwend and Roy Welsch, for their keen
advice, patient mentorship and for the technical insights provided during the internship and while writing the thesis. I am privileged to have learned from them.
Secondly, I want to thank Amgen Inc. for providing the opportunity of working on such an interesting topic and for the support of this thesis. I would like to thank specifically Patrick Gammell and Sally Kline, for their continuous guidance, encouragement and candid feedback to ensure a successful project. They helped me remove the obstacles along the way and headed me in the right direction. Ting Wang, Tom Mistretta, Roger Hart, the Materials Science group and the Digital Integration and Predictive Technologies group provided invaluable technical feedback and expertise. It was a pleasure working with them and I am
extremely thankful for their support and patience.
I am also grateful for the support structure that Amgen Inc. established for the MIT LGO interns. Dollie
Grajczak was instrumental in making our experience unique from start to end. I would also like to thank Aine Hanly, our LGO executive sponsor at Amgen Inc., who was deeply involved with all of our projects and was a personal champion for me. The LGO alumni community at Amgen Inc. was fantastic in bridging my experience as an intern and a LGO student, and helping me transform my research project into an MIT
LGO thesis. Chris Garvin, Kerry Weinberg, Leigh Hunnicutt deserve special recognition.
My fellow off-cyle Amgen Inc. interns, Martin Carcamo, Yucen Xie, Hillary Doucette and David Goldberg,
provided continuous feedback and insights, and inspired me to become a better leader and professional. I cannot imagine doing this work without them. I would also like to extend my gratification to the LGO staff, in particular to Ted Equi, Anna Voronova and Patty Eames, for keeping all of us on the right path. I am also extremely thankful to my classmates for their friendship and for making these two years remarkable. As I usually say, they are my favorite part of the LGO experience, which has been the best decision I have made so far. I look forward to our next chapter and future contributions as global leaders.
Finally, I would like to thank my family. I am humbled to be at MIT and they are the ones who made it all possible. They are my main source of motivation and I am grateful for their confidence in me, support and encouragement to pursue my goals, wherever they might take me.
"You can't connect the dots looking forward; you can only connect them looking backwards. So you have to trust that the dots will somehow connect in your future."- Steve Jobs.
Pubuc Informaton
THIS PAGE IS INTENTIONALLY LEFT BLANK
The author wishes to acknowledge the
Leaders for Global Operations Program for its support of this work.
Public Informatior
THIS PAGE IS INTENTIONALLY LEFT BLANK
Pubhc Information
Table of Contents
CH A PTER 1. IN TR O D U CTIO N...15
1.1 PROJECT DRIVERS AND M OTIVATION...15
1.2 PROBLEM STATEM ENT ... 16
1.3 STATEM ENT OF H YPOTHESIS AND RESEARCH M ETHODOLOGY ... 17
1.4 SCOPE AND LIM ITATIONS ... 18
1.5 THESIS OVERVIEW ... 19
CH A PTER 2. PR O BLEM BA CK G RO U N D ... 22
2.1 BIOTECHNOLOGY INDUSTR Y ... 22
2.2 ABOUT AM GEN INC...24
2.3 BIOLOGICS M ANUFACTURING ... 25
2.3.1 CHOICE OF BIOREACTOR ... 28
2.4 RAW M ATERIAL M ANAGEM ENT...29
2.4.1 KEY SUPPLY CHAIN M ANAGEMENT CHALLENGES... 30
2.4.2 PROCESS ANALYTICAL TECHNOLOGY (PAT) FOR RAW MATERIALS...32
CH A PTER 3. PR O BLEM FO RM U LA TIO N ... 34
3.1 SELECTION OF RAW M ATERIALS...34
3.2 SELECTION OF RESPONSE VARIABLES...36
3.3 OVERVIEW OF AVAILABLE DATA...38
3.4 M ACHINE LEARNING APPROACH ... 3...39
3.5 CHAPTER SUMM ARY ... 40
CH A PTER 4. LITERA TU R E REV IEW ... 41
4.1 FILTRATION PROCESSES...41
4.1.1 TANGENTIAL FLOW FILTRATION AND ALTERNATING FLOW FILTRATION ... 41
4.2 CHEM ICALLY DEFINED M EDIA USED IN CELL CULTURE ... 45
4.4 M ACHINE LEARNING OVERVIEW ... 47
4.4.1 PREDICTIVE M ODELS ... 47
4.4.2 FEATURE SELECTION M ETHODS...47
CH A PTER 5. R ESEA R C H M ETH O D O LO G Y ... 51
5.1 SETTING THE FOUNDATION: M ODEL ARCHITECTURE...51
5.2 D ATA COLLECTION M ETHODS ... 53
5.3 D ATA PREPROCESSING M ETHODS ... 54
5.4 FEATURE ENGINEERING ... 55
5.5 PREDICTIVE M ODELS ... 57
5.6 PERFORMANCE M ETRICS FOR M ODEL SELECTION ... 61
5.7 CHAPTER SUMM ARY ... 63
CH A PTER 6. R ESU LTS A ND DISCU SSIO N ... 65
6.1 M ODEL PERFORM ANCE ... 65
6.1.1 PREDICTIVE POW ER ANALYSIS OF M ODEL ... 66
Public Informatior
6.1.2 SENSITIVITY ANALYSIS ON MODEL ARCHITECTURE ... 69
6.2 C A SE STU D IES ... 70
6.2.1 RAw MATERIAL IMPACT ON PERFUSION FILTER CHANGE-OUT ... 70
6.2.2 CHEMICALLY DEFINED MEDIA ATTRIBUTES IMPACT ON TITER AND PRODUCT QUALITY ... 74
6.2.3 IMPACT OF CHEMICALLY DEFINED MEDIA STORAGE TIME ON TITER AND PRODUCT QUALITY ... 80
6.3 RAw MATERIAL SUPPLY CHAIN IMPLICATIONS ... 82
6.4 BUSINESS IMPACT SCENARIO ... 83
6.5 SUSTAINABILITY OF RESULTS: WEB TOOL DEVELOPMENT AND PILOT IMPLEMENTATION...85
6.6 C HAPTER SUM M ARY ... 90
CHAPTER 7. CONCLUSION AND RECOMMENDATIONS...92
7.1 SUMMARY OF FINDINGS AND CONTRIBUTIONS...92
7.2 RECOMMENDATIONS FOR SCALABILITY ... 94
7.3 FUTURE RESEARCH AND RELATED APPLICATIONS...94
BIBLIOGRAPHY...96
APPENDICES ... 103
APPENDIX 1. PROCESS AND MATERIAL VARIABLES INCLUDED AS INDEPENDENT VARIABLES...103
APPENDIX 2. CHEMICALLY DEFINED MEDIA COMPONENT LIST ... 104
APPENDIX 3. MODEL PERFORMANCE (EXTENDED) ... 107
APPENDIX 4. PREDICTIVE ALGORITHM COEFFICIENTS FOR EACH MODEL...109
Pubic Information
List of Figures
Figure 1-1. Schematic showing the steps of biomanufacturing. First is cell line development, second is
upstream processing, third is downstream processing, and last is fill and finish. Note: cell line
development is not performed each time a product is manufactured (Love, 2008)...26
Figure 1-2. Classic biologics manufacturing process (Amgen Inc., 2018c)...27
Figure 3-1. Elusion profile of SE-HPLC for molecule 1. y-Axis: Absorbance at 280 nm (AU) and x-Axis: elution tim e (m in). ... 37
Figure 3-2. Outline of the data available for modelling... 38
Figure 4-1. Distinction between conventional filtration and tangential (cross-flow) filtration. The washing action of the fluid passing tangentially across the surface of the membrane keeps the filter from becoming clogged (B rock, 1983). ... 42
Figure 4-2. Pressure and exhaust cycles of an alternating tangential flow filtration (ATF) module (Z ydn ey , 20 15)...44
Figure 5-1. Machine Learning pipeline implemented...53
Figure 5-2. D ata sources and relationships. ... 54
Figure 5-3. Example of feature selection using feature importance from ensemble of trees (Random F orest)...57
Figure 5-4. Sample linear interpretation of predictive model. ... 58
Figure 5-5. Partitioning of the space in a CART model (Hastie, T, Tibshirani, Robert, Friedman, 2009). ... 5 9 Figure 5-6. Sample visual interpretation of CART model...60
Figure 5-7. Diagram of cross-validation scheme with k=5... 61
Figure 6-1. Types of predictive models constructed (left)...66
Figure 6-2. Models average performance for predicting titer and product quality attributes. ... 68
Figure 6-3. Performance (AUC) for two predictive algorithms (Classification Tree and Logistic Regression) using four different feature selection methods and no feature selection method (None)...70
Figure 6-4. Performance of Logistic Regression (measured by area under the receiver operating curve) when incorporating different raw material attributes into the model for predicting perfusion filter change-o u t. ... 74
Figure 6-5. Lasso Regression performance (measured by R2) when incorporating different raw material attributes into the model for predicting titer at the end of the cell culture...76
Figure 6-6. Linear coefficients of Lasso Regression for molecule 1, process A, CDM 1, in the case where all material attributes where included in the regression...77
Figure 6-7. Lasso Regression performance (measured by R2) when incorporating different raw material attributes into the model for predicting product quality at the end of the cell culture. ... 78
Figure 6-8. Linear coefficients of Lasso Regression for molecule 1, process A, CDM 1, in the case where all material attributes were included in the regression...78
Figure 6-9. Titer variation with storage time for one lot of Chemically Defined Medium 6...82
Figure 6-10. Cost calculation scheme (Klutz et al., 2016)...84
Figure 6-11. Dual purpose user interface to: (1) collect and preprocess data and then train/re-train models (Modelling) and (2) make predictions for new data and obtain model interpretation (Predictor)...87
Public information
List of Tables
Table 5-1. Predictive models trained for regression and classification models. ... 52 Table 5-2. Confusion matrix for filter Change-Out (CO) problem ... 62 Table 6-1. Performance metric and standard deviation for all the models trained. Regression problems are
scored using R2
and classification problems, using area under the Receiver Operating Characteristic
C urve (A U C )...67
Table 6-2. Coefficients for storage time in models for titer and product quality prediction, for two
chemically defined media used in the manufacturing process C of molecule 2. ... 81
Table A 1-1. List of variables included in models...103 Table A 2-1. Media component list of amino acid of a selection of published media ... 104
Table A 2-2. Media component list of inorganic salts of a selection of published media (Landauer, Spier
andG riffiths,2012)...105
Table A 2-3. Media component list of vitamins of a selection of published media (Landauer, Spier and
G riffiths, 20 12)...106
Table A 2-4. Media component list of lipids and similar substances of a selection of published
m edia...10 6
Table A 3-1. Best feature selection technique and predictive algorithm for each of the scenarios
m odeled ... 107 Table A 3-2. Model Adjusted Performance...108
Pubbc Information
List of Equations
Equation 4-1 ... 48 Equation 4-2 ... 48 Equation 4-3 ... 48 Equation 4-4 ... 49 Equation 4-5 ... 49 Equation 4-6 ... 49 Equation 4-7 ... 50 Equation 5-1 ... 56 Equation 5-2 ... 59 Equation 5-3 ... 62 Equation 5-4 ... 62 Equation 5-5 ... 63 Equation 5-6 ... 63 Public Informatior Page 13 of 119Acronyms
API Active Pharmaceutical Ingredient
AUC Area Under Receiver Operating Characteristic Curve
CART Classification and Regression Tree
CDM Chemically Defined Media
CHO Chinese Hamster Ovary
CO Change Over
CoA Certificate of Analysis
CoG Cost of Goods
FDA Food and Drug Administration
GMP Good Manufacturing Practice
HPLC High Performance Liquid Chromatography
k-NN k-Nearest Neighbors
MI Mutual Information
MS Mass Spectrometry
NC Non-conformance
NWP Normalized Water Permeability
OBP Office of Biotechnology Products
PAT Process Analytical Technology
PCA Principal Component Analysis
PLS Partial Least Squares
PQA Product Quality Attribute
QbD Quality by Design
ROC Receiver Operating Characteristic Curve
SEC Size Exclusion Chromatography
SGD Stochastic Gradient Descent
SU Single Use
SVM Support Vector Machine
TFF Tangential Flow Filtration
VCD Viable Cell Density
Public Information
Chapter 1
Introduction
"We are on the cusp of a true revolution in
biology, including how we research and develop new therapies, offering a real chance to transform how we treat patients and address the burden of disease." - Bob Bradway.
The study seeks to develop a novel predictive framework to evaluate the impact of raw material variability on the performance of commercial biomanufacturing processes. This chapter focuses on presenting the problem statement and the motivation to address it. Additionally, we enumerate the research objectives, provide an overview of the research methodology and state the scope and limitations of the analysis.
1.1 Project Drivers and Motivation
In the words of Bob Bradway, Amgen Inc.'s CEO, "we are in a pivotal time: the Age of Biology". Our understanding of human biology has never been greater, and our ability to translate this knowledge into innovation enables us to develop cures for patients at a pace we have never experienced (PhRMA, 2018b). The progress we are seeing today has revolutionized the way we treat disease. There are about 7,000 medicines in development globally impacting U.S. patients. Across the drug development pipeline, 74 percent have the potential to be first-in-class treatments (PhRMA, 2018a).
The unprecedented growth of the biotechnology industry (projected increase of 50.8% since 2016, for 2021) (MarketLine, 2017), has also spurred competition. More than 1,400 of products in the development pipeline are follow-on biopharmaceuticals, mostly biosimilars, "me-too medicines" and "me-better medicines" in major markets (BioPlan Associates, 2018). In this landscape, Amgen Inc. has recognized manufacturing excellence and the ability to reliably ensure supply as a differentiating capability. Consistently and efficiently delivering medicines to patients, rests on a robust and reliable manufacturing process. Therefore,
Pubh;c Informatior
understanding and controlling raw material variability remains a cornerstone within Amgen Inc.'s strategy of serving "every patient, every time". This work proposes leveraging advances in artificial intelligence to uncover a deeper understanding of material variability in cell culture process performance.
Amgen Inc. has a network of material suppliers, which provide the processing aids and Single-Use (SU) equipment, process reagents and excipients used in the biomanufacturing process. Within this supply network with several players and echelons, developing a novel framework to technically assess material
attributes impact on process performance is driven by three main reasons:
Variability in process performance: advancing a shared understanding of material attributes represents
an opportunity to improve quality and reduce variability, enabling consistent performance. This shared understanding uncovers the potential to implement improvements to materials, supplier's manufacturing processes and specifications, and Amgen Inc. processes as appropriate.
" Material changes: there is an on-going need to evaluate and assess new materials, driven by
changes in drug modalities, new process technologies or novel material developments. In assessing the need for new materials/suppliers, the machine learning framework will aid Amgen Inc. in considering the performance of current materials and suppliers to ensure that engaging with new suppliers or novel materials does not pose inappropriate risks.
" Time invested in solving deviations: During commercial manufacturing, processes are monitored
for performance and deviations from in-process controls or limits are recorded and tracked as non-conformances (NCs). A deeper understanding of material attributes' impact could aid and potentially accelerate the investigation of NCs.
1.2 Problem Statement
Material sciences is a rapidly growing field within the biotechnology industry due to the complexity of the materials in use. The complexity of the biopharma supply chain network, in terms of number of materials, suppliers, and echelons, represents a challenge to the ability to understand the reasons for material
Public Information
Page 16 of 119
variability. Finding a solution to technically assess the impact of material variation on the manufacturing process represents an opportunity to apply suitable controls and ensure consistent manufacturing. Proactively assessing material attributes uncovers a deeper level of understanding which potentially ensures reliable supply through consistent performance and allows us to select quality materials by design.
The primary goal of the internship is to develop a novel framework to analyze the relationship between external material supplier data and internal material and process performance information. The purpose of the tool is to predict the impact of material variability on critical performance variables within commercial manufacturing processes of biologics.
In support of this objective, the thesis seeks to address the following questions:
* What is the impact of material variability on cell culture process performance?
* Which are the material attributes with the greatest impact on the process performance?
* Can we extract the scientific principles behind the relationship found between the material
attribute and the process performance variable?
1.3 Statement of Hypothesis and Research Methodology
Our hypothesis is that material variability has a measurable impact on biomanufacturing process performance. Through machine learning techniques we aim to quantify the impact of individual material attribute variability, and qualitatively understand which are the attributes with the largest influence on process performance.
This thesis focuses on developing a machine learning framework that takes material attributes and process data as inputs to predict critical process outcomes, such as final product quality and final product yield. There are two basic requirements for the machine learning models:
Pubil informatior
1. Predictive power: the models developed should enable us to predict, with an acceptable accuracy, the performance of new materials in the process, based on assays performed on the new materials to measure relevant attributes.
2. Descriptive power or interpretability: the models should be self-explanatory and reveal the material attributes with the largest impact on process performance.
We measured the model performance on out-of-sample data to quantify their predictive power. To enable interpretability, we incorporated feature engineering and selection steps before building the predictive model. We tested the sensitivity of model performance to the feature selection techniques chosen. The interpretability of the models is displayed through three case studies which analyze the impact of specific materials on the manufacturing process performance of selected products.
The predictive models were deployed in a Web Tool which facilitates: (1) re-training the machine learning models once new data becomes available, and (2) making predictions for incoming data and interpreting them. This tool removes the barrier of entry for users without a coding background.
1.4 Scope and Limitations
The scope of the work is circumscribed in several dimensions.
We focused on predicting the performance of commercial cell culture manufacturing process. We did not include in the analysis clinical trials or commercial downstream bioprocessing. Additionally, we focused on studying the impact of two materials on the production of four specific molecules. Two of these molecules are manufactured via different processes. All in all, we concentrated our analysis in four molecules and six manufacturing processes. The processes in scope pertain to the production of large-molecules across various Amgen Inc.'s manufacturing sites: Thousand Oaks, Rhode Island, Puerto Rico and Singapore.
As presented in the previous sections, it was within the scope of this internship to:
Public Information
" Develop machine learning models to predict material attribute impact on process performance,
for two materials as a proof of concept.
" Allow interpretability to propose scientific principles behind the relationships found.
" Suggest improvements to the current process for the establishment of functional requirements and specifications for materials in scope.
The generality and modularity of the framework developed enables extending the models to other materials, processes, products and scales (i.e., clinical) without extensive effort.
A limitation of the work is that the data on material attributes inputted into the models was provided by
external suppliers. We only included attributes which are currently being measured by the material supplier. Consequently, there might be attributes impacting process performance not included in the models. However, the evaluation of the out-of-sample performance of the models allows us to measure to what the
extent the selected features are explaining the variability observed on process performance.
1.5 Thesis Overview
This thesis is organized as follows.
Chapter 2 presents an overview of the biotechnology industry, focused on the healthcare segment; followed
by a review of one of its mayor players: Amgen Inc. We provide a general description of the manufacturing
process of biologics and the materials management challenges within the industry.
Chapter 3 lays the problem foundation, by describing the materials selected as proof of concept for the
study, the scope of the data available, the explanatory variables or features included in the model, the predicted variables and the model architecture used to build the framework.
Chapter 4 contextualizes the reader on the attributes of the materials within scope and covers previous
work done on the field to understand their variability. We also provide background on the machine learning methods and algorithms utilized.
Publc informaTior
Chapter 5 details the architecture of the framework developed. We present the modelling choices, the data collection and preprocessing methods, the feature engineering techniques and the predictive models implemented and the quality metrics which enable the best model selection.
Chapter 6 presents an overview of the model performance on out-of-sample data accompanied by a sensitivity analysis on the model architecture. This is followed by a qualitative evaluation of the models, exposing some of the insights on material attributes' impact on process performance through three case studies. Additionally, we discuss the business impact of the framework designed and the implications for the material supply chain. Finally, we introduce the modelling and predictive web tool implemented and piloted at Amgen Inc.
Chapter 7 summarizes the general findings and recommendations for Amgen Inc. We also provide recommendations for scalability. Finally, we identify areas for future work, as well as other potential applications of the work done.
This thesis is intended for multiple audiences, and the relevance of the subsets of content outlined previously depends on the area of interest of the audience.
Individuals with an academic interest in the areas of machine learning and material variability modelling are encouraged to read the entire document, with special focus on Chapters 3, 5 and 6. These chapters present the problem formulation, explain the methodology used, and describe the results, respectively. However, the rest of the chapters in the thesis provide important context, analysis, or discussion of the concepts relevant to the models developed.
Individuals with an academic interest in the areas of materials science in biomanufacturing and the biopharma industry, in general, are encouraged to focus on Chapters 2, 3, 4, and 6, which provide a description and analysis set forth in this thesis. These chapters also cover the potential impact that can be
Public information
realized by implementing the machine learning predictive framework, without getting into the specifics of machine learning modelling.
Finally, individuals within Amgen are encouraged to read the entire document with a particular focus on
Chapter 7, which provides the recommendations for future work and scalability.
Public Informatior
Chapter 2
Problem Background
"Our world is built on biology and once we begin to understand it, it then becomes a
technology. "- Ryan Bethencourt.
The scope of this thesis encompasses commercial biomanufacturing processes. This chapter provides an overview of the biotechnology industry, with focus on the healthcare segment; followed by a review of one of its mayor players: Amgen Inc. Additionally, we present a brief description of the manufacturing of biologics and the material's management system within the industry.
2.1 Biotechnology Industry
Biotechnology was defined by The European Federation of Biotechnology as an "integral application of knowledge and techniques to derive benefits from microorganism, animal and plant cultures and offers the possibilities of producing substances essential to the sustenance of life and well-being of mankind" (Rao, 1999). In more simple terms, it could be interpreted as the merger of biology and technology. This was the vision that the Hungarian agricultural engineer Karl Ereky had in 1919 in his book "Biotechnologie der Fleisch-, Fett- und Milcherzeugung im landwirtschaftlichen Grossbetriebe", where he described a technology based on converting raw materials into a more useful product.
Ereky's vision has now been realized by thousands of companies and research institutions, turning biotechnology into one of the fastest growing applied sciences. In 2021, the global biotechnology industry is forecasted to have a value of $533.9 billion, an increase of 50.8% since 2016 (MarketLine, 2011). Medical/healthcare is the largest segment of the global biotechnology industry, accounting for 57.2% of the industry's total value. Aside from healthcare, modem biotechnology includes the areas of: chemical
Public Information
industry, energy, food industry, agriculture, environment protection and abatement of pollution, and biometallurgy (Rao, 1999). The focus of this thesis remains on biotechnology medicines.
Biotechnology medicines are large molecules that are similar or identical to the proteins and other complex substances that the body relies on to stay healthy. Also designated as biologics, large molecules or protein therapeutics, biotech medicines differ in several aspects from small molecules, which typically have a low molecular weight (<1000 Da) and can be delivered in a pill form. Among many differences, the most poignant to the industry may be that large molecule drugs are typically derived from living systems, in contrast to small molecule therapeutics that are chemically synthesized. The living system, whether bacteria, yeast, or Chinese Hamster Ovary (CHO) cells, express the appropriate DNA sequence of interest that yields the therapeutic protein. These cells grow and reproduce within a series of large bioreactors, where they express the protein of interest. This protein is then harvested, purified, and formulated to be used as treatment (Nealon, 2018). Additionally, biotech medicines are typically injected or infused into the body in order to protect its complex structure from being broken down by digestion if taken by mouth.
Since the first approval of recombinant insulin by the US Food and Drug Administration (FDA) in 1982, more than 239 different proteins or peptides have been approved for clinical use by the FDA, and many more are in development (Usmani et al., 2017). This is because protein therapeutics have several advantages over traditional small molecules. Protein therapeutics offer a highly specific set of functions that cannot be mimicked by simple chemical compounds. Moreover, the elevated specificity results in less potential to interfere with normal biological processes and cause adverse effects.
Amongst the biotechnology industry, Amgen Inc. is a pioneering biotechnology company that discovers, develops, manufactures and markets human therapeutics based on advances in cellular and molecular
biology.
Pubc informatior
2.2
About Amgen Inc.
AMGen (Applied Molecular Genetics Inc.) was established in Thousand Oaks, California, in 1980, as the brainchild of venture capitalists William K. Bowes and associates. In 1993 the company raised a $40 million IPO and officially changed its name to Amgen Inc. (Amgen Inc., 2018a). The company grew to become a world-class pioneering biotechnology company. With presence in approximately 100 countries and regions worldwide, as of December 2018 Amgen Inc.'s market capitalization exceeds $ 124 billion (Yahoo Finance,
2018). The company markets its principal products mainly in the US, Europe and Canada and operates in
one business segment: human therapeutics.
Following the mission statement of serving patients, Amgen Inc. focuses on six therapeutic areas: cardiovascular disease, oncology, bone health, neuroscience, nephrology and inflammation. The company's major products include Neulasta (pegfilgrastim), Neupogen (filgrastim), Enbrel (etanercept), Xgeva/Prolia (denosumab), Aranesp (darbepoetin alfa), Epogen (Epoetin alfa), and Sensipar/Mimpara (cinacalcet) (MarketLine, 2017). Partnering with Novartis, Amgen Inc. launched Aimovig in May 2018, which became the first FDA-approved drug designed to prevent migraines and was recognized by the TIME magazine as one of the best inventions of 2018 (TIME Magazine, 2018). This launch serves as one testament of Amgen Inc.'s commitment to address areas of high unmet medical need and strive for solutions that improve health outcomes and people's lives.
Amgen Inc. is a leader in the manufacturing of biologics, which represents 76 % of their product portfolio (Amgen Inc., 2018b). Biologics are produced in living cells and are inherently complex due to naturally-occurring molecular variations. Highly specialized knowledge and extensive process and product characterization are required to transform laboratory-scale processes into reproducible commercial manufacturing processes.
Amgen Inc.'s manufacturing network has the clinical and commercial production capabilities of bulk manufacturing, formulation, fill, finish and device assembly. These activities are performed within the
Public Information
United States and its territories in Puerto Rico, Rhode Island and California facilities, as well as internationally in Ireland, Netherlands and Singapore facilities. In addition, Amgen Inc. utilizes third-party contract manufacturers to supplement its commercial and clinical manufacturing requirements. (Amgen
Inc., 2017).
2.3 Biologics Manufacturing
The manufacture of biologics is a highly complex process in comparison to small molecule manufacturing processes due to the sensitivity of biologics to environmental conditions. As stated previously, protein-based therapies have structures that are larger, more complex, and more variable than the structure of drugs based on chemical compounds. Moreover, large molecules are produced using intricate living systems that require extremely precise conditions in order to make consistent products. At a high level, the cell culture based process consists of the following four main steps:
1. Producing the master cell bank containing the gene that makes the desired protein; CHO cells are
typically used.
2. Using defined culture media to grow large numbers of cells that produce the protein.
3. Isolating and purifying the product protein.
4. Formulating and filling the biologic for use by patients.
First, an optimal cell which produces high concentrations of the target protein is engineered. Cells are
cryopreserved in vials or cell bags (~2 - 5 mL). Manufacturing is initiated with the revival of the
cryopreserved cells. Typically, the cells are thawed into small T-flasks, shake flasks, or spinner flasks and expanded in increasingly larger culture vessels to achieve sufficient cells to inoculate a seed bioreactor. Throughout the expansion process, the cells are kept in optimal conditions (temperature, pH, nutrients) for continued growth. Following the inoculum expansion and seed bioreactor steps, the cells are inoculated into the production bioreactor. During the production bioreactor step, the therapeutic protein is expressed
by the cells. After this, the desired protein is isolated from the cells and the growth media. Various
Publc Informatiorn
purification technologies are used to isolate and purify the proteins based on their size, molecular weight, and electrical charge. The purified protein is typically formulated with a sterile solution that can stabilize the protein for storage prior to administration to patients. The final steps are to fill vials or syringes with individual doses of the finished drug and to label the vials or syringes, package them, and make them available to physicians and patients (Figure 2-1).
Host cells
1. Call line developnent
o
NA-oo
4hw
o
Lead2. Upstream processing
1~
~ -01-W1-.A-*-i
0 I Ua
I S S I S I I S I S I I I U I S U S I3. Downstream processing
Final poduct
4. Fill and finish
Figure 2-1. Schematic showing the steps of biomanufacturing. First is cell line development, second is upstream processing, third
is downstream processing, and last is fill and finish. Note: cell line development is not performed each time a product is manufactured (Love, 2008).
The described procedure is essentially a batch process. Once a batch progresses through one stage, the equipment is cleaned (or discarded, in the case of Single Used Systems) and prepared for the next batch.
At every step of this process, it is crucial to maintain the specific environment that cells need in order to thrive. Even subtle changes can affect the cells and alter the proteins they produce. For this reason, strict controls are needed to ensure the quality and consistency of the final product in accordance with Good
Manufacturing Practice (Figure 2-2). For this reason, manufacturers of biologics carefully monitor process
Public Information
variables such as temperature, pH, nutrient concentration, and oxygen levels. Other offline tests are carried out to ensure that there has not been any ingress of agents such as bacteria or viruses.
DNA -Cloning
STEP1
Cellinedevdopmet Truueni c sec'besclSTEP2 Celexansion MeahWtKWocelI
STEP3 clhxe BimreacorMedia*.te
STEP
4 Halst Vemoi caIs ftm prXSTEP5
m5
pseps wess
Ded IeDe~n~ STEP 6 Wwd inid ffvnWFNgor meliSTEP 7
Filling NcunaSTEP8
FinishingCortoiedtwmpuahne
STEP9
Packaging&strage EnsurenofnNo ptidl
STEP 10 & dhfrmmfl sidXs
Testing to
enr
produSTEP 11
sWWlYi
Temawisswkerog Good Manufacturing Practice (GMP)SClan roow & swile equipmet (preenihon and COtOl of pomna bactri cnamin*i)
Vkius sagregaion (prevention of poenial virus mwnimnaion)
Segregation: Person nandMarial
ct sheff ife GM mm
Eu.~~ftwi1
U..
U..
U..
U..
U..
U..
U..
Figure 2-2. Classic biologics manufacturing process (Amgen Inc., 2018c).
Public Information
Page 27 of 119
2.3.1 Choice of Bioreactor
Many types of production bioreactor formats have been developed and investigated over the years, particularly in academia. However, batch, fed-batch, and perfusion culture are currently the dominant modes of operation for commercial mammalian cell culture based processes.
The batch mode of operation is a closed culture system in which a fixed amount of nutrients is added at the beginning of the culture. No additional nutrients are added, or fed, during the production phase.
For fed-batch cell culture operation, growth-supporting nutrients are added during the cell culture process to improve cell growth and productivity, and the volume present in the bioreactor increases due to the addition of the feed medium. As nutrients are depleted, a feed solution is added to the cell culture. The feed solution is a concentrated solution of amino acids, vitamins, and in some cases glucose, with trace elements to support the cell culture while avoiding substantial dilution of the bioreactor contents. The addition rate of the feed can be used to modulate the growth rate of the culture and may help avoid or reduce unwanted metabolic byproducts, such as lactic acid. The culture is ideally harvested prior to significant decline in culture health.
In perfusion cell culture, a cell growth period is followed by a potentially long steady-state operation. During these two phases, fresh growth medium is added to the bioreactor and spent medium, typically containing the product and potentially cells, is removed. (In some other cases, the product is retained in the bioreactor and harvested by the end of production.) A method for cell retention is needed to keep the cells in the bioreactor, and control systems have to be designed to maintain consistent flow rates, volumes, and cell densities as well as to control all typical growth conditions (such as temperature, pH, and dissolved oxygen).
Higher cell densities are possible with perfusion when compared to fed-batch or batch processes because several limitations of batch processes are removed. Nutrients can be continuously provided by the replacement of the growth medium, and product and wastes can be continuously removed. This enables cell
Public Information
Page 28 of 119
III'
growth to continue until a second level of process limitation is reached, such as cell retention device capacity or bioreactor oxygen transfer rate. At this point cells must be discarded, either with the harvest stream (this may increase demands on harvest clarification systems) or in a separate stream of concentrated cells (which may result in small losses of product). In general, higher cell densities and increased bioreactor up-time can mean that smaller bioreactor volumes are required, reducing capital investment.
This trend towards smaller bioreactors has enabled the industry to embrace the use of disposable bioreactors. As the industry looks into the future of low-volume products, rapidly changing production demands, and personalized medicine, the concept of continuous processing is becoming more attractive
(Zhou and Kantardjieff, 2013).
This thesis focuses on the impact of material variability on fed-batch and perfusion cell culture processes.
2.4 Raw Material Management
Raw materials as used in this thesis is a collective term which is inclusive of all materials used in a given
bill of materials for drug substance and drug product manufacturing processes, other than the starting cell line. This definition covers a very diverse range of materials that includes processing aids and Single-Use
(SU) equipment, process reagents and, excipients. Processing aids and SU assemblies include materials
used to assist with the manufacturing process. In other words, they have a technical function in the process but are not intended to be incorporated into the final product. SU assemblies function as equipment but are used once and replaced. The processing aids and SU group includes filters, chromatography resins, catalysts, flocculants, shear protectants, antifoam agents and single use assemblies such as: bioreactors, buffer bags, sample bags, intermediate containers, tubing assemblies and manifolds. Secondly, process reagents include materials which are used in the manufacturing process as starting materials, cell culture nutrients, hydrolysates, organic and inorganic solvents, acids, bases and buffers, oxidants, reductants, coupling reagents, organometallic reagents, cleaning agents and preservatives, but which are not intended to be part of the final product. Lastly, the excipients are materials which are formulated along with an active
Pubkc informatior
pharmaceutical ingredient (API) to enhance or ensure the final properties of the drug product. These vary widely depending on whether the drug is a parenteral or intended for oral ingestion, and can include preservatives, solubilizing agents, stabilizers, diluents in liquid form and bulking agents, binders, lubricants, coatings, colorants and flavoring agents. However, oral ingestion is not a common administration method for biologics.
Raw Material Attribute as used in this thesis refers to any physical, chemical, biological or microbiological
property or characteristic of a raw material that should be within an appropriate limit, range, or distribution to ensure the desired product quality and/or process efficiency.
Certain raw materials used in biotechnology processes are complex in nature as they often have numerous subcomponents and known to exhibit lot-to-lot variability with respect to their attributes and subsequent impact on the process (for example, cell culture nutrients and media).
2.4.1 Key Supply Chain Management Challenges
There are numerous challenges that the biotechnology industry is facing regarding raw material supply chain management. Three issues deserve special attention (Rathore and Low, 2010):
* Raw material complexity: A particularly challenging problem is that of cell culture media
complexity. Various components are mixed to form the media for microbial fermentation or mammalian cell culture steps. Therefore, a large number of subcomponents are used, often a mix of chemically defined raw materials and complex raw materials, which come from various suppliers. The performance of the process is known to be sensitive to small changes in some of these components and even to changes in the procedure used to produce the media from them.
" Large network: A typical biopharmaceutical manufacturer may have anywhere from 20 to 50
vendors that are sourcing raw materials for a given process. Consequently, simply by the large
Public Information
size of the network, managing and controlling the quality of the supply chain becomes a daunting task.
0 Process complexity: Biotechnology processes are known to employ a large number of different
raw materials (typically between 50 and 100). Thereupon, it becomes complex to examine the
effect of each raw material experimentally.
Variability in product quality caused by variability in the quality of raw materials has been highlighted as a concern by the regulatory authorities (Rathore and Low, 2010). A robust raw material management system must be in place to facilitate implementation of Quality by Design (QbD). QbD consists of an initiative originated from the FDA's Office of Biotechnology Products (OBP) which attempts to provide guidelines to build quality into the product. This can be achieved with an understanding of the product and process by which the product is developed and manufactured, along with a knowledge of the risks involved in manufacturing the product and how to best mitigate those risks.
The minimum requirements of a raw material management system include:
1. Certificate of Analysis (CoA) from the vendor and confirmation that the raw material lot meets the
internal specifications. Material specifications are set based on materials meeting the performance requirements of the process in question.
2. Testing plan for critical quality attributes analysis of raw materials as necessary.
3. Robust system to review any changes to the vendor's manufacturing process with respect to change
in raw material quality and its impact on process consistency and product quality.
4. Vendor site audits at appropriate frequency to ensure vendor qualification as needed during product lifecycle. These may include technical audits by subject matter experts to assess supplier capabilities and competencies.
5. Appropriate system to ensure traceability of raw materials.
Pubic Information
Materials and supply chain management and oversight are becoming even more essential as the industry continues to adopt single-use bioprocessing systems. In this case, bioprocessing equipment is repeatedly purchased, used, and disposed of, rather than being permanently installed and operated by internal staff. Repeated SU material purchases (with essentially every piece of equipment in contact with the process stream) involve hundreds of new and fully sterile products for each bioprocessing batch.. This is a result of the inherent complexity of SU systems and procured materials in general, which represent a source of variability compared to fixed equipment which remains unchanged.
On top of the advancements in single-use systems, the progress in technologies such as perfusion bioreactors, is moving the biopharmaceutical industry towards continuous manufacturing. As this trend continues, the variability of the raw materials used for cell culture may have a larger impact on the final product quality, relative to the in process variability, due to the process being operated at steady-state. Therefore, the accurate and reproducible characterization of raw material quality will be of paramount importance to ensure that final product quality will be consistently satisfactory (Trunfio et al., 2017).
2.4.2 Process Analytical Technology (PAT) for Raw Materials
Process Analytical Technology (PAT) is a system for designing, analyzing and controlling pharmaceutical manufacturing processes through measurements of critical quality and performance attributes of raw and processed materials to ensure final product quality, the idea of which is to become more efficient while reducing over-processing, enhancing efficiency and minimizing waste (Chen, Lovett and Morris, 2011). In 2004, the FDA published its process analytical technology (PAT) guidance, as a voluntary framework. The PAT initiative is designed to improve the efficiencies of both the manufacturing and regulatory processes through the use of an integrated approach to quality analysis. The key components of this approach are data analysis, process analytical tools, process monitoring and continuous feedback. These components enable gaining process understanding which is of paramount importance to build quality into the product.
Public Information
"A process is generally considered well understood when: (1) all critical sources of variability are identified
and explained, (2) variability is managed by the process, and (3) product quality attributes can be accurately and reliably predicted over the design space established for materials used, process parameters, manufacturing, environmental, abd other conditions. The ability to predict reflects a high degree of process understanding" (FDA, 2004). While significant progress has been made in developing analytical methods for chemical attributes (e.g., identity and purity), certain physical and mechanical attributes of pharmaceutical ingredients are not necessarily well understood. Consequently, the inherent, undetected variability of raw materials may be manifested in the final product. Establishing effective processes for managing physical attributes of raw and in-process materials requires a fundamental understanding of attributes that are critical to product quality. Such attributes (e.g., particle size within a sample) of raw and in-process materials may pose a significant challenge because of their complexities and difficulties related to collecting representative samples. For example, it is well known that powder sampling procedures can be erroneous (BioPlan Associates, 2014).
The success of PAT and QbD applications in pharmaceuticals will depend on better analytics, allowing biomanufacturers to make a strong business case for using these tools to maximize yields and minimize quality defects.
Pubhc nformatior
Chapter 3
Problem Formulation
"The formulation of the problem is often more essential than its solution." - Albert Einstein.
We have evidenced the need to develop a solution to technically assess and predict the impact of material attributes' on the manufacturing process. This chapter dissects the problem statement presented into its separate components. These elements are: the raw materials selected, the response variables which illustrate the impact of the material attributes on the process, the data available to describe the process and materials attributes, and the type of predictive models chosen. Each of them is dedicated a separate section.
3.1 Selection of Raw Materials
Two raw materials are selected to design and pilot the implementation of the framework to assess the impact of their attributes on process performance. The materials chosen are used in most biologics manufacturing processes: chemically defined media (CDM) and filters. These materials also exhibit two critical supplier relationships and represent interesting case studies because of data availability.
Chemically defined media refers to a type of medium used in cell cultures. The medium consists of a mixture of inorganic salts and other nutrients capable of sustaining cell growth and survival in vitro. Ever since the observation in the 1950s that natural media could be replaced in part by synthetic media, attempts have been made to culture cells without serum on chemically defined media (Freshney, 2011). Serum is a component of blood which is comprised of blood plasma with the fibrinogen removed. It contains neither blood cells nor clotting factors, but includes all other proteins, electrolytes, antibodies, hormones, and it may be added to the media as component to grow the cells in culture. Fetal bovine serum (FBS), or fetal calf serum (FCS) are the most widely used forms (Smith, 2015). Serum containing media naturally contain various serum derived substances, which make the medium composition undefined and whose
Public Information
concentrations can fluctuate from batch to batch. This situation makes the culture results less reproducible and poses a risk of microbial contamination. Among the serum free media, subgroups of protein free media (which do not contain any protein at all) and chemically defined media (which do not contain any undefined ingredient) provide additional stability and reproducibility for culture systems, facilitating the identification of the cellular secretions and reducing the risk of microbial contamination (Yao and Asayama, 2017)
(Section 4-2). This study focuses on chemically defined media. The class of components of defined media
comprise amino acids, vitamins, inorganic salts, glucose, organic supplements, hormones and growth factors. Even with the standardization inherent to chemically defined media production, the variability from supplier's batches could potentially derive in undesired cell culture process variability. Understanding the attributes (if any) which drive the observed irregularity on process performance is a key first step to ensure
reliability.
The other type of raw material subject of this study is hollow fiber cartridges with polymeric membranes. Filtration is a pressure-driven process by which particles are removed from fluid, air, or gas samples by passing through a permeable material. In bioprocessing, filtration plays an important role in purifying, concentrating and separating solutions and products. The filters that pertain to this study are perfusion tangential flow filters (TFF) with polymeric hollow fiber cartridges. In TFF, the solution is passed tangentially along the surface of the filter, and the constituents that are smaller than the membrane pore size are driven through the filter by means of a pressure gradient (Rautenbach, 2017). The use of the TFF method results in one feed generating two product streams: retentate and permeate. Perfusion filters are used in bioreactors to separate the spent media (permeate) and retain the cells (retentate) in the culture vessel. In some cases, the pore size is such that it also retains the product. Similar filter membranes with larger pore size are used in the last stage of cell culture bioprocessing, to harvest the product (permeate) by separating it from the cells and debris (retentate). Further detail on filtration and the use of chemically defined media in cell cultures is given in Chapter 4.
Public information
3.2 Selection of Response Variables
The response or dependent variable is the variable being predicted or explained. The explanatory or independent variables, in this case, material and process attributes, drive change in the response variable.
A couple of dependent variables are selected to depict the impact of the materials on cell culture process
performance.
In the models trained to assess the impact of CDM attributes, we selected titer and product quality as dependent variables. Titer is the term typically used to describe the concentration of the protein, such as antibodies. We use titer to refer to the concentration (in mass/volume units) of the final product in solution, and is an indicator of the efficiency of the process. Titer can be measured repeatedly during the length of the cell culture process. Our response variable is the titer sampled at the end of the cell culture process, computed as the average of three independent measurements.
The other group of dependent variables selected to evaluate the impact of CDM is the product quality attributes. Specifically, we focused on two of these attributes: amount of aggregated species and glycosylation profile.
Protein glycosylation is the attachment of a saccharide moiety to a protein, and is a modification that occurs either co-translationally or post-translationally. The two major types of glycosylation: N-linked, which are bound to the nitrogen on the amino acids asparagine or arginine, and O-linked glycans, which are bound to the hydroxyl oxygen on the amino acids serine or threonine (Lakos, Dremina and Snyder, 2018). The presence of glycosylation plays a role in protein folding, interaction, stability, and mobility, and may affect immunogenicity, pharmacokinetics and anti-inflammatory activity (Roth, Yehezkel and Khalaila, 2012). Therefore, it is imperative to obtain the full glycan profile of a glycoprotein during discovery, clinical, and manufacturing phases. Following the purification of the biologic, two approaches may be taken for glycan structure analysis: chromatography and mass spectrometry. In our work, we quantify and include as
Public Information
dependent variables particular N-linked glycan forms measured through high-performance liquid chromatography (HPLC). The specific glycan form varies depending on the product.
Aggregates of proteins may arise from several mechanisms and are typically considered to be undesirable beyond a certain level because of the concern that the aggregates may lead to an immunogenic reaction (small aggregates) or may cause adverse events on administration (particulates). In our study, the level of aggregation of the manufactured molecule is evaluated using size exclusion high-performance liquid chromatography (SE-HPLC). Size exclusion chromatography provides quantitative information on the molecular size distribution of sample proteins under non-denaturing conditions based on differences in their hydrodynamic volume. Molecules with larger hydrodynamic sizes elute earlier than molecules with smaller volumes. The Peak A+A' represents main species (monomer), and Peak B represents high molecular weight species (also referred to as aggregates) (Figure 3-1).
4- PekA+ A'
Peak 8
2JO 440 if e Uts 33
Figure 3-1. Elusion profile of SE-HPLC for molecule 1. y-Axis: Absorbance at 280 nm (AU) and x-Axis: elution time (min).
For perfusion filters, the response variable to characterize the material impact on the filtration process is the binary condition of a filter change-out. If the filter membrane is not operating at normal conditions (i.e. given a constant flux, the trans-membrane pressure increases above a certain operating threshold) in the bioprocessing, the material is replaced or changed-out. Whether the filter was changed-out during the process is the response variable.
Public Information
3.3 Overview of Available Data
The data available to build the machine learning models can be classified in two distinct types: material and process features (Figure 3-2). These are the explanatory variables which describe the changes in the response variables selected.
The material data was provided by external suppliers. As previously stated in Chapter 1, this represents a limitation of the models; only the material attributes for which we have data are incorporated into the model. Therefore, there might be missing a significant contribution of material attributes for which we do not have information. However, the evaluation of the out-of-sample performance of the models allows us to measure the explanatory value of the selected features and infer the impact of unaccounted factors on process performance. The process data consists of the daily measurements performed in the bioreactor during cell culture process and variables which are measured once per batch. The material and process features are connected by the material batch used in the production batch.
Appendix 1 presents a comprehensive list of the general process and material variables included. The
methods for collecting and pre-processing the data are reviewed in Chapter 5.
Daily measured variables z
Once per batch measured variables
Feature Machine
Generation Learning Model Material attribute
Figure 3-2. Outline of the data available for modelling.
Publc Information