• Aucun résultat trouvé

Identification of complex microbiological dynamic systems by nonlinear filtering

J.P. Gauchi1, C. Bidot1, J.C. Augustin2, J.P. Vila3

1 Unité de Mathématiques et Informatique Appliquées (UR341), INRA, Domaine de Vilvert, 78352 Jouy-en-Josas, France (Jean-Pierre.Gauchi@jouy.inra.fr, Caroline.Bidot@jouy.inra.fr)

2 Laboratoire ENVA/Unité Microbiologie des Aliments – Sécurité et Qualité, 7, avenue du Général de Gaulle, 94704 Maisons-Alfort, France (jcaugustin@vet-alfort.fr)

3 UMR Analyse des Systèmes et Biométrie, INRA/SupAgro, 2, Place P. Viala, 34060 Montpellier, France (Vila@supagro.inra.fr)

Abstract

In this paper, we consider microbiological dynamic systems encountered in food processing and conservation (e.g., Listeria monocytogenes). In order to predict the evolution of bacteria populations to improve our microbiological knowledge or to meet other objectives such as risk analysis, the modeling and identification of such stochastic dynamic systems have to first be subjected to adapted statistical approaches. However, the specificities of such systems (nonlinear state space systems) in which the variables of interest are only indirectly observed through a complex process of sampling, dilution series and counting, make it difficult to use classical methods such as least squares or maximum likelihood. It is then necessary to resort to specialized approaches such as sequential estimation techniques. This paper describes the first results of such a promising approach that involves a new particular filter applied to simulated microbial dynamics.

Keywords

Nonlinear filters, convolution kernels, particle filters, predictive modeling, microbiology.

Introduction

The complexity of food-type microbiological systems is related to both structure (ecosystems involving several bacterial species in interaction) and to function. The latter is the result of the distribution of their dynamics according to at least three hierarchal levels that we can approximately represent here by: (a) an external level, only accessible to measurements (counts taken in culture medium after sampling and dilution series); (b) an intermediary level, corresponding to the dynamics of bacterial increases or decreases, strictly speaking, not directly measurable in the food substrate considered but normally capable of being modeled in the form of primary stochastic models; (c) a deeper level, characterized by kinetic variables that determine the preceding dynamics and that are themselves the result of the interactions of various biotic and abiotic factors (media conditions such as pH, temperature, water activity, etc.). These kinetic elements can sometimes be modeled in the form of secondary stochastic models.

These functional characteristics make all predictive modeling and identification of these bacterial dynamics particularly difficult if not impossible when using classical approaches (e.g., nonlinear least squares). Recent research (Rossi and Vila, 2003; 2005; 2006) made it possible to provide a relevant methodological response to the two problems of parametric identification of these systems and to the estimation of predicted probability densities of the evolution of bacterial concentrations in substrates of interest. This research is based on the implementation of a new nonlinear particular filtering technique using a convolution kernel approach.

The aim of this presentation is to show the results obtained by applying this new type of filtering system with two actual microbiological dynamic systems (the second one will be discussed at the conference), of great interest to the food microbiology community, using a new user-friendly procedure written in Matlab (Bidot et al., 2009; Vila et al., 2009). Rather

than go into the theoretical basis of the method, published in Rossi and Vila (2003; 2005;

2006), here, we will only provide some elements in the following section.

Method

Statistical samplings

The current data acquisition procedures are sequential: a sample is taken from a primary solution tube, diluted in cascade (Fisher dilution series), and then spread over Petri dishes.

After a period of time, the bacterial units that form colonies (CFU) are counted, assuming that each colony corresponds to a single deposited bacterium. The current practice in microbiology at this time is to deduce the number of bacteria present in an initial solution of the primary tube by simple extrapolation using successive dilution factors and ignoring errors due to sampling (at the time the sample is removed), pipetting, dilution and counts on the Petri dishes. This highly approximate determinist estimation leaves room for much improvement. Using hypotheses involving spatial distributions to be validated experimentally (Poisson distributions, aggregative distributions) and hypotheses involving handling errors, we carried out probabilistic characterizations of these counts (depending on the initial numbers), taking estimates of the variances of all successive errors into account.

Conditional density estimates of countings

The dynamic systems considered here are similar to nonlinear space state systems, written as follows:

xt+1 = ft(xt+1, θ , εt) yt+1 = L (.| xt+1, θ )

where xt+1 is the real vector of dimension d of state variables, ft is a known function, yt+1 is the real vector of dimension q of observation variables, θ is a vector of unknown and constant parameters, εt is a white noise vector and L is the probability distribution of yt+1. In the convolution filtering approach, which is ours, we introduce the additional state equation, θt+1t,, to link the estimation of conditional densities of parameters θ with those of variables xt. On the basis of a priori densities at t = 0 for the variable vector and for the parameter vector, these estimations at time t are calculated using the formulas given by Rossi and Vila (2005; 2006). This is a non-parametric approach where the particles are generated from densities based on Parzen-Rosenblatt kernels. We will refer to Rossi and Vila (2005; 2006) for the equations and more details about this filtering algorithm and its convergence conditions.

Results and discussion

The Baranyi-Roberts (BR) model

The very well-known equation of the BR model (Baranyi and Roberts, 1994; 1995) under a discretized autoregressive form that will be our state equation is:

Nt+1 = δ N0 exp(μmax At ) (1/ Bt)( μmax (dAt /dt) – (dBt /dt )( 1/ Bt)) + Nt + φt where:

At = t + (1/μmax) ln(exp(- μmax t) + exp(μmaxλ) – exp(- μmaxt - μmaxλ)) Bt = 1 + (exp(μmax At) - 1)/(Nmax / N0)

where:

- Nt is the state variable, i.e., the bacteria number in the medium at time t,

- N0 is the bacteria number in the medium at the initial time t0; it is a parameter to be estimated,

- Nmax is the maximum bacteria number; it is a parameter to be estimated, - λ is the lag time before growth; it is a parameter to be estimated,

- μmax is the maximum growth rate; it is a parameter to be estimated,

- φt is an error term, a centered Poisson random P(Ntc) - (Ntc), where c is a fixed real constant or a parameter to be estimated and Ntc the parameter of the Poisson distribution,

- δ is the discretization step.

In this case, the observation model yt+1 corresponds to the distribution based on the sampling-dilution process (typically Poisson distributions) and CFU counts on the Petri dishes, referred to as L(.|xt+1, θy), where θy is a sub-vector of θ. This distribution cannot be characterized analytically, but it can be simulated, a requirement for putting the filter into practice.

Protocol

The CFU counting data were obtained using the following protocol:

- ten sampling times = 0, 72, 120, 168, 240, 264, 288, 336, 408, 504 (hours), with three samplings at each sampling time,

- five different dilution factors: the dilution factor is equal to 1 on the 0-120 time range, equal to 0.1 at the time 168, equal to 10-3 on the 240-264 time range, equal to 10-5 on the 264-408 time range, and equal to 10-5 at the time 504,

- the postulated parameter intervals were: [0.01 ; 2] for μmax, [20 ; 60] for λ, [100 ; 400] for N0, and [107 ; 109] for Nmax,

- discretization step δ = 24 (hours), - particle number = 106.

With this protocol, we obtained (Fig. 1) the evolutions of the parameter estimations, as well as their approximated confidence intervals for the considered number of particles.

The final estimations for the four parameters, taking a computing time of about 5 minutes into account, are: N0 = 280, Nmax = 3.44584x108, λ=36, and μmax=0.063.

We have shown here the application of our method to a model that is very well known to the microbiology community, but since there are only four parameters in this model, this estimation problem is too easy to fully illustrate the power of the method that we propose.

We do not have adequate space here to show the results of a model with seven parameters, which would be the Baranyi model in which we would have replaced the parameter μmax by its expression given by the cardinal secondary model of Rosso et al. (1993). Calculations were, however, made and we will present the graphs at the conference.

The advantages of the approach that we propose are highly significant and multiple. Three of them stand out in particular:

- all of the variability sources are taken into account to estimate the parameters,

- we can directly obtain these estimations without using the estimation of the μmax of the primary model,

- it is not necessary to know the initial values for the parameters since the range of variations (even very large ones) are sufficient.

Moreover, it can be observed that the estimation in two steps by nonlinear regression for models with many parameters does not often work because of a poor choice of initial values.

1 2 3 4 5 6 7 8 9 10

Figure 1: Evolution of the four-parameter estimates for 106 particles. The dashed lines represent the lower and upper bounds of a 95% confidence interval, while the central line represents the parameter estimation.

Conclusions

We think that the method described here can be used to estimate the parameters for very complicated models with many parameters, models often in demand in the food microbiology community (Augustin et al., 2005). Moreover, this method is currently being generalized to resolve the following problems: comparison and selection of dynamic models by Bayes factors estimated by filtering (Vila and Saley, 2009), rupture detection in dynamic models or, on the long term, perhaps even the regulation and control (for example, by means of temperature) of microbiological systems.

References

Augustin J.C., Zuliani V., Cornu M., Guillier L. (2005) Growth rate and growth probability of Listeria monocytogenes in dairy, meat and seafood products in suboptimal conditions. Journal of Applied Microbiology, 99, 1019-1042.

Baranyi J. and Roberts T.A. (1994) A dynamic approach to predicting bacterial growth in food. International Journal of Food Microbiology, 23, 277-294.

Baranyi J., Roberts, T.A. (1995) Mathematics of predictive food microbiology. International Journal of Food Microbiology, 26, 199-218.

Bidot C., Gauchi J.P., Vila J.P. (2009) Programmation Matlab du filtrage non linéaire par convolution de particules pour l’identification et l’estimation d'un système dynamique microbiologique. Rapport technique INRA/Jouy-en-Josas/MIA/ n°2009-1.

Rossi V. and Vila J.P. (2003) Filtrage non linéaire en temps discret par convolution de particules. Actes des XXXVèmes Journées de Statistique, 823-826.

Rossi V. and Vila J.P. (2005) Approche non paramétrique du filtrage de système non linéaire à temps discret et à paramètres inconnus. C.R. Acad.

Sci. Paris. Ser I 340, 759-764.

Rossi V. and Vila J.P. (2006) Nonlinear filtering in discrete time : a particle convolution approach. Inst. Stat. Univ. Paris, 3, 71-102.

Rosso L., Lobry J.R., Flandrois J.P. (1993) An unexpected correlation between cardinal temperatures of microbial growth highlighted by a new model. Journal of Theoretical Biology, 162, 447-463.

Vila J.P., Saley I. (2009) Bayes Factor estimation for nonlinear dynamic state space models. C.R. Acad. Sci., Paris, Ser. I 347, 429-434.

Vila J.P., Gauchi J.P., Bidot C. (2009) Identification de systèmes dynamiques microbiologiques complexes par filtrage non linéaire. Actes des 41èmes Journées de Statistique, Bordeaux, May 25-29, 2009.

Modelling the influence of free fatty acids on heat resistance of