Comparison of forecast models of production of dairy cows combining animal and diet parameters

(1)

HAL Id: hal-02358044

https://hal.archives-ouvertes.fr/hal-02358044v3

Submitted on 5 Feb 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Comparison of forecast models of production of dairy cows combining animal and diet parameters

Thong Nguyen, Remy Fouchereau, Emmanuel Frenod, Christine Gerard, Vincent Sincholle

To cite this version:

Thong Nguyen, Remy Fouchereau, Emmanuel Frenod, Christine Gerard, Vincent Sincholle. Compari-

son of forecast models of production of dairy cows combining animal and diet parameters. Computers

and Electronics in Agriculture, Elsevier, 2020, 170, pp.105258. �10.1016/j.compag.2020.105258�. �hal-

02358044v3�

(2)

Comparison of forecast models of production of dairy cows combining animal and diet parameters

Quoc Thong Nguyen ^a , R´ emy Fouchereau ^b , Emmanuel Fr´ enod ^a,b , Christine Gerard ^c , and Vincent Sincholle. ^c

a Universit´ e de Bretagne Sud, Laboratoire de Math´ ematiques de Bretagne Atlantique, UMR CNRS 6205, Campus de Tohannic, Vannes, France

b See-d, Parc Innovation Bretagne Sud, Vannes, France

c NEOVIA, France

Abstract

We study the effect of nutritional diet characteristics on the lactating Holstein- Friesian dairy cows in Brittany, France from 36 individuals. An analysis of the relations between fat/protein content and milk yield was implemented for our dataset. The fat and protein production increase at a slower rate as milk yield increases. The importance of chemical composition on milk production is studied using the linear model. The data analysis confirms the importance of Starch, crude fiber, and protein which have a positive effect on milk production. This analysis also confirms the previous study on the effect of parity on the production. After that, the milk production forecast- ing is investigated using both linear models and machine learning approaches (support vector machine, random forest, neural network). We study the per- formance of multiple linear regression and machine learning-based models in both non-autoregressive and autoregressive cases at the individual level.

The autoregressive models, which take into account the previously observed milk yield, have proven to significantly outperform the non-autoregressive approaches. Moreover, the computational cost of each approach is presented in the paper. While the random forest algorithm gives the best performance in both non-autoregressive and autoregressive approaches. The support vec- tor machine algorithm gives a very close performance with a substantial less computing time. The support vector machine is shown to be the best com-

∗ Corresponding author

Email address: quoc-thong.nguyen@univ-ubs.fr (Quoc Thong Nguyen)

(3)

promise between accuracy and computational cost.

Keywords: Milk production forecasting, Dairy modeling, Autoregression, Smart farming

1. Introduction

1 Milk production forecasting of the dairy cow is an essential factor that

2 is useful for the dairy farmers in management as well as health monitoring.

3 In literature, many parametric models have been developed to model the

4 lactation curve at the herd and individual level [1, 2, 3, 4, 5, 6]. Or the

5 studies on extended lactation in dairy production [7, 8]. Recently, there

6 are a number of modeling techniques on milk production forecasting that

7 showed to obtain a highly accurate prediction with adaptability at the herd

8 level [9, 10, 8]. The nonlinear autoregressive model with exogenous input

9 using artificial neural networks introduced by Murphy et al. [9] shown to be

10 most effective milk-production model.

11 On the other hand, understanding the effect of the nutritional diet on milk

12 production and the quality of milk is not only helpful in financial planning but

13 also in the production of other dairy products, such as yogurt, cheese, butter

14 [11]. The importance of feed intake, diet on dairy cows was investigated in

15 recent years. For example, the feed intake increases slowly at the beginning

16 of lactation [12]; or the effects of dietary starch concentration on yield of milk

17 and milk components were investigated by Boerman et al. [13].

18 In spite of that, not many studies are on individual cow level, and on the

19 milk forecasting based on the nutrition for the small scale farms. Milk yield

20 forecasting of each individual cow can be beneficial to many applications such

21 as monitoring health conditions and disease detection, i.e. mastitis [14, 15].

22 Recently, Zhang et al. [16] conducted a study on the effect of parity weighting

23 with the dataset in the south of Ireland; or Van Bebber et al. [17] applied

24 Kalman Filter on monitoring dairy milk yields.

25 The subject of this study is to improve livestock farming, particularly

26 milk production, by monitoring the performance in nutrition supplies. The

27 first objective is to analyze the importance of the chemical composition of

28 nutrition on the production and milk production monitoring of dairy cattle

29 in Brittany, France. Secondly, we compare the performance of different types

30

(4)

of multiple linear regression and machine learning-based models for predic-

31 tion of production of the individual cow. The practicability and ability for

32 industrial applications are also discussed.

33 The paper is organized as follows. Section 2 is devoted to describe in detail

34 the content of our dataset and to present the composition analysis. Section 3

35 briefly recalls and analyzes the linear regression models and machine learning

36 algorithms. Section 4 focuses on the performance of the regression algorithms

37 on forecasting. The concluding remarks are given in Section 5.

38 2. Data description and composition analysis

39 2.1. Data description

40 The empirical data were collected from 36 lactating Holstein-Friesian

41 dairy cows in a research farm in Brittany, France, equipped with a robotic

42 milking system. For a ten months period (from December 2015 to September

43 2016), there are 7691 valid milking records collected. Each milking record

44 contains Daily Milk Yield (DMY), Day In Milk (DIM), parity information

45 (first, second, third onward lactation, see Tab. 1), number of milking per

46 day and the collective (corn silage, grass silage, wheat straw, soybean meal)

47 or individual (pelleted feed distributed through an automatic feeder) con-

48 sumption of diet components. Each cow is milked one to four times per day

49 by the robotic milking system, the cow can possibly be milked each time

50 it comes to the freestall for food. In this experiment, the amount of given

51 diets are changed every week. In this study, we are interested in the effect of

52 the diet on milk production forecasting. Particularly, the chemical compo-

53 sition studied in this paper are starch, crude fiber, Net Energy (NE) Unit´ e

54 Fourrag` ere Lait (UFL ¹ ) and protein (PDIE ² ). Therefore, the consumption

55 of different diets was converted to these four chemical compositions. Table

56 2 presents the composition of each diet. It should be noted that, in Table

57 2, the consumption of the first eight diets (Corn silage, Grass silage, ..., Ni-

58 trogen supplement) is the same for 36 dairy cows at a specific week. On

59 the other hand, since the last four components (Production feed, ..., Liquid

60 1 which are respectively the units used in dairy production to estimate available energy and protein supply to dairy cows, estimated based on 1 UFL = 1.7 Mcal, see [18].

2 Prot´eines Digestibles dans l’Intestin limitantes par l’apport d’ Energie: true protein ´

absorbable in the small intestine when rumen fermentable energy (organic matter) is lim-

iting microbial protein synthesis in the rumen [19].

(5)

feed) in Table 2 are distributed by robot, which means the consumption of

61 these four components varies according to the milk production level of each

62 individual cow. Therefore, the consumption of each individual may differ at

63 a specific week. In order to have a regular effect of each nutrient on milk

64 production, we used the weekly data instead of the daily data. That means

65 each data point is the average of seven days’ observations. The statistical

66 characteristics of the interesting variables are presented in Table 3.

Parity number of cows

First lactation 20

Second lactation 13

Third onward lactation 3

Table 1: Number of individuals on each parity lactation.

DM ^* content, Protein, Starch, Crude fiber, NE, PDIE,

% g/kg of DM g/kg of DM g/kg of DM UFL/kg of DM g/kg of DM

Corn silage 34.1 75 360 174 0.95 69

Grass silage 23.4 141 0 231 0.92 63

Fescue 88 93 0 222 0.76 82

Alfalfa hay 91.8 160 0 169 0.72 93

Fresh grass 18.3 167 0 217 0.94 90

Wheat straw 88 35 0 420 0.42 44

Ears corn 64 51 580 72 1.06 95

Nitrogen supplement 88 455 0 170 1.09 278

Production feed 88 273 114 14 1.17 205

Soluble nitrogen supplement 88 489 0 13 1.08 256

Ruminoprotected nitrogen supplement 88 443 0 13 1.08 273

Liquid feed 100 0 0 0 2.20 0

* Dry Matter

Table 2: Chemical composition of different diet.

67 2.2. Milk fat and protein composition analysis

68 In this section, we analyze the correlation between fat and protein content

69 and milk yield with the collected data. The yield of cheese and butter mainly

70 depend on milk fat and protein yield. A factor that impacts milk fat and

71 protein concentration is milk yield [20]. It is well-known that, in daily rumi-

72 nants, correlations among fat and protein content (g over 1 kilogram of milk

73 yield) and milk yield are negative [21]. In our experiment, the reported cor-

74 relation coefficients between milk yield and fat and protein content are −0.04

75 and −0.21, respectively. In our observed data, the fat and protein content

76

(6)

Mean SD ⁺ Min Max Starch (kg) 0.185 0.124 0.000 0.451 Crude fiber (kg) 0.426 0.190 0.080 0.966 PDIE (kg) 0.730 0.304 0.159 1.683 Net energy (UFL) 3.692 1.630 0.672 8.046

Parity 1.631 0.972 1 5

Milking per day 2.731 0.541 1 5

+ Standard deviation

Table 3: The statistical characteristics of the interested variables.

decrease as the milk yield increase, but not significant. As shown in Figures

77 1a and 1c, the fat and protein content visually decrease as milk yield increase

78 to 20 (kg/day). This phenomenon can be explained as at the beginning of

79 the lactation, the milk production increases more rapidly than the ability

80 of consumption of the cow. Moreover, when dairy cows produce more milk,

81 they consume more, especially water [22], but nutrition absorption cannot

82 change so intensively.

83 Some studies discovered that as milk yield increases, fat and protein syn- thesis generally increases at a slower rate [23, 20]. This phenomenon can be described by the allometric model:

y = ax ^b

where y is fat or protein yield (g/day), x the milk yield (kg/day), and a and

84 b are equation coefficients. Parameter b represents a scaling factor describing

85 the effect of milk yield variation on its two main constituents. With b = 1,

86 milk yield shows a linear relationship with fat or protein yield whose content

87 in milk is equal to a; if b > 1, fat or protein yield tends to increase more

88 proportionally than milk yield; and finally, if b < 1, fat or protein yield

89 increases at a slower rate than the milk yield.

90 In Figures 1b and 1d, the application of this model to data showed that

91 fat and protein synthesis varied proportionally to the output of milk with an

92 exponent 0.964 and 0.910 for milk fat and milk protein, respectively. Thus,

93 the higher the milk yield, the more cheese produced, even each additional

94 unit of milk results a lower increase in fat and protein. Moreover, from this

95 dataset, since the relationship between milk fat and milk yield has higher

96 variability than that between milk protein and milk yield (see Figure 1),

97

(7)

modification of milk composition by nutritional means should be easier to

98 achieve for fat than for protein.

10 20 30 40 50

30405060708090

Milk yield, kg/d

Fat concentration, g/kg

y= −0.032x+40.904 R²=0.002

(a)

0 10 20 30 40 50 60

0.00.51.01.52.02.53.0

Milk yield, kg/d

Fat yield, kg/d

y=0.045x^0.964 R²=0.814

(b)

10 20 30 40 50

25303540455055

Milk yield, kg/d

Protein concentration, g/kg

y= −0.082x+34.724 R²=0.044

(c)

0 10 20 30 40 50 60

0.00.51.01.52.0

Milk yield, kg/d

Protein yield, kg/d

y=0.044x^0.91 R²=0.91

(d)

Figure 1: Relationships between milk yield and (a) milk fat yield, (b) milk fat concentra- tion, (c) milk protein yield and (d) milk protein concentration.

99 3. Modelization

100 In this section, we present the linear models for analyzing the effect of the

101 features on milk production. Particularly, the fitting performance of three

102 linear regression methods (ridge, LASSO, elastic) is compared. In addition,

103 machine learning algorithms are introduced to predict milk production. The

104 multiple linear model is also used for forecasting. We compare the multiple

105 linear model with the machine learning approaches on milk prediction in the

106 next section.

107

(8)

3.1. Multiple Linear Model

108 A mixed linear model for milk yield observations is used. The model can

109 be written as

110 y _it = MPD + PAR + ST + CF + NE + PDIE + f(t) + e _it , (1) where y _it = average of weekly milk yield of cow i at week t; MPD = the fixed effect of Milking Per Day; PAR = fixed effect of parity; ST, CF, NE, PDIE are the fixed effects of the consumption of Starch (kg), Crude Fiber (kg), Net Energy (UFL), PDIE (kg), respectively; e _it = random residual error; they are assumed to be independent to each other. The term f (t) is the fixed function of week t based on the Ali and Schaeffer model [2], which is used to fit the average shape of the lactation curve. The Ali and Schaeffer model has been shown to be one of the most effective milk yield predictors [24, 16].

The model is written as:

f (i) = a ₀ + a ₁ γ _t + a ₂ γ _t ² + a ₃ ω _t + a ₄ ω ² _t ,

where γ = 7t/305, ω = ln(305/7t), and a ₀ , a ₁ , a ₂ , a ₃ , a ₄ are regression coef- ficients. The coefficient a ₀ is associated with the high of the general yield, a 1 and a 2 are associated with the increasing slope of the curve, a 3 and a 4

represent the decreasing slope of the curve. In matrix notation, the model can be given as

y = Xb + e,

where y is a N × 1 vector of observed milk yield, b is a p × 1 vector of the

111 regression coefficients, X is an N × p incidence matrix, and e is a N ×1 vector

112 of residual effects. Many regression methods have been developed to estimate

113 the coefficients and improve the accuracy in prediction. In many problems,

114 when the number of variables is too large, a selection model is needed to

115 remove the less informative variables and reduce the computational cost. In

116 some other cases, when the variables are highly correlated, another condition

117 is required to prevent some variables from being poorly determined. In this

118 study, we consider three common regression methods.

119 Ridge regression

120 Ridge regression is ideal if the features (the columns of X) are highly

related [25, 26]. In particular, it performs well with many features each

having small effect and prevents coefficients with many correlated variables

(9)

from being poorly determined and exhibiting high variance. Ridge regression shrinks the coefficients of correlated features equally by penalizing. The ridge regression estimator solves the regression problem using L 2 norm penalized least squares:

b ˆ = arg min

b

ky − Xbk ² ₂ + λ kbk ² ₂ , where ky − Xbk ² ₂ = P n

i=1 (y _i − x _i ^> b) ² is the L ₂ norm loss function, x _i ^> is

121 the i-th row of matrix X, kbk ² ₂ = P p

i=1 b ² _i is the L ₂ norm penalty on b, and

122 λ > 0 is the tuning parameter which is associated with the degree of linear

123 shrinkage. We have the ordinary least squares when λ = 0. The larger value

124 of λ leads to the greater amount of shrinkage. However, the ridge ˆ b’s cannot

125 be zeros no matter how large the value of λ is set. The value of λ is dependent

126 on the data, it can be optimally determined using cross-validation.

127 LASSO regression

128 LASSO (least absolute shrinkage and selection operator) regression method is widely used in variable selection and in the domain with massive dataset [27, 26]. The LASSO performs less sufficient when the features are highly correlated. The method tends to choose a subset of the features, it shrinks some coefficients and sets coefficients of other features to zero. The optimiza- tion problem for the LASSO regression estimation with L ₁ norm penalty is written as follow:

b ˆ = arg min

b

ky − Xbk ² ₂ + λ kbk ₁ , where kbk ₁ = P p

i=1 |b _i | is the L ₁ norm, λ is the tuning parameter. L ₁ norm

129 makes LASSO regularize the least squares fit and shrinks some components

130 to zeros. The suitable value for λ, which is dependent on data, is optimally

131 selected by cross-validation.

132 Elastic net regression

133 The elastic net regression method is an extension of LASSO that is robust to extreme correlations among the features [28, 29]. The elastic net simul- taneously does automatic variable selection and continuous shrinkage, the groups of correlated variables can also be selected. The elastic net uses both L ₁ (LASSO) and L ₂ (ridge) penalty, the optimization problem is formulated as follow:

ˆ b = arg min

b

ky − Xbk ² ₂ + λ ₁ kbk ₁ + λ ₂ kbk ² ₂ .

(10)

Let α = λ ₂ /(λ ₁ + λ ₂ ), then the problem is equivalent to solving b ˆ = arg min

b

ky − Xbk ² ₂ , subject to (1 − α) kbk ₁ + α kbk ² ₂ ≤ t for some t.

The elastic net penalty (1 − α) kbk ₁ + α kbk ² ₂ ≤ t is a convex combination

134 of the lasso and ridge penalty. The elastic net is a simple ridge regression

135 when α = 1 and a LASSO regression when α = 0. The tuning parameter t is

136 determined with cross-validation for a given α. The L ₁ part does automatic

137 variable selection, while the L ₂ part encourages grouped selection [26].

138 Model validation and performance

139 With our dataset, we compare the performance of each linear regression

140 method on fitting the milk production with the model (1). In this experiment,

141 we fit the linear model using a publicly available R package glmnet [29]. The

142 values of the tuning parameter are optimized by 10-fold cross-validation and

143 α = 0.5 in the case of the elastic net regression method. The coefficients of

144 the interesting features fitted by these methods are illustrated in Figure 2.

145 The coefficient linked to variable starch (kg) is large in all three methods.

146 The results are reasonable according to the previous studies [30, 13], the

147 production responded positively to an increment in starch concentration. As

148 expected, the ridge method keeps all the features, while LASSO and elastic

149 net shrunk the coefficients of consumption of PDIE (kg) and crude fiber

150 (kg) to zeros. This is due to the correlations between PDIE, crude fiber,

151 Net energy, Starch are high (greater than 0.89). Table 4 shows the statistical

152 results of fitting the lactation production with linear regression methods. The

153 elastic net gives slightly better result, in general, the performance of these

154 methods are quite similar. In the next part, we will analyze the performance

155 of the linear model in forecasting the milk production. The comparison with

156 other machine learning methods will be executed as well.

157 Statistics Ridge LASSO Elastic net

RMSE 3.23 3.15 3.12

SSE 10753 10240 10054

R ² 0.86 0.87 0.87

Table 4: Statistical values of linear fitting model using Ridge, LASSO and Elastic net.

Root Mean Square Error (RMSE), Sum of Squared Errors (SSE), R ² .

(11)

Ridge LASSO Elastic net

0 10 20 0 10 20 0 10 20

Crude_fiber_kg Milking_per_day Net_Energy_UFL Parity PDIE_kg Starch_kg

coefficient

features

Figure 2: The coefficient of each features estimated by ridge, LASSO, elastic net (α = 0.5) regression.

3.2. Machine learning algorithms

158 On forecasting milk production, in this study, we investigate three ma-

159 chine learning algorithms: support vector machine regression (SVR), artifi-

160 cial neural network (ANN), and random forest (RF). These algorithms were

161 applied in previous studies in the domain of agriculture [31, 32, 33, 34]. The

162 multiple linear model is also used in the prediction of milk production and

163 compared with these three machine learning algorithms.

164 Support vector regression

165 The Support Vector Machine is a supervised learning algorithm applied frequently in classification and regression analysis. The Support Vector Ma- chine for function estimation is usually called Support Vector Regression [35].

Suppose we have a training data {(x ₁ , y ₁ ), . . . , (x _n , y _n )} ∈ X × R , where X

denotes the space of the input features (e.g. X = R ^d ). In ε-SV regression,

the objective is to find a function f(x) that has at most ε deviation from

the actual observed data point y _i for all that training data, and is as flat as

possible at the same time. In case of a non-linear SVR, the input data are

mapped to higher dimensional Hilbert space H where the regression line can

be linearly constructed. For the sake of presentation, a linear regression line

(12)

is found by solving the following optimization problem:

minimize w, ξ

1 2 ||w|| ² + C

n

X

i=1

(ξ _i + ξ _i ^∗ )

subject to



 

 

y _i − hw, x _i i − b ≤ ε + ξ _i , with b ∈ R hw, x _i i + b − y _i ≤ ε + ξ _i ^∗

ξ _i , ξ _i ^∗ ≥ 0,

where w is the slope of the hyperplane, h., .i denotes the dot product in X.

The slack variables ξ _i , ξ _i ^∗ are introduced for the ”soft margin” loss function.

The constant C > 0 determines the trade-off between the flatness of function f and the amount of data points whose deviations are larger than ε are tolerated. Figure 3 graphically interpret a linear SVR. In the non-linear problem, a kernel function k is responsible for computing the dot product in the high dimensional space. In this study, we used the Gaussian or radical basis function (RBF) kernel:

k (x _i , x _j ) = exp −γ||x _i − x _j || ²

, with x _i , x _j ∈ X.

The parameters are tuned with the 10-fold cross-validation using the R

+ +

+ + +

+ +

+

+ +

x e

e

Figure 3: The soft margin loss setting for a linear SVR.

166 package ’e1071’ [36]. In this dataset, the optimal parameters, in term of

167 smallest mean squared error, are C = 100, γ = 0.01.

168

(13)

Random forest

169 Random Forest [37] is an algorithm that learns from multiple decision

170 trees driven on slightly different subsets of data. The random forest algorithm

171 can be applied for both classification and regression. The procedure of the

172 algorithm consists of three stages [38]. The first stage is to create n _tree

173 bootstrap samples from the data. Particularly, each sample (bag) contains

174 N observations which are uniformly selected (with replacement) out of N

175 original observations using bootstrap. Then for each sample, we grow a

176 decision CART (Classification and Regression Tree) [39]. Instead of using

177 all predictors, at each node of each tree, m _try of the predictors are randomly

178 selected, and the best split is chosen from those variables. Finally, for the

179 new data, the prediction is obtained by aggregating the predictions of the

180 n _tree trees, i.e., the average of all prediction of each tree in case of regression.

181 The advantage of the Random Forest is that it can be easily implemented for

182 the nonlinear cases. The R package ’randomForest’ ported by Liaw et al. [38]

183 is used in this paper. For our dataset, by doing three repetitions of 10-fold

184 cross-validation, the parameters n _tree = 2000 and m _try = 4 are selected.

185 Artificial neural network

186 As the name suggested, this is a connectionist system that is inspired

187 by biological neural networks. It is also commonly known as the multilayer

188 perceptron (MLP). A standard neural network consists of many connected

189 nodes called neural, constructing the input, hidden and output layers. Each

190 neuron produces a sequence of real-value activation. The input values are

191 multiplied by the synaptic weights, which present the strength of the con-

192 nection. The sum of these products is fed to each neuron within the hidden

193 layer via a typically non-linear real-valued activation function such as tanh

194 or logistic [40, 41]. In the case of a single hidden layer, the values are then

195 fed into the output layer neural via the activation function, and predict the

196 output value for each instance. Figure 4 depicts the fully connected artificial

197 neural network. During the training process, MLPs employ backpropagation

198 techniques to minimize the sum of squared errors [42].

199 In this paper, we investigate the fully connected feed-forward neural net-

200 work with one hidden layer; the inputs are parity, DIM, ..., NE; and the

201 output is the milk yield. The R package ’neuralnet’ [43] is used to imple-

202 ment the data in our study. To avoid overfitting the training data, we have

203

(14)

tested few configurations ³ , and have selected the best by cross-validation.

204 The optimum network consisted of 4 neurons in the hidden layer is used [9].

205 The resilient back-propagation with weight backtracking is applied to train

206 the data. The logistic function in (2) is carried out as the activation function:

207 f (x) = σ(x) = 1

1 + e ^−x . (2)

208

.. .

I ₁

I ₂

I ₃

I 7

H ₁

H 2

H ₃

H ₄

Y Input

layer

Hidden layer

Ouput layer

Figure 4: Artificial neural network with one hidden layer.

4. Prediction performance comparison and discussion

209 In order to evaluate the prediction performance of the multiple linear

210 regression (MLR) with elastic regression and the machine learning algorithms

211 on this dataset; for each cow, the training set is the dataset excluding the data

212 of one individual. The trained model is then used to predict the production

213 of the excluded dairy cow. Moreover, the autoregressive versions of these

214 methods are also investigated in this paper. The evaluation criteria chosen

215 in this study include: Root Mean Squared Error (RMSE), Mean Absolute

216 3 configurations that have been tested: 4, 5, 6, 7 neurons with Logistic, ReLu activation

functions

(15)

Error (MAE) and Coefficient of Determination (R ² ). In addition, we also

217 compare the computational cost of each model to each other.

218 The computer used in this study was a MacBook Pro with Intel core i7 2.5

219 GHz and 16 G 1600 MHz DDR3. Table 5 and Figure 5 present the RMSE, the

220 MAE and the R ² values of the elastic regression, SVR, random forest, neural

221 network forecasts, respectively, against dataset of 36 individual cows in case

222 of no autoregression. There are some R ² values that are negative. This is

223 due to the over estimation of the prediction. For instance, as demonstrated

224 in Figure 6, the over predictions of milk yield for the cow #16 make greater

225 error than the mean value does. However, the predictions illustrate well the

226 shape of the observations, the correlation is 0.82. The negative R ² values

227 were set to R ² = 0 in the subsequent analysis. The maximum and minimum

228 RMSE values are 5.16 and 1.56 for the MLR, 4.61 and 1.44 for the SVR, 5.77

229 and 1.46 for the random forest, 4.75 and 1.46 for the neural network. Table 6

230 shows the average errors of each model for all 36 individual cows. In general,

231 all the machine algorithms mostly outperform the MLR. The random forest

232 and SVR give the most favorable results, and random forest model is more

233 accurate in term of RMSE and MAE. Moreover, in Table 7, the random forest

234 can compute the internal estimates of variable importance (in percentage).

235 Similar to the results of MLR model, starch is the most importance variable

236 according to the random forest algorithm.

237 PLEASE PUT THE TABLE 5 HERE PLEASE PUT THE FIGURES 5 HERE PLEASE PUT THE TABLES 6, 7 HERE

238 In addition, in our data collection procedure, there are two cows that

239 were having medical issues. In Figure 7, we present the lactation curves of

240 these two individuals: cow #8 was diagnosed lame at week 24-th of lactation,

241 and cow #9 was diagnosed mastitis at Juin 2016 and August 2016. We can

242 also observe that the production changed at these points, and the predictions

243 become less accurate around these points. Due to the health condition, the

244 amount of food consumption may vary, which leads to the variation in the

245 prediction. This observation is interesting in future studies in detecting the

246 potential health issue of each individual.

247 PLEASE PUT THE FIGURES 6, 7 HERE

(16)

248 As shown in Table 8, the MLR has the least training time (in seconds) due to its simplicity, while the neural network model has the most expensive computing. The SVR has a substantial better computational time than the random forest. It also gives better result than the MLR. Therefore, in term of both accuracy and computational cost, the SVR gives the most sufficient result.

PLEASE PUT THE TABLE 8 HERE

A nonlinear autoregressive exogenous (NARX) model has been applied

249 to milk production forecasting at herd level in the study by Murphy et al.

250 [9]. In that study, the training data consists of daily herd milk yield, days in

251 milk and number of cows milked, and the NARX was shown to be the most

252 effective milk-production model. In our study, the autoregressive version of

253 the aforementioned models is also considered. The autoregressive models

254 applied in our experiment have an order of one. In particular, the record in

255 the previous week is added into the prediction variables:

256 y _t = F (y t−1 , u ₁ , u ₂ , ..., u _p ) + ε _t ,

where y t is the average milk production record on week t, {u 1 , u 2 , ..., u p } are

257 the other prediction variables, and ε _t is the error term. Table 9 and Fig-

258 ure 8 present the errors of the autoregressive version of all four forecasting

259 models against dataset of 36 individual cows. In all cases, the autoregres-

260 sive approach significantly improves the accuracy of all prediction models.

261 For example, considering individual cow ID #7, the RMSEs of four mod-

262 els without autoregression are 2.44, 2.22, 2.81 and 2.67, respectively; with

263 autoregression, the errors decreased to 1.88, 1.89, 2.35 and 1.80, respec-

264 tively. However, considering the cow number 35, we get more error with the

265 autoregressive models, this can be caused by the status of that individual

266 (e.g. health problem). Therefore, milk yield forecasting could be applied in

267 monitoring health conditions [14]. In average, Table 10 show a substantial

268 improvement in accuracy compared to the model without autoregression, the

269 R ² values of the regression are mostly high. Moreover, as shown in Table

270 11, the internal estimates of variable importance computed by random for-

271 est show that the information in the past is essentially important (62.78%),

272 starch is still an important variable (14.81%) compared to the rest.

273 PLEASE PUT THE TABLE 9 HERE

(17)

PLEASE PUT THE FIGURE 8 HERE PLEASE PUT THE TABLES 10, 11 HERE

Table 12 presents the average training time for the autoregressive model, the random forest and neural network still consume more computing power than the MLR and SVR. The SVR is yet the best compromise between accuracy and computational cost. In practice, with a portable application, the dairy farmers can improve and update the database in realtime, and train the model with the local dataset. Therefore, it is potentially suitable for industrial applications.

PLEASE PUT THE TABLE 12 HERE

274 5. Concluding remarks

275 This is a study on a small scale (36 milking cows) in Brittany, France. The

276 correlation between fat and protein content and milk yield with the collected

277 data has indicated the decrease of the fat and protein content as milk yield

278 increases to 20 (kg/day). On this dataset, the analysis of the chemical

279 composition of nutrition has shown the significant weight of nutrition supply

280 through the diet on the milk production level of dairy cattle, which is more

281 important than milk per day and parity.

282 Moreover, we compare the performance of the linear regression models

283 and machine learning models on forecasting milk production at the individ-

284 ual level. For each model, we investigate both versions: autoregressive and

285 non-autoregressive approaches. With this dataset, the autoregressive mod-

286 els, which consider the previous observation, are shown to be significantly

287 better than the non-autoregressive approaches. When the past is consid-

288 ered, the information from the previous observation considerably improves

289 the prediction accuracy.

290 Among the different methods, the random forest gives the best perfor-

291 mance on 15 individuals, the support vector machine gives prediction with

292 the smallest errors on 13 dairy cows. The linear and neural network models

293 show the best results on 5 and 3 individuals, respectively. However, the com-

294 putational times of SVR are significantly less than random forest. Therefore,

295 the support vector regression is the most efficient method for predicting milk

296 production among the other models in terms of both prediction accuracy and

297

(18)

computational cost. The result indicates the possibility of practical appli-

298 cation on a small scale farm with a small number of dairy cows. However,

299 the autoregressive models require the previous observation, then the non-

300 autoregressive approaches are more practical when past observations are not

301 available, or a far prediction is considered. Further research on other kinds

302 of dairy cows with larger cow population sizes over longer time periods is re-

303 quired to investigate the potential of using these models in health monitoring

304 on an individual cow level with high accuracy.

305 Acknowledgments

306 This research activity have been financed by Conseil regional Bretagne

307 and FEDER Bretagne within the project NUTGEN of the Universit´ e de

308 Bretagne Sud.

309 References

310 [1] P. Wood, Algebraic model of the lactation curve in cattle, Nature 216

311 (1967) 164–165.

312 [2] T. Ali, L. Schaeffer, Accounting for covariances among test day milk

313 yields in dairy cows, Canadian Journal of Animal Science 67 (1987)

314 637–644.

315 [3] J. Wilmink, Adjustment of lactation yield for age at calving in relation

316 to level of production, Livestock Production Science 16 (1987) 321–334.

317 [4] L. Schaeffer, Application of random regression models in animal breed-

318 ing, Livestock Production Science 86 (2004) 35–45.

319 [5] A. Silvestre, A. Martins, V. Santos, M. Ginja, J. Cola¸co, Lactation

320 curves for milk, fat and protein in dairy cows: A full approach, Livestock

321 Science 122 (2009) 308–313.

322 [6] S. Adediran, D. Ratkowsky, D. Donaghy, A. Malau-Aduli, Comparative

323 evaluation of a new lactation curve model for pasture-based holstein-

324 friesian dairy cows, Journal of dairy science 95 (2012) 5344–5356.

325

(19)

[7] M. Mellado, J. Flores, A. De Santiago, F. Veliz, U. Mac´ıas-Cruz,

326 L. Avenda˜ no-Reyes, J. Garc´ıa, Extended lactation in high-yielding hol-

327 stein cows: Characterization of milk yield and risk factors for lactations

328 > 450 days, Livestock Science 189 (2016) 50–55.

329 [8] J. O. Lehmann, L. Mogensen, T. Kristensen, Extended lactations in

330 dairy production: Economic, productivity and climatic impact at herd,

331 farm and sector level, Livestock science 220 (2019) 100–110.

332 [9] M. Murphy, M. O’Mahony, L. Shalloo, P. French, J. Upton, Comparison

333 of modelling techniques for milk-production forecasting, Journal of dairy

334 science 97 (2014) 3352–3363.

335 [10] F. Zhang, M. D. Murphy, L. Shalloo, E. Ruelle, J. Upton, An automatic

336 model configuration and optimization system for milk production fore-

337 casting, Computers and Electronics in Agriculture 128 (2016) 100–111.

338 [11] S. Nickerson, Milk production: Factors affecting milk composition, in:

339 Milk quality, Springer, 1995, pp. 3–24.

340 [12] I. Harder, E. Stamer, W. Junge, G. Thaller, Lactation curves and model

341 evaluation for feed intake and energy balance in dairy cows, Journal of

342 Dairy Science (2019).

343 [13] J. Boerman, S. Potts, M. VandeHaar, M. Allen, A. Lock, Milk pro-

344 duction responses to a change in dietary starch concentration vary by

345 production level in dairy cattle, Journal of dairy science 98 (2015) 4698–

346 4706.

347 [14] F. Andersen, O. Øster˚ as, O. Reksen, Y. T. Gr¨ ohn, Mastitis and the

348 shape of the lactation curve in norwegian dairy cows, Journal of dairy

349 research 78 (2011) 23–31.

350 [15] D. B. Jensen, M. van der Voort, H. Hogeveen, Dynamic forecasting of

351 individual cow milk yield in automatic milking systems, Journal of dairy

352 science 101 (2018) 10428–10439.

353 [16] F. Zhang, J. Upton, L. Shalloo, M. Murphy, Effect of parity weight-

354 ing on milk production forecast models, Computers and Electronics in

355 Agriculture 157 (2019) 589–603.

356

(20)

[17] J. Van Bebber, N. Reinsch, W. Junge, E. Kalm, Monitoring daily milk

357 yields with a recursive test day repeatability model (kalman filter), Jour-

358 nal of dairy science 82 (1999) 2421–2429.

359 [18] M. Vermorel, Energy: the feed unit systems, in: Ruminant nutrition:

360 recommended allowances and feed tables, INRA Publications, Paris,

361 1989, p. 28.

362 [19] S. A. Kadi, F. Djellal, M. Berchiche, Caract´ erisation de la conduite ali-

363 mentaire des vaches laiti` eres dans la r´ egion de tizi-ouzou, alg´ erie, Live-

364 stock Research for rural development 19 (2007).

365 [20] G. Pulina, A. Nudda, G. Battacone, A. Cannas, Effects of nutrition on

366 the contents of fat, protein, somatic cells, aromatic compounds, and un-

367 desirable substances in sheep milk, Animal Feed Science and Technology

368 131 (2006) 255–291.

369 [21] R. Emery, Milk fat depression and the influence of diet on milk compo-

370 sition., The Veterinary Clinics of North America. Food Animal Practice

371 4 (1988) 289–305.

372 [22] U. Meyer, M. Everinghoff, D. G¨ adeken, G. Flachowsky, Investigations

373 on the water intake of lactating dairy cows, Livestock production science

374 90 (2004) 117–121.

375 [23] G. Pulina, N. Macciotta, A. Nudda, Milk composition and feeding in

376 the italian dairy sheep, Italian Journal of Animal Science 4 (2005) 5–14.

377 [24] V. Olori, S. Brotherstone, W. Hill, B. McGuirk, Fit of standard models

378 of the lactation curve to weekly records of milk production of cows in a

379 single herd, Livestock Production Science 58 (1999) 55–63.

380 [25] A. E. Hoerl, R. W. Kennard, Ridge regression: Biased estimation for

381 nonorthogonal problems, Technometrics 12 (1970) 55–67.

382 [26] J. O. Ogutu, T. Schulz-Streeck, H.-P. Piepho, Genomic selection using

383 regularized linear regression models: ridge regression, lasso, elastic net

384 and their extensions, BMC Proceedings 6 (2012) S10.

385 [27] R. Tibshirani, Regression shrinkage and selection via the lasso, Journal

386 of the Royal Statistical Society: Series B (Methodological) 58 (1996)

387 267–288.

388

(21)

[28] H. Zou, T. Hastie, Regularization and variable selection via the elastic

389 net, Journal of the royal statistical society: series B (statistical method-

390 ology) 67 (2005) 301–320.

391 [29] J. Friedman, T. Hastie, R. Tibshirani, Regularization paths for general-

392 ized linear models via coordinate descent, Journal of statistical software

393 33 (2010) 1.

394 [30] A. Cabrita, R. Bessa, S. Alves, R. Dewhurst, A. Fonseca, Effects of

395 dietary protein and starch on intake, milk production, and milk fatty

396 acid profiles of dairy cows fed corn silage-based diets, Journal of Dairy

397 Science 90 (2007) 1429–1439.

398 [31] C. Kamphuis, H. Mollenhorst, A. Feelders, D. Pietersma, H. Hogeveen,

399 Decision-tree induction to detect clinical mastitis with automatic milk-

400 ing, Computers and Electronics in Agriculture 70 (2010) 60–68.

401 [32] K. Saruta, Y. Hirai, K. Tanaka, E. Inoue, T. Okayasu, M. Mitsuoka,

402 Predictive models for yield and protein content of brown rice using sup-

403 port vector machine, Computers and electronics in agriculture 99 (2013)

404 93–100.

405 [33] B. Barrett, I. Nitze, S. Green, F. Cawkwell, Assessment of multi-

406 temporal, multi-sensor radar and ancillary spatial data for grasslands

407 monitoring in ireland using machine learning approaches, Remote Sens-

408 ing of Environment 152 (2014) 109–124.

409 [34] P. Shine, M. D. Murphy, J. Upton, T. Scully, Machine-learning algo-

410 rithms for predicting on-farm direct water and electricity consumption

411 on pasture based dairy farms, Computers and electronics in agriculture

412 150 (2018) 74–87.

413 [35] A. J. Smola, B. Sch¨ olkopf, A tutorial on support vector regression,

414 Statistics and computing 14 (2004) 199–222.

415 [36] D. Meyer, E. Dimitriadou, K. Hornik, A. Weingessel, F. Leisch, C.-C.

416 Chang, C.-C. Lin, M. D. Meyer, Package ‘e1071’, The R Journal (2019).

417 [37] L. Breiman, Random forests, Machine learning 45 (2001) 5–32.

418

(22)

[38] A. Liaw, M. Wiener, et al., Classification and regression by randomfor-

419 est, R news 2 (2002) 18–22.

420 [39] L. Breiman, Classification and regression trees, Routledge, 2017.

421 [40] P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger, Springer

422 Series in Statistics, Springer, 2009.

423 [41] J. Schmidhuber, Deep learning in neural networks: An overview, Neural

424 networks 61 (2015) 85–117.

425 [42] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hub-

426 bard, L. D. Jackel, Backpropagation applied to handwritten zip code

427 recognition, Neural computation 1 (1989) 541–551.

428 [43] F. G¨ unther, S. Fritsch, neuralnet: Training of neural networks, The R

429 journal 2 (2010) 30–38.

430

(23)

MLR SVR Random forest Neural network

Cow ID RMSE MAE R ² RMSE MAE R ² RMSE MAE R ² RMSE MAE R ²

1 3.24 2.75 0.74 2.17 1.90 0.88 2.13 1.78 0.89 2.98 2.62 0.78

2 3.11 2.51 0.55 2.86 2.07 0.62 2.68 1.84 0.67 2.6 1.89 0.69

3 4.09 3.08 0.65 3.63 2.48 0.72 3.52 2.39 0.74 4.02 2.91 0.66

4 3.34 2.51 0.54 3.65 2.96 0.45 2.70 2.05 0.70 3.85 3.15 0.39

5 2.42 1.71 0.86 2.41 1.81 0.86 1.46 1.18 0.95 2.19 1.66 0.89

6 2.74 2.30 0.68 2.86 2.16 0.65 2.32 1.57 0.77 2.64 2.15 0.70

7 2.44 1.99 0.81 2.22 1.70 0.84 2.81 2.17 0.74 2.67 1.96 0.77

8 3.96 3.22 0.77 4.44 3.67 0.71 3.70 3.14 0.80 4.75 3.58 0.67

9 4.28 3.84 0.67 3.58 2.40 0.77 3.79 2.72 0.74 3.43 2.36 0.79

10 4.72 3.88 0.58 3.46 2.93 0.78 5.77 4.81 0.37 3.75 3.18 0.74 11 1.87 1.51 0.90 2.41 1.96 0.83 2.22 1.78 0.86 2.33 1.82 0.84

12 4.72 3.83 0 3.44 2.67 0.38 3.58 2.85 0.33 3.48 2.82 0.37

13 3.52 2.85 0.15 3.04 2.23 0.37 3.42 2.18 0.19 2.3 1.75 0.64

14 2.81 2.26 0.83 3.14 2.28 0.79 1.84 1.54 0.93 3.02 2.18 0.8

15 5.16 4.41 0.04 3.25 2.44 0.62 3.5 2.69 0.56 3.28 2.57 0.61

16 3.34 3.06 0 3.02 2.51 0 2.41 1.88 0 2.99 2.59 0

17 2.91 2.48 0.87 3.52 2.74 0.81 3.47 2.43 0.82 3.18 2.63 0.85

18 4.38 3.79 0.17 3.96 3.2 0.32 3.28 2.55 0.53 3.55 2.76 0.45

19 4.06 2.70 0 4.61 2.86 0 4.49 2.97 0 3.74 2.62 0.001

20 2.94 1.98 0.11 2.47 1.58 0.38 2.40 1.60 0.41 2.40 1.47 0.41 21 2.84 2.25 0.67 1.71 1.30 0.88 2.01 1.20 0.83 1.70 1.27 0.88 22 3.42 2.95 0.64 2.42 2.13 0.82 2.26 1.84 0.84 3.49 3.02 0.62 23 2.75 2.28 0.70 2.45 1.99 0.76 2.10 1.48 0.82 2.40 1.86 0.77 24 2.56 2.23 0.72 2.02 1.53 0.83 1.85 1.39 0.85 2.29 1.67 0.78 25 2.00 1.53 0.73 1.44 1.16 0.86 2.17 1.52 0.68 1.57 1.34 0.83 26 1.76 1.47 0.95 2.66 2.15 0.88 2.03 1.69 0.93 2.28 1.96 0.91 27 3.36 2.73 0.57 2.29 1.77 0.80 2.59 1.90 0.74 2.67 2.15 0.73 28 1.56 1.26 0.92 1.97 1.54 0.87 1.96 1.63 0.87 1.73 1.50 0.90 29 3.86 2.75 0.40 4.23 2.92 0.28 4.33 2.48 0.24 4.20 2.81 0.29 30 1.65 1.41 0.81 1.70 1.34 0.80 2.64 2.10 0.52 1.46 1.02 0.85 31 3.15 2.45 0.80 3.44 2.61 0.76 3.66 2.30 0.73 3.54 2.78 0.75 32 2.29 1.71 0.82 2.28 1.77 0.83 1.93 1.45 0.87 2.50 2.00 0.79

33 2.69 2.17 0.43 3.44 2.69 0.07 4.68 3.53 0 4.46 3.63 0

34 2.16 1.72 0.90 1.81 1.39 0.93 2.13 1.66 0.90 2.54 2.05 0.86 35 3.24 2.89 0.77 2.83 2.36 0.82 2.23 1.90 0.89 3.29 2.57 0.76 36 2.36 1.40 0.89 2.45 1.54 0.89 2.20 1.67 0.91 2.79 1.77 0.85

Table 5: The forecast error of four models for 36 individual cows.

(24)

0 5 10 15 20 25 30 35

0 1 2 3 4 5 6 7

Cow ID

Root Mean Square Error

MLR SVR RF ANN

0 5 10 15 20 25 30 35

0 1 2 3 4 5 6

Cow ID

Mean Absolute Error

MLR SVR RF ANN

0 10 20 30 40

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

Cow ID R

2

v alues

MLR

SVR

RF

ANN

(25)

Elastic regression SVR Random forest Neural Network

RMSE 3.103 2.868 2.842 2.947

MAE 2.496 2.187 2.107 2.279

R ² 0.664 0.712 0.734 0.704

Table 6: Average error of each model for all 36 individual cows.

0 5 10 15 20 25 30 35

5 10 15 20 25 30 35

Week

Milk yield

Observed data Predicted data

Figure 6: The observations and predictions of milk production of cow number 16 using MLR, R ² value is -1.46.

Parity DIM MPD Starch Crude fiber PDIE NE 11.21 8.96 11.35 34.87 15.44 6.58 15.16

Table 7: Average of variable importance estimated by random forest (in %).

Elastic regression SVR Random forest Neural Network

mean 0.077 0.157 6.771 7.357

SD 0.005 0.007 0.175 4.754

Table 8: Average training time (in seconds) and its standard deviation for 36 experiments.

(26)

Figure 7: Two individual cows that had medical issues during the experiment, one had

lameness (left), while the other had mastitis (right).

(27)

MLR SVR Random forest Neural network

Cow ID RMSE MAE R ² RMSE MAE R ² RMSE MAE R ² RMSE MAE R ²

1 1.93 1.72 0.91 1.36 1.09 0.95 1.37 1.07 0.95 1.34 1.14 0.95 2 1.81 1.55 0.85 1.72 1.27 0.86 2.14 1.57 0.79 1.81 1.49 0.85 3 3.79 2.61 0.70 4.20 2.80 0.63 4.26 2.95 0.62 3.51 2.28 0.74 4 2.69 2.23 0.70 2.75 2.38 0.69 2.17 1.45 0.81 2.86 2.41 0.66 5 1.72 1.30 0.93 1.81 1.37 0.92 1.47 1.19 0.95 1.84 1.38 0.92 6 2.15 1.59 0.80 2.20 1.30 0.79 1.51 1.13 0.90 2.61 1.53 0.71 7 1.88 1.56 0.89 1.89 1.57 0.88 2.35 1.85 0.82 1.80 1.49 0.89 8 2.48 1.74 0.91 3.61 3.05 0.81 3.06 2.48 0.86 2.55 1.83 0.91 9 3.12 2.26 0.82 2.71 1.74 0.87 3.15 2.24 0.82 2.93 2.08 0.84

10 3.17 2.4 0.81 2.91 2.47 0.84 3.62 2.76 0.75 4.12 3.76 0.68

11 1.60 1.28 0.93 1.78 1.27 0.91 1.32 1.02 0.95 1.90 1.44 0.89

12 2.76 2.07 0.60 2.26 1.51 0.73 2.46 1.90 0.68 2.47 1.70 0.68

13 2.62 2.08 0.53 2.44 2.01 0.59 2.74 2.19 0.49 2.61 2.16 0.53

14 1.91 1.62 0.92 2.63 2.12 0.85 2.10 1.71 0.90 2.36 2.00 0.88

15 3.29 2.77 0.61 2.68 2.00 0.74 2.71 2.06 0.74 2.77 2.31 0.72

16 2.08 1.78 0.04 1.67 1.29 0.38 1.85 1.39 0.24 1.63 1.35 0.42

17 1.77 1.44 0.95 2.07 1.61 0.94 2.03 1.54 0.94 1.97 1.65 0.94

18 2.50 1.95 0.73 2.08 1.43 0.81 2.36 1.64 0.76 2.54 1.89 0.72

19 2.60 1.82 0.52 3.18 2.23 0.28 2.82 2.06 0.43 3.10 2.08 0.32

20 1.66 1.30 0.72 1.29 1.06 0.83 1.46 1.16 0.78 1.67 1.32 0.71

21 2.31 1.53 0.78 1.89 1.13 0.85 1.58 1.02 0.90 1.75 1.21 0.88

22 2.10 1.71 0.86 1.55 1.28 0.92 1.58 1.33 0.92 1.77 1.48 0.90

23 1.98 1.44 0.84 1.85 1.23 0.86 1.84 1.24 0.87 1.94 1.43 0.85

24 1.58 1.32 0.89 1.56 1.16 0.90 1.27 1.05 0.93 1.69 1.29 0.88

25 1.79 1.45 0.78 1.63 1.28 0.82 1.81 1.28 0.78 1.66 1.37 0.81

26 2.57 1.89 0.89 2.74 2.15 0.87 2.24 1.88 0.92 2.97 2.25 0.85

27 2.03 1.68 0.84 1.24 0.95 0.94 1.10 0.79 0.95 1.43 1.19 0.92

28 1.97 1.46 0.87 2.08 1.43 0.86 2.13 1.37 0.85 1.85 1.30 0.89

29 2.43 1.79 0.76 2.66 1.62 0.72 2.82 1.50 0.68 3.42 1.91 0.53

30 1.46 1.23 0.85 1.42 1.10 0.86 1.61 1.11 0.82 1.46 1.22 0.85

31 2.98 2.33 0.82 2.77 2.14 0.84 2.21 1.72 0.90 2.90 2.20 0.83

32 1.77 1.39 0.90 1.81 1.36 0.89 1.30 1.04 0.94 1.88 1.53 0.88

33 1.59 1.28 0.80 1.85 1.56 0.73 1.90 1.42 0.71 1.67 1.49 0.78

34 1.55 1.14 0.95 1.45 1.00 0.95 1.77 1.29 0.93 1.77 1.28 0.93

35 4.48 2.42 0.55 4.38 2.41 0.57 4.17 2.35 0.61 4.61 2.39 0.53

36 2.40 1.69 0.89 2.22 1.72 0.91 2.03 1.52 0.92 2.61 1.88 0.87

Table 9: The forecast error of four autoregressive models for 36 individual cows.

(28)

0 5 10 15 20 25 30 35

0 1 2 3 4 5 6 7

Cow ID

Root Mean Square Error

MLR SVR RF ANN

0 5 10 15 20 25 30 35

0 1 2 3 4 5 6

Cow ID

Mean Absolute Error

MLR SVR RF ANN

0 5 10 15 20 25 30 35

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

Cow ID R

2

v alues

MLR

SVR

RF

ANN

(29)

Elastic regression SVR Random forest Neural Network

RMSE 2.292 2.231 2.175 2.327

MAE 1.745 1.641 1.590 1.742

R ² 0.782 0.801 0.801 0.782

Table 10: Average error of each autoregressive model for all 36 individual cows.

Parity DIM MPD Starch Crude fiber PDIE NE y t−1

2.32 4.28 5.40 14.81 5.11 2.62 6.29 62.78

Table 11: Average of variable importance estimated by random forest (in %).

Elastic regression SVR Random forest Neural Network

mean 0.083 0.182 7.240 6.862

SD 0.009 0.007 0.152 2.919

Table 12: Average training time (in seconds) and its standard deviation for 36 experiments.

Comparison of forecast models of production of dairy cows combining animal and diet parameters

HAL Id: hal-02358044

https://hal.archives-ouvertes.fr/hal-02358044v3

Submitted on 5 Feb 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Comparison of forecast models of production of dairy cows combining animal and diet parameters

Thong Nguyen, Remy Fouchereau, Emmanuel Frenod, Christine Gerard, Vincent Sincholle

To cite this version:

Thong Nguyen, Remy Fouchereau, Emmanuel Frenod, Christine Gerard, Vincent Sincholle. Compari-

son of forecast models of production of dairy cows combining animal and diet parameters. Computers

and Electronics in Agriculture, Elsevier, 2020, 170, pp.105258. �10.1016/j.compag.2020.105258�. �hal-

02358044v3�

Comparison of forecast models of production of dairy cows combining animal and diet parameters

Quoc Thong Nguyen a , R´ emy Fouchereau b , Emmanuel Fr´ enod a,b , Christine Gerard c , and Vincent Sincholle. c

a Universit´ e de Bretagne Sud, Laboratoire de Math´ ematiques de Bretagne Atlantique, UMR CNRS 6205, Campus de Tohannic, Vannes, France

b See-d, Parc Innovation Bretagne Sud, Vannes, France

c NEOVIA, France

Abstract

∗ Corresponding author

Email address: quoc-thong.nguyen@univ-ubs.fr (Quoc Thong Nguyen)

promise between accuracy and computational cost.

Keywords: Milk production forecasting, Dairy modeling, Autoregression, Smart farming

1. Introduction

1

Milk production forecasting of the dairy cow is an essential factor that

2

is useful for the dairy farmers in management as well as health monitoring.

3

In literature, many parametric models have been developed to model the

4

lactation curve at the herd and individual level [1, 2, 3, 4, 5, 6]. Or the

5

studies on extended lactation in dairy production [7, 8]. Recently, there

6

are a number of modeling techniques on milk production forecasting that

7

showed to obtain a highly accurate prediction with adaptability at the herd

8

level [9, 10, 8]. The nonlinear autoregressive model with exogenous input

9

using artificial neural networks introduced by Murphy et al. [9] shown to be

10

most effective milk-production model.

11

On the other hand, understanding the effect of the nutritional diet on milk

12

production and the quality of milk is not only helpful in financial planning but

13

also in the production of other dairy products, such as yogurt, cheese, butter

14

[11]. The importance of feed intake, diet on dairy cows was investigated in

15

recent years. For example, the feed intake increases slowly at the beginning

16

of lactation [12]; or the effects of dietary starch concentration on yield of milk

17

and milk components were investigated by Boerman et al. [13].

18

In spite of that, not many studies are on individual cow level, and on the

19

milk forecasting based on the nutrition for the small scale farms. Milk yield

20

forecasting of each individual cow can be beneficial to many applications such

21

as monitoring health conditions and disease detection, i.e. mastitis [14, 15].

22

Recently, Zhang et al. [16] conducted a study on the effect of parity weighting

23

with the dataset in the south of Ireland; or Van Bebber et al. [17] applied

24

Kalman Filter on monitoring dairy milk yields.

25

The subject of this study is to improve livestock farming, particularly

26

milk production, by monitoring the performance in nutrition supplies. The

27

first objective is to analyze the importance of the chemical composition of

28

nutrition on the production and milk production monitoring of dairy cattle

Quoc Thong Nguyen ^a , R´ emy Fouchereau ^b , Emmanuel Fr´ enod ^a,b , Christine Gerard ^c , and Vincent Sincholle. ^c

Fourrag` ere Lait (UFL ¹ ) and protein (PDIE ² ). Therefore, the consumption