• Aucun résultat trouvé

Comparison of forecast models of production of dairy cows combining animal and diet parameters

N/A
N/A
Protected

Academic year: 2021

Partager "Comparison of forecast models of production of dairy cows combining animal and diet parameters"

Copied!
29
0
0

Texte intégral

(1)

HAL Id: hal-02358044

https://hal.archives-ouvertes.fr/hal-02358044v3

Submitted on 5 Feb 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Comparison of forecast models of production of dairy cows combining animal and diet parameters

Thong Nguyen, Remy Fouchereau, Emmanuel Frenod, Christine Gerard, Vincent Sincholle

To cite this version:

Thong Nguyen, Remy Fouchereau, Emmanuel Frenod, Christine Gerard, Vincent Sincholle. Compari-

son of forecast models of production of dairy cows combining animal and diet parameters. Computers

and Electronics in Agriculture, Elsevier, 2020, 170, pp.105258. �10.1016/j.compag.2020.105258�. �hal-

02358044v3�

(2)

Comparison of forecast models of production of dairy cows combining animal and diet parameters

Quoc Thong Nguyen a , R´ emy Fouchereau b , Emmanuel Fr´ enod a,b , Christine Gerard c , and Vincent Sincholle. c

a Universit´ e de Bretagne Sud, Laboratoire de Math´ ematiques de Bretagne Atlantique, UMR CNRS 6205, Campus de Tohannic, Vannes, France

b See-d, Parc Innovation Bretagne Sud, Vannes, France

c NEOVIA, France

Abstract

We study the effect of nutritional diet characteristics on the lactating Holstein- Friesian dairy cows in Brittany, France from 36 individuals. An analysis of the relations between fat/protein content and milk yield was implemented for our dataset. The fat and protein production increase at a slower rate as milk yield increases. The importance of chemical composition on milk production is studied using the linear model. The data analysis confirms the importance of Starch, crude fiber, and protein which have a positive effect on milk production. This analysis also confirms the previous study on the effect of parity on the production. After that, the milk production forecast- ing is investigated using both linear models and machine learning approaches (support vector machine, random forest, neural network). We study the per- formance of multiple linear regression and machine learning-based models in both non-autoregressive and autoregressive cases at the individual level.

The autoregressive models, which take into account the previously observed milk yield, have proven to significantly outperform the non-autoregressive approaches. Moreover, the computational cost of each approach is presented in the paper. While the random forest algorithm gives the best performance in both non-autoregressive and autoregressive approaches. The support vec- tor machine algorithm gives a very close performance with a substantial less computing time. The support vector machine is shown to be the best com-

∗ Corresponding author

Email address: quoc-thong.nguyen@univ-ubs.fr (Quoc Thong Nguyen)

(3)

promise between accuracy and computational cost.

Keywords: Milk production forecasting, Dairy modeling, Autoregression, Smart farming

1. Introduction

1

Milk production forecasting of the dairy cow is an essential factor that

2

is useful for the dairy farmers in management as well as health monitoring.

3

In literature, many parametric models have been developed to model the

4

lactation curve at the herd and individual level [1, 2, 3, 4, 5, 6]. Or the

5

studies on extended lactation in dairy production [7, 8]. Recently, there

6

are a number of modeling techniques on milk production forecasting that

7

showed to obtain a highly accurate prediction with adaptability at the herd

8

level [9, 10, 8]. The nonlinear autoregressive model with exogenous input

9

using artificial neural networks introduced by Murphy et al. [9] shown to be

10

most effective milk-production model.

11

On the other hand, understanding the effect of the nutritional diet on milk

12

production and the quality of milk is not only helpful in financial planning but

13

also in the production of other dairy products, such as yogurt, cheese, butter

14

[11]. The importance of feed intake, diet on dairy cows was investigated in

15

recent years. For example, the feed intake increases slowly at the beginning

16

of lactation [12]; or the effects of dietary starch concentration on yield of milk

17

and milk components were investigated by Boerman et al. [13].

18

In spite of that, not many studies are on individual cow level, and on the

19

milk forecasting based on the nutrition for the small scale farms. Milk yield

20

forecasting of each individual cow can be beneficial to many applications such

21

as monitoring health conditions and disease detection, i.e. mastitis [14, 15].

22

Recently, Zhang et al. [16] conducted a study on the effect of parity weighting

23

with the dataset in the south of Ireland; or Van Bebber et al. [17] applied

24

Kalman Filter on monitoring dairy milk yields.

25

The subject of this study is to improve livestock farming, particularly

26

milk production, by monitoring the performance in nutrition supplies. The

27

first objective is to analyze the importance of the chemical composition of

28

nutrition on the production and milk production monitoring of dairy cattle

29

in Brittany, France. Secondly, we compare the performance of different types

30

(4)

of multiple linear regression and machine learning-based models for predic-

31

tion of production of the individual cow. The practicability and ability for

32

industrial applications are also discussed.

33

The paper is organized as follows. Section 2 is devoted to describe in detail

34

the content of our dataset and to present the composition analysis. Section 3

35

briefly recalls and analyzes the linear regression models and machine learning

36

algorithms. Section 4 focuses on the performance of the regression algorithms

37

on forecasting. The concluding remarks are given in Section 5.

38

2. Data description and composition analysis

39

2.1. Data description

40

The empirical data were collected from 36 lactating Holstein-Friesian

41

dairy cows in a research farm in Brittany, France, equipped with a robotic

42

milking system. For a ten months period (from December 2015 to September

43

2016), there are 7691 valid milking records collected. Each milking record

44

contains Daily Milk Yield (DMY), Day In Milk (DIM), parity information

45

(first, second, third onward lactation, see Tab. 1), number of milking per

46

day and the collective (corn silage, grass silage, wheat straw, soybean meal)

47

or individual (pelleted feed distributed through an automatic feeder) con-

48

sumption of diet components. Each cow is milked one to four times per day

49

by the robotic milking system, the cow can possibly be milked each time

50

it comes to the freestall for food. In this experiment, the amount of given

51

diets are changed every week. In this study, we are interested in the effect of

52

the diet on milk production forecasting. Particularly, the chemical compo-

53

sition studied in this paper are starch, crude fiber, Net Energy (NE) Unit´ e

54

Fourrag` ere Lait (UFL 1 ) and protein (PDIE 2 ). Therefore, the consumption

55

of different diets was converted to these four chemical compositions. Table

56

2 presents the composition of each diet. It should be noted that, in Table

57

2, the consumption of the first eight diets (Corn silage, Grass silage, ..., Ni-

58

trogen supplement) is the same for 36 dairy cows at a specific week. On

59

the other hand, since the last four components (Production feed, ..., Liquid

60

1 which are respectively the units used in dairy production to estimate available energy and protein supply to dairy cows, estimated based on 1 UFL = 1.7 Mcal, see [18].

2 Prot´eines Digestibles dans l’Intestin limitantes par l’apport d’ Energie: true protein ´

absorbable in the small intestine when rumen fermentable energy (organic matter) is lim-

iting microbial protein synthesis in the rumen [19].

(5)

feed) in Table 2 are distributed by robot, which means the consumption of

61

these four components varies according to the milk production level of each

62

individual cow. Therefore, the consumption of each individual may differ at

63

a specific week. In order to have a regular effect of each nutrient on milk

64

production, we used the weekly data instead of the daily data. That means

65

each data point is the average of seven days’ observations. The statistical

66

characteristics of the interesting variables are presented in Table 3.

Parity number of cows

First lactation 20

Second lactation 13

Third onward lactation 3

Table 1: Number of individuals on each parity lactation.

DM * content, Protein, Starch, Crude fiber, NE, PDIE,

% g/kg of DM g/kg of DM g/kg of DM UFL/kg of DM g/kg of DM

Corn silage 34.1 75 360 174 0.95 69

Grass silage 23.4 141 0 231 0.92 63

Fescue 88 93 0 222 0.76 82

Alfalfa hay 91.8 160 0 169 0.72 93

Fresh grass 18.3 167 0 217 0.94 90

Wheat straw 88 35 0 420 0.42 44

Ears corn 64 51 580 72 1.06 95

Nitrogen supplement 88 455 0 170 1.09 278

Production feed 88 273 114 14 1.17 205

Soluble nitrogen supplement 88 489 0 13 1.08 256

Ruminoprotected nitrogen supplement 88 443 0 13 1.08 273

Liquid feed 100 0 0 0 2.20 0

* Dry Matter

Table 2: Chemical composition of different diet.

67

2.2. Milk fat and protein composition analysis

68

In this section, we analyze the correlation between fat and protein content

69

and milk yield with the collected data. The yield of cheese and butter mainly

70

depend on milk fat and protein yield. A factor that impacts milk fat and

71

protein concentration is milk yield [20]. It is well-known that, in daily rumi-

72

nants, correlations among fat and protein content (g over 1 kilogram of milk

73

yield) and milk yield are negative [21]. In our experiment, the reported cor-

74

relation coefficients between milk yield and fat and protein content are −0.04

75

and −0.21, respectively. In our observed data, the fat and protein content

76

(6)

Mean SD + Min Max Starch (kg) 0.185 0.124 0.000 0.451 Crude fiber (kg) 0.426 0.190 0.080 0.966 PDIE (kg) 0.730 0.304 0.159 1.683 Net energy (UFL) 3.692 1.630 0.672 8.046

Parity 1.631 0.972 1 5

Milking per day 2.731 0.541 1 5

+ Standard deviation

Table 3: The statistical characteristics of the interested variables.

decrease as the milk yield increase, but not significant. As shown in Figures

77

1a and 1c, the fat and protein content visually decrease as milk yield increase

78

to 20 (kg/day). This phenomenon can be explained as at the beginning of

79

the lactation, the milk production increases more rapidly than the ability

80

of consumption of the cow. Moreover, when dairy cows produce more milk,

81

they consume more, especially water [22], but nutrition absorption cannot

82

change so intensively.

83

Some studies discovered that as milk yield increases, fat and protein syn- thesis generally increases at a slower rate [23, 20]. This phenomenon can be described by the allometric model:

y = ax b

where y is fat or protein yield (g/day), x the milk yield (kg/day), and a and

84

b are equation coefficients. Parameter b represents a scaling factor describing

85

the effect of milk yield variation on its two main constituents. With b = 1,

86

milk yield shows a linear relationship with fat or protein yield whose content

87

in milk is equal to a; if b > 1, fat or protein yield tends to increase more

88

proportionally than milk yield; and finally, if b < 1, fat or protein yield

89

increases at a slower rate than the milk yield.

90

In Figures 1b and 1d, the application of this model to data showed that

91

fat and protein synthesis varied proportionally to the output of milk with an

92

exponent 0.964 and 0.910 for milk fat and milk protein, respectively. Thus,

93

the higher the milk yield, the more cheese produced, even each additional

94

unit of milk results a lower increase in fat and protein. Moreover, from this

95

dataset, since the relationship between milk fat and milk yield has higher

96

variability than that between milk protein and milk yield (see Figure 1),

97

(7)

modification of milk composition by nutritional means should be easier to

98

achieve for fat than for protein.

10 20 30 40 50

30405060708090

Milk yield, kg/d

Fat concentration, g/kg

y= −0.032x+40.904 R2=0.002

(a)

0 10 20 30 40 50 60

0.00.51.01.52.02.53.0

Milk yield, kg/d

Fat yield, kg/d

y=0.045x0.964 R2=0.814

(b)

10 20 30 40 50

25303540455055

Milk yield, kg/d

Protein concentration, g/kg

y= −0.082x+34.724 R2=0.044

(c)

0 10 20 30 40 50 60

0.00.51.01.52.0

Milk yield, kg/d

Protein yield, kg/d

y=0.044x0.91 R2=0.91

(d)

Figure 1: Relationships between milk yield and (a) milk fat yield, (b) milk fat concentra- tion, (c) milk protein yield and (d) milk protein concentration.

99

3. Modelization

100

In this section, we present the linear models for analyzing the effect of the

101

features on milk production. Particularly, the fitting performance of three

102

linear regression methods (ridge, LASSO, elastic) is compared. In addition,

103

machine learning algorithms are introduced to predict milk production. The

104

multiple linear model is also used for forecasting. We compare the multiple

105

linear model with the machine learning approaches on milk prediction in the

106

next section.

107

(8)

3.1. Multiple Linear Model

108

A mixed linear model for milk yield observations is used. The model can

109

be written as

110

y it = MPD + PAR + ST + CF + NE + PDIE + f(t) + e it , (1) where y it = average of weekly milk yield of cow i at week t; MPD = the fixed effect of Milking Per Day; PAR = fixed effect of parity; ST, CF, NE, PDIE are the fixed effects of the consumption of Starch (kg), Crude Fiber (kg), Net Energy (UFL), PDIE (kg), respectively; e it = random residual error; they are assumed to be independent to each other. The term f (t) is the fixed function of week t based on the Ali and Schaeffer model [2], which is used to fit the average shape of the lactation curve. The Ali and Schaeffer model has been shown to be one of the most effective milk yield predictors [24, 16].

The model is written as:

f (i) = a 0 + a 1 γ t + a 2 γ t 2 + a 3 ω t + a 4 ω 2 t ,

where γ = 7t/305, ω = ln(305/7t), and a 0 , a 1 , a 2 , a 3 , a 4 are regression coef- ficients. The coefficient a 0 is associated with the high of the general yield, a 1 and a 2 are associated with the increasing slope of the curve, a 3 and a 4

represent the decreasing slope of the curve. In matrix notation, the model can be given as

y = Xb + e,

where y is a N × 1 vector of observed milk yield, b is a p × 1 vector of the

111

regression coefficients, X is an N × p incidence matrix, and e is a N ×1 vector

112

of residual effects. Many regression methods have been developed to estimate

113

the coefficients and improve the accuracy in prediction. In many problems,

114

when the number of variables is too large, a selection model is needed to

115

remove the less informative variables and reduce the computational cost. In

116

some other cases, when the variables are highly correlated, another condition

117

is required to prevent some variables from being poorly determined. In this

118

study, we consider three common regression methods.

119

Ridge regression

120

Ridge regression is ideal if the features (the columns of X) are highly

related [25, 26]. In particular, it performs well with many features each

having small effect and prevents coefficients with many correlated variables

(9)

from being poorly determined and exhibiting high variance. Ridge regression shrinks the coefficients of correlated features equally by penalizing. The ridge regression estimator solves the regression problem using L 2 norm penalized least squares:

b ˆ = arg min

b

ky − Xbk 2 2 + λ kbk 2 2 , where ky − Xbk 2 2 = P n

i=1 (y i − x i > b) 2 is the L 2 norm loss function, x i > is

121

the i-th row of matrix X, kbk 2 2 = P p

i=1 b 2 i is the L 2 norm penalty on b, and

122

λ > 0 is the tuning parameter which is associated with the degree of linear

123

shrinkage. We have the ordinary least squares when λ = 0. The larger value

124

of λ leads to the greater amount of shrinkage. However, the ridge ˆ b’s cannot

125

be zeros no matter how large the value of λ is set. The value of λ is dependent

126

on the data, it can be optimally determined using cross-validation.

127

LASSO regression

128

LASSO (least absolute shrinkage and selection operator) regression method is widely used in variable selection and in the domain with massive dataset [27, 26]. The LASSO performs less sufficient when the features are highly correlated. The method tends to choose a subset of the features, it shrinks some coefficients and sets coefficients of other features to zero. The optimiza- tion problem for the LASSO regression estimation with L 1 norm penalty is written as follow:

b ˆ = arg min

b

ky − Xbk 2 2 + λ kbk 1 , where kbk 1 = P p

i=1 |b i | is the L 1 norm, λ is the tuning parameter. L 1 norm

129

makes LASSO regularize the least squares fit and shrinks some components

130

to zeros. The suitable value for λ, which is dependent on data, is optimally

131

selected by cross-validation.

132

Elastic net regression

133

The elastic net regression method is an extension of LASSO that is robust to extreme correlations among the features [28, 29]. The elastic net simul- taneously does automatic variable selection and continuous shrinkage, the groups of correlated variables can also be selected. The elastic net uses both L 1 (LASSO) and L 2 (ridge) penalty, the optimization problem is formulated as follow:

ˆ b = arg min

b

ky − Xbk 2 2 + λ 1 kbk 1 + λ 2 kbk 2 2 .

(10)

Let α = λ 2 /(λ 1 + λ 2 ), then the problem is equivalent to solving b ˆ = arg min

b

ky − Xbk 2 2 , subject to (1 − α) kbk 1 + α kbk 2 2 ≤ t for some t.

The elastic net penalty (1 − α) kbk 1 + α kbk 2 2 ≤ t is a convex combination

134

of the lasso and ridge penalty. The elastic net is a simple ridge regression

135

when α = 1 and a LASSO regression when α = 0. The tuning parameter t is

136

determined with cross-validation for a given α. The L 1 part does automatic

137

variable selection, while the L 2 part encourages grouped selection [26].

138

Model validation and performance

139

With our dataset, we compare the performance of each linear regression

140

method on fitting the milk production with the model (1). In this experiment,

141

we fit the linear model using a publicly available R package glmnet [29]. The

142

values of the tuning parameter are optimized by 10-fold cross-validation and

143

α = 0.5 in the case of the elastic net regression method. The coefficients of

144

the interesting features fitted by these methods are illustrated in Figure 2.

145

The coefficient linked to variable starch (kg) is large in all three methods.

146

The results are reasonable according to the previous studies [30, 13], the

147

production responded positively to an increment in starch concentration. As

148

expected, the ridge method keeps all the features, while LASSO and elastic

149

net shrunk the coefficients of consumption of PDIE (kg) and crude fiber

150

(kg) to zeros. This is due to the correlations between PDIE, crude fiber,

151

Net energy, Starch are high (greater than 0.89). Table 4 shows the statistical

152

results of fitting the lactation production with linear regression methods. The

153

elastic net gives slightly better result, in general, the performance of these

154

methods are quite similar. In the next part, we will analyze the performance

155

of the linear model in forecasting the milk production. The comparison with

156

other machine learning methods will be executed as well.

157

Statistics Ridge LASSO Elastic net

RMSE 3.23 3.15 3.12

SSE 10753 10240 10054

R 2 0.86 0.87 0.87

Table 4: Statistical values of linear fitting model using Ridge, LASSO and Elastic net.

Root Mean Square Error (RMSE), Sum of Squared Errors (SSE), R 2 .

(11)

Ridge LASSO Elastic net

0 10 20 0 10 20 0 10 20

Crude_fiber_kg Milking_per_day Net_Energy_UFL Parity PDIE_kg Starch_kg

coefficient

features

Figure 2: The coefficient of each features estimated by ridge, LASSO, elastic net (α = 0.5) regression.

3.2. Machine learning algorithms

158

On forecasting milk production, in this study, we investigate three ma-

159

chine learning algorithms: support vector machine regression (SVR), artifi-

160

cial neural network (ANN), and random forest (RF). These algorithms were

161

applied in previous studies in the domain of agriculture [31, 32, 33, 34]. The

162

multiple linear model is also used in the prediction of milk production and

163

compared with these three machine learning algorithms.

164

Support vector regression

165

The Support Vector Machine is a supervised learning algorithm applied frequently in classification and regression analysis. The Support Vector Ma- chine for function estimation is usually called Support Vector Regression [35].

Suppose we have a training data {(x 1 , y 1 ), . . . , (x n , y n )} ∈ X × R , where X

denotes the space of the input features (e.g. X = R d ). In ε-SV regression,

the objective is to find a function f(x) that has at most ε deviation from

the actual observed data point y i for all that training data, and is as flat as

possible at the same time. In case of a non-linear SVR, the input data are

mapped to higher dimensional Hilbert space H where the regression line can

be linearly constructed. For the sake of presentation, a linear regression line

(12)

is found by solving the following optimization problem:

minimize w, ξ

1

2 ||w|| 2 + C

n

X

i=1

i + ξ i )

subject to

 

 

y i − hw, x i i − b ≤ ε + ξ i , with b ∈ R hw, x i i + b − y i ≤ ε + ξ i

ξ i , ξ i ≥ 0,

where w is the slope of the hyperplane, h., .i denotes the dot product in X.

The slack variables ξ i , ξ i are introduced for the ”soft margin” loss function.

The constant C > 0 determines the trade-off between the flatness of function f and the amount of data points whose deviations are larger than ε are tolerated. Figure 3 graphically interpret a linear SVR. In the non-linear problem, a kernel function k is responsible for computing the dot product in the high dimensional space. In this study, we used the Gaussian or radical basis function (RBF) kernel:

k (x i , x j ) = exp −γ||x i − x j || 2

, with x i , x j ∈ X.

The parameters are tuned with the 10-fold cross-validation using the R

+ +

+ + +

+ + +

+ +

+

+ +

+ +

x e

e

Figure 3: The soft margin loss setting for a linear SVR.

166

package ’e1071’ [36]. In this dataset, the optimal parameters, in term of

167

smallest mean squared error, are C = 100, γ = 0.01.

168

(13)

Random forest

169

Random Forest [37] is an algorithm that learns from multiple decision

170

trees driven on slightly different subsets of data. The random forest algorithm

171

can be applied for both classification and regression. The procedure of the

172

algorithm consists of three stages [38]. The first stage is to create n tree

173

bootstrap samples from the data. Particularly, each sample (bag) contains

174

N observations which are uniformly selected (with replacement) out of N

175

original observations using bootstrap. Then for each sample, we grow a

176

decision CART (Classification and Regression Tree) [39]. Instead of using

177

all predictors, at each node of each tree, m try of the predictors are randomly

178

selected, and the best split is chosen from those variables. Finally, for the

179

new data, the prediction is obtained by aggregating the predictions of the

180

n tree trees, i.e., the average of all prediction of each tree in case of regression.

181

The advantage of the Random Forest is that it can be easily implemented for

182

the nonlinear cases. The R package ’randomForest’ ported by Liaw et al. [38]

183

is used in this paper. For our dataset, by doing three repetitions of 10-fold

184

cross-validation, the parameters n tree = 2000 and m try = 4 are selected.

185

Artificial neural network

186

As the name suggested, this is a connectionist system that is inspired

187

by biological neural networks. It is also commonly known as the multilayer

188

perceptron (MLP). A standard neural network consists of many connected

189

nodes called neural, constructing the input, hidden and output layers. Each

190

neuron produces a sequence of real-value activation. The input values are

191

multiplied by the synaptic weights, which present the strength of the con-

192

nection. The sum of these products is fed to each neuron within the hidden

193

layer via a typically non-linear real-valued activation function such as tanh

194

or logistic [40, 41]. In the case of a single hidden layer, the values are then

195

fed into the output layer neural via the activation function, and predict the

196

output value for each instance. Figure 4 depicts the fully connected artificial

197

neural network. During the training process, MLPs employ backpropagation

198

techniques to minimize the sum of squared errors [42].

199

In this paper, we investigate the fully connected feed-forward neural net-

200

work with one hidden layer; the inputs are parity, DIM, ..., NE; and the

201

output is the milk yield. The R package ’neuralnet’ [43] is used to imple-

202

ment the data in our study. To avoid overfitting the training data, we have

203

(14)

tested few configurations 3 , and have selected the best by cross-validation.

204

The optimum network consisted of 4 neurons in the hidden layer is used [9].

205

The resilient back-propagation with weight backtracking is applied to train

206

the data. The logistic function in (2) is carried out as the activation function:

207

f (x) = σ(x) = 1

1 + e −x . (2)

208

.. .

I 1

I 2

I 3

I 7

H 1

H 2

H 3

H 4

Y Input

layer

Hidden layer

Ouput layer

Figure 4: Artificial neural network with one hidden layer.

4. Prediction performance comparison and discussion

209

In order to evaluate the prediction performance of the multiple linear

210

regression (MLR) with elastic regression and the machine learning algorithms

211

on this dataset; for each cow, the training set is the dataset excluding the data

212

of one individual. The trained model is then used to predict the production

213

of the excluded dairy cow. Moreover, the autoregressive versions of these

214

methods are also investigated in this paper. The evaluation criteria chosen

215

in this study include: Root Mean Squared Error (RMSE), Mean Absolute

216

3 configurations that have been tested: 4, 5, 6, 7 neurons with Logistic, ReLu activation

functions

(15)

Error (MAE) and Coefficient of Determination (R 2 ). In addition, we also

217

compare the computational cost of each model to each other.

218

The computer used in this study was a MacBook Pro with Intel core i7 2.5

219

GHz and 16 G 1600 MHz DDR3. Table 5 and Figure 5 present the RMSE, the

220

MAE and the R 2 values of the elastic regression, SVR, random forest, neural

221

network forecasts, respectively, against dataset of 36 individual cows in case

222

of no autoregression. There are some R 2 values that are negative. This is

223

due to the over estimation of the prediction. For instance, as demonstrated

224

in Figure 6, the over predictions of milk yield for the cow #16 make greater

225

error than the mean value does. However, the predictions illustrate well the

226

shape of the observations, the correlation is 0.82. The negative R 2 values

227

were set to R 2 = 0 in the subsequent analysis. The maximum and minimum

228

RMSE values are 5.16 and 1.56 for the MLR, 4.61 and 1.44 for the SVR, 5.77

229

and 1.46 for the random forest, 4.75 and 1.46 for the neural network. Table 6

230

shows the average errors of each model for all 36 individual cows. In general,

231

all the machine algorithms mostly outperform the MLR. The random forest

232

and SVR give the most favorable results, and random forest model is more

233

accurate in term of RMSE and MAE. Moreover, in Table 7, the random forest

234

can compute the internal estimates of variable importance (in percentage).

235

Similar to the results of MLR model, starch is the most importance variable

236

according to the random forest algorithm.

237

PLEASE PUT THE TABLE 5 HERE PLEASE PUT THE FIGURES 5 HERE PLEASE PUT THE TABLES 6, 7 HERE

238

In addition, in our data collection procedure, there are two cows that

239

were having medical issues. In Figure 7, we present the lactation curves of

240

these two individuals: cow #8 was diagnosed lame at week 24-th of lactation,

241

and cow #9 was diagnosed mastitis at Juin 2016 and August 2016. We can

242

also observe that the production changed at these points, and the predictions

243

become less accurate around these points. Due to the health condition, the

244

amount of food consumption may vary, which leads to the variation in the

245

prediction. This observation is interesting in future studies in detecting the

246

potential health issue of each individual.

247

PLEASE PUT THE FIGURES 6, 7 HERE

(16)

248

As shown in Table 8, the MLR has the least training time (in seconds) due to its simplicity, while the neural network model has the most expensive computing. The SVR has a substantial better computational time than the random forest. It also gives better result than the MLR. Therefore, in term of both accuracy and computational cost, the SVR gives the most sufficient result.

PLEASE PUT THE TABLE 8 HERE

A nonlinear autoregressive exogenous (NARX) model has been applied

249

to milk production forecasting at herd level in the study by Murphy et al.

250

[9]. In that study, the training data consists of daily herd milk yield, days in

251

milk and number of cows milked, and the NARX was shown to be the most

252

effective milk-production model. In our study, the autoregressive version of

253

the aforementioned models is also considered. The autoregressive models

254

applied in our experiment have an order of one. In particular, the record in

255

the previous week is added into the prediction variables:

256

y t = F (y t−1 , u 1 , u 2 , ..., u p ) + ε t ,

where y t is the average milk production record on week t, {u 1 , u 2 , ..., u p } are

257

the other prediction variables, and ε t is the error term. Table 9 and Fig-

258

ure 8 present the errors of the autoregressive version of all four forecasting

259

models against dataset of 36 individual cows. In all cases, the autoregres-

260

sive approach significantly improves the accuracy of all prediction models.

261

For example, considering individual cow ID #7, the RMSEs of four mod-

262

els without autoregression are 2.44, 2.22, 2.81 and 2.67, respectively; with

263

autoregression, the errors decreased to 1.88, 1.89, 2.35 and 1.80, respec-

264

tively. However, considering the cow number 35, we get more error with the

265

autoregressive models, this can be caused by the status of that individual

266

(e.g. health problem). Therefore, milk yield forecasting could be applied in

267

monitoring health conditions [14]. In average, Table 10 show a substantial

268

improvement in accuracy compared to the model without autoregression, the

269

R 2 values of the regression are mostly high. Moreover, as shown in Table

270

11, the internal estimates of variable importance computed by random for-

271

est show that the information in the past is essentially important (62.78%),

272

starch is still an important variable (14.81%) compared to the rest.

273

PLEASE PUT THE TABLE 9 HERE

(17)

PLEASE PUT THE FIGURE 8 HERE PLEASE PUT THE TABLES 10, 11 HERE

Table 12 presents the average training time for the autoregressive model, the random forest and neural network still consume more computing power than the MLR and SVR. The SVR is yet the best compromise between accuracy and computational cost. In practice, with a portable application, the dairy farmers can improve and update the database in realtime, and train the model with the local dataset. Therefore, it is potentially suitable for industrial applications.

PLEASE PUT THE TABLE 12 HERE

274

5. Concluding remarks

275

This is a study on a small scale (36 milking cows) in Brittany, France. The

276

correlation between fat and protein content and milk yield with the collected

277

data has indicated the decrease of the fat and protein content as milk yield

278

increases to 20 (kg/day). On this dataset, the analysis of the chemical

279

composition of nutrition has shown the significant weight of nutrition supply

280

through the diet on the milk production level of dairy cattle, which is more

281

important than milk per day and parity.

282

Moreover, we compare the performance of the linear regression models

283

and machine learning models on forecasting milk production at the individ-

284

ual level. For each model, we investigate both versions: autoregressive and

285

non-autoregressive approaches. With this dataset, the autoregressive mod-

286

els, which consider the previous observation, are shown to be significantly

287

better than the non-autoregressive approaches. When the past is consid-

288

ered, the information from the previous observation considerably improves

289

the prediction accuracy.

290

Among the different methods, the random forest gives the best perfor-

291

mance on 15 individuals, the support vector machine gives prediction with

292

the smallest errors on 13 dairy cows. The linear and neural network models

293

show the best results on 5 and 3 individuals, respectively. However, the com-

294

putational times of SVR are significantly less than random forest. Therefore,

295

the support vector regression is the most efficient method for predicting milk

296

production among the other models in terms of both prediction accuracy and

297

(18)

computational cost. The result indicates the possibility of practical appli-

298

cation on a small scale farm with a small number of dairy cows. However,

299

the autoregressive models require the previous observation, then the non-

300

autoregressive approaches are more practical when past observations are not

301

available, or a far prediction is considered. Further research on other kinds

302

of dairy cows with larger cow population sizes over longer time periods is re-

303

quired to investigate the potential of using these models in health monitoring

304

on an individual cow level with high accuracy.

305

Acknowledgments

306

This research activity have been financed by Conseil regional Bretagne

307

and FEDER Bretagne within the project NUTGEN of the Universit´ e de

308

Bretagne Sud.

309

References

310

[1] P. Wood, Algebraic model of the lactation curve in cattle, Nature 216

311

(1967) 164–165.

312

[2] T. Ali, L. Schaeffer, Accounting for covariances among test day milk

313

yields in dairy cows, Canadian Journal of Animal Science 67 (1987)

314

637–644.

315

[3] J. Wilmink, Adjustment of lactation yield for age at calving in relation

316

to level of production, Livestock Production Science 16 (1987) 321–334.

317

[4] L. Schaeffer, Application of random regression models in animal breed-

318

ing, Livestock Production Science 86 (2004) 35–45.

319

[5] A. Silvestre, A. Martins, V. Santos, M. Ginja, J. Cola¸co, Lactation

320

curves for milk, fat and protein in dairy cows: A full approach, Livestock

321

Science 122 (2009) 308–313.

322

[6] S. Adediran, D. Ratkowsky, D. Donaghy, A. Malau-Aduli, Comparative

323

evaluation of a new lactation curve model for pasture-based holstein-

324

friesian dairy cows, Journal of dairy science 95 (2012) 5344–5356.

325

(19)

[7] M. Mellado, J. Flores, A. De Santiago, F. Veliz, U. Mac´ıas-Cruz,

326

L. Avenda˜ no-Reyes, J. Garc´ıa, Extended lactation in high-yielding hol-

327

stein cows: Characterization of milk yield and risk factors for lactations

328

> 450 days, Livestock Science 189 (2016) 50–55.

329

[8] J. O. Lehmann, L. Mogensen, T. Kristensen, Extended lactations in

330

dairy production: Economic, productivity and climatic impact at herd,

331

farm and sector level, Livestock science 220 (2019) 100–110.

332

[9] M. Murphy, M. O’Mahony, L. Shalloo, P. French, J. Upton, Comparison

333

of modelling techniques for milk-production forecasting, Journal of dairy

334

science 97 (2014) 3352–3363.

335

[10] F. Zhang, M. D. Murphy, L. Shalloo, E. Ruelle, J. Upton, An automatic

336

model configuration and optimization system for milk production fore-

337

casting, Computers and Electronics in Agriculture 128 (2016) 100–111.

338

[11] S. Nickerson, Milk production: Factors affecting milk composition, in:

339

Milk quality, Springer, 1995, pp. 3–24.

340

[12] I. Harder, E. Stamer, W. Junge, G. Thaller, Lactation curves and model

341

evaluation for feed intake and energy balance in dairy cows, Journal of

342

Dairy Science (2019).

343

[13] J. Boerman, S. Potts, M. VandeHaar, M. Allen, A. Lock, Milk pro-

344

duction responses to a change in dietary starch concentration vary by

345

production level in dairy cattle, Journal of dairy science 98 (2015) 4698–

346

4706.

347

[14] F. Andersen, O. Øster˚ as, O. Reksen, Y. T. Gr¨ ohn, Mastitis and the

348

shape of the lactation curve in norwegian dairy cows, Journal of dairy

349

research 78 (2011) 23–31.

350

[15] D. B. Jensen, M. van der Voort, H. Hogeveen, Dynamic forecasting of

351

individual cow milk yield in automatic milking systems, Journal of dairy

352

science 101 (2018) 10428–10439.

353

[16] F. Zhang, J. Upton, L. Shalloo, M. Murphy, Effect of parity weight-

354

ing on milk production forecast models, Computers and Electronics in

355

Agriculture 157 (2019) 589–603.

356

(20)

[17] J. Van Bebber, N. Reinsch, W. Junge, E. Kalm, Monitoring daily milk

357

yields with a recursive test day repeatability model (kalman filter), Jour-

358

nal of dairy science 82 (1999) 2421–2429.

359

[18] M. Vermorel, Energy: the feed unit systems, in: Ruminant nutrition:

360

recommended allowances and feed tables, INRA Publications, Paris,

361

1989, p. 28.

362

[19] S. A. Kadi, F. Djellal, M. Berchiche, Caract´ erisation de la conduite ali-

363

mentaire des vaches laiti` eres dans la r´ egion de tizi-ouzou, alg´ erie, Live-

364

stock Research for rural development 19 (2007).

365

[20] G. Pulina, A. Nudda, G. Battacone, A. Cannas, Effects of nutrition on

366

the contents of fat, protein, somatic cells, aromatic compounds, and un-

367

desirable substances in sheep milk, Animal Feed Science and Technology

368

131 (2006) 255–291.

369

[21] R. Emery, Milk fat depression and the influence of diet on milk compo-

370

sition., The Veterinary Clinics of North America. Food Animal Practice

371

4 (1988) 289–305.

372

[22] U. Meyer, M. Everinghoff, D. G¨ adeken, G. Flachowsky, Investigations

373

on the water intake of lactating dairy cows, Livestock production science

374

90 (2004) 117–121.

375

[23] G. Pulina, N. Macciotta, A. Nudda, Milk composition and feeding in

376

the italian dairy sheep, Italian Journal of Animal Science 4 (2005) 5–14.

377

[24] V. Olori, S. Brotherstone, W. Hill, B. McGuirk, Fit of standard models

378

of the lactation curve to weekly records of milk production of cows in a

379

single herd, Livestock Production Science 58 (1999) 55–63.

380

[25] A. E. Hoerl, R. W. Kennard, Ridge regression: Biased estimation for

381

nonorthogonal problems, Technometrics 12 (1970) 55–67.

382

[26] J. O. Ogutu, T. Schulz-Streeck, H.-P. Piepho, Genomic selection using

383

regularized linear regression models: ridge regression, lasso, elastic net

384

and their extensions, BMC Proceedings 6 (2012) S10.

385

[27] R. Tibshirani, Regression shrinkage and selection via the lasso, Journal

386

of the Royal Statistical Society: Series B (Methodological) 58 (1996)

387

267–288.

388

(21)

[28] H. Zou, T. Hastie, Regularization and variable selection via the elastic

389

net, Journal of the royal statistical society: series B (statistical method-

390

ology) 67 (2005) 301–320.

391

[29] J. Friedman, T. Hastie, R. Tibshirani, Regularization paths for general-

392

ized linear models via coordinate descent, Journal of statistical software

393

33 (2010) 1.

394

[30] A. Cabrita, R. Bessa, S. Alves, R. Dewhurst, A. Fonseca, Effects of

395

dietary protein and starch on intake, milk production, and milk fatty

396

acid profiles of dairy cows fed corn silage-based diets, Journal of Dairy

397

Science 90 (2007) 1429–1439.

398

[31] C. Kamphuis, H. Mollenhorst, A. Feelders, D. Pietersma, H. Hogeveen,

399

Decision-tree induction to detect clinical mastitis with automatic milk-

400

ing, Computers and Electronics in Agriculture 70 (2010) 60–68.

401

[32] K. Saruta, Y. Hirai, K. Tanaka, E. Inoue, T. Okayasu, M. Mitsuoka,

402

Predictive models for yield and protein content of brown rice using sup-

403

port vector machine, Computers and electronics in agriculture 99 (2013)

404

93–100.

405

[33] B. Barrett, I. Nitze, S. Green, F. Cawkwell, Assessment of multi-

406

temporal, multi-sensor radar and ancillary spatial data for grasslands

407

monitoring in ireland using machine learning approaches, Remote Sens-

408

ing of Environment 152 (2014) 109–124.

409

[34] P. Shine, M. D. Murphy, J. Upton, T. Scully, Machine-learning algo-

410

rithms for predicting on-farm direct water and electricity consumption

411

on pasture based dairy farms, Computers and electronics in agriculture

412

150 (2018) 74–87.

413

[35] A. J. Smola, B. Sch¨ olkopf, A tutorial on support vector regression,

414

Statistics and computing 14 (2004) 199–222.

415

[36] D. Meyer, E. Dimitriadou, K. Hornik, A. Weingessel, F. Leisch, C.-C.

416

Chang, C.-C. Lin, M. D. Meyer, Package ‘e1071’, The R Journal (2019).

417

[37] L. Breiman, Random forests, Machine learning 45 (2001) 5–32.

418

(22)

[38] A. Liaw, M. Wiener, et al., Classification and regression by randomfor-

419

est, R news 2 (2002) 18–22.

420

[39] L. Breiman, Classification and regression trees, Routledge, 2017.

421

[40] P. Bickel, P. Diggle, S. Fienberg, U. Gather, I. Olkin, S. Zeger, Springer

422

Series in Statistics, Springer, 2009.

423

[41] J. Schmidhuber, Deep learning in neural networks: An overview, Neural

424

networks 61 (2015) 85–117.

425

[42] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hub-

426

bard, L. D. Jackel, Backpropagation applied to handwritten zip code

427

recognition, Neural computation 1 (1989) 541–551.

428

[43] F. G¨ unther, S. Fritsch, neuralnet: Training of neural networks, The R

429

journal 2 (2010) 30–38.

430

(23)

MLR SVR Random forest Neural network

Cow ID RMSE MAE R 2 RMSE MAE R 2 RMSE MAE R 2 RMSE MAE R 2

1 3.24 2.75 0.74 2.17 1.90 0.88 2.13 1.78 0.89 2.98 2.62 0.78

2 3.11 2.51 0.55 2.86 2.07 0.62 2.68 1.84 0.67 2.6 1.89 0.69

3 4.09 3.08 0.65 3.63 2.48 0.72 3.52 2.39 0.74 4.02 2.91 0.66

4 3.34 2.51 0.54 3.65 2.96 0.45 2.70 2.05 0.70 3.85 3.15 0.39

5 2.42 1.71 0.86 2.41 1.81 0.86 1.46 1.18 0.95 2.19 1.66 0.89

6 2.74 2.30 0.68 2.86 2.16 0.65 2.32 1.57 0.77 2.64 2.15 0.70

7 2.44 1.99 0.81 2.22 1.70 0.84 2.81 2.17 0.74 2.67 1.96 0.77

8 3.96 3.22 0.77 4.44 3.67 0.71 3.70 3.14 0.80 4.75 3.58 0.67

9 4.28 3.84 0.67 3.58 2.40 0.77 3.79 2.72 0.74 3.43 2.36 0.79

10 4.72 3.88 0.58 3.46 2.93 0.78 5.77 4.81 0.37 3.75 3.18 0.74 11 1.87 1.51 0.90 2.41 1.96 0.83 2.22 1.78 0.86 2.33 1.82 0.84

12 4.72 3.83 0 3.44 2.67 0.38 3.58 2.85 0.33 3.48 2.82 0.37

13 3.52 2.85 0.15 3.04 2.23 0.37 3.42 2.18 0.19 2.3 1.75 0.64

14 2.81 2.26 0.83 3.14 2.28 0.79 1.84 1.54 0.93 3.02 2.18 0.8

15 5.16 4.41 0.04 3.25 2.44 0.62 3.5 2.69 0.56 3.28 2.57 0.61

16 3.34 3.06 0 3.02 2.51 0 2.41 1.88 0 2.99 2.59 0

17 2.91 2.48 0.87 3.52 2.74 0.81 3.47 2.43 0.82 3.18 2.63 0.85

18 4.38 3.79 0.17 3.96 3.2 0.32 3.28 2.55 0.53 3.55 2.76 0.45

19 4.06 2.70 0 4.61 2.86 0 4.49 2.97 0 3.74 2.62 0.001

20 2.94 1.98 0.11 2.47 1.58 0.38 2.40 1.60 0.41 2.40 1.47 0.41 21 2.84 2.25 0.67 1.71 1.30 0.88 2.01 1.20 0.83 1.70 1.27 0.88 22 3.42 2.95 0.64 2.42 2.13 0.82 2.26 1.84 0.84 3.49 3.02 0.62 23 2.75 2.28 0.70 2.45 1.99 0.76 2.10 1.48 0.82 2.40 1.86 0.77 24 2.56 2.23 0.72 2.02 1.53 0.83 1.85 1.39 0.85 2.29 1.67 0.78 25 2.00 1.53 0.73 1.44 1.16 0.86 2.17 1.52 0.68 1.57 1.34 0.83 26 1.76 1.47 0.95 2.66 2.15 0.88 2.03 1.69 0.93 2.28 1.96 0.91 27 3.36 2.73 0.57 2.29 1.77 0.80 2.59 1.90 0.74 2.67 2.15 0.73 28 1.56 1.26 0.92 1.97 1.54 0.87 1.96 1.63 0.87 1.73 1.50 0.90 29 3.86 2.75 0.40 4.23 2.92 0.28 4.33 2.48 0.24 4.20 2.81 0.29 30 1.65 1.41 0.81 1.70 1.34 0.80 2.64 2.10 0.52 1.46 1.02 0.85 31 3.15 2.45 0.80 3.44 2.61 0.76 3.66 2.30 0.73 3.54 2.78 0.75 32 2.29 1.71 0.82 2.28 1.77 0.83 1.93 1.45 0.87 2.50 2.00 0.79

33 2.69 2.17 0.43 3.44 2.69 0.07 4.68 3.53 0 4.46 3.63 0

34 2.16 1.72 0.90 1.81 1.39 0.93 2.13 1.66 0.90 2.54 2.05 0.86 35 3.24 2.89 0.77 2.83 2.36 0.82 2.23 1.90 0.89 3.29 2.57 0.76 36 2.36 1.40 0.89 2.45 1.54 0.89 2.20 1.67 0.91 2.79 1.77 0.85

Table 5: The forecast error of four models for 36 individual cows.

(24)

0 5 10 15 20 25 30 35

0 1 2 3 4 5 6 7

Cow ID

Root Mean Square Error

MLR SVR RF ANN

0 5 10 15 20 25 30 35

0 1 2 3 4 5 6

Cow ID

Mean Absolute Error

MLR SVR RF ANN

0 10 20 30 40

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

Cow ID R

2

v alues

MLR

SVR

RF

ANN

(25)

Elastic regression SVR Random forest Neural Network

RMSE 3.103 2.868 2.842 2.947

MAE 2.496 2.187 2.107 2.279

R 2 0.664 0.712 0.734 0.704

Table 6: Average error of each model for all 36 individual cows.

0 5 10 15 20 25 30 35

5 10 15 20 25 30 35

Week

Milk yield

Observed data Predicted data

Figure 6: The observations and predictions of milk production of cow number 16 using MLR, R 2 value is -1.46.

Parity DIM MPD Starch Crude fiber PDIE NE 11.21 8.96 11.35 34.87 15.44 6.58 15.16

Table 7: Average of variable importance estimated by random forest (in %).

Elastic regression SVR Random forest Neural Network

mean 0.077 0.157 6.771 7.357

SD 0.005 0.007 0.175 4.754

Table 8: Average training time (in seconds) and its standard deviation for 36 experiments.

(26)

Figure 7: Two individual cows that had medical issues during the experiment, one had

lameness (left), while the other had mastitis (right).

(27)

MLR SVR Random forest Neural network

Cow ID RMSE MAE R 2 RMSE MAE R 2 RMSE MAE R 2 RMSE MAE R 2

1 1.93 1.72 0.91 1.36 1.09 0.95 1.37 1.07 0.95 1.34 1.14 0.95 2 1.81 1.55 0.85 1.72 1.27 0.86 2.14 1.57 0.79 1.81 1.49 0.85 3 3.79 2.61 0.70 4.20 2.80 0.63 4.26 2.95 0.62 3.51 2.28 0.74 4 2.69 2.23 0.70 2.75 2.38 0.69 2.17 1.45 0.81 2.86 2.41 0.66 5 1.72 1.30 0.93 1.81 1.37 0.92 1.47 1.19 0.95 1.84 1.38 0.92 6 2.15 1.59 0.80 2.20 1.30 0.79 1.51 1.13 0.90 2.61 1.53 0.71 7 1.88 1.56 0.89 1.89 1.57 0.88 2.35 1.85 0.82 1.80 1.49 0.89 8 2.48 1.74 0.91 3.61 3.05 0.81 3.06 2.48 0.86 2.55 1.83 0.91 9 3.12 2.26 0.82 2.71 1.74 0.87 3.15 2.24 0.82 2.93 2.08 0.84

10 3.17 2.4 0.81 2.91 2.47 0.84 3.62 2.76 0.75 4.12 3.76 0.68

11 1.60 1.28 0.93 1.78 1.27 0.91 1.32 1.02 0.95 1.90 1.44 0.89

12 2.76 2.07 0.60 2.26 1.51 0.73 2.46 1.90 0.68 2.47 1.70 0.68

13 2.62 2.08 0.53 2.44 2.01 0.59 2.74 2.19 0.49 2.61 2.16 0.53

14 1.91 1.62 0.92 2.63 2.12 0.85 2.10 1.71 0.90 2.36 2.00 0.88

15 3.29 2.77 0.61 2.68 2.00 0.74 2.71 2.06 0.74 2.77 2.31 0.72

16 2.08 1.78 0.04 1.67 1.29 0.38 1.85 1.39 0.24 1.63 1.35 0.42

17 1.77 1.44 0.95 2.07 1.61 0.94 2.03 1.54 0.94 1.97 1.65 0.94

18 2.50 1.95 0.73 2.08 1.43 0.81 2.36 1.64 0.76 2.54 1.89 0.72

19 2.60 1.82 0.52 3.18 2.23 0.28 2.82 2.06 0.43 3.10 2.08 0.32

20 1.66 1.30 0.72 1.29 1.06 0.83 1.46 1.16 0.78 1.67 1.32 0.71

21 2.31 1.53 0.78 1.89 1.13 0.85 1.58 1.02 0.90 1.75 1.21 0.88

22 2.10 1.71 0.86 1.55 1.28 0.92 1.58 1.33 0.92 1.77 1.48 0.90

23 1.98 1.44 0.84 1.85 1.23 0.86 1.84 1.24 0.87 1.94 1.43 0.85

24 1.58 1.32 0.89 1.56 1.16 0.90 1.27 1.05 0.93 1.69 1.29 0.88

25 1.79 1.45 0.78 1.63 1.28 0.82 1.81 1.28 0.78 1.66 1.37 0.81

26 2.57 1.89 0.89 2.74 2.15 0.87 2.24 1.88 0.92 2.97 2.25 0.85

27 2.03 1.68 0.84 1.24 0.95 0.94 1.10 0.79 0.95 1.43 1.19 0.92

28 1.97 1.46 0.87 2.08 1.43 0.86 2.13 1.37 0.85 1.85 1.30 0.89

29 2.43 1.79 0.76 2.66 1.62 0.72 2.82 1.50 0.68 3.42 1.91 0.53

30 1.46 1.23 0.85 1.42 1.10 0.86 1.61 1.11 0.82 1.46 1.22 0.85

31 2.98 2.33 0.82 2.77 2.14 0.84 2.21 1.72 0.90 2.90 2.20 0.83

32 1.77 1.39 0.90 1.81 1.36 0.89 1.30 1.04 0.94 1.88 1.53 0.88

33 1.59 1.28 0.80 1.85 1.56 0.73 1.90 1.42 0.71 1.67 1.49 0.78

34 1.55 1.14 0.95 1.45 1.00 0.95 1.77 1.29 0.93 1.77 1.28 0.93

35 4.48 2.42 0.55 4.38 2.41 0.57 4.17 2.35 0.61 4.61 2.39 0.53

36 2.40 1.69 0.89 2.22 1.72 0.91 2.03 1.52 0.92 2.61 1.88 0.87

Table 9: The forecast error of four autoregressive models for 36 individual cows.

(28)

0 5 10 15 20 25 30 35

0 1 2 3 4 5 6 7

Cow ID

Root Mean Square Error

MLR SVR RF ANN

0 5 10 15 20 25 30 35

0 1 2 3 4 5 6

Cow ID

Mean Absolute Error

MLR SVR RF ANN

0 5 10 15 20 25 30 35

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4

Cow ID R

2

v alues

MLR

SVR

RF

ANN

(29)

Elastic regression SVR Random forest Neural Network

RMSE 2.292 2.231 2.175 2.327

MAE 1.745 1.641 1.590 1.742

R 2 0.782 0.801 0.801 0.782

Table 10: Average error of each autoregressive model for all 36 individual cows.

Parity DIM MPD Starch Crude fiber PDIE NE y t−1

2.32 4.28 5.40 14.81 5.11 2.62 6.29 62.78

Table 11: Average of variable importance estimated by random forest (in %).

Elastic regression SVR Random forest Neural Network

mean 0.083 0.182 7.240 6.862

SD 0.009 0.007 0.152 2.919

Table 12: Average training time (in seconds) and its standard deviation for 36 experiments.

Références

Documents relatifs

The aggregated forecasts demonstrate good results; for instance, they always show better performance than the best model in the ensemble and they even compete against the best

In database one, papers were selected if they contained a comparison of milk production from lactating dairy cows grazing PRG-WC (GC) swards against PRG only (GO)

Effects of graded levels of rumen-protected lysine on milk production in dairy cows.. H Rulquin, Catherine Hurtaud,

As we see from Figure 9, the neural network on the sub-sampled data classified a significant part of the verified transactions (Y-axis) in the class of fraudulent, but only

Results of classification (a) by an ensemble of trees, (b) by a neural network with feature generation by piecewise linear interpolation method from aps data on part of. the data

In this paper a Least-Squares Support Vector Machine (LS-SVM) approach is introduced which is capable of reconstructing the dependency structure for linear regression based LPV

Abstract—In this article we study the use of the Support Vector Machine technique to estimate the probability of the reception of a given transmission in a Vehicular Ad hoc

OD-model: the simulation data are both considered as prior knowledge on Output values and used to build a prior model in order to give prior knowledge on Derivative values as for