• Aucun résultat trouvé

Multiple linear regression « Try to explain logamout »

N/A
N/A
Protected

Academic year: 2022

Partager "Multiple linear regression « Try to explain logamout »"

Copied!
1
0
0

Texte intégral

(1)

Multiple linear regression

« Try to explain logamout »

PROC IMPORT OUT= WORK.ESTIM DATAFILE=

"D:\SAS\estim20171220.csv"

DBMS=TAB REPLACE;

GETNAMES=YES;

DATAROW=2;

RUN;

*************************

MULTIPLE REGRESSION

*************************;

* print mean;

PROC MEANS;

var gender2 matri3_4 age logamount fpublic;

run;

* centering (subtract mean);

data new2;

set estim;

logamount_c = logamount - 10.8243634;

age_c = age - 24.5697499;

run;

* check coding;

PROC MEANS;

var logamount_c age_c;

run;

* multiple regression model with fpublic;

PROC GLM; model fpublic=gender2 age_c logamount_c /solution;

run;

* regression model with matri3_4;

PROC GLM; model matri3_4=gender2 age_c /solution;

run;

* adding logamount;

PROC GLM; model matri3_4=gender2 age_c logamount_c/solution;

run;

* logamount adding more explanatory variables;

PROC GLM; model

logamount_c=gender2 age_c matri3_4 fpublic/solution;

run;

* confidence intervals for parameter estimates;

PROC GLM; model

logamount_c=gender2 age_c matri3_4 fpublic/solution clparm;

run;

We successively try to explain public sector employement, matrimony and logamount by other explanatory variables.

La procédure MEANS

Variable N Moyenne Ec-type Minimum Maximum

gender2 matri3_4 age logamount fpublic

23233 23233 23233 23233 23233

0.3049111 0.0412775 24.5697499 10.8243634 0.0201868

0.4603797 0.1989356 8.7935487 2.8145327 0.1406419

0 0 12.0000000 0 0

1.0000000 1.0000000 111.0000000 21.4875626 1.0000000 La procédure MEANS : after centering

Variable N Moyenne Ec-type Minimum Maximum

logamount_c age_c

23233 23233

3.9205181E-8 2.4675572E-8

2.8145327 8.7935487

-10.8243634 -12.5697499

10.6631992 86.4302501 La procédure GLM : fpublic

Paramètre Estimation Erreur type

Valeur du test t Pr > |t|

Constante 0.0379242784 0.00156396 24.25 <.0001 gender2 0.0109973519 0.00285779 3.85 0.0001 age_c 0.0020154072 0.00014977 13.46 <.0001 logamount_c -.0021708963 0.00046286 -4.69 <.0001

La procédure GLM : matrimony Paramètre Estimation Erreur

type

Valeur du test t Pr > |t|

Constante 0.0377216083 0.00156407 24.12 <.0001 gender2 0.0116620375 0.00285557 4.08 <.0001 age_c 0.0019681690 0.00014950 13.16 <.0001

La procédure GLM : matrimony

Paramètre Estimation Erreur

type

Valeur du test t Pr > |t|

Constante 0.0379242784 0.00156396 24.25 <.0001

gender2 0.0109973519 0.00285779 3.85 0.0001

age_c 0.0020154072 0.00014977 13.46 <.0001

logamount_c -.0021708963 0.00046286 -4.69 <.0001 La procédure GLM : logamount

Paramètre Estimation Erreur type

Valeur du test t Pr > |t| Intervalle de confiance à95%

Constante 0.1087099942 0.02256656 4.82 <.0001 0.0644780517 0.1529419368 gender2 -.3015951016 0.04047309 -7.45 <.0001 -.3809250291 -.2222651741 age_c 0.0221909202 0.00234160 9.48 <.0001 0.0176012266 0.0267806138 matri3_4 -.4365589733 0.09293786 -4.70 <.0001 -.6187233239 -.2543946226 fpublic 0.0629026810 0.14497076 0.43 0.6644 -.2212495896 0.3470549517 In conclusion, glm output show that :

women and old tends more to public sector than men and young

women, old and poor are more married than men, young and rich

women, married, young and private employe are poorer than the others

Références

Documents relatifs

Cardot and Johannes [2010] have shown that a thresholded projection estimator can attain up to a constant minimax-rates of convergence in a general framework which allows to cover

This mathematical software, called Data Groups Summarizer and Interpolator (DGSI) was applied, as an example, for a set of marine nozzle propeller data including both interpolation

Exercise 4 Auditors are often required to compare the audited (or current) value of an inventory item with the book (or listed) value.. If a company is keeping its inventory and

Exercise 26 We apply the lm function to a formula that describes the variable eruptions by the variable waiting, and save the linear regression model in a new variable

forward selection (stepwise) backward selection (stepwise) Forward-Stagewise Regression. Stéphane Canu (ITI) Intro Winter 2020-2021 21

Outline. The article proceeds as follows: we introduce the optimization problem coming from the classical SVR and describe the modifications brought by adding linear constraints

It follows that it is impossible to have a decreasing (resp. increasing) spread of the model output for positive (resp. This restriction is acceptable in a measurement context where

In this study, we tried to establish a more reliable QSAR model for purine derivatives, which are analogs of c-Src tyrosine kinase, and used 2D-QSAR methods to