• Aucun résultat trouvé

1 Computation of probabilities and quantiles

N/A
N/A
Protected

Academic year: 2022

Partager "1 Computation of probabilities and quantiles"

Copied!
3
0
0

Texte intégral

(1)

Universit´e Joseph Fourier L2/STA230

Lab 2: Probabilities and quantiles and description of a continuous quantitative variable

Objectives: The first objective of this session is to compute some probabilities and quantiles of some usual distributions. The second objective is to compute the usual descriptive indicators and graphs for a continuous quantitative variable.

1 Computation of probabilities and quantiles

R proposes exact computation of density functions, cumulative distribution function and quantile func- tions for standard probability families. For example,

• Gaussian distribution:

– Its density function isdnorm,

– its cumulative distribution function ispnorm(input a value, output a prob- ability), – its quantile function isqnorm(input a probability, output a value).

• Binomial distribution

– Its density function isdbinom,

– its cumulative distribution function ispbinom, – its quantile function isqbinom,

Exercise 1

From past experience, it is known that a certain surgery has a 90% chance to succeed. This surgery is going to be performed on 5 patients. LetX be the random variable equal to the number of successes out of the 5 attempts.

1. What model do you propose forX?

2. SimulateX

n <- 5; p <- 0.9 # parameters N <- 100 # sample size

X <- rbinom(N,n,p); X # simulated sample

3. What is the probability that the surgery will fail all 5 times?

n <- 5; p <- 0.9 # parameters N <- 1e7 # sample size

X <- rbinom(N,n,p) # sample

F <- length(which(X==0))/N; F # frequency of event dbinom(0,n,p) # R does it exactly

(2)

4. What is the probability for the surgery to fail exactly 2 times? Calculate exactly the probability with R.

5. Simulate the convergence of the proportion of 5 surgeries to fail exactly 2 times when the sample size increases.

N <- 2e4 # sample size X <- rbinom(N,n,p) # sample

F <- cumsum(X==3)/(1:N) # frequencies plot(1:N,F,pch=".") # plot frequencies

abline(h=dbinom(3,n,p),col="red") # theoretical value

6. What is the probability for the surgery to succeed at least 2 times? Simulate the convergence.

Exercise 2

The heightX of men in France is modeled by a normal distributionN(172,196) (unit: cm).

1. What proportion of French men are less than 160 cm tall?

mfm <- 172; sdfm <- sqrt(196) # parameters

p <- pnorm(160,mean=mfm,sd=sdfm); p # probability 2. What proportion of French men are more than two meters tall?

3. What proportion of French men are between 165 and 185 centimeters tall?

4. If ten thousand French men chosen at random were ranked by increasing height, how tall would be the 9000-th?

2 Basic statistics of a continuous quantitative variable

Exercise 3

Data HER (Health Exam Results) are from the US department of Health and Human Services, National Center for Health Statistics, and correspond to the Third National Health and Nutrition Ex- amination Survey. Variables that have been collected are:

iden: identification number of the individual dias: diastolic blood pressure (mmHg) sex: 0 for men, 1 for women chol: cholesterol (mg)

age: in years BMI: body mass index (kg/m2)

ht: height (cm) leg: upper leg length (cm)

wt: weight (kg) elbow: elbow breadth (cm)

waist: circumference (cm) wrist: wrist breadth (cm) pulse: pulse rate (beats per minute) arm: arm circumference (cm) sys: systolic blood pressure (mmHg) treat: treatment group

1. Upload theher.csvfile, and assign it todata.

data=read.table("her.csv", header=TRUE, sep = "")

2. Display its dimensions, its first 10 rows, the data of rows 2,4,5, and columns 5,6. Display the column names.

head(data) data[1:10,]

data[c(2,4,5),]

data[,c(5,6)]

names(data)

2

(3)

3. Assign the fourth column of datatoH(height). Display the summary of H.

4. Plot a histogram of H, add a red line indicating the mean, a blue line indicating the median, two green lines indicating the quartiles.

hist(H)

abline(v=mean(H), col="red")

Plot a histogram with 30 classes and comment.

hist(H, nclass=30)

5. Illustrate the distribution with a boxplot.

boxplot(H)

Compare with two box plots the distribution of the height of men and women.

HW<-data[data$sex==1,4]

HM<-data[data$sex==0,4]

boxplot(HW, HM, names=c("Women", "Men"), main="height")

boxplot(data$height data$sex, names=c("Women", "Men"), main="height")

6. Plot the empirical cumulative distribution function.

7. Define HcrasHcentered and reduced.

8. Plot a histogram of the empirical frequencies of Hcr, superpose a blue histogram with 30 classes, then a red density plot.

hist(Hcr,probability=TRUE)

hist(Hcr,nclass=30,probability=TRUE,border="blue",add=TRUE) lines(density(Hcr),col="red")

3

Références

Documents relatifs

(1) to train those providing maternal and child health care in the principles and techniques of risk screening during pregnancy, clean and safe delivery, resuscitation, thermal

Recognizing that significant improvements in health of the newborn in all countries could be achieved by integrating safe motherhood activities with appropriate care of the

After the assessment of needs for neonatal health and the development of technology - both preventive and therapeutic - relevant to the major problems encountered, and considering

Within countries there is a need to assess: the incidence and prevalence rates of common neonatal problems, including hypothermia, hypoglycaemia, hyperbilirubinaemia,

SUPPLEMENTARY FEEDING... Methods Planning ... Community education and community development Implementation ... Some instructional aids ... How to make a food demonstration

increase in need. Unless otherwise stated, all requirement estimates i n this report are presented as the quantity of the element that must be present in the daily

Key to this is the creation of a National Cancer Control Programme, which will facilitate a comprehensive set of measures including primary prevention, screening and early

T he Health, Health Care and Insurance Survey (ESPS) is a general population survey carried out by IRDES since 1988 that collects informa- tion on individuals’ health