Inside-zone - Probabilistic Spatial Constraint Verification

P ROBABILISTIC A CTIVITY R ECOGNITION

5.6 Probabilistic Constraints Verification

5.6.1 Probabilistic Spatial Constraint Verification

5.6.1.1 Inside-zone

In this section we study the relationship that can have a mobile object with the contextual object zone, the spatial relation inside-zone and we propose a probabilistic approach for the verification of this spatial relation. This approach can be generalized and extended for other spatial relationship between mobile objects and zones. The spatial relationinside-zonerequires

Figure5.4: The computation of the distance of a personPto a zone.

two physical objects, a mobile object (e.g. person p) and a contextual object of type zonez.

This relation(p in z)verifies wether the mobile object (e.g. person) is inside a zonez.

A main problem in the verification of this constraint is: (1)the imprecision and uncertainty in the detection of the relative location of a mobile object to the zone due to low level detection errors (e.g. reflections, shadows or occlusions). Thus, the verification of the constraint can fail.

Another issue consists in(2)the value of position attribute itself (e.g. wrong value estimation, value change rapidly from frame to frame). A solution to cope with the first problem (1) is to propose a probabilistic distance-based analysis for the verification of the constraint. It is a two-step approach:

¥ The first step consists in computing the distance between the 3D position of the tracked mobile object (i.e. person) and the zone of interest. The person is represented by the position of its feet corresponding to the middle of the bounding box bottom segment. The zones of interest are represented as polygon. In the following we note (x_p, y_p, z_p) the 3D position of the mobile object and ‘dist^′ its distance to the zone. We have defined a method called border method, to compute this distance between the 3D position of the mobile objects with respect to the border of the zone of interest.

The border method consists in computing the distance of the person P from each of the zone borders. As shown in figure 5.4, this step consists in calculating whether the orthog-onal projection (noted H) of P on segment [AB] belongs to the same segment. If H∈[AB]

then the distance of the person to the zone is||PH||, it is the minimal one. Otherwise, we calculate the distance of P to the two extremities of the segment [AB] and the minimal distance of P to [AB] is the Min (d1,d2); with d1=||AP||and d2 =||PB||.

¥ For the second step, we have to choose the exact shape of the distribution, which depends on the exact situation we want to model. The goal is to find a function that describes the likelihood that a person is inside a zone given the distance of the person to this zone. In

Figure5.5: Gaussian probability distributions, with mean 0 and different standard deviationsσ. The parameterµdetermines the location of the distribution whileσdetermines the width of the bell curve.

The standard deviation σ shows how much variation or ”dispersion” exists from the average. A low standard deviation indicates that the data points tend to be very close to the mean; high standard deviation indicates that the data points are spread out over a large range of values.

probability theory, it is to find a probability distribution function (PDF) that maximizes the value of probability when the distance ‘dist^′ is small (i.e. person inside the zone/

person near the zone) and a minimum value when the distance‘dist^′ is big (i.e. person far from the zone).

There are different probability models proposed in the literature, and the appropriate model depends on the distribution of the outcome of interest. A very common distribution that is often used in probability theory is the Gaussian distribution (fig.5.5), it needs only two parameters to represent the normal distribution denoted byN(µ, σ): the meanµof the distribution and the standard deviationσ(Eq. 5.30).

The normal (Gaussian) probability model applies when the distribution of the outcome conforms reasonably well to a normal or Gaussian distribution, which resembles a bell shaped curve. Note, normal probability model can be used even if the distribution of the continuous outcome is not perfectly symmetrical; it just has to be reasonably close to a normal or Gaussian distribution [Sullivan and LaMorte, 2013].

N(µ, σ²) = 1

√2πσexp(−(dist−µ)²

2σ² ) (5.30)

In this context, the outcomes represent the value of the distance of the person to the zone of interest. Figures 5.6, 5.7 and 5.8 are graphical representations showing a visual impression of

Figure5.6: Histogramme of the distance distribution; distance(person, reading zone).

Figure5.7: Histogramme of the distance distribution; distance(person, coffee corner zone).

the distribution of data. These histograms show the underlying frequency distribution (shape) of the distance values shown as rectangles, erected over discrete intervals, with an area equal to the frequency of the observations in the interval. The height of a rectangle is equal to the frequency density of the interval. These histograms show the proportion of cases that fall into each of several categories. The categories are usually specified as consecutive, non-overlapping intervals of a variable in our context the distance variable. These figures show and let us discover that the distribution of these distances fit into a normal distribution.

In figure 5.6, we take a video sequence where the ground truth indicates that a person is inside a zone called ‘reading zone’, we compute and store at each frame the distance of the person to this zone. We plot the histogram that is used for estimating the probability density function of the variable distance. In the same manner we plot the histogram of the distance of a person to the zone called ‘coffee corner zone’ and the zone called ‘TV zone’.

Once, we decide for the density distribution function, we have to learn the parameters of

Figure5.8: Histogramme of the distance distribution; distance(person, TV zone).

this function, i.e. the Gaussian parameters(µ, σ). That is, having a sample(x₁, ..., xn) from a normalN(µ, σ²)population we would like to learn the approximate values of parameters. This calculation is based on equations:

In our context, to learn the Gaussian parameters, we divide the video database: two-third for learning database and one third for testing. For the learning dataset which is selected to be representative to the whole video database, we calculate and store the distancesdist_iof a tracked person to each contextual zone (dist₁, ..., dist_n)_z₁, ...,(dist₁, ..., dist_k)_z_i, etc. Having these samples, we calculate the Gaussian parameters based on the following equations:

^ nis the sample size. More details of the step of Gaussian parameters learning are presented in section 6.7.1.2, chapter 6.

5.6.1.2 Close to

In this section, we study the relationship that can have a mobile object (i.e. person) with the contextual object equipment (e.g. chair, TV), the spatial relationclose-to. The probabilistic verification of the constraintClose tois defined in the same way than the spatial relationinside-zone. The proposed approach can be generalized and extended for other spatial relationship between mobile objects and equipment. As described above, to verify the spatial constraint, we compute the distance dist of the person to the objects (i.e. person, equipment), we choose the

Figure5.9: Computation of the distance od a person to the equipement.

Gaussian functionN(µ, σ)that gives a maximal value of probability (i.e. 1) when the distance is small and a minimun value (i.e. 0) when the distance is big.

¥ The first step consists in computing the euclidean distance of the person to the contextual object (i.e. equipment) (fig. 5.9). The tracked person is represented by the position of its feet corresponding to the middle of the bounding box bottom segment. The equipment of interest is represented as 2D polygon. The distance of the person to the equipement is calculated in the same manner than for the spatial relationinside-zone(see section 5.6.1.1 ).

¥ For the second step, we have to choose the distribution function that describes the likeli-hood that a person close to the equipment of interest given the distance of the person to this equipment. In probability theory, it is to find a probability distribution function (PDF) that maximizes the value of probability when the distance‘dist^′is small (i.e. person near the equipment) and a minimum value when the distance‘dist^′is big (i.e. person far from the equipment). As described in section 5.6.1.1, we use the Gaussian distribution func-tion. For each equipment, we calculate and store the distancesdist_iof a tracked person to each contextual equipment (dist₁, ..., dist_n)_equipment₁ , ...,(dist₁, ..., dist_k)_equipment_i etc. For the learning step, we divide the video database: two-third for learning and one third for testing. The learning of the Gaussian parameters (µ, σ) is based on equation 5.31.

5.6.1.3 Discussion

We discussed the approach that we propose for a probabilistic verification of spatial con-straints based on the Gaussian probability. We study the relationship that can have a mobile object with the contextual object zone, the spatial relation inside-zoneand compute the Gaus-sian probability of this constraints which allows us to decide if this constraint is satisfied or not.

This approach can be generalized and extended for other spatial relationship between mobile objects and zones.

We also study the relationship that can have a mobile object (i.e. person) with the contextual object equipment (e.g. chair, TV), the spatial relation close-to. The probabilistic verification of the constraint Close to is defined in the same way than the spatial relation inside-zone. The proposed approach can be generalized and extended for other spatial relationship between mobile objects and equipment. In the following, we discuss the posture and temporal relations and we detail the way that we propose to deal with uncertainty.

5.6.2 Posture

We have selected a set of specific postures which are representative of typical applications in video understanding. These postures are classified in a hierarchical way. We have four general posture categories and eight detailed posture sub-categories:

- Standing postures: standing with one arm up, standing with arms along the body and T-shape posture,

- Sitting postures: sitting on a chair and sitting on the floor, - Bending posture,

- Lying postures: lying with spread legs and lying with curled-up legs.

The human posture recognition algorithm [Boulay et al., 2007] determines the posture of the detected person using the detected silhouette and its 3D position. The 3D human model silhouettes are obtained by projecting the corresponding 3D human model on the image plane using the 3D position of the person and a virtual camera which has the same characteristics (position, orientation and field of view) as the real camera. A dedicated 3D engine (virtual 3D scene generator) can animate and display a 3D human model. It also extracts the gener-ated model silhouette. Finally, 3D human model silhouettes are compared with the detected silhouette to recognise the posture of the detected person.

The management of uncertainty of this constraint is given from the posture recognition al-gorithm based on a computed score of the most likely postures (see alal-gorithm 5). The temporal coherence of posture is exploited to compute the posture recognition score: the identifier of the recognized person is used to retrieve the previous detected postures in a temporal window.

Figure5.10: Hierarchical representation of the postures of interest [Boulay et al., 2007].

Figure5.11: Detected posture is compared with the previous detected postures to verify the temporal coherency.

These postures are then used to compute the filtered posture (i.e. the main posture) by search-ing the most probable posture which corresponds to the most frequent posture for a certain period of time (fig. 5.11 ).

This smoothing algorithm reduces the misdetection and allows activity recognition algo-rithm to benefit from reliable filtered postures, more details are available in [Boulay et al., 2007], section 5.4.

begin Data:

postureList←NULL the list which contains the quantity of occurrence of the postures fori = -windowSize to windowSizedo

postureList[detectedPOsture[t+ i]] += weightList[i]

end

return indexOfTheMaximum (postureList) return the posture which occurs the most frequently as the filtered posture at timet

end

Algorithm 5: Determination of the most probable posture in a temporal window of 2 ∗ windowSize+1. The weight listweightListdetermines how probable thei^thposture occurs in the window of2∗windowSize+1.

Dans le document The DART-Europe E-theses Portal (Page 135-143)