Probability Computation - Bayesian Probability Theory

P ROBABILISTIC A CTIVITY R ECOGNITION

5.3 Probabilistic Primitive state Recognition

5.3.1 Bayesian Probability Theory

5.3.2.1 Probability Computation

In this section, we specify how to compute each term of the equations (5.10). We attempt here to estimate the probability distributions from the available training data. The estimation of probability distributions, is a very large and complex domain. We describe here the most widely used method in the framework of Bayesian networks, the maximum likelihood estima-tion method. In the case where all variables are observed, the simplest method and the most widely used statistical estimation is to estimate the probability of an event by the frequency of occurrence of the event in the database. This approach, called maximum likelihood (ML) [Leray, 2006]:

P(X^ _i=x_k|Y_j=y_e) = ^θ^ML_i,j,k,e= N_i,j,K,e (X_i=x_k, Y_j=y_e) P

kN_i,j,K,e (5.11)

WhereN_i,j,Kis the number of events in the database for which the variableX_iis in statex_k and his parentsY_jare in the configurationy_e.

¥ The first probability to learn isP(e∈Ω):

P(e ∈ Ω) is the prior probability that a certain scenario model Ω is detected. We can assume that all scenarios in a certain universe are equally probable, so as not to favor any scenario because it happens more often. By scenario models in the same universe, we mean a set of scenarios which are mutually exclusive (if any of them is happening the others can not occur) and include all possible situations so that in any observation one of them must be happening. For example, the universe of the scenario models that describe a person posture is: (PersonStanding,PersonSittingandPersonBending). The uni-verse of scenario models that describe a person position in a certain area is: (Person-InsideZoneTV, PersonInsideZoneEntrance, PersonInsideZoneUseReadingTable,

This prior probability has to be learned from a training set of video sequences which has to be large and representative (e.g. the same illumination conditions, etc.).

¥ The second probability to be computed isP(ζ(Ω, O)|e∈Ω):

P(ζ(Ω, O)|e ∈ Ω) is the probability that the constraints of the event model are verified given that the event e is true (i.e. has been annotated as an instance of the event model Ω). This probability quantifies how likely it is that a constraint of event modelΩshould be verified when an instance of Ω is taking place. In this section we explain how we compute this probability, we present the first formalization 5.13 and we detail based on other information, how we modify it to get the final equation 5.14. The first formulation of this probability is as following:

wherenis the total number of constraints that are being considered. ♯(a)is the number of frames where a is verified. The term ♯(ζ(Ω, O)_i∧e ∈ Ω) implies that only frames where event e has been identified (i.e. annotated) as an instance of Ω are considered, and for each constraint of event model Ω, the number of frames where it is satisfied are counted. ♯(e ∈ Ω) is the total number of frames where the event e is annotated.

Conditional independence among the constraints is assumed. Thus, the probability of all the constraints that are being considered is calculated as the multiplication of the probabilities of each of the constraints.

It is important that the frames where the scenario is said to be identified should be an-notated (manually or automatically). Since we want to assess how much the verification of the constraint affects the event’s detection, the cases where the event is present but the constraint is not verified must be identified. It is necessary to determine the event instances that are in the ground truth GT but are not detected, and assess what is causing this failure in detection.

The physical objects that intervene in the constraint should be added in the equation. Oth-erwise, a failure in detecting the physical objects that intervene in the constraint might result in low probabilities for that constraint, when the problem is in fact due to a failure of physical object detection. In other words, the equation as it is penalizes both following cases:

- The physical object is detected but the constraint is not verified,

- The physical object is not detected resulting that the constraint is not verified.

To take into account the intervention of the physical objects in the calculation of this prob-ability, the final equation (5.14) to compute the probabilityP(ζ(Ω, O)|e∈Ω)becomes:

P(ζ(Ω, O)|e∈Ω) = Yn i=1

♯(ζ(Ω, O)_i∧e∈Ω)

♯(e∈Ω∧V_Ω^ζⁱ ∈po^O_e) (5.14) WhereV^ζⁱ are the physical object variables that intervene in the constraintζ_i.

The term♯(e ∈ Ω∧V_Ω^ζⁱ ∈ po^O_e) indicates that we only consider the frames of the ground truth where the evente is annotated and the physical objects are correctly tracked. We do not take into account the frames of the ground truth where the physical object are not correctly tracked.

¥ The third probability to be computed isP(V_Ω=po^O_e|e∈Ω):

P(V_Ω = po^O_e|e ∈ Ω) is the probability that the physical object variables in the event modelΩhave been detected given that eis an event instance of the event modelΩ. As in the previous factor, conditional independence among the physical objects is assumed and the probabilities obtained for each physical object are multiplied. This probability quantifies the likelihood of detecting a physical object when an event that involves it is actually happening. This probability is computed on-line and provided from the tracking algorithm as described in [Chau, 2012]. The algorithm in [Chau, 2012] computes the reliability of the trajectory of each mobile objectpo_tdetected at instantt. For each object detected at instant t, the algorithm considers all its matched objects pot−n(i.e. objects with temporarily established links) in previous frames that do not have yet official links

(i.e. trajectories) to any objects detected at the following frames. For such an object pair

where w_k is the weight of descriptor k, the descriptors are 2D, 3D positions, 2D shape ratio, 2D area, color histogram, histogram of oriented gradient (HOG), color covariance and dominant color. GS_k(po_t, pot−n)is the global score for descriptorkbetweenpo_tand pot−n, defined in function of link score and long-term score for descriptorkas follow :

GS_k(po_t, po_t−n) = (1−β). DS_k(po_t, po_t−n) +β. LT(po_t, χ_t−n) (5.16) where β is the weight of long-term score, DS_k(po_t, pot−n) is the similarity score for descriptor k between the two objects po_t and po_t−n; LT(po_t, χ_t−n) is their long-term scores (for more detail see [Chau, 2012], sections 6.1.1.1, 6.1.2.2 and 6.1.2.3).

However in the case where the tracking algorithm does not provide this probability value, an alternative way to compute this probability is provided in the following equation:

P(V_Ω=po^O_e|e∈Ω) = Ym k=1

♯(V_Ω^k ∈po^O_e ∧e∈Ω)

♯(e∈Ω) (5.17)

Where m is the number of physical objects that intervene in constraint of Ω. ♯(V_Ω^k ∈ po^O_e ∧e∈Ω)denotes the number of frames where the physical object is correctly tracked in the total frames where e is annotated. The training set is composed of the frames where the eventeis annotated. ♯(e∈Ω)is the total number of frames where the evente is annotated.

¥ P(ζ(Ω, O), V_Ω = po^O_e) is the Bayesian probability denominator. This probability is ob-tained by an integration (summation) over all the hypotheses. In this case, the first hypothesis is that a certain event instance e of event model Ω is detected, the other hypothesis is that an event instance⌉eof event modelΩ^′ is detected. This probability is computed based on the following equation (5.18):

P(ζ(Ω, O), V_Ω=po^O_e) =P(ζ(Ω, O), V_Ω=po^O_e|e∈Ω)×P(e∈Ω)+

♯(ζ(Ω, O), V_Ω=po^O_e)∧e∈Ω)corresponds to the number of frames where the constraint is verified and the physical object is detected, we consider the frames where the eventeis annotated. P(e∈Ω)is the prior probability that the eventeof event modelΩis detected.

This probability is computed based on the equation 5.12.

(♯(ζ(Ω, O), V_Ω = po^O_e)∧e ∈⌉Ω) corresponds to the number of frames where the con-straint is verified and the physical object is detected. We consider the frames where the other events of the same universe thane are annotated. The computation of this proba-bility is detailed in the next section.

Dans le document The DART-Europe E-theses Portal (Page 122-126)