Constraint-based approaches - Description-based Approaches

S TATE O F THE A RT

2.1 Activity Recognition Approaches

2.1.2 Description-based Approaches

2.1.2.1 Constraint-based approaches

Early work in constraint recognition introduces the notion of chronicles, undirected con-straint graphs describing the temporal concon-straints of atomic sub-events [Ghallab, 1996]. Tech-niques which are based on constraint resolution are among the most sophisticated event recog-nition techniques to date. Rota and Thonnat [Rota and Thonnat, 2000] represented an activity (i.e. scenario) by a set of positive (+) and/or negative (-) variables corresponding (at each instant t) to the detection of individuals, equipment, instantaneous recognised events. A pos-itive variable (x_i : +) corresponds to an expected object/event, whereas, a negative variable (xi: −) corresponds to an object/event that is not allowed to occur during the recognition of a

Figure2.9: Allen’s Temporal relations.

given activity. These variables are linked by a set of conditions (cj) corresponding to temporal constraints and/or non-temporal constraints. Each constraint is a boolean predicate involving these variables. A constraint is called negative constraint if it involves at least one negative vari-able, otherwise, it is called a positive constraint. Figure 2.10 show an example of ‘a person is far from an equipment’ scenario represented by Rota and Thonnat [Rota and Thonnat, 2000].

The authors define the recognition problem as a boolean problem:

P₀(M, A, F) (2.21)

where F is a set of facts, A is an ordered subset {f₁, ...f_k} of F and M is an event model.

A fact is a structured object defined by seven sets of attributes: name, type, date, geometry, velocity, properties and reference (see fig.2.10 and fig.2.11).

A={f₁, ...f_k}is a solution ofP₀if and only if:

∃x_k+1∈F, ...,∃x_n∈F c_j(f₁, ...f_k) =true ∀j∈{1, ...p} (2.22) A scenario is recognized if all its positive constraints are satisfied and its negative constraints are not satisfied.

The authors in [Dechter et al., 1991] have presented the activity recognition as a temporal constraint satisfaction problem. Their framework involves a set of variablesX={X₁, ...X_n}and

Figure 2.10: An example of ‘a person is far from an equipment’ scenario represented by Rota and Thonnat [Rota and Thonnat, 2000].

Figure2.11: Hierarchy of facts[Rota and Thonnat, 2000].

a set of constraints. Each variables presents a time point and each constraint is presented as a set of intervals:

{I₁, ..., I_n}= [a₁, b₁], ...,[a_n, b_n] (2.23) A unary constraint on a variableX_irestricts the value ofX_ito a set of intervall; namely, it respresents the disjunction:

(a₁≤X_i≤b₁)∨...∨(a_n≤X_i≤b_n) (2.24) A binary constraint on two variables X_i and X_j respresents the permissive value of the distanceX_j−X_i; it represent the disjunction:

(a₁≤X_j−X_i≤b₁)∨...∨(a_n≤X_j−X_i≤b_n) (2.25) A network of binary constraints consists on a set of variables,X₁, ..., Xnand a set of unary and binary constraints. It is represented by a ‘temporal constraint graph’, where nodes respresent

variables and edges correspond to constraints. The edges are labelled by sets of intervalls. A tupleX= (x₁, ...x_n)is called a solution if the assignment{X₁ = x₁, ...X_n= x_n}satisfies all the constraints. A valueνis a feasible value for variableX_iif there is a solution in which X_i= ν.

Figure 2.12 is an ilustration of a constraint satisfaction graph.

Figure 2.12: A constraint satisfaction problem model and the corresponding graph[Dechter et al., 1991]. This graph involves five variables: x₀, the starting time of the prob-lem, the choosen value is 7:00 am;x₁, x₂:are respectively the time when John left home and arrived at work and x₃,x₄are respectively the time when Fred left home and arrived at work. There are five constraints involving to the five variables corresponding to the time duration that each person has to take for going to work.

Chleq and Thonnat [Chleq and Thonnat, 1996] have represented a scenario as a set of inde-pendant positive/negative instantaneous events. The events composing a scenario are relating by temporal constraints. The algorithm recognize incrementally pre-defined scenarios repre-senting human behaviors in the observed scene. For each scenario, a graph is built: the vertices respresent the time point variables and the edges correspond to the temporal relations.

In [Vu et al., 2002], [Vu et al., 2003a], the authors first propose a language to describe sce-nario models and second a temporal constraint resolution approach to recognize in real-time scenario occurrences. They represent a scenario model with the list of the actors involved in the scenario and a set of constraints on these actors. An actor can be a person detected as a mobile object by the recognition process or a static object of the observed environment. A person is represented by his/her characteristics: his/her position in the observed environment,

width, velocity, etc. A static object of the environment is defined as a priori knowledge and can be either a zone of interest (a 2D polygonal as the entrance zone) or a piece of equipment (a 3D object as a desk) (fig.2.13, fig.2.14).

Figure2.13: Five types of entities are classified into ‘scenario’ and ‘scene-object’ in [Vu et al., 2002].

Figure2.14: An example of the model close to: a person is close to an equipment[Vu et al., 2002].

The event recognition process consists in mapping the set of constraints to a temporal con-straints Network and determining whether the video sequence satisfies these concon-straints (Fig-ures. 2.15, 2.16). The first step of the event recognition process is to compute which event (i.e.

scenarios) can be recognized at the current time. They call ‘trigger’ such a scenario which can be recognized. Once they have recognized a scenario they add to the list of triggers all the more complex scenarios which are ended with the given recognized scenario. For example, once they have recognized the scenario ‘close−to’, it is possible that the scenario ‘moves−close−to’

would be recognized. To recognize a given scenario the algorithm selects an actor for each actor variable and check whether the selected actors satisfy the constraints defined within the scenario model.

The algorithms proposed in the literature to solve the recognition problem are generally computationally intractable (NP-hard). To overcome this problem, Vu et al. [Vu et al., 2003a], [Vu et al., 2003b] have proposed a scenario model pre-compilation step and achieve a speed up of the algorithm that allows it to be used in real time surveillance applications. This step consists in decomposing the event model into a set of simple event models containing at most two sub-events. However, their approach of recognition is deterministic.

Figure2.15: Four step of ‘Bank attack’ scenario [Vu et al., 2004]: (1) at timet₁, the employee is at his position behind the counter, (2) at timet₂, the robber enters by the entrance and the employee is still at his position, (3) at timet₃, the robber moves to the front of the counter and the emplyee is still at his position and (4), at timet4, both of them arrive to the safe door.

Figure2.16: Hierarchical model of the ‘Bank attack’[Vu et al., 2004]. This scenario is composed of five parts: a set of physical object corresponding to the employee and the robber, a set of temporal variable (components), an empty set of forbidden scenarios, a set of constraints (‘before’, ‘during’ and ‘finish’

[Allen, 1983]) and an alert.

The advantage of constraint-based approaches is that knowledge can be formulated as an ontology for a particular activity domain and easily reused for different application domains [Thonnat and Rota, 1999]. They allow the description of events in a natural language, declar-ative and easily to be defined and modified by human experts. One drawback is that the event modeling step often needs a strong effort to describe all the events of interest. One solution to solve this problem is to re-use existent ontology and also use learning technics to learn au-tomatically event models. Another limitation is that these approaches do not cope with the uncertainty of recognition.

Figure2.17: An example of a Petri Net [Haas, 2002].

Dans le document The DART-Europe E-theses Portal (Page 43-49)