Perceptual Processes - Event Perception

Event Perception

8.2 Perceptual Processes

Having presented the ontology whichAbigailprojects onto the world, it is now possible to describe the process by which she perceives support, contact, and attachment relations between objects in the movie.

Recall thatAbigailhas no prior knowledge about the types or delineation of objects in the world. She interprets any set of gures connected by joints as an object. To do so, she must know which gures are joined. Not being given that information as input, her rst task is to form a model of the image that describes which gures are joined. Since the attachment status of gures may change from frame to frame as the movie unfolds, she must repeat the analysis which derives the joint model as part of the processing for each new frame. The ontology which Abigailprojects onto an image includes a layer model in addition to a joint model. SinceAbigailis given only two-dimensional information as input, she must infer information about the third dimension in the form of layer assertions in the layer model.

Again, since gures can move from layer to layer during the course of the movie,Abigailmust update both the layer and joint models on a per-frame basis. ThusAbigailperforms two stages of processing for each frame. In the rst stage she updates the joint and layer models for the image. The derived joint model delineates the objects which appear in the image. In the second stage she uses the derived joint and layer models to recover support, contact, and attachment relations between the perceived objects.

The architecture used byAbigailto process each movie frame is depicted in gure 8.2. The architecture takes as input, the positions, orientations, shapes, and sizes of the gures constituting the image, along with a joint and layer model for the image. The architecture updates this joint and layer model, groups the gures into objects, and recovers support, contact, and attachment relations between those objects.

Central to the event perception architecture is an imagination capacity which encodes naive physical knowledge such as the substantiality, continuity, gravity, and ground plane constraints.

8.2.1 Deriving the Joint and Layer Models

As Abigailwatches the movie, she continually maintains both a joint model J and a layer model L.

At the start of the movie, these models are empty, containing no joints and no layer assertions. After each frame of the movie,Abigaillooks for evidence in the most recent frame that the joint and layer models should be changed. Most of the evidence requires that ^Abigailhypothesize potential changes and then imagine the eect of these changes on the world. ^Abigailassumes that the world is for the most part stable. Objects are typically supported. She considers an unstable world with unsupported objects to be less likely than a stable one. If the world is unstable when imagined without making the hypothesized changes, then these hypothesized changes are adopted as permanent changes to the joint and layer models. This facet ofAbigail's perceptual mechanism is not justied by any experimental evidence from human perception but simply appears to work well in practice.

Abigail's preference for a stable world requires that, to the extent possible, all objects be supported.

There are two ways to prevent an object from falling. One is for it to be joined to some other supported gure. The other is for it to be supported by another gure. One gure can support another gure only if they are on the same layer, since support happens as a consequence of the need to avoid substantiality violations and substantiality holds only between two gures on the same layer.

Abigail's imagination capacity is embodied in a kinematic simulator. This simulator can predict how a set of gures will behave under the eect of gravity, given particular joint and layer models, such

• substantiality

• continuity

• gravity

• ground plane Imagination Capacity figures

layer assertions joints

objects support contact attachment

Figure 8.2: The event perception architecture incorporated intoAbigail. The architecture takes as input, the positions, orientations, shapes, and sizes of the gures constituting the image, along with a joint and layer model for the image. The architecture updates this joint and layer model, groups the gures into objects, and recovers support, contact, and attachment relations between those objects.

Central to the event perception architecture is an imagination capacity which encodes naive physical knowledge such as the substantiality, continuity, gravity, and ground plane constraints.

that naive physical constraints such as substantiality are upheld. This imagination capacity, denoted as I(F;J;L) will be described in detail in chapter 9. The processes described here treat this capacity as modular. Any simulation mechanism that accurately models gravity and substantiality will do. The event perception processes simply call I(F;J;L) with dierent values ofF, J, and L, asking dierent questions of the predicted future, in the process of updating the joint and layer models and recovering support relations.⁶

Abigailcan change the joint and layers models in six dierent ways to keep those models synchro-nized with the world. She can

add a layer assertion to L,

remove a layer assertion from L,

add a joint to J,

remove a joint from J,

promote a parameter of some joint j ²J from exible to rigid,

demote a parameter of some joint j ²J from rigid to exible,

or perform any simultaneous combination of the above changes. Each type of change is motivated by particular evidence in the most recent movie frame, potentially mediated by the imagination process.

Abigail makes three types of changes to the layer model on the basis of evidence gained from watching each movie frame. The process can be stated informally as follows. She will add an assertion that two gures are on dierent layers whenever they overlap, since if they were not on dierent layers, substantiality would be violated. She will add an assertion that two gures are on the same layer whenever one of the gures must support the other in order to preserve the stability of the image.

Finally, whenever newer layer assertions contradict older layer assertions, the older ones are removed from the layer model giving preference to newer evidence. For example, when presented with the image from gure 6.1,Abigailwill infer that the ball and the table top are on the same layer since the ball would fall if it was not supported by the table top.

The process of updating the layer model can be stated more precisely as follows. A layer model consists of an ordered set L of layer assertions. Initially, at the start of the movie, this set is empty. The closure of a layer model is the layer model augmented with all of the layer assertions entailed by the equality axioms. A layer model is consistent if its closure does not simultaneously imply that two gures are on the same, as well as dierent, layers. Abigailnever replaces the layer model with its closure.

She always maintains the distinction between layer assertions that have been added to the model as a result of direct evidence, in contrast to those which have been derived by closure. A maximal consistent subset of a layer model L is a consistent subset L⁰ of L such that any other subset L⁰⁰ of L that is a superset of L⁰ is inconsistent. The lexicographic maximal consistent subset of a layer model L is the particular maximal consistent subset of L returned by the following procedure.

procedure

Maximal Consistent Subset(L)

2 L⁰^fg;

for

a²L

do if

L⁰[ fagis consistent 5

then

L⁰ L⁰[ fag

od

; 6

return

L⁰

end

6As discussed in chapter 9, the imagination capacityÎ(^F;J;^L;^P) takes a predicate^P as its fourth parameter. In informal presentations, it is simpler to omit this parameter and use the English gloss `^P occurs duringÎ(^F;^J;^L)' in place ofÎ(^F;^J;^L;^P).

The above procedure may not nd the largest possible maximal consistent subset. That problem has been shown to be NP-hard by Wolfram(1986). Using the above heuristic has proven adequate in practice.

Given the above procedure we can now dene the process used to update the layer model. We dene L⁶./ to be the set of all dierent-layer assertions f ⁶./ g, where f and g overlap in the most recent movie frame. These are layer assertions which must be added to the layer model in order not to violate substantiality. We dene L./ to be the set of all same-layer assertions f ./ g, where f and g touch in the most recent movie frame. These are hypothesized layer assertions which could potentially account for support relationships needed to preserve stability. L./ contains assertions only between gures which touch since only such assertions could potentially contribute to support relationships. The layer model updating procedure makes permanent only those hypothesized same-layer assertions that actually do prevent gures from falling under imagination. The layer model updating procedure is as follows.⁷

procedure

Update Layer Model

for

f ./ g²L./

do if

neither ^f nor ^g move during

4 I(^F;J;Maximal Consistent Subset(L⁶./^[(L./^{; f}f ./ g^g)^[L)) 5

then

L./ L./^{; f}f ./ g^g

od

;

6 L Maximal Consistent Subset(L⁶./^[L./^[L)

end

The process of updating the joint model is conceptually very similar to updating the layer model. The algorithm is illustrated in gure 8.3. First, remove all joints j from J where f(j) does not intersect g(j) in the most recent frame (lines 2 and 3). Second, demote any rigid parameter of any joint j2J when the constraint implied by that parameter is violated (lines 4 through 9). Third, remove all joints j from J where both f(j) and g(j) are exible (lines 10 and 11). This is to enforce the constraint from page 127 that every joint have at least one rigid displacement parameter. Fourth, nd a minimal set of parameter promotions and new joints that preserve the stability of the image (lines 12 through 33). To do this we form the set J⁰of all joints j⁰ where f(j⁰) intersects g(j⁰) in the most recent movie frame (lines 12 through 20). Those joints in J⁰which appear in J have their parameters initialized to the same values as their counterparts in J, while any new joints have their parameters initialized to be exible. We then promote all of the exible parameters in J⁰ to have the rigid values that they have in the most recent movie frame. One by one we temporarily demote each of the parameters just promoted and imagine the world (lines 21 through 33). If when demoting a parameter of a joint j⁰, the constraint specied by the original rigid parameter is not violated during the imagined outcome of that demotion, then that demotion is preserved. Otherwise, the parameter is promoted back to the rigid value it has in the most recent movie frame. After trying to demote each of the newly promoted joint parameters, remove all joints j⁰ from J⁰ where both f(j⁰) and g(j⁰) are exible (lines 34 and 35) and replace J with J⁰ (line 36).⁸

Recall that an object can be supported in two ways, either by being joined to another object or by resting on top of another object on the same layer. ^Abigailgives preference to the latter explanation.

J J^{; f}j^g

od

;

Dans le document Language Acquisition (Page 130-134)