Affordance Graph: A Framework to Encode Perspective Taking and Effort based Affordances for day-to-day Human-Robot Interaction

(1)

HAL Id: hal-01977534

https://hal.laas.fr/hal-01977534

Submitted on 10 Jan 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Affordance Graph: A Framework to Encode Perspective Taking and Effort based Affordances for day-to-day

Human-Robot Interaction

Amit Kumar Pandey, Rachid Alami

To cite this version:

Amit Kumar Pandey, Rachid Alami. Affordance Graph: A Framework to Encode Perspective Taking

and Effort based Affordances for day-to-day Human-Robot Interaction. IEEE/RSJ International

Conference on Intelligent Robots and Systems, Nov 2013, Tokyo, Japan. �hal-01977534�

(2)

Affordance Graph: A Framework to Encode Perspective Taking and Effort based Affordances for day-to-day Human-Robot Interaction

Amit Kumar Pandey and Rachid Alami

Abstract— Analyzing affordances has its root in socio- cognitive development of primates. Knowing what the environment, including other agents, can offer in terms of action capabilities is important for our day-to-day interaction and cooperation. In this paper, we will merge two complementary aspects of affordances: from agent-object perspective, what an agent afford to do with an object, and from agent-agent perspective, what an agent can afford to do for other agent, and present a unified notion of Affordance Graph. The graph will encode affordances for a variety of tasks: take, give, pick, put on, put into, show, hide, make accessible, etc. Another novelty will be to incorporate the aspects of effort and perspective- taking in constructing such graph. Hence, the Affordance Graphwill tell about the action-capabilities of manipulating the objects among the agents and across the places, along with the information about the required level of efforts and the potential places. We will also demonstrate some interesting applications.

I. INTRODUCTION

Ability to analyze affordance is one of the essential aspects of developing complex and flexible socio-cognitive behaviors among primates, including human. In fact, affordance - what something or someone can offer or afford to do - is an important aspect in our day-to-day interaction with the environment and with others.

The pioneer work of Gibson [1], in cognitive psychology, refers affordance as what an object offers, as all action possibilities, independent of the agent’s ability to recognize them.

Whereas, in Human Computer Interaction (HCI) domain, Norman [2] tightly couples affordances with past knowledge and experience and sees affordance as perceived and actual properties of the things. Affordance could be learnt [3], even through vision based data [4], and by manipulation trials [5], as well as could be used to learn action selection [6].

Moreover, research on action-specific perception proposes that people perceive the environment in terms of their abilities to act on it, which in fact help the perceivers to plan future actions, [7]. This equally holds from the perspective of Human-Robot Interaction (HRI), mere existence of an affordance (in its typical notion of action-possibilities) is not sufficient, more important is that the agent should be able to perceive and ground it with their abilities. In [8], the idea to integrate robots and smart environments along with the Gibson’s notion of affordances has been presented. The

This work has been conducted within the EU SAPHARI project (http://www.saphari.eu/) funded by the E.C. Division FP7-IST under Con- tract ICT-287513.

Authors are with CNRS, LAAS, 7 avenue du colonel Roche, F-31400 Toulouse, France; Univ de Toulouse, LAAS, F-31400 Toulouse, France

{akpandey@laas.fr;rachid.alami@laas.fr}

Fig. 1: In a typical human-robot interaction scenario, various types of affordances exist, e.g. some objects offer to be picked, some to put-on something, whereas some agents can afford to give or to show something to someone. External sensors such as Kinect mounted on the ceiling are not shown.

aim is to provide mechanisms by which such system, which the authors term as Peis-Ecology (Ecology of Physically Embedded Intelligent Systems, such as one shown in fig.

1) may gain awareness of the opportunities, which are available in it, and use this awareness to self-configure and to interact with a human user. In this paper, we will extend this idea from the perspective of day-to-day human-robot interactive tasks, by incorporating the notions ofperspective taking andeffort, to make the robot agent and effort based affordance-aware. We are essentially interested in the aspect of computing different types of affordances and integrating them in a unified framework.

In robotics, affordance has been viewed from different perspectives: agent, observer and environment, [9], studied with respect to object (e.g. for tool use [10]) and location (e.g. for traversability [11]). In [12], linguistic instruction from the human has been grounded based on the affordances of the objects, i.e. the services they provide. In [13], the robot learns self affordance, i.e. action-effect relation of its movement on its own body. In [14], the proposed interpersonal maps relate the information about the robot’s own body with that of other robot, and serve for imitating other’s sensory-motor affordances. In, [15], we enriched the notion of affordances by Agent-Agent Affordance for HRI tasks, based on the reasoning about what an agent affords to do for another agent (give, show, hide, make accessible, ...), withwhich effort levelandwhere.

On the other hand, perspective taking- what others see, reach, do - is an important aspect of social interaction, [16], [17], [18], [19], [20]. Hence, perceiving the self-centered affordances are not sufficient, the robot should also be able to perceive the affordances from the other agents’ perspectives.

Further, in [21], we have argued that perspective taking

(3)

TABLE I: Effort Classes for Visuo-Spatial Abilities

Effort to Reach Effort to See

No_Effort No_Effort

Arm_Effort Head_Effort

Arm_Torso_Effort Head_Torso_Effort Whole_Body_Effort Whole_Body_Effort Displacement_Effort Displacement_Effort No_Possible_Known_Effort No_Possible_Known_Effort

Effort Level Minimum: 0

Maximum: 5

Effort to Reach Effort to See

No_Effort No_Effort

Arm_Effort Head_Effort

Arm_Torso_Effort Head_Torso_Effort Whole_Body_Effort Whole_Body_Effort Displacement_Effort Displacement_Effort No_Possible_Known_Effort No_Possible_Known_Effort

Effort Level Minimum: 0

Maximum: 5

from the current state of the agent is not sufficient. In fact analyzingeffortplays an important role in cooperation.

Hence, in [21], we have developed the notion of perspective taking from a set of states attainable by putting a set of efforts [15], e.g. turn, lean, etc. In this paper, it will serve to incorporate the notion ofeffortinaffordance analysis.

Novelty: The novelty of the paper is to geometrically ground and merge the two complementary aspects of affordance:Agent-ObjectandAgent-Agent, further by incorporating the notions of perspective taking and effort analysis. We will present a unified framework as the concept ofAffordance Graph. This will elevate the robot’s awareness about not only the capabilities of itself but also the action potentialities of other agents for a set of basic human robot interactive tasks:

give,take,pick,hide,show,make accessible,put on,put into, etc. The graph encodes ’what’ and ’where’ aspects for all the agents, which are identified as important components for joint-task, [22]. Hence, the Affordance Graphwill facilitate an affordance-aware human robot interaction, with various potential applications in day-to-day life: achieving joint- task, cooperation, grounding interaction and changes, playing human-interactive games, etc.

In the next section, for continuity, we will briefly present our earlier works on human-aware effort hierarchy and agent-agent affordances (Taskability Graph). In sec. III-A, we introduce the novel concept of Manipulability Graph, which encodes agent-object affordances. Taskability Graph together with Manipulability Graph will be used to derive the novel notion ofAffordance Graphin sec. III-B. Sec. IV presents the experimental result and analysis on constructing and updating such graphs. In sec. V, we will discuss a couple of implemented potential applications of Affordance Graph.

II. BACKGROUNDWORK

A. Human-Aware Effort Analysis: Qualifying Effort For a robot to interact and cooperate with us in complete harmony, it should be able to reason on efforts at the human- understandable level of abstraction. In [21] we have argued that perspective taking analysis only from the current state of the agents is not sufficient. Following this, in [15], we have conceptualized qualitative effort classes, as shown in table I.

We have also proposed comparative effort levels, which of course can differ depending upon the requirement and con- text. In fact these are based on the body parts involved and motivated from human movement and behavioral psychology studies on reach taxonomy, [23], [24], see fig. 2.

Fig. 2: Taxonomy of reach actions: (a) arm-shoulder reach, (b) arm-torso reach, (c) standing reach.

Fig. 3: ConstructedTaskability Graphs(see [15]), for different tasks, for the actual scenario of fig. 1. Given the criterion of mutual-effort balancing and restricting desired maximum individual effort level toArm Torso Effort. Point clouds show potential places and sphere size shows required effort level.

B. Taskability Graph: Encoding Agent-Agent Affordances As mentioned earlier, in [15] we have introduced the notion ofagent-agent affordanceand present the concept of Taskability Graph. The construction ofTaskability Graphis based on the Mightability Maps, [21]. Mightability stands for “Might be Able to...”. In short, Mightability Analysis is visuo-spatial perspective taking, not only from the current state but also from a set of different states achievable by the agent. This enables the robot to find effort-based potential places to perform a task between two agents. By using this,Taskability Graphis constructed by finding agent-agent affordances among all the agents, [15]. Taskability Graph encodeswhatall agents in the environment might be able to do for all other agents, with which levels of mutual efforts and atwhich places.Taskability Graphcorresponding to the actual scenario of fig. 1 is shown in fig. 3. Currently4basic tasks are encoded:Make Accessible, Show, Give andHide, as shown by the4layers of edges, in bottom up order. Please see [15] for the details.

We represent aTaskability Graphfor a taskas a directed graphT Gtask:

T G_task= (V (T G), E(T G)) (1) V(T G)is set of agent vertices in the environment:

V (T G) ={v(T G)|v(T G)∈AG} (2) whereAGis the set of agents in the environment.E(T G) is set of edges between an ordered pair of agents:

E(T G) ={e(T G)|e(T G) =hvi(T G),

v_j(T G), e_propi ∧v_i(T G)6=v_j(T G)} (3)

(4)

eprop is property of an edge:

e^{T G}_prop= CS, EC^{T G}=hEC_ab^ag|∀ag∈ {source(e), target(e)} ∧ ∀ab∈RelAbagi) (4) whereCSis candidate space where potentially the task could be performed. EC^{T G} is a list of enabling condition. In the current implementation each enabling condition EC_ab^ag corresponds to the required effort for an abilityab∈RelAb, for an agentag. Currently, we focus on two basic abilities, to seeandto reach, hence,RelAb={see, reach}.

Hence, currently an edge of aTaskability Graphencodes:

e^{T G}_prop= places,

efforttarget agent

see ,efforttarget agent reach

effort^performing_see ^agent,effort^performing_reach ^agentE (5) III. METHODOLOGY

A. Manipulability Graph: Encoding Effort based Agent- Object Affordances

In this section we will introduce the first contribution of this paper:Manipulability Graph. This will encode whatan agent might be able to do with an object, and with which effort level.

Complementary to the Taskability Graph, which encodes agent-agent affordances, Manipulability Graph represents agent-object affordances. Currently, we have implemented four such affordances: Touch, Pick, PutOnto and PutInto.

We have chosen them because they are the most used basic action primitives within our domain of day-to-day pick- point-and-place type tasks. We regard Pick as the ability to See∧Reach∧Grasp, whereas Touch as the ability to just See∧Reach, PutOnto as the ability to See∧Reach theplaces on horizontal surface,PutIntoas the ability to put something into some container object, i.e. to See∧Reach theplaces belonging to horizontal open side of an object.

We are using a dedicated grasp planner, built in-house [25], to compute a set of grasps for 3D objects, using the robot’s gripper and an anthropomorphic hand. The robot computes and stores the set of such grasps for each new object it encounters. Then based on the environment, the grasps in collision are filtered out. An agent can show reaching behavior to touch, grasp, push, hit, point, take out or put into, etc., hence, the precise definition of reachability of an object depends upon the purpose. So, at first level of finding reachability, we chose to have a rough estimate based on the assumption that if at least one cell belonging to the object is reachable, then that object is said to beReachable.

See [21] for our approach to find the reachability of a cell in the workspace. Similar is done for finding visibility of an object. However, using [26] such reachability is further tested for execution if required, based on IK and feasibility of a collision free trajectory. Similarly, a pixel based visibility score is computed, if required, by using [20].

Similar toTaskability Graphwe represent aManipulability Graphfor a taskas a directed graphM Gtask:

M Gtask= (V (M G), E(M G)) (6)

V(M G)is set of vertices representing entitiesET =AG∪ OBJ (OBJ is set of objects in the environment):

V(M G) ={v(M G)|v(M G)∈AG∨v(M G)∈OBJ} (7) E(M G) is set of edges between an ordered pair of agent and object:

E(M G) ={e(M G)|e(M G) =hvi(M G), vj(M G), e^{M G}prop

E

∧vi(M G)∈AG∧vj(M G)∈OBJ} (8)

e^{M G}_prop is property of an edge of the Manipulability Graph:

e^{M G}_prop= CS, EC^{M G}=hEC_ab^ag|ag=source(e)

∧ ∀ab∈RelAb_agi) (9) Note that each edge in Manipulability Graph contains the list of enabling condition for a single agent (see eq. 10), belonging to the source vertex, because the target vertex belongs to an object. Whereas, each edge of Taskability Graphhas enabling conditions for two agents (see eq. 5).CS is the candidate search space in which the feasible solution will fall. Depending upon the type of the task, it could be the set of collision free grasp configurations forP ick, or it could be the visible and reachable places belonging to a plane for P utOnto, to an open side of the container forP utInto or to an object’s surface forT ouch. Hence,

e^{M G}_prop= (CS∈ {graspconfigurations, places}

D

effort^performing_see ^agent,effort^performing_reach ^agentE (10) In the current implementation, if there exist at least one collision free grasp, and the object is reachable and visible with a particular effort level of the agent, then it is said that the agent can afford topickthe object with that effort level.

We have equipped the robot with the capability to autonomously findhorizontal supporting facet and horizontal open side, if exist, of any object. For this, it extracts horizontal planes by finding the facet having vertical normal vector, based on the convex hull of the corresponding 3D model of the object. Then such planes are uniformly sampled into cells and a virtual small cube (currently of dimension (5cm x 5cm x 5cm)) is placed at each cell. As the cell already belongs to a horizontal surface and is inside the convex hull of the object, therefore, if the placed cube collides with the object, it is assumed to be a cell of support plane. Otherwise, the cell belongs to an open side of the object, from where something could be put inside. With this method the robot could find, which object offers to put something onto it and which offers to put something inside it and which are the places to do these. This avoids explicitly indicating the planner about the supporting objects, such as table, box-top, and the container objects, such as trashbin.

Fig. 4a shows the automatically extracted places where the human in the middle can put something onto with Arm Effort. Note that the robot not only found the table as the support plane, but also the top of the box. Similarly in fig. 4b, the robot autonomously identified the pink trashbin as a container objects having horizontal open facet. And it

(5)

(a) Affordance forPutOnto (b) Affordance forPutInto

Fig. 4: Agent-Object Affordances

also found the places from where the human on the right can put something inside it.

Manipulability Graphfor an agent-object affordance type is constructed by finding least feasible effort for each agent, satisfying the task criteria. Again, theMightability Analysis serves to find the least feasible effort, as it performs visuo- spatial perspective taking from multiple states of the agents.

Fig. 5a showsManipulability Graphs, forPickandPutInto affordances. For clarity, we do not superimpose Touch and PutOnto affordances. Each edge of Manipulability Graph shows the agent’s least feasible effort to see and reach the objects, fig. 5b. Different maximum desired effort levels can be assigned for different affordances and agents for constructing the graph. To show this, we provided the maximum allowed effort for Human1 as DisplacementEffort, whereas for PR2 robot and Human2, it was T orso Effort. Hence, the resulted graph shows that Human1 and Human2 both can pick Obj2, with different effort level. For Human1, the planner is able to compute collision free placement positions and configurations around the object, fig. 5c, from where he can reach, see and grasp it. However, the graph shows that Human1 can put something into the trashbin on his right, whereasHuman2cannot, because of his more restricted maximum effort level. An interesting side note, relatively smaller green sphere on Human1-Obj2 edge encodes that Human1 can see Obj2 directly, hence more easily than Human2, who needs to turn head i.e. to put Head Effort.

Note that there is no edge from PR2 to Obj1. Actually PR2 can reach and see, i.e. touchObj1, however, there exist no collision free grasp configuration for its gripper to pick it, because of the object placement and its gripper size.

However, we will see in section IV, that changing Obj1 position facilitates thePickaffordance of PR2 for this object.

B. Affordance Graph

By combining a set of Taskability GraphsT Gr and a set of Manipulability Graphs M Gr for a set of affordances, we have developed the concept of Affordance Graph (AfG).

Hence, theAffordance Graphwill tell the action-possibilities of manipulating the objects among the agents and across the places, along with the information about the required level of efforts and the potential spaces.

Affordance Graph(AfG) is obtained as:

AfG = ]

∀tg∈T Gr

T G_tg ] ]

∀mg∈M Gr

M G_mg (11)

(a) Manipulability Graph for Pick and PutInto affordances from the perspective of different agents

(b) Single edge description (c) Human1 affordance to pick Obj2, requiresDisplacementEffort.

Fig. 5: Manipulability Graph, taking into account different maximum allowed effort levels for different agents

] is the operator, which depending upon the affordance type, appropriately merges a Taskability or Manipulability Graphin Affordance Graph. As will be explained below it will create some virtual edges to ensure one of the desired properties to be maintained, that is between any pair of vertices of the Affordance Graph, there should be at most one edge. Further, it also assigns proper labels, weights and directions to the edges, which will become evident from the discussion below.

Figure 6 shows the Affordance graph of the current scenario. The ] operator uses following set of rules for constructingAffordance Graph:

(i) Create unique vertices for each agent and each object in the environment.

(ii) For each edge Et of Taskability Graph from the performing agent PAto the target agentTA, introduce an intermediate virtual vertex Vt and split Et into two edges, E1, connecting PA andVt; andE2, connecting Vt andTA.

(iii) The direction of E1andE2depends upon the task:

(a) If the task is toGive or Make-Accessible, E1will be directed inward to Vt and E2 will be directed outward fromVt towardsTA.

(b) If the task is to Hide or Show, E2 will also be directed towardsVtfromTA. This is to incorporate the intention behind such tasks, i.e. the object is not expected to be transported toTA, andE2is for the purpose of groundingVt to corresponding TA.

(iv) Assign meaningful symbolic labels to each of the new edges E1and E2. For example, if Et belongs to Give

(6)

Fig. 6: Constructed Affordance Graph of the real world scenario of fig. 1. It encodes effort-based, agent-object and agent-agent affordances for a set of basic HRI tasks.

task, then labelE1asTo Giveand labelE2asTo Take;

ifEtbelongs toMake-Accessibletask, then labelE1as To Placeand labelE2as To Pickand so on.

(v) For each edgeEmtofManipulability GraphtoPickan object, an edge is introduced in theAffordance Graph directing from the object to thePA.

(vi) For each edgeEmpofManipulability GraphforPutInto andPutOntoaffordances, an edge is introduced in the Affordance GraphfromPAto the objects.

Rule (iii) encodes potential flow of an object between two agents whereas rules (v) and (vi) encode the possible flow of an object to an agent, hence sometime we refer Affordance Graphalso asObject Flow Graph.

Each edge has a weight depending upon the efforts encoded in the corresponding edge of the parent Taskability or Manipulability Graph. There could be various criteria to assign such weights. We will discuss below one such choice.

The weights shown by Blue spheres in the Affordance Graphof figure 6 have been selected based on the maximum effort of the relevant abilities to see and/or reach. For example, if the edge corresponds to theGivetask, the highest effort between reach and see, encoded in the Taskability Graphwill be assigned for both the edges: performing agent, PA-Vt(V tis the virtual vertex as discussed above) and target agent,Vt-TA. But, if the task is toShow, then for the edgePA- Vt still the highest between the reach and see effort will be assigned, however, for the edgeVt-TA, the weight is assigned as effort to see. This is becauseTAis not required to reach the object. In fact, the relevant abilities for a task are provided to the system a priori, which could also be learnt for basic HRI tasks as we begin to show in [27].

The novel aspects of Affordance Graph are: (i) Incor- porates Perspective Taking and Effort with Affordances in an unified concept. (ii) Provides a graph based framework to query about affordances by simply using existing graph search algorithms.(iii) Allows playing with edges, vertices, weights, to guide the graph search. Hence, facilitates incorporating a range of social constraints, desires, prefer-

(a) Human’s new affordance edge (b) PR2’s new affordance edge

(c) Need whole body effort (d) Actual verification

Fig. 7: Changes in the environment and its effect successfully encoded in the updatedManipulability Graph.

ences, effort criteria, in finding desirable/suitable affordance potentialities. (iv) Supports in a range of HRI problems, such as grounding interaction and changes, generating shared cooperative plans, etc., as will be demonstrated in section V.

IV. EXPERIMENTALRESULTS ANDANALYSIS

We have tested our system on real robot PR2.The robot usesMove3D [28], an integrated planning and visualization platform. The robot, through various sensors, maintains and updates the 3D world state in real time. For object identi- fication and localization, tags based stereovision system is used. For localizing humans and tracking the whole body, data fromKinect (Microsoft)sensor is used.

Fig. 7 shows effect of the environmental changes on the Manipulability Graph. We have displaced Obj1 of fig. 5a behind the box, as shown in fig. 7c. There exists a new edge in the updated corresponding Manipulability Graph, as shown in fig. 7a. Earlier there was no edge in the Manipulability Graphof fig. 5a because of non-existence of collision free placement ofHuman1aroundObj1, even with DisplacementEffort. In the changed situation, the robot finds a feasibleHuman1 - Obj1 Pickaffordance. Thanks to the Mightability Analysis, it finds the least feasible effort to pick Obj1 as W hole BodyEffort, fig. 7c, and that the agent will be required to stand up and lean forward, which has been also verified by the actual agent, fig. 7d.

Next, we placed Obj1 at the edge of the table, which facilitated collision free grasp by the PR2 gripper. The updated corresponding Manipulability Graph automatically contains a new edge for PR2 -Obj1 Pickaffordance, fig. 7b.

Table II shows the computation time for different components to obtain theAffordance Graphof fig. 1. Note that it is for the first time creation of the graphs, which is acceptable

(7)

TABLE II: Computation Time in s

3D grid size:60 x 60 x 60cells, each of dimension5 cm x 5 cm x 5 cm 3D grid creation and Initialization (one time process) 1.6 Computation of Mightability Analysis (provided 3D grid

initialization is done) 0.446

Computation of Taskability Graph (provided Mightability

Analysis is done) 1.06

Computation of Manipulability Graph (provided Mightability

Analysis is done 0.14

Computation of Affordance Graph (provided Taskability and Manipulabity graphs are calculated) 0.002 To obtain the shared plan to clean the table (see fig. 8a (provided Affordance Graph has been created) 0.01

for a typical human-robot interaction scenario. As during the course of interaction or some action, generally a part of the environment changes, hence selectively updating these graphs will be even faster. However, we are aware about the exponential complexity of the system, with significantly increased number of affordances, agents and objects. There- fore, the future work is to interface it with our supervisor system [29], which will decide which part, and how much of these graphs will be updated depending upon the changes, situation and requirements.

V. POTENTIALAPPLICATIONS

A. Generation of cooperative shared plans

As long as the robot reasons only on the current states of the agents, the complexity as well as the flexibility of cooperative task planning is bounded. If the agent cannot reach an object from the current state, it means to the planner that the agent cannot manipulate that object. But thanks to theAffordance Graph, our robot is equipped with agents’ rich effort-based multi-state reasoning of action potentialities. It facilitates incorporating effort in cooperative task planning and allows the planner to reason beyond the current state of the agent. Moreover, while querying the graph, by altering the nodes, weights and edges, different types of constraints can be incorporated, such as, agent busy,tired,bored,back pain,neck pain,cannot move,equally supporting, etc.

For example, consider the task of cleaning the table by putting all the manipulable objects of fig. 1 into the trashbin.

To solve this, the Affordance Graph has been used to find the shortest path between each object’s node and the node which corresponds to the trashbin. We assume that the agents are tired and don’t want to stand up or move. This can be directly incorporated by restricting the agents’ maximum desired effort as Arm T orso Effort. To reflect this, the edges having higher weights than Arm T orsoEffort have been assigned an infinite weight, hence avoiding any path through them. The resulted object-wise clean the table shared cooperative plan has been shown in fig. 8a, restricting the individual efforts to Arm T orso Effort. Last row of table II shows that once we have the Affordance Graph, finding the solution for such tasks is very quick,0.01sto obtain the plan of fig. 8a.

(a)

(b)

Fig. 8: Generated shared plan for clean the table.

(a) With maximum effort level of all the agents as Arm T orso Effort. (b) Modified plan for small tape in case Human1is allowed to put upto W hole BodyEffort.

Next, we have tried the framework by increasingHuman1 willingness to putW hole Body Effort. Then the sub-plan to trash the small tape was changed as shown in fig. 8b.

Note that these examples demonstrate that various requirements could be easily incorporate in the presented graph based framework. In principle a higher level robot supervisor system, such as SHARY [29], or the high-level task planner, such as HATP [30], will take such decisions on preferences based on the requirements and adjust the parameters.

B. Grounding Changes, Analyzing Effects and Guessing Potential Actions, Cooperation and Efforts

Based on Affordance graph, a set of hypotheses could be generated about potential agents and actions, which might be responsible for some changes in the environment in the absence of the robot. Consider at a particular instance of time the state of the environment observed by the robot was s0, as shown in fig. 9a. The robot went away for a moment, meanwhile the humans made some changes in the environment. Now the robot is back and observes the new

(a) (b)

Fig. 9: (a) Initial environment state, s0. (b) The changed state, s1. During the course of changes the PR2 robot was absent. Now, from PR2’s perspective,Obj3 is lost, whereas the positions ofObj2andObj1have been changed. The robot will try to guess the lost object’s position as well as to ground the changes in terms of agents and actions.

(8)

Fig. 10: ModifiedAffordance Graph to ground the changes of fig. 9. For the lost objects, the reasoner also guesses one feasible placement and orientation (in this scenario, vertex v4), which might be making the object invisible to the robot.

state of the environment as s1, fig. 9b. The problem of grounding the changes is: given hs0, s₁i, find hC, E, Ai, where C is the physical changes, E is the effect of C on abilities and affordances andA is the potential sequence of actions behind suchC.

To analyze E on the agents abilities, the reasoner com- pares the computedManipulability Graphscorresponding to the states s0 and s1, and generates a set of comparative and qualifying facts, such as effort increased to see an object by an agent and so on. However, we will discuss the more interesting aspects: guessinglostobjects’ positions, and grounding changes to potential agents and actions.Lost objects are those, which are no more visible in states1form the robot’s current perspective, such as Obj3of figure 9a.

First of all, the reasoner adds states₀’s positions of those objects, which are now displaced in state s₁, in a list of dummy object vertices, DV. Then, to guess a position of lostobject, it uses theTaskability Graphforhideaffordance and finds the agents who might potentially hide the object from the robots as well as the corresponding potential places.

Currently, it is explicitly provided to the reasoner that the object could be lying on wooden furniture (table, shelf, in our current scenario). Hence, for the current example, it guessed a possible placement and orientation of the lost object Obj3 behind the white box, as indicated in figure 10. This guessed position is also inserted in the list of dummy object vertices DV. Then, DV is inserted into the set of vertices belonging to agents (AG) and objects (OBJ), to get GV = {AG∪OBJ ∪DV}. Finally, using GV,Manipulability Graph Taskability GraphandAffordance Graph(Af G₁)are obtained. Hence, this graph merges both the states,s₀ands₁as well as the guessed positions (if any) in a single hypothetical state.

As the robot was not responsible for the changes, the reasoner removes the outgoing edges from the robot vertex in(Af G1). Then for each displaced object Objd, a pair of vertices hvs, vgi is found from (Af G1), where vs ∈ DV

and vg correspond to the positions of Objd in s0 and s1. Now, simply finding a shortest path (Af G₁) for hvs, vgi, the reasoner can guess about the agent, the action and the effort behind the change forObj_d position.

In the current example, the set of dummy vertices found by the reasoner is DV = {v1, v2, v3, v4}, as encircled in red in figure 10. To get possible explanations behind the changes, the set of vertices pairs obtained are {hv1, Obj1i,hv2, Obj2i,hv3, v4i}. Note that, hv3, v4i corresponds to the lost object, Obj3. Below we show the partial output of the explanations produced by the reasoner. (Names mapping: Obj1 : GREY T AP E, Obj2 : W ALLE T AP E,Obj3 :LOT R T AP E).

== POSSIBLE EXPLANATIONS ==bbinterpretationcc

• LOTR TAPEMoved:

LOTR TAPE GRASP PICK by HUMAN2, GIVE at a place, TAKE by HUMAN1, PUT ONTO at a place bb Human2 picked the object and gave it to Human1, then Human1 placed it at its new position, which in fact wasguessedby the robot, as it is a lost object.cc

• WALLE TAPEMoved:

WALLE TAPE GRASP PICK by HUMAN2, GIVE at a place, TAKE by HUMAN1 PUT ONTO at a place bbHuman2 picked the object and gave it to Human1 and then Human1 placed it at its new perceived positioncc

• GREY TAPEMoved:

GREY TAPE GRASP PICK by HUMAN1, PUT ONTO at a place

bbHuman1 picked and placed it at its new positioncc

Above result shows the capability of the system to infer the potential cause of changes with a possible explanation. This is based on different assumptions about the agents and their willingness cooperate and put efforts, hence not necessarily be guessing the actual course of actions. Depending upon various factors, such as the criteria on effort used for com- putingTaskabilityandManipulability Graphs, the criteria for assigning weight to the edges ofAffordance Graph, and the criteria to find the path in theAffordance Graph, the resultant path in the graph could be different and could imply different assumption for guessing the actions. In the current example, as the criteria was effort balancing and the resultant shortest path was minimizing overall effort, hence the explanation in some sense assumes that whenever possible and feasible, agents will cooperate to achieve the changes. However, that is how we also guess, based on some assumptions about the agents, their states and behaviors.

A remark on planning complexity: It is important to discuss about the complexity of such task planning. Such graphs are meant for a given environment states_i, aiming to provide rich information about various types of affordances in the given situation. Therefore, while finding cooperative shared plan, transition from one vertex to another will make some changes in the environment state and result into a new statesn. Hence, the graphs computed insi might no longer be representing the actual affordances in sn, and a partial or full re-computation might be needed, before searching for

(9)

the next segment of the plan. This might lead to exponential complexity. In our example scenarios, we assume that as the agents’ relative positions are not changing significantly, and the objects’ are not large enough to significantly affect theTaskability Graphsfrom one state to another. Hence, we relaxed the need of updating these graphs and relying on the plan entirely produced by searching in the single graphs. As such graphs are fast to compute, and updating them will be even faster. It is interesting research challenge to design and develop algorithm, which intelligently updates such graphs (completely or partially), when they are used for planning.

C. Supporting Higher level symbolic task planners

Even with the limiting assumption discussed above, such graphs could provide the high-level task planners, such as [30], with the flexibility to choose from different feasible sub-actions and the associated cost, at different stages of planning. Our research is moving in this direction [31].

D. Grounding interaction, enhancing human robot interaction, proactivity and action recognition

The graph can be used for querying during various human robot verbal communication, grounding referred objects, synthesizing proactive behaviors, etc. Moreover, knowing the affordances possibilities can help in predicting where andwhat part of other’s action to efficiently or proactively involve in joint-task or cooperation, [22]. We are working in these research directions as well.

VI. CONCLUSION

We have geometrically groundedagent-object affordances from the HRI perspective, and introduced the notion of Manipulability Graph. By merging this with theTaskability Graph, introduced in our earlier work to encodeagent-agent affordances, we presented a new kind of graph, Affordance Graph, which introduces the aspects of Perspective Taking and Effort in Affordance Analysis. This encodes the action potentialities of the agents for a set of basic human-robot interactive object manipulation tasks. In fact, it elevates the robot’s knowledge about effort-based affordances for agents, objects and tasks and will serve as the basis for developing more complex socio-cognitive behaviors. Fast updating and quick querying of the graph, make it feasible for almost real- time human-robot interaction. We have shown its use for generating shared cooperative plans and grounding changes and discussed other applications and pointers of future work.

REFERENCES

[1] J. J. Gibson, The Theory of Affordances. Psychology Press, Sep.

1986, pp. 127–143.

[2] D. Norman,The psychology of everyday things. Basic Books, 1988.

[3] E. J. Gibson, “Perceptual learning in development: Some basic con- cepts,”Ecological Psychology, vol. 12, no. 4, pp. 295–302, 2000.

[4] H. Koppula, R. Gupta, and A. Saxena, “Learning human activities and object affordances from rgb-d videos,” International Journal of Robotics Research, vol. 32, no. 8, pp. 951–970, 2013.

[5] B. Moldovan, P. Moreno, M. van Otterlo, J. Santos-Victor, and L. De Raedt, “Learning relational affordance models for robots in multi-object manipulation tasks,” inIEEE ICRA, 2012, pp. 4373–4378.

[6] M. Lopes, F. S. Melo, and L. Montesano, “Affordance-based imitation learning in robots,” inIEEE/RSJ IROS, 2007, pp. 1015–1021.

[7] J. K. Witt, “Action’s effect on perception,” Current Directions in Psychological Science, vol. 20, no. 3, pp. 201–206, 2011.

[8] A. Saffiotti and M. Broxvall, “Affordances in an ecology of physically embedded intelligent systems,” in Towards affordance-based robot control (2006). Springer-Verlag, 2008, pp. 106–121.

[9] E. S¸ahin, M. Ç akmak, M. R. Do˘gar, E. U˘gur, and G. Üçoluk, “To afford or not to afford: A new formalization of affordances toward affordance-based robot control,” Adaptive Behavior, vol. 15, no. 4, pp. 447–472, Dec. 2007.

[10] A. Stoytchev, “Behavior-grounded representation of tool affordances,”

inIEEE ICRA, April 2005, pp. 3060 – 3065.

[11] E. Ugur, M. R. Dogar, M. Cakmak, and E. Sahin, “Curiosity-driven learning of traversability affordance on a mobile robot,” inIEEE ICDL, 2007, pp. 13–18.

[12] R. Moratz and T. Tenbrink, “Affordance-based human-robot interaction,” in2006 international conference on Towards affordance-based robot control. Berlin, Heidelberg: Springer-Verlag, 2008, pp. 63–76.

[13] E. Erdemir, M. Wilkes, K. Kawamura, and A. Erdemir, “Learning structural affordances through self-exploration,” in IEEE Ro-Man, 2012, pp. 865–870.

[14] V. Hafner and F. Kaplan, “Interpersonal maps: How to map affordances for interaction behaviour,” inTowards Affordance-Based Robot Con- trol, ser. LNCS, E. Rome, J. Hertzberg, and G. Dorffner, Eds. Springer Berlin / Heidelberg, 2008, vol. 4760, pp. 1–15.

[15] A. K. Pandey and R. Alami, “Taskability graph: Towards analyzing effort based agent-agent affordances,” in IEEE Ro-Man, 2012, pp.

791–796.

[16] J. H. Flavell, B. A. Everett, K. Croft, and E. R. Flavell, “Young children’s knowledge about visual perception: Further evidence for the level 1 - level 2 distinction,”Developmental Psychology, vol. 17, no. 1, pp. 99–103, 1981.

[17] P. Rochat, “Perceived reachability for self and for others by 3 to 5-year old children and adults,”Journal of Experimental Child Psychology, vol. 59, pp. 317–333, 1995.

[18] C. Breazeal, M. Berlin, A. G. Brooks, J. Gray, and A. L. Thomaz,

“Using perspective taking to learn from ambiguous demonstrations,”

Robotics and Autonomous Systems, vol. 54, pp. 385–393, 2006.

[19] J. Trafton, N. Cassimatis, M. Bugajska, D. Brock, F. Mintz, and A. Schultz, “Enabling effective human-robot interaction using perspective-taking in robots,”IEEE Trans. SMC, Part A: Systems and Humans, vol. 35, no. 4, pp. 460 – 470, 2005.

[20] L. Marin-Urias, E. Sisbot, A. Pandey, R. Tadakuma, and R. Alami,

“Towards shared attention through geometric reasoning for human robot interaction,” inHumanoids 2009, dec. 2009, pp. 331 –336.

[21] A. K. Pandey and R. Alami, “Mightability maps: A perceptual level decisional framework for co-operative and competitive human-robot interaction,” inIROS, 2010, pp. 5842 – 5848.

[22] N. Sebanz and G. Knoblich, “Prediction in joint action: What, when, and where,”Topics in Cognitive Science, vol. 1, no. 2, pp. 353–367, 2009.

[23] H. J. Choi and L. S. Mark, “Scaling affordances for human reach actions,”Human Movement Science, vol. 23, no. 6, pp. 785–806, 2004.

[24] D. L. Gardner, L. S. Mark, J. A. Ward, and H. Edkins, “How do task characteristics affect the transitions between seated and standing reaches?”Ecological Psychology, vol. 13, no. 4, pp. 245–274, 2001.

[25] J.-P. Saut and D. Sidobre, “Efficient models for grasp planning with a multi-fingered hand,”Robot. Auton. Syst., vol. 60, pp. 347–357, 2012.

[26] M. Gharbi, J. Cortes, and T. Simeon, “A sampling-based path planner for dual-arm manipulation,” inIEEE/ASME Int. Conf. on Advanced Intelligent Mechatronics, 2008, pp. 383–388.

[27] A. K. Pandey and R. Alami, “Towards task understanding through multi-state visuo-spatial perspective taking for human-robot interaction,” inIJCAI-ALIHT, July 2011.

[28] T. Simeon, J.-P. Laumond, and F. Lamiraux, “Move3d: a generic platform for path planning,” in4th Int. Symp. on Assembly and Task Planning, 2001, pp. 25–30.

[29] A. Clodic, H. Cao, S. Alili, V. Montreuil, R. Alami, and R. Chatila,

“Shary: A supervision system adapted to human-robot interaction,”

vol. 54, pp. 229–238, 2009.

[30] S. Alili, R. Alami, and V. Montreuil, “A task planner for an autonomous social robot,” inDistributed Autonomous Robotic Systems (DARS), 2008, pp. 335–344.

[31] L. de Silva, A. K. Pandey, and R. Alami, “An interface for interleaved symbolic-geometric planning and backtracking,” in IEE/RSJ IROS, 2013.