Pruning power performance. Having defined the threshold σ and δ, now
we can verify the pruning power performance of the LB Keogh Lowe bounding function due to the capability of filtering the negative human motions. In fact, we perform the 1-nearest neighbor search using the sequential scan technique. A random humanmotion was chosen from the data set to act as the query and the remaining 269 human motions acted as the data. The search carried out 50 trials on each different lengths of the humanmotion. We recall Formula 20 to measure the pruning power for query humanmotion. The average of the 50 queries was reported as the pruning power of each different lengths of the dataset. Figure 14 shows how the pruning power averaged of the proposed LB Keogh Lowe. Lower bounding function varies as the lengths of the data in the humanmotion dataset in which, 92% of the humanmotion of length 1024 and 67% of the humanmotion of length 32 did not require computation of the actual time warping distances. The promising pruning power greatly reduces the querying time. We conducted experiments to measure the time required for the query evaluation of the humanmotion in different lengths. In prior, we should recall the work of Keogh et al. [16] to confirm the time warping constraintε. Due to the experimental result of Keogh the threshold εis the most efficient with the value of 20% of the length of the real time-series data. Thus we denote ε=0.2.
On singular values decomposition and patterns for humanmotion analysis and simulation
Adrien Datas, Pascale Chiron and Jean-Yves Fourquet
Abstract— We are interested in humanmotion characteri- zation and automatic motion simulation. The apparent redun- dancy of the humanoid w.r.t its explicit tasks lead to the problem of choosing a plausible movement in the framework of redun- dant kinematics. This work explores the intrinsic relationships between singular value decomposition at kinematic level and optimization principles at task level and joint level. Two task- based schemes devoted to simulation of humanmotion are then proposed and analyzed. These results are illustrated by motion captures, analyses and task-based simulations. Pattern of singular values serve as a basis for a discussion concerning the similarity of simulated and real motions.
Some results have been obtained in the humanmotion literature for reach motion that involves the position of the hands. We discuss these results and a motion generation scheme associated. When orientation is also explicitly required, very few works are available and even the methods for analysis are not defined.
We discuss the choice for metrics adapted to the orientation, and also the problems encountered in defining a proper metric in both position and orientation. Motion capture and simulations are provided in both cases. The main goals of this paper are:
tometer to magnetic disturbances was confirmed by additional dedicated experiments.
V. C ONCLUSION
The IMU is an interesting technology to measure humanmotion in order to programme, command, control and col- laborate with a robot. The method presented here uses two sensor modules, one for each segment of the human arm, in order to estimate the wrist trajectory. Each module is used to calculate the rotational matrix representing the transformation from the local frame to the inertial frame. Two methods are used to calculate this matrix, one based on accelerometer and magnetometer data and the other one based on gyroscope data. Both methods have advantages and drawbacks. The first one is accurate in really slow movements and is very sensitive to magnetic disturbances, the other one give a smooth trajectory but drift over time. A complementary filter is used to take avantage of both methods. The estimated trajectory of the wrist is then used to move a robot arm. An important trajectory default has been observed between the human wrist trajectory and the movement of the robot. It has been experimented that the magnetometer, easily disturbed by the environnement, is the main source of error in comparison to the accelerometer. The future work will focus on how to improve the measure- ment of the wrist trajectory by a better magnetometer signal or by another method not based on the magnetometer. In a second time, the next steps will be to integrate the position and the orientation of the wrist and then the hand.
I. O BJECTIVE OF THE WORK
New generation of humanoid robots are becoming more faithfully copies of the human body. In order to facilitate the integration of humanoid robots in the human environment, our objective is to generate humanoid motions inspired by humanmotion in the everyday human activities. Human body has many Degrees of Freedom (DOFs), but classical kinematic representation of each arm involves 7 DOF. Thus to achieve a defined task with the hand, even including position and orientation, redundancy exists. The redundancy is increased with the possibility of displacement of the trunk. In robot control, the redundancy is generally solved at the kinematic level by minimization of criterion or definition of several tasks with different priority level [1, 2]. In this context our objective is to define which criterion in IK algorithm leads to human like motion.
For the task of humanmotion analysis, having 3D information about the human pose is also a challenge, which has attracted many researchers. Motion capture systems, like those from Vicon [90] are able of accurately capturing human pose, and track it along the time resulting in high res- olution data, which include markers representing the human pose. Mo- tion capture data have been widely used in industry, like in animation and video games. In addition, many datasets have been released pro- viding such data for different human actions in different contexts, like the Carnegie Mellon University Motion Capture database [18]. However, these systems present some disadvantages. First, the cost of such tech- nology may limit its use. Second, it implies that the subject wears some physical markers so as to estimate the 3D pose. As a result, this technol- ogy is not convenient for the general public.
The majority of works for temporal alignment of humanmotion sequences make use of a template showing a spe- cific subject in a specific type of clothing. The main ad-
vantage of such methods is that they are particularly robust to severe acquisition artifacts. However, a template is re- quired for tracking. For example, the method of Bradley et al. [5], which is designed for the temporal alignment of a moving cloth, constructs the template from a photograph of the garment. Aguiar et al. [8, 20] create the template by a full-body laser scan of the subject in its current clothes before performing capture. A physically-based cloth model is then used for temporal alignment. Budd et al. [7] intro- duce a way to align frames that are not necessarily tempo- rally adjacent, but most similar according to a volume-based shape similarity measure, which makes the template-based alignment more robust to large motions between adjacent frames. Allain et al. [1] perform shape tracking assum- ing a locally rigid deformation model, in order to compute both the mean pose and correspondences over time. The template is manually selected as one of the meshes of the input sequence. While this method is surface-based, the follow-up method [2] proposes a volumetric parametriza- tion of the tracking which shows better results than surface- based methods. The shape is represented by a centroidal Voronoi tessellation, which enables volume conservation. This method also assumes a template is provided. Our method is template-based for robustness purposes. How- ever, no additional information nor user interaction is nec- essary to build the template since it is automatically selected among the meshes of the sequence, using a method similar in spirit to the one of Letouzey and Boyer [12].
considered first. As a consequence, the trajectory tracking and the balance man- agement tasks dealt with already admissible trajectories. The imitation showed good results in terms of all four tasks. Fig. 2 shows the hands and foot simulta- neous trajectories of scaled human (blue) and humanoid (red) movements during on-line tracking. The distances are in [mm] for the left hand (top), the right hand (middle) and the left foot (bottom). The Cartesian values were synchronized in time, which means that the robot motion was performed 1) at the same velocity as the humanmotion and 2) the human movement coordination was respected. All the optimized tasks are in the kernel of the last Jacobian, which means they have equivalent priority in the proposed model. The only way to modify the or- der of priority is the tuning of the gains κ ℓ and κ h . Let us also point out that the
Future research directions will involve different issues. First of all, the probabilistic humanmotion models provide complementary tools to appearance modeling usually considered for the detection and tracking of people. The Bayesian framework exploited in our work could be easily extended to combine both appearance and motion mod- els. Additionally, we could enrich the characterization of humanmotion by learning more complex temporal models of humanmotion using time series analysis tools such as HMMs or linear and non-linear auto-regressive models. With a more varied train- ing set, we could learn more general models of 2 D image motion. Finally, the proposed probabilistic humanmotion models could also be used to characterize and analyze other categories of dynamic events, not necessarily human related, such as dynamic phenom- ena occurring in meteorological image sequences.
Another challenge for the understanding of humanmotion is related with the evaluation of joint forces and torques, as well as muscle efforts. This kind of information appears especially useful to improve the performance of a sport gesture, to detect muscle failure and to prevent injuries. Inverse dynamics methods have been recently proposed in order to evaluate muscle efforts by combining 3D measurement data with a full dynamic model of the patient. We intend to explore those techniques in the lab and to address many open questions regarding the patient-specific modeling of the human body, the influence of muscle activation dynamics and muscle forces, or the development of reliable numerical methods to solve inverse problems in biomechanics.
In Ancient Greece, studies on motion were also conducted. Particularly, Aristotle (383 B.C. to 321 B.C.) published texts about gait in animals which included some observations about motion patterns involved in humans. In the Renaissance period, Leonardo da Vinci (1452-1519) stated that it was indispensable for a painter to became familiar with anatomy to understand which muscles caused particular motions of the human body parts. Furthermore, a detailed description about how humans climb stairs was also given (see Figure 1.2). In those days, art was a discipline which devoted a lot of effort to studying humanmotion. Alfonso Borelli (1608- 1679) was one of the pioneers on the measurement and analysis of human locomotion from a quantitative point of view. The foundation of modern dynamics was laid down by Isaac Newton during the “Enlightenment period”. The three laws of motion, were a very crucial contribution to understanding humanmotion. From an analytical point of view they also achieved more accurate results than any previous methods.
Figure 7: Some preliminary results suggest that range laser scanners could also be used to recognize people, with machine learning techniques, based on their gait [7], which has potentia[r]
Institut Mines-T´el´ecom / T´el´ecom Lille, Villeneuve d’Ascq, France
Abstract
3D Shape similarity from video is a challenging problem lying at the heart of many primary research areas in computer graphics and computer vision applications. In this paper, we address within a new framework the problem of 3D shape representation and shape similarity in human video sequences. Our shape representation is formulated using Extremal Human Curve (EHC) descriptor extracted from the body surface. It allows taking benefits from Riemannian geometry in the open curve shape space and therefore computing statistics on it. It also allows subject pose comparison regardless of geometri- cal transformations and elastic surface change. Shape similarity is performed by an efficient method which takes advantage of a compact EHC representa- tion in open curve shape space and an elastic distance measure. Thanks to these main assets, several important exploitations of the human action analy- sis are performed: shape similarity computation, video sequence comparison, video segmentation, video clustering, summarization and motion retrieval.
only depth features are used, our method is not fairly comparable to the others. Indeed, even if we only use depth features to describe MSs, our method still needs skeleton data to identify MSs. Nevertheless, we can see that our segmen- tation approach allows a good recognition of activities when each segment is only described by depth appearance feature. Compared to skeleton-based methods, our approach significantly outperforms other solutions. This shows that our segmentation approach combined with shape analysis of humanmotion allows us to efficiently recognize activities involving manipulation of objects. Even without considering any information about objects held by the subject, we are able to recognize 71.8% of the activities. This result is higher than that scored by [33] and [35], which combine both skeleton and depth features. Finally, if we add depth features to the skeleton, the recognition accuracy is increased to 80.9%, which is almost 10% above the best state-of-the-art method [35].
are used to generate the human-like motion of the humanoid. The proposed conversion process improves
existing techniques and is developed with the aim to enable imitating of humanmotion with a humanoid
robot, to perform a task with and/or without contact between hands and equipment. A comparative
analysis shows that our algorithm, which takes into account the situation of marker frames and the
geometry.
email: mehdi@benallegue.com
Jean-Paul Laumond is Research Director at LAAS-CNRS in Toulouse, France. He is a member of the French Academy of Technology. His research is devoted to robot motion. In the 90’s, he has been the coordinator of two European Esprit projects, both dedicated to robot motion planning and control. In the early 2000’s he created and managed Kineo CAM, a spin-off company from LAAS-CNRS devoted to develop and market motion planning technology. Siemens acquired Kineo CAM in 2012. In 2006, he launched the research team Gepetto dedicated to HumanMotion studies. He teaches Robotics at Ecole Normale Suprieure in Paris. He has published more than 150 papers in international journals and conferences in Robotics, Computer Science, Automatic Control and recently in Neurosciences. He has been the 2011-2012 recipient of the Chaire Innovation Technologique Liliane Bettencourt at Coll` ege de France in Paris. His current project Actanthrope is devoted to the computational foundations of anthropomorphic action and it is supported by the European Research Council (ERC-ADG 340050). His research interests include humanoid robotics, human locomotion, digital actor animation and motion planning.
motion is decomposed by a sliding window or key features to build a codebook thanks to the learning phase from the whole dataset, and then each motion can be represented as a bag or histogram of words [59]. For an effective representation of motion data, both the spatial and temporal dynamics of humanmotion must be modeled. The Hidden Markov Model (HMM) is a popular technique for modeling sequential data. The HMM represents the humanmotion as a succession of states. At each state, local statistics and state transition probabilities are de- termined by the training phase on the dataset. After the recent progress in deep learning techniques, many applications of computer graphic field, including motion data recognition and prediction, have shown a change of paradigm. In particular, Recurrent Neural Networks (RNNs) are capable of preserving states as they pass through a step, hence they are suitable for sequence-based problems. The Long Short Term Memory networks (LSTMs) are a special kind of RNNs, with the main difference lying in the inclusion of memory states and gates, that can learn long-term dependencies in time series problems. They notably solve the problem of vanishing gradient [60], which arises in very deep neural networks, including RNNs.
Upon observing the way in which humans perform the task, it is reasonable to think that humans use combinations of different criterion functions instead of a single criterion, as presented in the article above. Park et al. 16 and Albrecht et al. 17 used the sets of parameters (such as minimization of joint jerks, minimization of torque changes, and so on) to produce the combination of criterion functions for humanmotion analysis. Mombaur et al. 6,7 defined the imitation of the human locomotion as an optimization problem with an objective function defined in the task space. The objective function is a weighted sum of the basic criterion function such as minimization of total time, integrated squares of the three acceleration components, and the integrated squared difference of the body orientation angle and direction toward the goal. The aim of their research was to produce a universal combination of the weighted coefficients for the optimization algorithm that satisfies the imitation of any type of human locomotion. Billard et al. 18 extend the pseu- doinverse optimization method for solving the IKs in order to determine the optimal imitation strategy which best satisfies the constraints of the given task. They defined the objective function as a weighted sum of the basic criterion functions defined into the Cartesian and joint spaces. Their optimization algorithm minimizes the difference between the current and the desired position of the joints and the three-dimensional (3-D) Cartesian position of the hands. They compute the trajectory of robot joints that imitates human motions. The constraints of the robot’s body are taken into account. Likewise, using the joint space, Yang et al. 19 analyzed humanmotion by combining joint displa- cement minimization, changes in potential energy, and dis- comfort basic function in a multiobjective optimization algorithm in order to predict a static posture for the human. The virtual human Santos has been used to evaluate differ- ent performance measures and to test the applicability of their optimization algorithm to posture prediction. In each basic function, they proposed the weight coefficient for each joint, taking into account the importance of particular joints for carrying out the task. They applied the optimiza- tion algorithm for each basic function separately and
S. Berretti, P. Pala and A. Del Bimbo are with the Media Integration and Communication Center, University of Florence, Florence, Italy (e-mail: stefano.berretti@unifi.it, pietro.pala@unifi.it, delbimbo@dsi.unifi.it).
models to the data, thus supporting detection and tracking of skeleton models of human bodies in real time. However, solutions which aim to understand the observed human actions by interpreting the dynamics of these representations are still quite limited. What further complicates this task is that action recognition should be invariant to geometric transformations, such as translation, rotation and global scaling of the scene. Additional challenges come from noisy or missing data, and variability of poses within the same action and across different actions. In this paper, we address the problem of modeling and analyzing humanmotion from skeleton sequences captured by depth cameras. Particularly, our work focuses on building a robust framework, which recasts the action recognition problem as a statistical analysis on the shape space manifold of open curves. In such a framework, not only the geometric ap- pearance of the human body is encoded, but also the dynamic information of the humanmotion. Additionally, we evaluate the latency performance of our approach by determining the number of frames that are necessary to permit a reliable recognition of the action.