Haut PDF Learning objects model and context for recognition and localisation

Learning objects model and context for recognition and localisation

Learning objects model and context for recognition and localisation

Araujo, Universidade de Coimbra Marten Bjorkman, Kungliga Tekniska Hogskolan. Examinateurs : Eric Marchand, IRISA-INRIA Frederic Lerasle, LAAS-CNRS[r]

161 En savoir plus

Mid-level features and spatio-temporal context for activity recognition

Mid-level features and spatio-temporal context for activity recognition

Context has also been considered as an important cue for action recog- nition [ 9 , 28 , 29 ]. Authors in [ 9 ] propose to learn neighborhood shapes of the space-time features which are discriminative for a given action category, and recursively map the descriptors of the variable-sized neighborhoods into higher-level vocabularies resulting into a hierarchy of space-time configura- tions. More recently, Wang et al. [ 28 ] introduce a contextual model in order to capture contextual interactions between interest points. Multiple chan- nels of contextual features for each interest point are computed in multi-scale contextual domains with different shapes, where an individual context is rep- resented by the posterior density of this particular feature class at this pixel location. Then multiple kernel learning is used to select the best combina- tion of channels in a multi-channel SVM classification. In [ 29 ], objects and human body parts are considered as mutual context and their interactions are modeled using random fields. Authors in that work, cast the learning task as a structure learning problem, by which the structural connectivity between objects, overall human poses, and different body parts are estimated.
En savoir plus

27 En savoir plus

Context Aware Group Activity Recognition

Context Aware Group Activity Recognition

Current group activity recognition approaches [4], [16], [18], [23], [32], [40] typically tackle this problem by de- composing it into two parts: feature learning and relational reasoning. The first part focuses on learning person-specific visual features important for understanding individual actions. In the second part, pairwise relations are modeled to infer the group activities. Despite recent advances, these approaches still confuse between visually similar group activities as they rely only on person-level appearance features in the feature learning part and ignore the contextual information present in videos. Consider the example shown in Fig. 1. It is challenging to differentiate walking activity in the first case from crossing activity in the second case using appearance features alone as in both the cases people are moving from one point to another. However, if we have additional cues that identify that a group of people is moving on a sidewalk in Fig. 1a vs. road in Fig. 1b, the model can learn to distinguish these group activities. We term these additional cues as contextual information and propose to integrate them with appearance features for group activity understanding.
En savoir plus

9 En savoir plus

Interactive Robot Learning for Multimodal Emotion Recognition

Interactive Robot Learning for Multimodal Emotion Recognition

Moreover, to the best of our knowledge, no studies focus on the fusion of gait and thermal facial features together to detect human emotion in social robotics context. A very common scenario is when the human walks towards the social robot and stops in front of it to interact with it. In our work, we developed a multimodal emotion recognition method by using thermal facial images during human-robot interaction and gait data during walking towards the robot to recognize four emotional states (i.e., neutral, happy, angry, and sad). The offline emotion recognition is widely developed by researchers. However, the online testing of emotion recognition model is challenging in real-time HRI context. In this paper, we developed a new method based on Random Forest (RF) model and confusion matrices of two individual Random Forest models by using the data from the face thermal images and gait. In addition, the IRL method is used with the verbal human feedback in the learning loops in order to improve the performance of real-time emotion recognition. The experiment results shows the effectiveness of IRL in multimodal emotion recognition.
En savoir plus

11 En savoir plus

Distributed UI on Interactive tabletops: issues and context model

Distributed UI on Interactive tabletops: issues and context model

This application intended to teach the recognition and learning of colors (only red, yellow, blue and green) to children (aged from 2 to 5 according to the level of difficulty). The scenario is based on the French teaching syllabus for nursery schools. We asked a teacher to imagine one or more scenarios using an interactive tabletop and a set of objects without giving any limits or constraints. The teacher proposed a simple application in which the children have to move a set of objects which have “lost their color” into the suitably colored frame (i.e. a “black and white” bee should be placed inside a yellow frame) [14].
En savoir plus

17 En savoir plus

Context models and out-of-context objects

Context models and out-of-context objects

Context Models and Out-of-context Objects Myung Jin Choi, Antonio Torralba, Alan S. Willsky Abstract The context of an image encapsulates rich information about how natural scenes and objects are related to each other. Such contextual information has the potential to enable a coherent understanding of natural scenes and images. However, context models have been evaluated mostly based on the improvement of object recognition performance even though it is only one of many ways to exploit contextual information. In this paper, we present a new scene understanding problem for evaluating and applying context models. We are interested in finding scenes and objects that are “out-of-context”. Detecting “out-of-contextobjects and scenes is challenging because context violations can be detected only if the relationships between objects are carefully and precisely modeled. To address this problem, we evaluate different sources of context information, and present a graphical model that combines these sources. We show that physical support relationships between objects can provide useful contextual information for both object recognition and out-of-context detection.
En savoir plus

11 En savoir plus

Graphical event model learning and verification for security assessment

Graphical event model learning and verification for security assessment

To explore the dynamics of a wide variety of systems behavior based on collected event streams, there exist many advanced continuous time modeling formalisms: for instance, continuous time Bayesian networks, Markov jump pro- cesses [6], Poisson networks and graphical event models (GEMs) [2]. In this work we are interested in Recursive Timescale Graphical Models (or RTGEMs) [2] a sub-family of GEMs, that present advantages compared to the other formalisms. Appropriate learning and verification techniques should be adapted for the type of formalism that we wish to use. Standard model checking, for example, is used as an verification method [1]. It has been applied to many formalisms, but to the best of our knowledge, never adapted to RTGEMs. Another valid solution for verification are approximation methods, such as Statistical Model Checking (SMC) [4], which is an efficient technique based on simulations and statistical results. SMC has been successfully applied to probabilistic graphical models such as dynamic Bayesian networks (DBNs) in [3]. In the same way, SMC could be easily adapted to RTGEMs.
En savoir plus

9 En savoir plus

Context-based visual feedback recognition

Context-based visual feedback recognition

A real-time visual feedback recognition library for interactive interfaces (called Watson) was developed to recognize head gaze, head gestures, and eye gaze using th[r]

195 En savoir plus

Mineral grains recognition using computer vision and machine learning

Mineral grains recognition using computer vision and machine learning

3.3. Data post-processing Due to random displacements between the SEM ground truth and the optical image, the instances (attributes and labels) dataset had to be cleaned. To perform this, a first description about the dataset is necessary. The dataset has a total of 786 655 instances. Among all these instances, there are 287 classes. The majority of those cannot be used due to the small number of occurrences. Furthermore, the imbalance of instances among categories affects the training phase of machine learning algorithms and their performances during the classification test ( Yen and Lee , 2006 ). For example, we have a to- tal of 2 instances for the ‘‘Actinolite, Plagioclase’’ class and 16 566 occurrences for the ‘‘Plagioclase, None’’ category. In addition, the classification algorithm used to process SEM data yields a category named ‘‘Unknown’’, in which particle with an ambiguous composition was not allocated with a mineral name. The classification fails when the chemical composition of a mineral exceed the specified tolerance in distance in the Euclidian hyperspace due to impurities, mixed signal or spectral deconvolution issues. Consequently, all instances labeled as ‘‘Unknown’’ were excluded from learning to avoid contaminating the other classes. Thus, we decided to exclude all instances with the word ‘‘Unknown’’ in their label because the sand grains normally belong to a known mineral and to avoid contaminating the other classes. Also, particle identified as ‘‘Quartz’’ are overwhelmingly dominant (47 570 instances), but plagued with various color issues. Quartz is typically colorless and transparent. However, it may be stained by iron oxide coating, tinted by internal structural damages, or be loaded with submi- croscopic inclusions that alter its apparent color. Being transparent and bi-refracting, light traversing the grains tends to disperse as in a prism into ‘‘rainbows’’. Furthermore, due to transparency, quartz particle may reflect the color light form neighboring grains. Consequently, instances labeled as ‘‘Quartz’’ were eliminated from the dataset. Finally, to prove the computer vision and machine learning concept, classes that are not pure were excluded. For example, instances labeled as ‘‘Plagioclase, None’’ were considered as pure and were preserved, while instances labeled as ‘‘Plagioclase, Magnetite’’ were not considered as pure and disregarded. Once post-processed, 546 444 instances were retained, labeled into 9 classes. Among these instances, the ‘‘Background’’ class account for 468 431 instances.
En savoir plus

10 En savoir plus

Context-based Visual Feedback Recognition

Context-based Visual Feedback Recognition

The contextual features are evaluated at the same rate as the vision-based gesture recognizer. 4.5 Experiments We designed our experiments to demonstrate how contextual features can improve visual feedback recognition under the same two axes described in Chapter 2: embod- iment and conversional capabilities. Our datasets include interactions with a robot, an avatar and a non-embodied interface. We also experimented with different types of visual feedback: head nods, head shakes and eye gaze aversion gestures. Finally, we tested our context-based recognition framework (described in Section 4.3) with two classification algorithms (SVM and FHCRF) to show the generality of our approach. The experiments were performed on three different datasets: MelHead and Eye. For the MelHead dataset, the goal is to recognize head nods and head shakes from human participants when interacting with a robot. For the WidgetsHead dataset, the goal is to recognize head nods from human participants when interacting with gestures-based widgets. For the AvatarEye dataset, the goal is to recognize eye gestures (gaze aversion) from human participants when interacting with a virtual agent. For the MelHead and WidgetsHead datasets, multi-class SVMs were used to train and test the contextual predictor and multi-modal integrator while FHCRF models were used for the AvatarEye dataset. Our goal for using both SVM and FHCRF is to show how our context-based framework generalizes to different learning algorithms.
En savoir plus

197 En savoir plus

A Spectral Database for the Recognition of Urban Objects in Kaunas City: Performance and Morphometric Issues

A Spectral Database for the Recognition of Urban Objects in Kaunas City: Performance and Morphometric Issues

A Spectral Database for the Recognition of Urban Objects in Kaunas City: Performance and Morphometric Issues Sébastien Gadal 12 , Gintautas Mozgeris 3 , Donatas Jonikavicius 3 , Jurate Kamicaityte 4 , Walid Ouerghemmi 1 • The diversity of urban materials of Kaunas city, contemporary and historical urban structures, urban planning

2 En savoir plus

Machine Learning for a Context Mining Facility

Machine Learning for a Context Mining Facility

To obtain better performance, several studies use ensemble methods such as [18] that combines the predictions of three models resulting from Support Vector Regression, Ensemble Tree and Artificial Neural Network (ANN) algorithms for predicting the consumption of air conditioning in residential buildings. To ensure that their data meet the input requirements of the models, they perform a linear interpolation for missing and incorrect data. Then, to select the right feature set, they use a statistical measure. In [37], the authors propose an approach for detecting the current transportation mode of a user from his/her smartphone sensors data. They propose to divide the collected data into consecutive non-overlapping time sequences and to extract four features for each sequence and each sensor. Then, they combine multiple learners to improve their performance. In [38], the authors present an ensemble method that combines the predictions of three models resulting from DT, MLP, and Logistic Regression (LR) for human activity recognition. To determine the class of a new activity, they consider the predictions (i.e. classes) of the three models, and choose the class with the highest number of votes. Their results show that ensemble learning can achieve significant improvements for activity recognition when compared to what each learning algorithm can achieve individually. The same problem is also investigated in [39]; in this case, however, the authors combine the results of other classifiers such as MLP, SVM, and LogitBoost. In addition, they use a clustering method to select 18 relevant features from 24 features and they obtain a good accuracy of 91.15%. In [22], the authors introduce a multi-class classification approach based on ultra-wide band sensor measurements and RF to detect when old people fall down. The pre-processing phase includes filtering, feature extraction, stream windowing, change detection and buffering. The classifier obtains the lowest error rate by setting the number of trees at 200.
En savoir plus

8 En savoir plus

Transfer Learning for Handwriting Recognition on Historical Documents

Transfer Learning for Handwriting Recognition on Historical Documents

{firstname.lastname}@univ-nantes.fr Keywords: Handwriting Recognition, Historical Document, Transfer Learning, Deep Neural Network, Unlabeled Data Abstract: In this work, we investigate handwriting recognition on new historical handwritten documents using transfer learning. Establishing a manual ground-truth of a new collection of handwritten documents is time consuming but needed to train and to test recognition systems. We want to implement a recognition system without performing this annotation step. Our research deals with transfer learning from heterogeneous datasets with a ground-truth and sharing common properties with a new dataset that has no ground-truth. The main difficulties of transfer learning lie in changes in the writing style, the vocabulary, and the named entities over centuries and datasets. In our experiment, we show how a CNN-BLSTM-CTC neural network behaves, for the task of transcribing handwritten titles of plays of the Italian Comedy, when trained on combinations of various datasets such as RIMES, Georges Washington, and Los Esposalles. We show that the choice of the training datasets and the merging methods are determinant to the results of the transfer learning task.
En savoir plus

9 En savoir plus

Technological Learning and Organizational Context: Fit and Performance in SMEs

Technological Learning and Organizational Context: Fit and Performance in SMEs

In addition to a strong technological knowledge intensity, other attributes of the organizational context (lower part of figure 1) must be present. In order for organizational learning to occur, individuals within an organization must be given the opportunity to make the required changes to correct errors once they have been detected (what Argyris and Schon (1978) identify as double-loop learning). This necessitates an organizational culture which favors participation and openness, what Kanter (1983: 396) labels "organic" in opposition to "mechanistic" culture. Managers in such organizations favor participatory decision-making through formal and informal meetings and the active diffusion of information (Birley and Westhead, 1990). Workers' commitment to learning is encouraged and is reflected in various human resource practices such as performance appraisal (Hornsby and Kuratko, 1990) or through the existence of training practices (Snell and Dean, 1992). By developing an organizational climate conducive to change and creativity and committing to organizational learning, organizations promote employee motivation and skills, without which learning cannot occur.
En savoir plus

23 En savoir plus

Semantic Context Model for Efficient Speech Recognition

Semantic Context Model for Efficient Speech Recognition

Proposed methodology An effective way to take into account semantic information is to re-evaluate (rescoring) the best hypotheses of the ASR (N-best). The recognition system provides us for each word of the hypoth- esis sentence an acoustic score p_acc (w) and a linguistic score p_ml (w). The best sentence is the one that maximizes the probability of the word sequence:

2 En savoir plus

The Self-Adaptive Context Learning Pattern: Overview and Proposal

The Self-Adaptive Context Learning Pattern: Overview and Proposal

agent system that learns from demonstrations to control robotic devices and ESCHER, a multi-agent system for multi-criteria optimization. 4.1 ALEX Service robotic deals with the design of robotic devices whose objectives are to provide adequate services to their users. User needs are multiple, dynamic and sometimes contradictory. Providing a natural way to automatically adapt the behaviour of robotic devices to user needs is a challenging task. The complex- ity comes with the lack of way to evaluate user satisfaction without evaluating a particular objective. A good way to handle this challenge is to use Learning from Demonstrations, a paradigm to dynamically learn new behaviours from demonstrations performed by a human tutor. With this approach, each action performed by a user on a device is seen as a feedback. Through the natural pro- cess of demonstration, the user not only shows that the current device behaviour is not satisfying him, but also provides the adequate action to perform. Adaptive Learner by EXperiments (ALEX) [17] is a multi-agent system designed to face this challenge.
En savoir plus

15 En savoir plus

MAC-RANSAC: a robust algorithm for the recognition of multiple objects

MAC-RANSAC: a robust algorithm for the recognition of multiple objects

lionel.moisan@parisdescartes.fr Abstract This paper addresses the problem of recognizing multi- ple rigid objects that are common to two images. We pro- pose a generic algorithm that allows to simultaneously de- cide if one or several objects are common to the two images and to estimate the corresponding geometric transforma- tions. The considered transformations include similarities, homographies and epipolar geometry. We first propose a generalization of an a contrario formulation of the RANSAC algorithm proposed in [ 6 ]. We then introduce an algorithm for the detection of multiple transformations between im- ages and show its efficiency on various experiments. 1. Introduction
En savoir plus

9 En savoir plus

Approximate reflectional symmetries of fuzzy objects with an application in model-based object recognition

Approximate reflectional symmetries of fuzzy objects with an application in model-based object recognition

In this section we explore symmetry as a second edge attribute used together with v(a, b), as described below. 5.2 Symmetry attributes As mentioned in the introduction, when no strict or exact symmetry is verified, then it is meaningful to consider symmetry as a matter of degree, expressed by a symmetry measure. In our case, the regions are crisp sets but, as we have to deal with approximate symmetries, it is still of interest to use a symmetry measure instead of a boolean value. All the results obtained in the previous sections are valid here, considering crisp sets as a particular case of fuzzy sets. Symmetry measures can be used to define a vertex attribute or an edge at- tribute. The first case applies if some objects of the scene are known to be approximately symmetrical. Then it is possible to define a symmetry attribute as the orientation of the symmetry plane of the region and compare these ori- entations in the model and the image to be recognized. Another option for such a scene is to compare the degree of symmetry of regions.
En savoir plus

29 En savoir plus

A procedure for localisation and electrophysiological characterisation of ion channels heterologously expressed in a plant context

A procedure for localisation and electrophysiological characterisation of ion channels heterologously expressed in a plant context

Bei and Luan opened the way towards a "green" heterolo- gous expression system when they found that tobacco mesophyll protoplasts are devoid of K + inward currents and demonstrated that the KAT1 Shaker channel could be heterologoulsy expressed and subsequently characterised therein [17]. The stable transformation protocol by Agro- bacterium infiltration of leaf disks and subsequent regen- eration of a plant [18], however, is time consuming. Another drawback of this method is that ubiquitous expression of some transgenes may, in some instances, prevent the regeneration of a transformed plant that is required to obtain mesophyll protoplasts, or induce inop- portune transcriptome modifications. On the other hand, the transient expression of protein-GFP fusions in tobacco cells has been used since about 8 years to study the target- ing of proteins [19,20] suggesting possible use of these cells for electrophysiological characterisation of electro- genic transport systems [21]. Based on this, we developed a new procedure relying on transient transformation of tobacco mesophyll protoplasts. Vectors (available upon request) were engineered that allow selection of the trans- formed protoplasts (GFP reporter), expression of GFP- tagged or untagged proteins (for subcellular localisation and electrophysiological analyses, respectively), and co- expression of two different proteins in order to investigate their functional interactions. A PEG-mediated transforma- tion protocol was adapted and the potential usefulness of the method was assessed by functional expression of the AKT1 channel, a result that had not been obtained in
En savoir plus

15 En savoir plus

Games based on active NFC objects : model and security requirements

Games based on active NFC objects : model and security requirements

V. C ONCLUSION Thanks to a thorough risk analysis, this paper identifies top four realistic and cost efficient security requirements for securing games based on active NFC objects against cheaters: secure communications between the object and the server as well as between two objects, sign data stored on objects and perform regular mandatory online checks of the objects. The objective of these requirements is to help game developers to protect their games from players boosting the characteristics of their objects and playing with counterfeited objects.
En savoir plus

4 En savoir plus

Show all 10000 documents...