Deep Learning for Visual and Multimodal Recognition Séminaire TIM 2017 Traitement de l’Information Multimodale
Texte intégral
Documents relatifs
This thesis studies empirical properties of deep convolutional neural net- works, and in particular the Scattering Transform.. Indeed, the theoretical anal- ysis of the latter is
While the output feature vector is robust to local changes, its robustness to global deformations needs improvements since it preserves too much spatial information. In [5, 7,
Companion technology, as a field of cross-disciplinary research, has played different roles in respect of the areas of application [ Biundo et al. For a diverse number of
The proposed method for action recognition can be seen as composed by two parts: one as pose-based recognition, which uses a sequence of body joints coordinates to predict the
The design capitalizes on the prominent location of the Kneeland Street edge as a potential "gateway" into Chinatown, and pro- poses to create on this highly visible
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des
The robot captures a single image at the initial pose, the network is trained again and then our CNN-based direct visual servoing is performed.. While the robot is servoing the
• a novel training process is introduced, based on a single image (acquired at a reference pose), which includes the fast creation of a dataset using a simulator allowing for