Facial landmark detection

Top PDF Facial landmark detection:

A survey of deep facial landmark detection

A survey of deep facial landmark detection

CNN could act as a good feature extractor in the conven- tional cascaded regression framework. 2.2 Dense facial landmark detection We now consider dense facial landmarks, i.e. landmarks that are not necessarily semantic but can also be part of a contour (e.g. the popular 68 points or the 194 Helen mo- dels). Zhang et al. [71] proposed to use a coarse-to-fine encoder-decoder network to simultaneously detect 68 fa- cial points. They proposed a 4-stage cascaded encoder- decoder network with increasing input resolution in dif- ferent stages. The landmark positions are updated at the end of each stage by the CNN output. The author conse- quently improved this method by proposing to add an occlusion-recovering auto-encoder to reconstruct the oc- cluded facial parts in order to avoid errors due to occlu- sions [70]. The occlusion-recovering auto-encoder network is designed to reconstruct the genuine face appearance from the occluded one by training on a synthetic occluded dataset. Sun et al. [54] used a MLP as a graph transfor- mer network to replace the regressors in a cascaded regres- sion framework to detect the facial landmarks and proved that this combination could be fully trained by backpropa- gation. Wu et al. [61] used a 3-way factorized Restricted Boltzmann Machine(RBM) [24] to build a deep face shape model to predict the dense 68-point facial landmarks. One disadvantage of using a single CNN to predict directly a dense prediction is that the network is trained to achieve its best result on the global shape, which could possibly leave some local imprecisions. A straight-forward idea is to refine different facial parts locally and independently as post-processing. Duffner et al.[17] and later [19, 26] pro- posed to predict dense facial landmarks by a CNN to es- timate rough positions followed by several small regional CNNs to refine different parts locally. This kind of struc- ture is more time-consuming but can significantly optimize the precision. Another work by Lv et al. [38] proposed to use two-stage re-initialization with a deep network regres- sor in each stage. The framework consists of a global stage, where a coarse facial landmark shape is predicted and a lo- cal stage, where landmarks of each facial part are estimated respectively. One of the innovation is that the global/local transformation parameter is estimated by a CNN to reinitia- lize the facial region to a canonical shape prior to the land- mark prediction. This largely improves the performance on large poses. Shao et al. [52] applies adaptive weights to different landmarks during different phases of the training. They give a relatively bigger coefficient to some important points such as eye corners and mouths corners at the begin- ning of the training process and then reduce their weights if the result has converged. This operation enables the neural network to first learn a robust global shape, and learning locally-refined predictions afterwards.
En savoir plus

9 En savoir plus

2D Wasserstein Loss for Robust Facial Landmark Detection

2D Wasserstein Loss for Robust Facial Landmark Detection

The recent performance of facial landmark detection has been significantly im- proved by using deep Convolutional Neural Networks (CNNs), especially the Heatmap Regression Models (HRMs). Although their performance on common benchmark datasets has reached a high level, the robustness of these models still remains a challenging problem in the practical use under noisy conditions of realistic environments. Contrary to most existing work focusing on the de- sign of new models, we argue that improving the robustness requires rethinking many other aspects, including the use of datasets, the format of landmark an- notation, the evaluation metric as well as the training and detection algorithm itself. In this paper, we propose a novel method for robust facial landmark detection, using a loss function based on the 2D Wasserstein distance combined with a new landmark coordinate sampling relying on the barycenter of the indi- vidual probability distributions. Our method can be plugged-and-play on most state-of-the-art HRMs with neither additional complexity nor structural modi- fications of the models. Further, with the large performance increase, we found that current evaluation metrics can no longer fully reflect the robustness of these
En savoir plus

39 En savoir plus

Fine-grained facial landmark detection exploiting intermediate feature representations

Fine-grained facial landmark detection exploiting intermediate feature representations

presented a method with quantized densely connected U-Nets, which greatly improved the efficiency of HRMs. Deep Robust Regression: Robust training is critical to en- hance the landmark accuracy especially for small errors. Pre- vious work on robust loss function for deep model regression is mainly inspired by the use of the M-estimator in robust statistics. The primary goal is to attenuate the impact of out- liers on the overall loss. Belagiannis et al. (2015) proposed to use Tukey’s biweight loss function for human pose estima- tion. Their loss function saturates with large residuals. They showed that their loss function helps the deep regression model to converge both faster and better, compared to the traditional L2 loss function. Feng et al. (2018) proposed a novel wing loss for deep robust facial landmark detection, which behaves like the logarithmic function for small errors and like the L1 loss function for large errors. They emphasized the importance of small residuals during the calculation of the loss. Recently, Lathuili`ere et al. (2018) combined a robust mixture modeling to deep CNN regression models which adapts to an evolving
En savoir plus

18 En savoir plus

Etude du profil facial au Maroc : Etude photographique

Etude du profil facial au Maroc : Etude photographique

(vue inférieure). Cette partie médiane (Na-So), fronto-ethmoïdo-sphénoïdale, est essentiellement à vocation ventilatoire ;  latéralement, la base s’étend des rebords supra-orbitaires aux scissures de Glaser qui séparent les fosses glénoïdes du bord antérieur des os tympanaux. Ces parties latérales, fronto- sphénoïdo-temporales, sont à vocation visuelle et manducatrice. Les trous basicrâniens, foramen ovale ou trou ovale et foramen spinosum ou trou petit rond, s’intègrent à cette partie latérale de la base du crâne. Les processus ptérygoïdiens qui appartiennent au massif facial ne sont pas décrits avec la base du crâne. Vues de profil, les trois parties de la base du crâne, une médiane et deux latérales, se superposent jusqu’à la crête synostosique. Seules les parties latérales se prolongent en arrière jusqu’aux fosses glénoïdes (correspondant au point glénion ou Gl) (fig. 16).
En savoir plus

111 En savoir plus

Automatic landmark location with a combined active shape model

Automatic landmark location with a combined active shape model

cult challenge for realistic face recognition applications, where the face is recorded under variable illumination conditions including indoor and outdoor recordings and also with some pose and scale variability. Moreover, the image distortion and complex background also bring some difficulty both for landmark location and face recognition. The proposed landmark detection method, called Com- bined Active Shape Models, is robust to illumination, translation, and rotation. It exploits the Scale Invariant Feature Transform (SIFT) [1] and the Active Shape Model (ASM) [2]. In order to have a better representation of face images, the landmarks on the face region and the face contour are modeled and processed separately. The performance of the proposed Combined-ASM algorithm is tested on the BioID and FRGCv2.0 face image databases.
En savoir plus

8 En savoir plus

Searching for Prototypical Facial Feedback Signals

Searching for Prototypical Facial Feedback Signals

Abstract. Embodied conversational agents should be able to provide feedback on what a human interlocutor is saying. We are compiling a list of facial feedback expressions that signal attention and interest, ground- ing and attitude. As expressions need to serve many functions at the same time and most of the component signals are ambiguous, it is important to get a better idea of the many to many mappings between displays and functions. We asked people to label several dynamic expressions as a probe into this semantic space. We compare simple signals and com- bined signals in order to find out whether a combination of signals can have a meaning on its own or not, i. e. the meaning of single signals is different from the meaning attached to the combination of these signals. Results show that in some cases a combination of signals alters the per- ceived meaning of the backchannel.
En savoir plus

8 En savoir plus

GEOMETRIC BASED 3D FACIAL GENDER CLASSIFICATION

GEOMETRIC BASED 3D FACIAL GENDER CLASSIFICATION

Yuan et al. [3] proposed a novel fusion-based gender classification method that is able to compensate for facial expression. They per- formed experimental investigation to evaluate the significance of dif- ferent facial regions in the task of gender classification. Jing et al. [4] were investigated gender classification based on 2.5D facial sur- face normals (facial needle-maps) which can be recovered from 2D intensity images using a non-lambertian Shape-from-shading (SFS) method. The authors described a weighted principal geodesic anal- ysis (WPGA) method to extract features from facial surface normals to increase the gender discriminating power in the leading eigen- vectors. They adopted an a posteriori probability based method for gender classification. Xiaoguang et al. [5] exploited the range in- formation of human faces for ethnicity identification using a support vector machine. An integration scheme is also proposed for ethnicity and gender identifications by combining the registered range and in- tensity images. The 3D images provides competitive discriminative power on ethnicity and gender identifications to the intensity modal- ity is demonstrated. To the best of our knowledge, no approaches have been proposed to exploit 3D facial curves where shape analysis of 3D face return. But, does the use of all the curves on the face lead to better gender classification performances?, and among the facial curves is there any ones more relevant than others to discriminate gender from facial surface?
En savoir plus

6 En savoir plus

Modeling emotionnal facial expressions and their dynamics for realistic interactive facial animation on virtual characters

Modeling emotionnal facial expressions and their dynamics for realistic interactive facial animation on virtual characters

Modeling Emotional Facial Expressions and their Dynamics for Realistic Interactive Facial Animation on Virtual Characters.. Thèse soutenue à Rennes le 10 décembre 2010 devant le jury com[r]

181 En savoir plus

Hide & Share: Landmark-based Similarity for Private KNN Computation

Hide & Share: Landmark-based Similarity for Private KNN Computation

We evaluate H&S in the context of a user-based collaborative- filtering recommender with publicly available traces from existing recommendation systems. We show that although landmark- based similarity does disturb similarity values (to ensure privacy), the quality of the recommendations is not as significantly ham- pered. We also show that the mere fact of disturbing similarity values turns out to be an asset because it prevents a malicious user from performing a profile reconstruction attack against other users, thus reinforcing users’ privacy. Finally, we provide a formal privacy guarantee by computing an upper bound on the amount of information revealed by H&S about a user’s profile.
En savoir plus

13 En savoir plus

Feature extraction on faces : from landmark localization to depth estimation

Feature extraction on faces : from landmark localization to depth estimation

Landmark localization – finding the precise location of specific parts in an image – is a central step in many complex vision problems. Examples include hand tracking (Datcu and Lukosch, 2013; Hu et al., 2010), gesture recognition (Dardas et al., 2010), facial expression recognition (Kahou et al., 2013), face identity verification (Sun et al., 2014; Taigman et al., 2014), and eye gaze tracking (Mora and Odobez, 2012; Zhang et al., 2015). Reliable landmark estimation is often part of the pipeline for sophisticated, robust vision tools. Neural networks have yielded state-of-the art results on numerous landmark estimation problems (Honari et al., 2016; Tompson et al., 2015; Wang et al., 2016; Xiao et al., 2016; Yu et al., 2016). However, neural networks generally need to be trained on a large set of labeled data to be robust to the variations in natural images. Landmark labeling is a tedious manual work where precision is important; as a result, few landmark datasets are large enough to train reliable deep neural networks. On the other hand it is much easier to label an image with a single class label rather than the entire set of precise landmarks, and datasets with labels related to—but distinct from—landmark detection are far more abundant.
En savoir plus

208 En savoir plus

Facial colorings using Hall's Theorem

Facial colorings using Hall's Theorem

If α 1 + α 2 + . . . + α k = 0 then d ≤ k ≤ 4, a contradiction. Assume that α 1 + α 2 + . . . + α k > 0. By symmetry, we can assume α 1 > 0 and there is a vertex v strongly shared by f and f i 1 . Contract an edge incident with v and the face f in G. Since G is ℓ-minimal, the obtained graph has an ℓ-facial coloring with ⌊7ℓ/2⌋ + 6 colors. The vertices of G distinct from v keep their colors and we aim to extend the coloring to the vertex v. The vertex v cannot be assigned at most 2ℓ colors of ℓ-facially adjacent vertices on f i 1 , at most

35 En savoir plus

Facial Feedback Signals for ECAs

Facial Feedback Signals for ECAs

Facial Feedback Signals for ECAs Elisabetta Bevacqua 1 and Dirk Heylen 2 and Catherine Pelachaud 1 and Marion Tellier 1 Abstract. One of the most desirable characteristics of an intelligent interactive system is its capability of interacting with users in a natural way. An example of such a system is the embodied conver- sational agent (ECA) that has a humanoid aspect and the capability of communicating with users through multiple modalities such as voice, gesture, facial expressions, that are typical of human-human communication. It is important to make an ECA able to fit well in each role in a conversation: the agent should behave in a realistic and human-like way both while speaking and listening. So far most of the work on ECAs have focused on the importance of the ECA’s behaviour in the role of the speaker, implementing models for the generation of verbal and non-verbal signals; but currently we are mainly interested in modelling the listening behaviour. In this paper we will describe our work in progress on this matter.
En savoir plus

8 En savoir plus

Refined carbohydrate consumption and facial attractiveness

Refined carbohydrate consumption and facial attractiveness

were summed leading to an estimate of total energy intake for breakfast (EI1), afternoon snack (EI2) and between-meal intake (EI3) and corresponding macronutrient compositions. Apparent age and attractiveness estimation Volunteer adult raters were recruited in public places in Montpellier, France. For each rater, the sex, age and geographic origin (continent of birth for the rater, parents and grandparents) were recorded. A first set of raters estimated the age of the subjects from their facial photographs. A Delphi- based computer program was generated to present randomly drawn photographs to raters of the opposite sex. Each rater assessed 20 distinct photographs. If the rater knew one of the subjects, the trial was removed. T hree photographs randomly chosen among those previously viewed were presented again at the end to estimate judgement reliability. A second set of raters was sampled to make decisions concerning the relative attractiveness of the facial photographs. A Delphi-based computer program wa s generated to present randomly drawn pairs of photographs to raters of the opposite sex (Figure 1) . For each pair, the raters were instructed to click on the photograph depicting the face that they found the most attractive. The position of the photograph on the screen (left or right) was randomly ascribed. Each rater assessed 20 distinct pairs of photographs, corresponding to 40 different randomly chosen subjects. If the rater knew one of the subjects presented for judgement, the trial was removed. Additionally, the first pair of photographs viewed by each participant was not used in the analyses because the task could require a certain amount of habituation. Three pairs randomly chosen from among those previously viewed were presented again at the end to estimate judgement reliability.
En savoir plus

33 En savoir plus

3-facial colouring of plane graphs

3-facial colouring of plane graphs

1 Introduction The concept of facial colorings, introduced by Kr´al’, Madaras, and ˇSkrekovski [11, 12], extends the well-known concept of cyclic colorings. A facial segment of a plane graph G is a sequence of vertices in the order obtained when traversing a part of the boundary of a face. The length of a facial segment is the number of its edges. Two vertices u and v of G are `-facially adjacent if there exists a facial segment of length at most ` between them. An `-facial coloring of G is a function which assigns a color to each vertex of G such that any two distinct `-facially adjacent vertices are assigned with distinct colors. Notice that a vertex of G that is `-facially adjacent to itself does not prevent G from being colored. A graph admitting an `-facial coloring with k colors is called `-facially k-colorable.
En savoir plus

22 En savoir plus

Affectiva-MIT Facial Expression Dataset (AM-FED): Naturalistic and Spontaneous Facial Expressions Collected In-the-Wild

Affectiva-MIT Facial Expression Dataset (AM-FED): Naturalistic and Spontaneous Facial Expressions Collected In-the-Wild

O’Toole et al. [ 20 ] present a database including videos of facial expressions shot under controlled conditions. 3. Data Collection Figure 1 shows the web-based framework that was used to crowdsource the facial videos and the user experience. Visitors to the website opt-in to watch short videos while their facial expressions are being recorded and analyzed. Immediately following each video, visitors get to see where they smiled and with what intensity. They can compare their “smile track” to the aggregate smile track. On the client- side, all that is needed is a browser with Flash support and a webcam. The video from the webcam is streamed in real- time at 14 frames a second at a resolution of 320x240 to a server where automated facial expression analysis is per- formed, and the results are rendered back to the browser for display. There is no need to download or install anything on the client side, making it very simple for people to partici- pate. Furthermore, it is straightforward to easily set up and customize “experiments” to enable new research questions to be posed. For this experiment, we chose three successful Super Bowl commercials: 1. Doritos (“House sitting”, 30 s), 2. Google (“Parisian Love”, 53 s) and 3. Volkswagen (“The Force”, 62 s). Viewers chose to view one or more of the videos.
En savoir plus

9 En savoir plus

Crowdsourced data collection of facial responses

Crowdsourced data collection of facial responses

1) grant access to their webcam for video recording and 2) to allow Affectiva and MIT to use the facial video for inter- nal research. Further consent for the data to be shared with the research community at large is also sought, and only videos with consent to be shared publically are shown in this paper. This data collection protocol was approved by the Massachusetts Institute of Technology Committee On the Use of Humans as Experimental Subjects (COUHES) prior to launching the site. A screenshot of the consent form is shown in Figure 3. If consent is granted, the commercial is played in the browser whilst simultaneously streaming the facial video to a server. In accordance with MIT COUHES, viewers could opt-out if they chose to at any point while watching the videos, in which case their facial video is im- mediately deleted from the server. If a viewer watches a video to the end, then his/her facial video data is stored along with the time at which the session was started, their IP address, the ID of the video they watched and responses (if any) to the self report questions. No other data is stored. Following each commercial, the webcam is automatically stopped and a message clearly states that the “webcam has now been turned off”. Viewers could then optionally answer three multiple choice questions: “Did you like the video?”, “Have you seen it before?” and “Would you watch this video again?”. A screenshot of the questions is shown in Figure 4. Finally, viewers were provided with a graphical represen- tation of their smile intensity during the clip compared to other viewers who watched the same video; viewers were also given the option to tweet their result page or email it to a friend. All in all, it took under 5 seconds to turn around the facial analysis results once the video was completed so view- ers perceived the results as instantaneous. Viewers were free to watch one, two or three videos and could watch a video as many times as they liked. In this paper we focus on the general characteristics of the collected videos (e.g., pose and lighting) and leave the analysis of the facial and self-report responses to future work as there is not space to discuss them fully here.
En savoir plus

9 En savoir plus

The facial weak order on hyperplane arrangements

The facial weak order on hyperplane arrangements

on the maximal cones of F ≡ PR , the quotient FW (A , B ) / ≡ FW defines a lattice structure on all cones of F ≡ PR . In particular, if the fan F ≡ PR is polytopal (this remains an open question in general, see [ 17 ]), then FW (A , B ) / ≡ FW is an order on all faces of the corresponding polytope. For instance, quotients of the facial weak order for finite Coxeter arrangements provide lattice structures on all faces of the generalized associahedra of [ 11 ] which are polytopal realizations of the Cambrian fans [ 19 ].

12 En savoir plus

Landmark-Based Registration of Curves via the Continuous Wavelet Transform

Landmark-Based Registration of Curves via the Continuous Wavelet Transform

of deformations and warping can be found in a tutorial by Younes (2000) and extensive references on curve alignment for functional data analysis can be found in Ramsay and Silverman (2005). In what follows, the terms alignment, warping, registration, or matching will also be used to refer to the synchronization of set of signals. Curve alignment is thus a preliminary task that is often necessary before the statistical analysis of a dataset. Matching two functions can be done by aligning individual locations of corresponding structural points (or landmarks) from one curve to another. Previous approaches to landmark-based registration in a statistical setting include Kneip and Gasser (1992), Gasser and Kneip (1995), Ramsay and Li (1998), Munoz Maldonado, Staniswallis, Irwin, and Byers (2002), and Bigot (2003). For landmark-based matching one needs to detect the landmarks of a set of signals from discrete (noisy) observations. The estimation of the landmarks is usually complicated by the presence of noise whose fluctuations might give rise to spurious estimates which do not correspond to structural points of the unknown signals. Then, it is necessary to determine the landmarks that should be associated. This step is further complicated by the presence of outliers and by the fact that some landmarks of a given curve might have no counterpart in the other curves. Generally, these steps are performed manually (see, e.g., Munoz Maldonado et al. 2002) which can be tedious if the number of signals is large. This article uses the scale-space approach proposed by Bigot (2003, 2005) to estimate the landmarks of a noisy function. This method is based on the estimation of the significant zero-crossings of the continuous wavelet transform of a noisy signal, and on a new tool, the structural intensity, proposed by Bigot (2003, 2005) to represent the landmarks of a signal via a probability density function. The main modes of the structural intensity correspond to the significant landmarks of the unknown signal. In a sense, the structural intensity can be viewed as a smoothing method that highlights the significant features of a signal observed with noise.
En savoir plus

24 En savoir plus

Abord crânio-facial des orbitopathies dysthyroidiennes

Abord crânio-facial des orbitopathies dysthyroidiennes

4.Mulhern M.G.,Aduriz-Lorenzo P.M.,Rawluk D. Et al. Ocular complications of acoustic neuroma surgery. Br J Ophthalmol 83:1389,1999. 5.Harrison,D.H.Treatment of infants with facial palsy. Arch Dis Child. 71:277,1994. 6. Adams G.G.,Kirkness C.M.,Lee J.P. Botulinum toxin A induced protective ptosis. Eye 1:603,1987. 7.Clark,R.P.,Berris,C.E. Botulinum toxin:A treatment for facial asymmetry caused by facial nerve paralysis. Plast Reconstr Surg 84:353,1989. 8.Bikhazi,N.B.,Maas,C.S. Refinement in the rehabilitation of the paralysed face using botulinum toxin. Otolaryngol Head Neck Surg 117:303,1997. 9.Krastinova D,Franchi G,Kelly MB et al. Rehabilitation of the paralysed or lax lower lid using a graft of conchal cartilage. Br J Plast Surg. 55:12,2002. 10. Krastinova-Lolov D. Mask-lift and aesthetic sculpturing. Plast Reconstr Surg. 95:21,1995.
En savoir plus

5 En savoir plus

The facial weak order on hyperplane arrangements

The facial weak order on hyperplane arrangements

ON HYPERPLANE ARRANGEMENTS ARAM DERMENJIAN, CHRISTOPHE HOHLWEG, THOMAS MCCONVILLE, AND VINCENT PILAUD Abstract. We extend the facial weak order from finite Coxeter groups to central hyperplane arrangements. The facial weak order extends the poset of regions of a hyperplane arrangement to all its faces. We provide four non- trivially equivalent definitions of the facial weak order of a central arrangement: (1) by exploiting the fact that the faces are intervals in the poset of regions, (2) by describing its cover relations, (3) using covectors of the corresponding oriented matroid, and (4) using certain sets of normal vectors closely related to the geometry of the corresponding zonotope. Using these equivalent de- scriptions, we show that when the poset of regions is a lattice, the facial weak order is a lattice. In the case of simplicial arrangements, we further show that this lattice is semidistributive and give a description of its join-irreducible el- ements. Finally, we determine the homotopy type of all intervals in the facial weak order.
En savoir plus

35 En savoir plus

Show all 3157 documents...