CNN could act as a good feature extractor in the conven- tional cascaded regression framework.
2.2 Dense **facial** **landmark** **detection**
We now consider dense **facial** landmarks, i.e. landmarks that are not necessarily semantic but can also be part of a contour (e.g. the popular 68 points or the 194 Helen mo- dels). Zhang et al. [71] proposed to use a coarse-to-fine encoder-decoder network to simultaneously detect 68 fa- cial points. They proposed a 4-stage cascaded encoder- decoder network with increasing input resolution in dif- ferent stages. The **landmark** positions are updated at the end of each stage by the CNN output. The author conse- quently improved this method by proposing to add an occlusion-recovering auto-encoder to reconstruct the oc- cluded **facial** parts in order to avoid errors due to occlu- sions [70]. The occlusion-recovering auto-encoder network is designed to reconstruct the genuine face appearance from the occluded one by training on a synthetic occluded dataset. Sun et al. [54] used a MLP as a graph transfor- mer network to replace the regressors in a cascaded regres- sion framework to detect the **facial** landmarks and proved that this combination could be fully trained by backpropa- gation. Wu et al. [61] used a 3-way factorized Restricted Boltzmann Machine(RBM) [24] to build a deep face shape model to predict the dense 68-point **facial** landmarks. One disadvantage of using a single CNN to predict directly a dense prediction is that the network is trained to achieve its best result on the global shape, which could possibly leave some local imprecisions. A straight-forward idea is to refine different **facial** parts locally and independently as post-processing. Duffner et al.[17] and later [19, 26] pro- posed to predict dense **facial** landmarks by a CNN to es- timate rough positions followed by several small regional CNNs to refine different parts locally. This kind of struc- ture is more time-consuming but can significantly optimize the precision. Another work by Lv et al. [38] proposed to use two-stage re-initialization with a deep network regres- sor in each stage. The framework consists of a global stage, where a coarse **facial** **landmark** shape is predicted and a lo- cal stage, where landmarks of each **facial** part are estimated respectively. One of the innovation is that the global/local transformation parameter is estimated by a CNN to reinitia- lize the **facial** region to a canonical shape prior to the land- mark prediction. This largely improves the performance on large poses. Shao et al. [52] applies adaptive weights to different landmarks during different phases of the training. They give a relatively bigger coefficient to some important points such as eye corners and mouths corners at the begin- ning of the training process and then reduce their weights if the result has converged. This operation enables the neural network to first learn a robust global shape, and learning locally-refined predictions afterwards.

En savoir plus
The recent performance of **facial** **landmark** **detection** has been significantly im- proved by using deep Convolutional Neural Networks (CNNs), especially the Heatmap Regression Models (HRMs). Although their performance on common benchmark datasets has reached a high level, the robustness of these models still remains a challenging problem in the practical use under noisy conditions of realistic environments. Contrary to most existing work focusing on the de- sign of new models, we argue that improving the robustness requires rethinking many other aspects, including the use of datasets, the format of **landmark** an- notation, the evaluation metric as well as the training and **detection** algorithm itself. In this paper, we propose a novel method for robust **facial** **landmark** **detection**, using a loss function based on the 2D Wasserstein distance combined with a new **landmark** coordinate sampling relying on the barycenter of the indi- vidual probability distributions. Our method can be plugged-and-play on most state-of-the-art HRMs with neither additional complexity nor structural modi- fications of the models. Further, with the large performance increase, we found that current evaluation metrics can no longer fully reflect the robustness of these

En savoir plus
presented a method with quantized densely connected U-Nets, which greatly improved the efficiency of HRMs.
Deep Robust Regression: Robust training is critical to en- hance the **landmark** accuracy especially for small errors. Pre- vious work on robust loss function for deep model regression is mainly inspired by the use of the M-estimator in robust statistics. The primary goal is to attenuate the impact of out- liers on the overall loss. Belagiannis et al. (2015) proposed to use Tukey’s biweight loss function for human pose estima- tion. Their loss function saturates with large residuals. They showed that their loss function helps the deep regression model to converge both faster and better, compared to the traditional L2 loss function. Feng et al. (2018) proposed a novel wing loss for deep robust **facial** **landmark** **detection**, which behaves like the logarithmic function for small errors and like the L1 loss function for large errors. They emphasized the importance of small residuals during the calculation of the loss. Recently, Lathuili`ere et al. (2018) combined a robust mixture modeling to deep CNN regression models which adapts to an evolving

En savoir plus
(vue inférieure).
Cette partie médiane (Na-So), fronto-ethmoïdo-sphénoïdale, est essentiellement à vocation ventilatoire ;
latéralement, la base s’étend des rebords supra-orbitaires aux scissures de Glaser qui séparent les fosses glénoïdes du bord antérieur des os tympanaux. Ces parties latérales, fronto- sphénoïdo-temporales, sont à vocation visuelle et manducatrice. Les trous basicrâniens, foramen ovale ou trou ovale et foramen spinosum ou trou petit rond, s’intègrent à cette partie latérale de la base du crâne. Les processus ptérygoïdiens qui appartiennent au massif **facial** ne sont pas décrits avec la base du crâne. Vues de profil, les trois parties de la base du crâne, une médiane et deux latérales, se superposent jusqu’à la crête synostosique. Seules les parties latérales se prolongent en arrière jusqu’aux fosses glénoïdes (correspondant au point glénion ou Gl) (fig. 16).

En savoir plus
111 En savoir plus

cult challenge for realistic face recognition applications, where the face is recorded under variable illumination conditions including indoor and outdoor recordings and also with some pose and scale variability. Moreover, the image distortion and complex background also bring some difﬁculty both for **landmark** location and face recognition. The proposed **landmark** **detection** method, called Com- bined Active Shape Models, is robust to illumination, translation, and rotation. It exploits the Scale Invariant Feature Transform (SIFT) [1] and the Active Shape Model (ASM) [2]. In order to have a better representation of face images, the landmarks on the face region and the face contour are modeled and processed separately. The performance of the proposed Combined-ASM algorithm is tested on the BioID and FRGCv2.0 face image databases.

En savoir plus
Abstract. Embodied conversational agents should be able to provide feedback on what a human interlocutor is saying. We are compiling a list of **facial** feedback expressions that signal attention and interest, ground- ing and attitude. As expressions need to serve many functions at the same time and most of the component signals are ambiguous, it is important to get a better idea of the many to many mappings between displays and functions. We asked people to label several dynamic expressions as a probe into this semantic space. We compare simple signals and com- bined signals in order to find out whether a combination of signals can have a meaning on its own or not, i. e. the meaning of single signals is different from the meaning attached to the combination of these signals. Results show that in some cases a combination of signals alters the per- ceived meaning of the backchannel.

En savoir plus
Yuan et al. [3] proposed a novel fusion-based gender classification
method that is able to compensate for **facial** expression. They per- formed experimental investigation to evaluate the significance of dif- ferent **facial** regions in the task of gender classification. Jing et al. [4] were investigated gender classification based on 2.5D **facial** sur- face normals (**facial** needle-maps) which can be recovered from 2D intensity images using a non-lambertian Shape-from-shading (SFS) method. The authors described a weighted principal geodesic anal- ysis (WPGA) method to extract features from **facial** surface normals to increase the gender discriminating power in the leading eigen- vectors. They adopted an a posteriori probability based method for gender classification. Xiaoguang et al. [5] exploited the range in- formation of human faces for ethnicity identification using a support vector machine. An integration scheme is also proposed for ethnicity and gender identifications by combining the registered range and in- tensity images. The 3D images provides competitive discriminative power on ethnicity and gender identifications to the intensity modal- ity is demonstrated. To the best of our knowledge, no approaches have been proposed to exploit 3D **facial** curves where shape analysis of 3D face return. But, does the use of all the curves on the face lead to better gender classification performances?, and among the **facial** curves is there any ones more relevant than others to discriminate gender from **facial** surface?

En savoir plus
Modeling Emotional Facial Expressions and their Dynamics for Realistic Interactive Facial Animation on Virtual Characters.. Thèse soutenue à Rennes le 10 décembre 2010 devant le jury com[r]

181 En savoir plus

We evaluate H&S in the context of a user-based collaborative- filtering recommender with publicly available traces from existing recommendation systems. We show that although **landmark**- based similarity does disturb similarity values (to ensure privacy), the quality of the recommendations is not as significantly ham- pered. We also show that the mere fact of disturbing similarity values turns out to be an asset because it prevents a malicious user from performing a profile reconstruction attack against other users, thus reinforcing users’ privacy. Finally, we provide a formal privacy guarantee by computing an upper bound on the amount of information revealed by H&S about a user’s profile.

En savoir plus
208 En savoir plus

If α 1 + α 2 + . . . + α k = 0 then d ≤ k ≤ 4, a contradiction. Assume that
α 1 + α 2 + . . . + α k > 0. By symmetry, we can assume α 1 > 0 and there is a
vertex v strongly shared by f and f i 1 . Contract an edge incident with v and
the face f in G. Since G is ℓ-minimal, the obtained graph has an ℓ-**facial** coloring with ⌊7ℓ/2⌋ + 6 colors. The vertices of G distinct from v keep their colors and we aim to extend the coloring to the vertex v. The vertex v cannot be assigned at most 2ℓ colors of ℓ-facially adjacent vertices on f i 1 , at most

were summed leading to an estimate of total energy intake for breakfast (EI1), afternoon snack (EI2) and between-meal intake (EI3) and corresponding macronutrient compositions.
Apparent age and attractiveness estimation
Volunteer adult raters were recruited in public places in Montpellier, France. For each rater, the sex, age and geographic origin (continent of birth for the rater, parents and grandparents) were recorded. A first set of raters estimated the age of the subjects from their **facial** photographs. A Delphi- based computer program was generated to present randomly drawn photographs to raters of the opposite sex. Each rater assessed 20 distinct photographs. If the rater knew one of the subjects, the trial was removed. T hree photographs randomly chosen among those previously viewed were presented again at the end to estimate judgement reliability. A second set of raters was sampled to make decisions concerning the relative attractiveness of the **facial** photographs. A Delphi-based computer program wa s generated to present randomly drawn pairs of photographs to raters of the opposite sex (Figure 1) . For each pair, the raters were instructed to click on the photograph depicting the face that they found the most attractive. The position of the photograph on the screen (left or right) was randomly ascribed. Each rater assessed 20 distinct pairs of photographs, corresponding to 40 different randomly chosen subjects. If the rater knew one of the subjects presented for judgement, the trial was removed. Additionally, the first pair of photographs viewed by each participant was not used in the analyses because the task could require a certain amount of habituation. Three pairs randomly chosen from among those previously viewed were presented again at the end to estimate judgement reliability.

En savoir plus
1 Introduction
The concept of **facial** colorings, introduced by Kr´al’, Madaras, and ˇSkrekovski [11, 12], extends the well-known concept of cyclic colorings. A **facial** segment of a plane graph G is a sequence of vertices in the order obtained when traversing a part of the boundary of a face. The length of a **facial** segment is the number of its edges. Two vertices u and v of G are `-facially adjacent if there exists a **facial** segment of length at most ` between them. An `-**facial** coloring of G is a function which assigns a color to each vertex of G such that any two distinct `-facially adjacent vertices are assigned with distinct colors. Notice that a vertex of G that is `-facially adjacent to itself does not prevent G from being colored. A graph admitting an `-**facial** coloring with k colors is called `-facially k-colorable.

En savoir plus
O’Toole et al. [ 20 ] present a database including videos of **facial** expressions shot under controlled conditions.
3. Data Collection
Figure 1 shows the web-based framework that was used to crowdsource the **facial** videos and the user experience. Visitors to the website opt-in to watch short videos while their **facial** expressions are being recorded and analyzed. Immediately following each video, visitors get to see where they smiled and with what intensity. They can compare their “smile track” to the aggregate smile track. On the client- side, all that is needed is a browser with Flash support and a webcam. The video from the webcam is streamed in real- time at 14 frames a second at a resolution of 320x240 to a server where automated **facial** expression analysis is per- formed, and the results are rendered back to the browser for display. There is no need to download or install anything on the client side, making it very simple for people to partici- pate. Furthermore, it is straightforward to easily set up and customize “experiments” to enable new research questions to be posed. For this experiment, we chose three successful Super Bowl commercials: 1. Doritos (“House sitting”, 30 s), 2. Google (“Parisian Love”, 53 s) and 3. Volkswagen (“The Force”, 62 s). Viewers chose to view one or more of the videos.

En savoir plus
1) grant access to their webcam for video recording and 2) to allow Affectiva and MIT to use the **facial** video for inter- nal research. Further consent for the data to be shared with the research community at large is also sought, and only videos with consent to be shared publically are shown in this paper. This data collection protocol was approved by the Massachusetts Institute of Technology Committee On the Use of Humans as Experimental Subjects (COUHES) prior to launching the site. A screenshot of the consent form is shown in Figure 3. If consent is granted, the commercial is played in the browser whilst simultaneously streaming the **facial** video to a server. In accordance with MIT COUHES, viewers could opt-out if they chose to at any point while watching the videos, in which case their **facial** video is im- mediately deleted from the server. If a viewer watches a video to the end, then his/her **facial** video data is stored along with the time at which the session was started, their IP address, the ID of the video they watched and responses (if any) to the self report questions. No other data is stored. Following each commercial, the webcam is automatically stopped and a message clearly states that the “webcam has now been turned off”. Viewers could then optionally answer three multiple choice questions: “Did you like the video?”, “Have you seen it before?” and “Would you watch this video again?”. A screenshot of the questions is shown in Figure 4. Finally, viewers were provided with a graphical represen- tation of their smile intensity during the clip compared to other viewers who watched the same video; viewers were also given the option to tweet their result page or email it to a friend. All in all, it took under 5 seconds to turn around the **facial** analysis results once the video was completed so view- ers perceived the results as instantaneous. Viewers were free to watch one, two or three videos and could watch a video as many times as they liked. In this paper we focus on the general characteristics of the collected videos (e.g., pose and lighting) and leave the analysis of the **facial** and self-report responses to future work as there is not space to discuss them fully here.

En savoir plus
on the maximal cones of F ≡ PR , the quotient FW (A , B ) / ≡ FW defines a lattice structure on
all cones of F ≡ PR . In particular, if the fan F ≡ PR is polytopal (this remains an open question
in general, see [ 17 ]), then FW (A , B ) / ≡ FW is an order on all faces of the corresponding
polytope. For instance, quotients of the **facial** weak order for finite Coxeter arrangements provide lattice structures on all faces of the generalized associahedra of [ 11 ] which are polytopal realizations of the Cambrian fans [ 19 ].

of deformations and warping can be found in a tutorial by Younes (2000) and extensive references on curve alignment for functional data analysis can be found in Ramsay and Silverman (2005). In what follows, the terms alignment, warping, registration, or matching will also be used to refer to the synchronization of set of signals. Curve alignment is thus a preliminary task that is often necessary before the statistical analysis of a dataset. Matching two functions can be done by aligning individual locations of corresponding structural points (or landmarks) from one curve to another. Previous approaches to **landmark**-based registration in a statistical setting include Kneip and Gasser (1992), Gasser and Kneip (1995), Ramsay and Li (1998), Munoz Maldonado, Staniswallis, Irwin, and Byers (2002), and Bigot (2003). For **landmark**-based matching one needs to detect the landmarks of a set of signals from discrete (noisy) observations. The estimation of the landmarks is usually complicated by the presence of noise whose fluctuations might give rise to spurious estimates which do not correspond to structural points of the unknown signals. Then, it is necessary to determine the landmarks that should be associated. This step is further complicated by the presence of outliers and by the fact that some landmarks of a given curve might have no counterpart in the other curves. Generally, these steps are performed manually (see, e.g., Munoz Maldonado et al. 2002) which can be tedious if the number of signals is large. This article uses the scale-space approach proposed by Bigot (2003, 2005) to estimate the landmarks of a noisy function. This method is based on the estimation of the significant zero-crossings of the continuous wavelet transform of a noisy signal, and on a new tool, the structural intensity, proposed by Bigot (2003, 2005) to represent the landmarks of a signal via a probability density function. The main modes of the structural intensity correspond to the significant landmarks of the unknown signal. In a sense, the structural intensity can be viewed as a smoothing method that highlights the significant features of a signal observed with noise.

En savoir plus
4.Mulhern M.G.,Aduriz-Lorenzo P.M.,Rawluk D. Et al. Ocular complications of acoustic neuroma surgery. Br J Ophthalmol 83:1389,1999. 5.Harrison,D.H.Treatment of infants with **facial** palsy. Arch Dis Child. 71:277,1994.
6. Adams G.G.,Kirkness C.M.,Lee J.P. Botulinum toxin A induced protective ptosis. Eye 1:603,1987.
7.Clark,R.P.,Berris,C.E. Botulinum toxin:A treatment for **facial** asymmetry caused by **facial** nerve paralysis. Plast Reconstr Surg 84:353,1989. 8.Bikhazi,N.B.,Maas,C.S. Refinement in the rehabilitation of the paralysed face using botulinum toxin. Otolaryngol Head Neck Surg 117:303,1997. 9.Krastinova D,Franchi G,Kelly MB et al. Rehabilitation of the paralysed or lax lower lid using a graft of conchal cartilage. Br J Plast Surg. 55:12,2002. 10. Krastinova-Lolov D. Mask-lift and aesthetic sculpturing. Plast Reconstr Surg. 95:21,1995.

En savoir plus
ON HYPERPLANE ARRANGEMENTS
ARAM DERMENJIAN, CHRISTOPHE HOHLWEG, THOMAS MCCONVILLE, AND VINCENT PILAUD
Abstract. We extend the **facial** weak order from finite Coxeter groups to central hyperplane arrangements. The **facial** weak order extends the poset of regions of a hyperplane arrangement to all its faces. We provide four non- trivially equivalent definitions of the **facial** weak order of a central arrangement: (1) by exploiting the fact that the faces are intervals in the poset of regions, (2) by describing its cover relations, (3) using covectors of the corresponding oriented matroid, and (4) using certain sets of normal vectors closely related to the geometry of the corresponding zonotope. Using these equivalent de- scriptions, we show that when the poset of regions is a lattice, the **facial** weak order is a lattice. In the case of simplicial arrangements, we further show that this lattice is semidistributive and give a description of its join-irreducible el- ements. Finally, we determine the homotopy type of all intervals in the **facial** weak order.

En savoir plus