HAL Id: tel-03276023
https://tel.archives-ouvertes.fr/tel-03276023
Submitted on 1 Jul 2021
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Hate speech and offensive language detection using
transfer learning approaches
Marzieh Mozafari
To cite this version:
626
NNT
:
2021IPP
AS007
Hate Speech and Offensive Language
Detection using Transfer Learning
Approaches
Th `ese de doctorat de l’Institut Polytechnique de Paris pr ´epar ´ee `a T ´el ´ecom SudParis
´
Ecole doctorale n◦626 Ecole doctorale de l’Institut Polytechnique de Paris (ED IP
Paris)
Sp ´ecialit ´e de doctorat : Informatique
Th `ese pr ´esent ´ee et soutenue `a ´Evry, le 28/05/2021, par
M
ARZIEHM
OZAFARIComposition du Jury :
Daqing Zhang
Directeur d’ ´etudes, IMT, T ´el ´ecom SudParis - France Pr ´esident Gabriella Pasi
Professeure, University of Milano-Bicocca - Italy Rapporteuse Ioan Marius Bilasco
Maˆıtre de Conf ´erences, Lille 1 University - France Rapporteur Elena Cabrio
Assistant Professor, Universit ´e C ˆote d’Azur - France Examinateur Christophe Cerisara
Charg ´e de recherche, CNRS - France Examinateur Daqing Zhang
Directeur d’ ´etudes, IMT, T ´el ´ecom SudParis - France Examinateur Noel Crespi
Professeur, IMT, T ´el ´ecom SudParis - France Directeur de th `ese Reza Farahbakhsh
Dedication
❚♦
t✇♦ ♦❢ ♠② ❜❡st ❢r✐❡♥❞s ❉r✳ ❆❧✐ ❏❛❧✐❧✈❛♥❞ ❛♥❞ ❉r✳ ❆r❞❛✈❛♥ ❆❢s❤❛r ✇❤♦ ♣❛ss❡❞ ❛✇❛② ❞✉r✐♥❣ t❤❡✐r P❤❉ ❛♥❞ ❝♦✉❧❞ ♥♦t ❞❡❢❡♥❞ t❤❡✐r t❤❡s✐s✱ ✉♥❢♦rt✉♥❛t❡❧②✳✳✳
✏◆♦ ♦♥❡ ✐s ❜♦r♥ ❤❛t✐♥❣ ❛♥♦t❤❡r ♣❡rs♦♥ ❜❡❝❛✉s❡ ♦❢ t❤❡ ❝♦❧♦r ♦❢ ❤✐s s❦✐♥✱ ♦r ❤✐s ❜❛❝❦❣r♦✉♥❞✱ ♦r ❤✐s r❡❧✐❣✐♦♥✳ P❡♦♣❧❡ ♠✉st ❧❡❛r♥ t♦ ❤❛t❡✱ ❛♥❞ ✐❢ t❤❡② ❝❛♥ ❧❡❛r♥ t♦ ❤❛t❡✱ t❤❡② ❝❛♥ ❜❡ t❛✉❣❤t
t♦ ❧♦✈❡✱ ❢♦r ❧♦✈❡ ❝♦♠❡s ♠♦r❡ ♥❛t✉r❛❧❧② t♦ t❤❡ ❤✉♠❛♥ ❤❡❛rt t❤❛♥ ✐ts ♦♣♣♦s✐t❡✳✑
✶✽ ❚❆❇▲❊ ❖❋ ❈❖◆❚❊◆❚❙
❆ ❆♣♣❡♥❞✐① ✶✻✾
Chapter
2
Chapter
3
Social Media Content Analysis
❈❍❆P❚❊❘ ✸✳ ❙❖❈■❆▲ ▼❊❉■❆ ❈❖◆❚❊◆❚ ❆◆❆▲❨❙■❙ ✺✶
Dkl(P, M ) =
X
P (i) logP (i)
Chapter
4
Chapter
5
❈❍❆P❚❊❘ ✺✳ ▼❯▲❚■▲■◆●❯❆▲ ❍❆❚❊ ❙P❊❊❈❍ ❉❊❚❊❈❚■❖◆ ✶✸✼ st❛❣❡✮ ❛♥❞ ❧❡❛r♥✐♥❣ r❛t❡ ♦❢ ♦✉t❡r ❧♦♦♣ β ❛r❡ s❡t ✐♥✐t✐❛❧❧② t♦ ✸❡✲✺ ❛♥❞ ✻❡✲✺✱ r❡s♣❡❝t✐✈❡❧②✳ ❲❡ ✉s❡ ❆❞❛♠ ♦♣t✐♠✐③❡r t♦ ✉♣❞❛t❡ t❤❡ ♣❛r❛♠❡t❡rs✳ ❚❤❡ ♥✉♠❜❡r ♦❢ ✉♣❞❛t❡ st❡♣s ✐♥ t❤❡ ✐♥♥❡r✲ ❧♦♦♣ ✐s s❡t t♦ ✶✵✳ ❉✉r✐♥❣ t❤❡ ✜rst ✸✵ ❡♣♦❝❤s✱ ✇❡ ❝❛❧❝✉❧❛t❡ t❤❡ ✜rst✲♦r❞❡r ❞❡r✐✈❛t✐✈❡s ❛♥❞ ✐♥ t❤❡ r❡st ♦❢ tr❛✐♥✐♥❣ ♣r♦❝❡ss ✇❡ ❝❛❧❝✉❧❛t❡ t❤❡ s❡❝♦♥❞✲♦r❞❡r ❞❡r✐✈❛t✐✈❡s ✐♥ ▼❆▼▲✳ ❲❡ ♣❡r❢♦r♠ ❡✈❛❧✉❛t✐♦♥ ♦♥ t❤❡ s❛♠♣❧❡s ✐♥ Lval s❡t ✇✐t❤ ✺ ❞✐✛❡r❡♥t s❡❡❞s ❛❢t❡r ❡❛❝❤ ❡♣♦❝❤✱ ❛♥❞ t♦ ❛✈♦✐❞ ♦✈❡r✜tt✐♥❣✱ ✇❡ ❛♣♣❧② ❡❛r❧② st♦♣♣✐♥❣ ✇❤❡♥ t❤❡ ✈❛❧✐❞❛t✐♦♥ ❛❝❝✉r❛❝② ❢❛✐❧❡❞ t♦ ❞❡❝r❡❛s❡ ❢♦r ✺ ❡♣♦❝❤s✳ ■♥ t❤❡ ❢❡✇✲s❤♦t s❡tt✐♥❣✱ ✇❡ ❝❤♦s❡ k ∈ {4, 8, 16} t♦ ❡✈❛❧✉❛t❡ ❤♦✇ ♠♦❞❡❧s ❣❡♥❡r❛❧✐③❡ t♦ ♥❡✇ t❛r❣❡t ❧❛♥❣✉❛❣❡ ✇✐t❤ ❛ ❧✐♠✐t❡❞ ❧❛❜❡❧❡❞ ❞❛t❛ k ♣❡r ❝❧❛ss✳ ❋♦r t❤❡ ❳▲▼✲❘ ❜❛s❡❧✐♥❡✱ t❤❡ ♠❛①✐♠✉♠ s❡q✉❡♥❝❡ ❧❡♥❣t❤ ♦❢ t❤❡ ✐♥♣✉t s❡♥t❡♥❝❡s ✐s s❡t t♦ ✷✺✻ ❛♥❞ ✐♥ ❝❛s❡ t❤❡ ✐♥♣✉t ❧❡♥❣t❤ ✐s s❤♦rt❡r ♦r ❧♦♥❣❡r✱ ✐t ✇✐❧❧ ❜❡ ♣❛❞❞❡❞ ✇✐t❤ ③❡r♦ ✈❛❧✉❡s ♦r tr✉♥❝❛t❡❞ t♦ t❤❡ ♠❛①✐♠✉♠ ❧❡♥❣t❤✱ r❡s♣❡❝t✐✈❡❧②✳ ❚❤❡ ♠♦❞❡❧ ✐s ✜♥❡✲t✉♥❡❞ ✇✐t❤ ❛ ❜❛t❝❤ s✐③❡ ♦❢ ✶✻ ❢♦r ✺ ❡♣♦❝❤s✳ ❆♥ ❆❞❛♠ ♦♣t✐♠✐③❡r ✇✐t❤ ❛ ❧❡❛r♥✐♥❣ r❛t❡ ♦❢ ✷❡✲✺ ✐s ✉s❡❞ t♦ ♠✐♥✐♠✐③❡ t❤❡ ❈r♦ss✲❊♥tr♦♣② ❧♦ss ❢✉♥❝t✐♦♥✳ ❋♦r ♥♦♥✲❡♣✐s♦❞✐❝ ❜❛s❡❧✐♥❡✱ t❤❡ ♠♦❞❡❧ ✐s tr❛✐♥❡❞ ❢♦r ✺ ❡♣♦❝❤s ♦♥ Ltrain ❛♥❞ ✐s ❡✈❛❧✉❛t❡❞ ❛❢t❡r ❡❛❝❤ ❡♣♦❝❤ ♦♥ Lval s❡t✳ ■♠♣❧❡♠❡♥t❛t✐♦♥ ❆s t❤❡ ✐♠♣❧❡♠❡♥t❛t✐♦♥ ❛♥❞ ❡①❡❝✉t✐♦♥ ❡♥✈✐r♦♥♠❡♥t✱ ✇❡ ✉s❡ ▲❛❜✲■❆✶✶ ♣❧❛t❢♦r♠ ♣r♦✈✐❞❡❞ ❜② ❚❤❡ ❋r❡♥❝❤ ◆❛t✐♦♥❛❧ ❈❡♥tr❡ ❢♦r ❙❝✐❡♥t✐✜❝ ❘❡s❡❛r❝❤✶✷ ✭❈◆❘❙✮ ✇✐t❤ ❛ ◆❱■❉■❆ ❚❡s❧❛ ❱✶✵✵ ●P❯ ✇✐t❤ ✸✷ ●✐❇ ♦❢ ❘❆▼ ✭◆❱▲✐♥❦✮✳ ✺✳✸✳✺✳✸ ❘❡s✉❧ts ❛♥❞ ❞✐s❝✉ss✐♦♥s ■♥ t❤✐s s❡❝t✐♦♥✱ ✇❡ ❡✈❛❧✉❛t❡ t❤❡ tr❛✐♥✐♥❣ ♠♦❞❡❧s ♦♥ ❤❛t❡ s♣❡❡❝❤ ❛♥❞ ♦✛❡♥s✐✈❡ ❧❛♥❣✉❛❣❡ ❞❡t❡❝t✐♦♥ t❛s❦s ✇✐t❤ ❞✐✛❡r❡♥t ❧❛♥❣✉❛❣❡s✳ ❍❛t❡ s♣❡❡❝❤ ❞❡t❡❝t✐♦♥ ■♥ t❤✐s t❛s❦✱ ✇❡ ❝♦♠❜✐♥❡ ❛❧❧ ❞❛t❛s❡ts ✐♥ ❡❛❝❤ ❧❛♥❣✉❛❣❡ ❛s r❡♣♦rt❡❞ ✐♥ ❚❛❜❧❡ ✺✳✼✳ ❉✉❡ t♦ t❤❡ ❧❛❝❦ ♦❢ ❛ ❤❡❧❞✲♦✉t ❜❡♥❝❤♠❛r❦ t❡st s❡t ❢♦r ❡❛❝❤ ❞❛t❛s❡t✱ ❛❢✲ t❡r ❝♦♠❜✐♥✐♥❣ ❛❧❧ ❞❛t❛s❡ts ✐♥ ❡❛❝❤ ❧❛♥❣✉❛❣❡✱ ✇❡ s❡❧❡❝t ✷✵✪ ♦❢ s❛♠♣❧❡s ✐♥ ❡❛❝❤ ❧❛♥❣✉❛❣❡ ❛s t❡st s❡t ❜② ♣❡r❢♦r♠✐♥❣ ❛ str❛t✐✜❡❞ s❛♠♣❧✐♥❣✳ ❚♦ ❤❛✈❡ ❛ ✈❡r✐t② ♦❢ t❛s❦s ❞✉r✐♥❣ t❤❡ ♠❡t❛✲tr❛✐♥✐♥❣ st❡♣✱ ✇❡ ❧❡✈❡r❛❣❡ ❞✐✛❡r❡♥t ❧❛♥❣✉❛❣❡s ✇✐t❤ ❞✐✛❡r❡♥t ❤❛t❡❢✉❧ ❝♦♥t❡♥t ✇❤❡r❡ ❛❧❧ ❧❛♥❣✉❛❣❡s ❡①❝❡♣t t✇♦ ❛r❡ s❡❧❡❝t❡❞ ❛s tr❛✐♥✐♥❣ s❡t✳ ❋♦r ❡①❛♠♣❧❡✱ t♦ ❡✈❛❧✉❛t❡ ♠❡t❛✲ ❧❡❛r♥✐♥❣ ♠♦❞❡❧s ♦♥ ❆r❛❜✐❝ ❛s ❛ t❛r❣❡t ❧❛♥❣✉❛❣❡ ✇✐t❤ k ❧❛❜❡❧❡❞ s❛♠♣❧❡s ♣❡r ❝❧❛ss✱ ✇❡ ❝♦♥✲ s✐❞❡r ♦♥❡ ❧❛♥❣✉❛❣❡ ❢♦r ✈❛❧✐❞❛t✐♦♥ ❛♥❞ t❤❡ r❡st ♦❢ ❧❛♥❣✉❛❣❡s ❢♦r tr❛✐♥✐♥❣✱ ✇❤❡r❡ Ltrain =
{English, F rench, German, Indonesian, Spanish, P ortuguese}✱ Lval= {Italian}✱ ❛♥❞ Ltest=
Chapter
6
Conclusion and Future Work
❈♦♥t❡♥ts
✻✳✶ ❈♦♥❝❧✉s✐♦♥ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✶✹✻ ✻✳✶✳✶ ❙✉♠♠❛r② ❛♥❞ ■♥s✐❣❤ts ♦❢ ❈♦♥tr✐❜✉t✐♦♥s ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✶✹✻ ✻✳✷ ❋✉t✉r❡ ❲♦r❦ ❛♥❞ ❈❤❛❧❧❡♥❣❡s ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✶✹✾
Titre : D ´etection du Discours de Haine et du Langage Offensant utilisant des Approches de Transfer Learning Mots cl ´es : Discours de haine, Langage offensant, Apprentissage par transfert, BERT, XLM-RoBERTa,
l’apprentissage en profondeur, Classification interlinguistique des textes, Few-shot learning, Meta learning, R ´eseaux sociaux, Twitter
R ´esum ´e : Une des promesses des plateformes de
r ´eseaux sociaux (comme Twitter et Facebook) est de fournir un endroit s ˆur pour que les utilisateurs puissent partager leurs opinions et des informations. Cependant, l’augmentation des comportements abu-sifs, comme le harc `element en ligne ou la pr ´esence de discours de haine, est bien r ´eelle. Dans cette th `ese, nous nous concentrons sur le discours de haine, l’un des ph ´enom `enes les plus pr ´eoccupants concernant les r ´eseaux sociaux.
Compte tenu de sa forte progression et de ses graves effets n ´egatifs, les institutions, les plateformes de r ´eseaux sociaux et les chercheurs ont tent ´e de r ´eagir le plus rapidement possible. Les progr `es r ´ecents des algorithmes de traitement automatique du langage naturel (NLP) et d’apprentissage automatique (ML) peuvent ˆetre adapt ´es pour d ´evelopper des m ´ethodes automatiques de d ´etection des discours de haine dans ce domaine.
Le but de cette th `ese est d’ ´etudier le probl `eme du discours de haine et de la d ´etection des propos in-jurieux dans les r ´eseaux sociaux. Nous proposons diff ´erentes approches dans lesquelles nous adaptons des mod `eles avanc ´es d’apprentissage par transfert (TL) et des techniques de NLP pour d ´etecter auto-matiquement les discours de haine et les contenus injurieux, de mani `ere monolingue et multilingue. La premi `ere contribution concerne uniquement la langue anglaise. Tout d’abord, nous analysons le contenu textuel g ´en ´er ´e par les utilisateurs sur Fa-cebook en introduisant un nouveau cadre capable de cat ´egoriser le contenu en termes de similarit ´e bas ´ee sur diff ´erentes caract ´eristiques, `a savoir les ca-ract ´eristiques lexicales, topiques et s ´emantiques. En outre, en utilisant l’API Perspective de Google, nous mesurons et analysons la toxicit ´e du contenu. En-suite, nous proposons une approche TL pour l’identi-fication des discours de haine en utilisant une combi-naison du mod `ele non supervis ´e pr ´e-entraˆın ´e BERT (Bidirectional Encoder Representations from Trans-formers) et de nouvelles strat ´egies supervis ´ees de r ´eglage fin. Enfin, nous ´etudions l’effet du biais
in-volontaire dans notre mod `ele pr ´e-entraˆın ´e BERT et proposons un nouveau m ´ecanisme de g ´en ´eralisation dans les donn ´ees d’entraˆınement en repond ´erant les ´echantillons puis en changeant les strat ´egies de r ´eglage fin en termes de fonction de perte pour att ´enuer le biais racial propag ´e par le mod `ele. Pour ´evaluer les mod `eles propos ´es, nous utilisons trois da-tasets publics provenant de Twitter.
Title : Hate Speech and Offensive Language Detection using Transfer Learning Approaches
Keywords : Hate Speech, Offensive Language, Transfer Learning, BERT, XLM-RoBERTa, Deep Learnig,
Cross Lingual Text Classification, Few-shot Learning, Meta Learning, Social Media, Twitter
Abstract : The great promise of social media
plat-forms (e.g., Twitter and Facebook) is to provide a safe place for users to communicate their opinions and share information. However, concerns are gro-wing that they enable abusive behaviors, e.g., threa-tening or harassing other users, cyberbullying, hate speech, racial and sexual discrimination, as well. In this thesis, we focus on hate speech as one of the most concerning phenomena in online social media. Given the high progression of online hate speech and its severe negative effects, institutions, social media platforms, and researchers have been trying to react as quickly as possible. The recent advancements in Natural Language Processing (NLP) and Machine Learning (ML) algorithms can be adapted to develop automatic methods for hate speech detection in this area.
The aim of this thesis is to investigate the problem of hate speech and offensive language detection in so-cial media, where we define hate speech as any com-munication criticizing a person or a group based on some characteristics, e.g., gender, sexual orientation, nationality, religion, race. We propose different ap-proaches in which we adapt advanced Transfer Lear-ning (TL) models and NLP techniques to detect hate speech and offensive content automatically, in a mo-nolingual and multilingual fashion.
In the first contribution, we only focus on English language. Firstly, we analyze user-generated textual content in Facebook to gain a brief insight into the type of content by introducing a new framework being able to categorize contents in terms of topical simila-rity based on different features, namely lexical, topi-cal, and semantical features. Furthermore, using the Perspective API from Google, we measure and ana-lyze the toxicity of the content. Secondly, we propose a TL approach for identification of hate speech by em-ploying a combination of the unsupervised pre-trained
model BERT (Bidirectional Encoder Representations from Transformers) and new supervised fine-tuning strategies. Finally, we investigate the effect of unin-tended bias in our pre-trained BERT-based model and propose a new generalization mechanism in training data by reweighting samples and then changing the fine-tuning strategies in terms of the loss function to mitigate the racial bias propagated through the mo-del. To evaluate the proposed models, we use three publicly available datasets from Twitter.
In the second contribution, we consider a multilingual setting where we focus on low-resource languages in which there is no or few labeled data available. First, we present the first corpus of Persian offensive lan-guage consisting of 6 000 microblogs from Twitter to deal with offensive language detection in Persian as a low-resource language in this domain. After annota-ting the corpus, we perform extensive experiments to investigate the performance of transformer-based nolingual and multilingual pre-trained language mo-dels (e.g., ParsBERT, mBERT, XLM-RoBERTa) in the downstream task. Furthermore, we propose an en-semble model to boost the performance of our model. Then, we expand our study into a cross-lingual few-shot learning problem and we adapt a meta learning-based approach to study the problem of few-shot hate speech and offensive language detection in low-resource languages that will allow hateful or offensive content to be predicted by only observing a few labe-led data items in a specific target language. To eva-luate the proposed model, we use diverse collections of different publicly available corpora, comprising 15 datasets across 8 languages for hate speech and 6 datasets across 6 languages for offensive language. To the best of the author’s knowledge, there has been an insignificant number of attempts to use meta lear-ning approaches on hate speech detection tasks.
Institut Polytechnique de Paris