• Aucun résultat trouvé

Hate speech and offensive language detection using transfer learning approaches

N/A
N/A
Protected

Academic year: 2021

Partager "Hate speech and offensive language detection using transfer learning approaches"

Copied!
179
0
0

Texte intégral

(1)

HAL Id: tel-03276023

https://tel.archives-ouvertes.fr/tel-03276023

Submitted on 1 Jul 2021

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Hate speech and offensive language detection using

transfer learning approaches

Marzieh Mozafari

To cite this version:

(2)

626

NNT

:

2021IPP

AS007

Hate Speech and Offensive Language

Detection using Transfer Learning

Approaches

Th `ese de doctorat de l’Institut Polytechnique de Paris pr ´epar ´ee `a T ´el ´ecom SudParis

´

Ecole doctorale n◦626 Ecole doctorale de l’Institut Polytechnique de Paris (ED IP

Paris)

Sp ´ecialit ´e de doctorat : Informatique

Th `ese pr ´esent ´ee et soutenue `a ´Evry, le 28/05/2021, par

M

ARZIEH

M

OZAFARI

Composition du Jury :

Daqing Zhang

Directeur d’ ´etudes, IMT, T ´el ´ecom SudParis - France Pr ´esident Gabriella Pasi

Professeure, University of Milano-Bicocca - Italy Rapporteuse Ioan Marius Bilasco

Maˆıtre de Conf ´erences, Lille 1 University - France Rapporteur Elena Cabrio

Assistant Professor, Universit ´e C ˆote d’Azur - France Examinateur Christophe Cerisara

Charg ´e de recherche, CNRS - France Examinateur Daqing Zhang

Directeur d’ ´etudes, IMT, T ´el ´ecom SudParis - France Examinateur Noel Crespi

Professeur, IMT, T ´el ´ecom SudParis - France Directeur de th `ese Reza Farahbakhsh

(3)
(4)
(5)
(6)
(7)
(8)

Dedication

❚♦

t✇♦ ♦❢ ♠② ❜❡st ❢r✐❡♥❞s ❉r✳ ❆❧✐ ❏❛❧✐❧✈❛♥❞ ❛♥❞ ❉r✳ ❆r❞❛✈❛♥ ❆❢s❤❛r ✇❤♦ ♣❛ss❡❞ ❛✇❛② ❞✉r✐♥❣ t❤❡✐r P❤❉ ❛♥❞ ❝♦✉❧❞ ♥♦t ❞❡❢❡♥❞ t❤❡✐r t❤❡s✐s✱ ✉♥❢♦rt✉♥❛t❡❧②✳✳✳

(9)
(10)

✏◆♦ ♦♥❡ ✐s ❜♦r♥ ❤❛t✐♥❣ ❛♥♦t❤❡r ♣❡rs♦♥ ❜❡❝❛✉s❡ ♦❢ t❤❡ ❝♦❧♦r ♦❢ ❤✐s s❦✐♥✱ ♦r ❤✐s ❜❛❝❦❣r♦✉♥❞✱ ♦r ❤✐s r❡❧✐❣✐♦♥✳ P❡♦♣❧❡ ♠✉st ❧❡❛r♥ t♦ ❤❛t❡✱ ❛♥❞ ✐❢ t❤❡② ❝❛♥ ❧❡❛r♥ t♦ ❤❛t❡✱ t❤❡② ❝❛♥ ❜❡ t❛✉❣❤t

t♦ ❧♦✈❡✱ ❢♦r ❧♦✈❡ ❝♦♠❡s ♠♦r❡ ♥❛t✉r❛❧❧② t♦ t❤❡ ❤✉♠❛♥ ❤❡❛rt t❤❛♥ ✐ts ♦♣♣♦s✐t❡✳✑

(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)

✶✽ ❚❆❇▲❊ ❖❋ ❈❖◆❚❊◆❚❙

❆ ❆♣♣❡♥❞✐① ✶✻✾

(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
(30)

Chapter

2

(31)
(32)
(33)
(34)
(35)
(36)
(37)
(38)
(39)
(40)
(41)
(42)
(43)
(44)
(45)
(46)

Chapter

3

Social Media Content Analysis

(47)
(48)
(49)
(50)
(51)
(52)
(53)
(54)

❈❍❆P❚❊❘ ✸✳ ❙❖❈■❆▲ ▼❊❉■❆ ❈❖◆❚❊◆❚ ❆◆❆▲❨❙■❙ ✺✶

Dkl(P, M ) =

X

P (i) logP (i)

(55)
(56)
(57)
(58)
(59)
(60)
(61)
(62)
(63)
(64)
(65)
(66)
(67)
(68)
(69)
(70)

Chapter

4

(71)
(72)
(73)
(74)
(75)
(76)
(77)
(78)
(79)
(80)
(81)
(82)
(83)
(84)
(85)
(86)
(87)
(88)
(89)
(90)
(91)
(92)
(93)
(94)
(95)
(96)
(97)
(98)
(99)
(100)
(101)
(102)
(103)
(104)

Chapter

5

(105)
(106)
(107)
(108)
(109)
(110)
(111)
(112)
(113)
(114)
(115)
(116)
(117)
(118)
(119)
(120)
(121)
(122)
(123)
(124)
(125)
(126)
(127)
(128)
(129)
(130)
(131)
(132)
(133)
(134)
(135)
(136)
(137)
(138)
(139)
(140)

❈❍❆P❚❊❘ ✺✳ ▼❯▲❚■▲■◆●❯❆▲ ❍❆❚❊ ❙P❊❊❈❍ ❉❊❚❊❈❚■❖◆ ✶✸✼ st❛❣❡✮ ❛♥❞ ❧❡❛r♥✐♥❣ r❛t❡ ♦❢ ♦✉t❡r ❧♦♦♣ β ❛r❡ s❡t ✐♥✐t✐❛❧❧② t♦ ✸❡✲✺ ❛♥❞ ✻❡✲✺✱ r❡s♣❡❝t✐✈❡❧②✳ ❲❡ ✉s❡ ❆❞❛♠ ♦♣t✐♠✐③❡r t♦ ✉♣❞❛t❡ t❤❡ ♣❛r❛♠❡t❡rs✳ ❚❤❡ ♥✉♠❜❡r ♦❢ ✉♣❞❛t❡ st❡♣s ✐♥ t❤❡ ✐♥♥❡r✲ ❧♦♦♣ ✐s s❡t t♦ ✶✵✳ ❉✉r✐♥❣ t❤❡ ✜rst ✸✵ ❡♣♦❝❤s✱ ✇❡ ❝❛❧❝✉❧❛t❡ t❤❡ ✜rst✲♦r❞❡r ❞❡r✐✈❛t✐✈❡s ❛♥❞ ✐♥ t❤❡ r❡st ♦❢ tr❛✐♥✐♥❣ ♣r♦❝❡ss ✇❡ ❝❛❧❝✉❧❛t❡ t❤❡ s❡❝♦♥❞✲♦r❞❡r ❞❡r✐✈❛t✐✈❡s ✐♥ ▼❆▼▲✳ ❲❡ ♣❡r❢♦r♠ ❡✈❛❧✉❛t✐♦♥ ♦♥ t❤❡ s❛♠♣❧❡s ✐♥ Lval s❡t ✇✐t❤ ✺ ❞✐✛❡r❡♥t s❡❡❞s ❛❢t❡r ❡❛❝❤ ❡♣♦❝❤✱ ❛♥❞ t♦ ❛✈♦✐❞ ♦✈❡r✜tt✐♥❣✱ ✇❡ ❛♣♣❧② ❡❛r❧② st♦♣♣✐♥❣ ✇❤❡♥ t❤❡ ✈❛❧✐❞❛t✐♦♥ ❛❝❝✉r❛❝② ❢❛✐❧❡❞ t♦ ❞❡❝r❡❛s❡ ❢♦r ✺ ❡♣♦❝❤s✳ ■♥ t❤❡ ❢❡✇✲s❤♦t s❡tt✐♥❣✱ ✇❡ ❝❤♦s❡ k ∈ {4, 8, 16} t♦ ❡✈❛❧✉❛t❡ ❤♦✇ ♠♦❞❡❧s ❣❡♥❡r❛❧✐③❡ t♦ ♥❡✇ t❛r❣❡t ❧❛♥❣✉❛❣❡ ✇✐t❤ ❛ ❧✐♠✐t❡❞ ❧❛❜❡❧❡❞ ❞❛t❛ k ♣❡r ❝❧❛ss✳ ❋♦r t❤❡ ❳▲▼✲❘ ❜❛s❡❧✐♥❡✱ t❤❡ ♠❛①✐♠✉♠ s❡q✉❡♥❝❡ ❧❡♥❣t❤ ♦❢ t❤❡ ✐♥♣✉t s❡♥t❡♥❝❡s ✐s s❡t t♦ ✷✺✻ ❛♥❞ ✐♥ ❝❛s❡ t❤❡ ✐♥♣✉t ❧❡♥❣t❤ ✐s s❤♦rt❡r ♦r ❧♦♥❣❡r✱ ✐t ✇✐❧❧ ❜❡ ♣❛❞❞❡❞ ✇✐t❤ ③❡r♦ ✈❛❧✉❡s ♦r tr✉♥❝❛t❡❞ t♦ t❤❡ ♠❛①✐♠✉♠ ❧❡♥❣t❤✱ r❡s♣❡❝t✐✈❡❧②✳ ❚❤❡ ♠♦❞❡❧ ✐s ✜♥❡✲t✉♥❡❞ ✇✐t❤ ❛ ❜❛t❝❤ s✐③❡ ♦❢ ✶✻ ❢♦r ✺ ❡♣♦❝❤s✳ ❆♥ ❆❞❛♠ ♦♣t✐♠✐③❡r ✇✐t❤ ❛ ❧❡❛r♥✐♥❣ r❛t❡ ♦❢ ✷❡✲✺ ✐s ✉s❡❞ t♦ ♠✐♥✐♠✐③❡ t❤❡ ❈r♦ss✲❊♥tr♦♣② ❧♦ss ❢✉♥❝t✐♦♥✳ ❋♦r ♥♦♥✲❡♣✐s♦❞✐❝ ❜❛s❡❧✐♥❡✱ t❤❡ ♠♦❞❡❧ ✐s tr❛✐♥❡❞ ❢♦r ✺ ❡♣♦❝❤s ♦♥ Ltrain ❛♥❞ ✐s ❡✈❛❧✉❛t❡❞ ❛❢t❡r ❡❛❝❤ ❡♣♦❝❤ ♦♥ Lval s❡t✳ ■♠♣❧❡♠❡♥t❛t✐♦♥ ❆s t❤❡ ✐♠♣❧❡♠❡♥t❛t✐♦♥ ❛♥❞ ❡①❡❝✉t✐♦♥ ❡♥✈✐r♦♥♠❡♥t✱ ✇❡ ✉s❡ ▲❛❜✲■❆✶✶ ♣❧❛t❢♦r♠ ♣r♦✈✐❞❡❞ ❜② ❚❤❡ ❋r❡♥❝❤ ◆❛t✐♦♥❛❧ ❈❡♥tr❡ ❢♦r ❙❝✐❡♥t✐✜❝ ❘❡s❡❛r❝❤✶✷ ✭❈◆❘❙✮ ✇✐t❤ ❛ ◆❱■❉■❆ ❚❡s❧❛ ❱✶✵✵ ●P❯ ✇✐t❤ ✸✷ ●✐❇ ♦❢ ❘❆▼ ✭◆❱▲✐♥❦✮✳ ✺✳✸✳✺✳✸ ❘❡s✉❧ts ❛♥❞ ❞✐s❝✉ss✐♦♥s ■♥ t❤✐s s❡❝t✐♦♥✱ ✇❡ ❡✈❛❧✉❛t❡ t❤❡ tr❛✐♥✐♥❣ ♠♦❞❡❧s ♦♥ ❤❛t❡ s♣❡❡❝❤ ❛♥❞ ♦✛❡♥s✐✈❡ ❧❛♥❣✉❛❣❡ ❞❡t❡❝t✐♦♥ t❛s❦s ✇✐t❤ ❞✐✛❡r❡♥t ❧❛♥❣✉❛❣❡s✳ ❍❛t❡ s♣❡❡❝❤ ❞❡t❡❝t✐♦♥ ■♥ t❤✐s t❛s❦✱ ✇❡ ❝♦♠❜✐♥❡ ❛❧❧ ❞❛t❛s❡ts ✐♥ ❡❛❝❤ ❧❛♥❣✉❛❣❡ ❛s r❡♣♦rt❡❞ ✐♥ ❚❛❜❧❡ ✺✳✼✳ ❉✉❡ t♦ t❤❡ ❧❛❝❦ ♦❢ ❛ ❤❡❧❞✲♦✉t ❜❡♥❝❤♠❛r❦ t❡st s❡t ❢♦r ❡❛❝❤ ❞❛t❛s❡t✱ ❛❢✲ t❡r ❝♦♠❜✐♥✐♥❣ ❛❧❧ ❞❛t❛s❡ts ✐♥ ❡❛❝❤ ❧❛♥❣✉❛❣❡✱ ✇❡ s❡❧❡❝t ✷✵✪ ♦❢ s❛♠♣❧❡s ✐♥ ❡❛❝❤ ❧❛♥❣✉❛❣❡ ❛s t❡st s❡t ❜② ♣❡r❢♦r♠✐♥❣ ❛ str❛t✐✜❡❞ s❛♠♣❧✐♥❣✳ ❚♦ ❤❛✈❡ ❛ ✈❡r✐t② ♦❢ t❛s❦s ❞✉r✐♥❣ t❤❡ ♠❡t❛✲tr❛✐♥✐♥❣ st❡♣✱ ✇❡ ❧❡✈❡r❛❣❡ ❞✐✛❡r❡♥t ❧❛♥❣✉❛❣❡s ✇✐t❤ ❞✐✛❡r❡♥t ❤❛t❡❢✉❧ ❝♦♥t❡♥t ✇❤❡r❡ ❛❧❧ ❧❛♥❣✉❛❣❡s ❡①❝❡♣t t✇♦ ❛r❡ s❡❧❡❝t❡❞ ❛s tr❛✐♥✐♥❣ s❡t✳ ❋♦r ❡①❛♠♣❧❡✱ t♦ ❡✈❛❧✉❛t❡ ♠❡t❛✲ ❧❡❛r♥✐♥❣ ♠♦❞❡❧s ♦♥ ❆r❛❜✐❝ ❛s ❛ t❛r❣❡t ❧❛♥❣✉❛❣❡ ✇✐t❤ k ❧❛❜❡❧❡❞ s❛♠♣❧❡s ♣❡r ❝❧❛ss✱ ✇❡ ❝♦♥✲ s✐❞❡r ♦♥❡ ❧❛♥❣✉❛❣❡ ❢♦r ✈❛❧✐❞❛t✐♦♥ ❛♥❞ t❤❡ r❡st ♦❢ ❧❛♥❣✉❛❣❡s ❢♦r tr❛✐♥✐♥❣✱ ✇❤❡r❡ Ltrain =

{English, F rench, German, Indonesian, Spanish, P ortuguese}✱ Lval= {Italian}✱ ❛♥❞ Ltest=

(141)
(142)
(143)
(144)
(145)
(146)
(147)
(148)

Chapter

6

Conclusion and Future Work

❈♦♥t❡♥ts

✻✳✶ ❈♦♥❝❧✉s✐♦♥ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✶✹✻ ✻✳✶✳✶ ❙✉♠♠❛r② ❛♥❞ ■♥s✐❣❤ts ♦❢ ❈♦♥tr✐❜✉t✐♦♥s ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✶✹✻ ✻✳✷ ❋✉t✉r❡ ❲♦r❦ ❛♥❞ ❈❤❛❧❧❡♥❣❡s ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✳ ✶✹✾

(149)
(150)
(151)
(152)
(153)
(154)
(155)
(156)
(157)
(158)
(159)
(160)
(161)
(162)
(163)
(164)
(165)
(166)
(167)
(168)
(169)
(170)
(171)
(172)
(173)
(174)
(175)
(176)
(177)
(178)

Titre : D ´etection du Discours de Haine et du Langage Offensant utilisant des Approches de Transfer Learning Mots cl ´es : Discours de haine, Langage offensant, Apprentissage par transfert, BERT, XLM-RoBERTa,

l’apprentissage en profondeur, Classification interlinguistique des textes, Few-shot learning, Meta learning, R ´eseaux sociaux, Twitter

R ´esum ´e : Une des promesses des plateformes de

r ´eseaux sociaux (comme Twitter et Facebook) est de fournir un endroit s ˆur pour que les utilisateurs puissent partager leurs opinions et des informations. Cependant, l’augmentation des comportements abu-sifs, comme le harc `element en ligne ou la pr ´esence de discours de haine, est bien r ´eelle. Dans cette th `ese, nous nous concentrons sur le discours de haine, l’un des ph ´enom `enes les plus pr ´eoccupants concernant les r ´eseaux sociaux.

Compte tenu de sa forte progression et de ses graves effets n ´egatifs, les institutions, les plateformes de r ´eseaux sociaux et les chercheurs ont tent ´e de r ´eagir le plus rapidement possible. Les progr `es r ´ecents des algorithmes de traitement automatique du langage naturel (NLP) et d’apprentissage automatique (ML) peuvent ˆetre adapt ´es pour d ´evelopper des m ´ethodes automatiques de d ´etection des discours de haine dans ce domaine.

Le but de cette th `ese est d’ ´etudier le probl `eme du discours de haine et de la d ´etection des propos in-jurieux dans les r ´eseaux sociaux. Nous proposons diff ´erentes approches dans lesquelles nous adaptons des mod `eles avanc ´es d’apprentissage par transfert (TL) et des techniques de NLP pour d ´etecter auto-matiquement les discours de haine et les contenus injurieux, de mani `ere monolingue et multilingue. La premi `ere contribution concerne uniquement la langue anglaise. Tout d’abord, nous analysons le contenu textuel g ´en ´er ´e par les utilisateurs sur Fa-cebook en introduisant un nouveau cadre capable de cat ´egoriser le contenu en termes de similarit ´e bas ´ee sur diff ´erentes caract ´eristiques, `a savoir les ca-ract ´eristiques lexicales, topiques et s ´emantiques. En outre, en utilisant l’API Perspective de Google, nous mesurons et analysons la toxicit ´e du contenu. En-suite, nous proposons une approche TL pour l’identi-fication des discours de haine en utilisant une combi-naison du mod `ele non supervis ´e pr ´e-entraˆın ´e BERT (Bidirectional Encoder Representations from Trans-formers) et de nouvelles strat ´egies supervis ´ees de r ´eglage fin. Enfin, nous ´etudions l’effet du biais

in-volontaire dans notre mod `ele pr ´e-entraˆın ´e BERT et proposons un nouveau m ´ecanisme de g ´en ´eralisation dans les donn ´ees d’entraˆınement en repond ´erant les ´echantillons puis en changeant les strat ´egies de r ´eglage fin en termes de fonction de perte pour att ´enuer le biais racial propag ´e par le mod `ele. Pour ´evaluer les mod `eles propos ´es, nous utilisons trois da-tasets publics provenant de Twitter.

(179)

Title : Hate Speech and Offensive Language Detection using Transfer Learning Approaches

Keywords : Hate Speech, Offensive Language, Transfer Learning, BERT, XLM-RoBERTa, Deep Learnig,

Cross Lingual Text Classification, Few-shot Learning, Meta Learning, Social Media, Twitter

Abstract : The great promise of social media

plat-forms (e.g., Twitter and Facebook) is to provide a safe place for users to communicate their opinions and share information. However, concerns are gro-wing that they enable abusive behaviors, e.g., threa-tening or harassing other users, cyberbullying, hate speech, racial and sexual discrimination, as well. In this thesis, we focus on hate speech as one of the most concerning phenomena in online social media. Given the high progression of online hate speech and its severe negative effects, institutions, social media platforms, and researchers have been trying to react as quickly as possible. The recent advancements in Natural Language Processing (NLP) and Machine Learning (ML) algorithms can be adapted to develop automatic methods for hate speech detection in this area.

The aim of this thesis is to investigate the problem of hate speech and offensive language detection in so-cial media, where we define hate speech as any com-munication criticizing a person or a group based on some characteristics, e.g., gender, sexual orientation, nationality, religion, race. We propose different ap-proaches in which we adapt advanced Transfer Lear-ning (TL) models and NLP techniques to detect hate speech and offensive content automatically, in a mo-nolingual and multilingual fashion.

In the first contribution, we only focus on English language. Firstly, we analyze user-generated textual content in Facebook to gain a brief insight into the type of content by introducing a new framework being able to categorize contents in terms of topical simila-rity based on different features, namely lexical, topi-cal, and semantical features. Furthermore, using the Perspective API from Google, we measure and ana-lyze the toxicity of the content. Secondly, we propose a TL approach for identification of hate speech by em-ploying a combination of the unsupervised pre-trained

model BERT (Bidirectional Encoder Representations from Transformers) and new supervised fine-tuning strategies. Finally, we investigate the effect of unin-tended bias in our pre-trained BERT-based model and propose a new generalization mechanism in training data by reweighting samples and then changing the fine-tuning strategies in terms of the loss function to mitigate the racial bias propagated through the mo-del. To evaluate the proposed models, we use three publicly available datasets from Twitter.

In the second contribution, we consider a multilingual setting where we focus on low-resource languages in which there is no or few labeled data available. First, we present the first corpus of Persian offensive lan-guage consisting of 6 000 microblogs from Twitter to deal with offensive language detection in Persian as a low-resource language in this domain. After annota-ting the corpus, we perform extensive experiments to investigate the performance of transformer-based nolingual and multilingual pre-trained language mo-dels (e.g., ParsBERT, mBERT, XLM-RoBERTa) in the downstream task. Furthermore, we propose an en-semble model to boost the performance of our model. Then, we expand our study into a cross-lingual few-shot learning problem and we adapt a meta learning-based approach to study the problem of few-shot hate speech and offensive language detection in low-resource languages that will allow hateful or offensive content to be predicted by only observing a few labe-led data items in a specific target language. To eva-luate the proposed model, we use diverse collections of different publicly available corpora, comprising 15 datasets across 8 languages for hate speech and 6 datasets across 6 languages for offensive language. To the best of the author’s knowledge, there has been an insignificant number of attempts to use meta lear-ning approaches on hate speech detection tasks.

Institut Polytechnique de Paris

Références

Documents relatifs

Nowadays, such social media companies are studying the identification of hate speech and offensive language, which is very difficult because some sentences containing neutral words

In the system implementation, we have used a special type of Recurrent Neural Network (RNN) based deep learning approach known as Long Short Term Memory (LSTM) [10, 11] to detect

Identifying and categorizing offensive language in social media (OffensEval), in: Proceedings of the 13th International Workshop on Semantic Evaluation, Association for

FIRE2020: Using Machine Learning for Detection of Hate Speech and Offensive Code-Mixed Social Media text.. Varsha Pathak a , Manish Joshi b , Prasad Joshi c , Monica Mundada d

In: Proceedings of third Italian conference on computational linguistics (CLiC-it 2016) & fifth evaluation campaign of natural language processing and speech tools for Italian..

In this paper, we presented a model to identify hate speech and offensive language for English language, and also used the k-fold ensemble method to improve the generalization

This paper describes the research that our team, DA Master, did on the shared task HASOC, conducted by FIRE-2019, which involves identification of hate and offensive language in

The number of posts provided per label for English, German and code-mixed Hindi is given in Table 11. The posts marked as HATE is further categorized as given in Table 2 and