• Aucun résultat trouvé

LIDIC - UNSL's Participation at eRisk 2017: Pilot Task on Early Detection of Depression

N/A
N/A
Protected

Academic year: 2022

Partager "LIDIC - UNSL's Participation at eRisk 2017: Pilot Task on Early Detection of Depression"

Copied!
12
0
0

Texte intégral

(1)

Pilot task on Early Detetion of Depression

Notebook for eRisk at CLEF 2017

Ma.PaulaVillegas

1

, DarioG.Funez

1

,

Ma.JoséGariarenaUelay

1

,LetiiaC.Cagnina

1 , 2

,andMareloL.Errealde

1

1

LIDICResearhGroup,

UniversidadNaionaldeSanLuis,Argentina

2

ConsejoNaionaldeInvestigaionesCientíasyTénias(CONICET)

{villegasmariapaula74,funezda rio,m jgar iare naue lay} gmai l.o m

{lagnina,merrealde}gmail.o m

Abstrat. InthispaperwedesribethepartiipationoftheLIDICRe-

searhGroupofUniversidadNaionaldeSanLuis(UNSL)-Argentinaat

eRisk2017pilottask.Themaingoalofthistaskisonsideringearlyrisk

detetion senarios, depression inthis ase, where the issue of getting

timely preditions with a reasonable ondene level beomes ritial.

In the pilot task, systems mustbe able to sequentially proess piees

of evidene and detet early traes of depression as soon as possible.

Theused data set is a olletion of writings labeled as depressed or

non-depressed thatwerereleasedtothepilottaskpartiipantsintwo

dierent stages:training andtest.Ourproposalfor thistaskwas based

onasemantirepresentationofdoumentsthatexpliitly onsidersthe

partialinformationthatismadeavailableinthedierenthunkstothe

earlyriskdetetion systemsalongthe time.Thattemporal approah

was omplemented with other standard text ategorization models in

some spei situations that seemed not to be orretly addressed by

ourapproahalone.Intheteststage,theresultingsystemobtainedthe

lowest

ERDE 50

errorona total of30 submissionsfrom 8dierent in- stitutions.However,onethegolden-truth ofthetestingdatasetwas

released,weouldverifythatourtemporalapproahalonemighthave

obtained veryrobust results and the lowest reported

ERDE

error for

boththresholdsusedinthepilottask.

Keywords:EarlyRiskDetetion, Unbalaned DataSets,TextRepre-

sentations,SemantiAnalysisTehniques.

1 Introdution

ThewidespreaduseofInternet,soialnetworksandotheromputertehnologies

hasmarkedthebeginningofaneweraintheommuniationamongpeople.In

thisontext,wehavenowavailablealot ofinformationthatmightbeusefulto

detet,inatimelyway,thosesituationsthatmightbepotentiallydangerousor

riskyforthephysial well-beingandhealthofindividuals,ommunitiesandso-

(2)

todepression,amongothers.

This type of situations have started to be studied in a new researh eld

known as early risk detetion (ERD) whih has reeived inreasingly interest

from sienti researhersat world level due to the important impat that it

ould have in many relevant and urrent problems of the real world. In this

ontext, thisyearwas organizedtherstearlyrisk preditiononfereneeRisk

2017 3

intheontextoftheCLEF2017Workshop.Aspartofthisevent,apilot

task was organized and the present artile desribes our partiipation in this

task.

TheeRisk 2017pilot task wasorganizedinto twodierentstages:training

andteststages.It alsoassumedanERDsenario,thatis,dataaresequentially

read as a stream and the hallenge onsists in deteting risk asesas soon as

possible.Inordertoreproduethissenario,theeRisk2017'sorganizersreleased

thetestdataset followingasequential,hunkbyhunk riterion;that is,the

rsthunkontainedtheoldest10%ofthemessages,theseondhunkontains

theseondoldest10%,andsoforthuptoomplete10hunksthatrepresentthe

fullwritingoftheanalysedindividuals.

To deal with this problem we implemented an original proposal that we

named temporal variation of terms (TVT) [3℄.However,at thetrainingstage,

TVT seemedto showsome weaknessat spei hunks sowedeidedto om-

plement itwith otherstandardmethods to help ourapproahin these spei

situations.Thus,ourimplementedERDsystemisinfataombinedsystemthat

strongly depends on TVTbut alsouses other methods asadditionalsoure of

opinions.Theresultingsystemhadaveryaeptableperformaneonthedata

setreleasedintheteststageobtainingthelowest

ERDE 50

erroronatotalof30

submissionsfrom8dierentinstitutions.However,onethegolden-truthofthe

testdatasetwasreleased,weouldverifythat TVTalonemighthaveobtained

very robust results and the lowest reported

ERDE

error for both thresholds

used in thepilottask. Forthis reason,wealso inlude anadditionalsetionin

this work with preliminaryresults obtainedwith TVT aloneon thetest setin

orderto observethepotentialofthis methodforgeneralERD problems.

Therestoftheartileisorganizedasfollows:Setion2desribessomegeneral

aspetsofthedatasetusedinthepilottaskandthemethodsusedinourERD

system. Next, in Setion 3 the ativities arried out in the training stage are

desribedand thejustiationsof themain designdeisionsmadeonourERD

system, are presented. Setion 4 showsthe obtainedresults with our proposal

ontheeRisk2017datasetreleasedintheteststage.Then,someomplementary

results are presented in Setion 5 where interesting aspets are shown on the

performane of TVT working alone on the test set. Finally, Setion 6 depits

potentialfuture worksandtheobtainedonlusions.

3

(3)

2.1 Data Set

The dataset used in the pilottask of theeRisk ompetition 4

wasinitially de-

sribed in [8℄. It is a olletion of writings(posts or omments) from a set of

Soial Media users. There are two ategories of users, depressed and non-

depressed and,foreahuser,theolletionontainsasequeneofwritings(in

hronologial order). Foreah user, his olletionof writingshas beendivided

into 10 hunks. The rst hunk ontains the oldest 10% of the messages, the

seond hunkontainstheseondoldest 10%,and soforth.This olletionwas

split into atraining and atest set that we will refer as

T R DS

and

T E DS

re-

spetively. The

T R DS

set ontained 486users (83positive, 403negative) and

the(test)

T E DS

set ontained401users(52 positive, 349negative).Theusers

labeledaspositivearethosethathaveexpliitlymentionedthattheyhavebeen

diagnosedwithdepression.

Thepilot task was divided by their organizers into a training stage and a

teststage.Intherstone,thepartiipatingteamshadaesstothe

T R DS

data

setwithallhunksofalltrainingusers.Theyouldthereforetunetheirsystems

withthetrainingdata.Then,intheteststage,the10hunksofthe

T E DS

data

setweregraduallyreleasedbytheorganizersonebyoneuntilompletingallthe

hunks that orrespond to theompletewritingsof theonsidered individuals.

Eah time that a hunk

ch i

was released, partiipants in the pilot task were askedtogivetheirpreditionsontheusersontainedinthe

T E DS

,basedonthe

partialinformationread fromhunks

ch 1

to

ch i

.

2.2 Methods

In order to dealwith theproblem posed in thepilot task we proposed a new

doumentrepresentationnamedtemporalvariation ofterms (TVT)[3℄.TVTis

anelaborateapproahthatrequiresenoughspaetobedesribedinanadequate

way.Forthisreason,inthepresentworkwewillonlyfousonthedeisionaspets

that wereonsidered at thepilot task toombine TVTwith otherwell-known

doument representation approaheslike Bag of Words (BoW). Thus, we will

give below a separate desription of the method with its main harateristis

but theinterestreaderanobtainadetaileddesriptionofTVTin[3℄.

Weusedinourexperimentsdierentdoumentrepresentationsandlearning

algorithms. Therefore, we start giving some general ideas of those doument

representationsandalittlemoredetailedexplanationofTVT.Afterthat,some

generalonsiderationsontheused learningalgorithmsarebrieyintrodued.

Doumentrepresentations

4

(4)

thelanguagemodelsmostusedin textategorizationtasks.InBoW repre-

sentations, features arewordsand doumentsare simply treatedasolle-

tionsofunorderedwords.Formally,adoument

d

isrepresentedbythevetor ofweights

d BoW = (w 1 , w 2 , ..., w n )

where

n

is thesizeof thevoabulary of

thedataset.Eahweight

w i

isavaluethatisassignedtoeahfeature(word)

aordingtowhetherthewordappearsinadoumentorhowfrequentlythis

word appears. This popularrepresentation is simple to implement, fast to

obtainand an be used under dierent weighting shemes (boolean, term

frequeny(

tf

)orterm frequeny- inversedoumentfrequeny(

tf − idf

)).

InourstudyweusedBoWwiththebooleanweightingsheme.

Conise Semanti Analysis (or Seond Order Attributes (SOA)). Conise

SemantiAnalysisisasemantianalysistehniquethatinterpretswordsand

textfragmentsinaspaeofoneptsthatarelose(orequal)totheategory

labels.Forinstane,ifdoumentsinthedatasetarelabeledwith

q

dierent

ategorylabels(usuallynomorethan100elements),wordsanddouments

willberepresentedina

q

-dimensionalspae.Thatspaesizeisusuallymuh smallerthan standardBoW representations whih diretly depend on the

voabulary size (more than 10000 or20000elements in general). CSA has

been usedin generaltext ategorizationtasks[6℄ and hasbeen adapted to

work in author proling tasksunder the nameof SeondOrder Attributes

(SOA)[7℄.

Charater3-grams. A

n

-gram is asequene of

n

haraters obtainedfrom

eah text in the dataset.In the

n

-grammodel, theditionary ontainsall

n

-gramsthatourinanyterminthevoabulary.Therepresentationsusing

harater

n

-grams havedemonstrated to be eetive in many appliations [5℄where

n

-grams areonsideredthetermsusedin BoWrepresentations.

LIW C

-based features. Featuresderivedfrom Linguisti Inquiry and Word

Count (LIWC)[11,10℄ havebeenused in severalstudies related to psyho-

logialaspets of individuals. LIWC has been suessfullyused to identify

depressedand non-depressedpeople analyzinglinguistimarkersofdepres-

sionsuhastheuseofthepersonalpronounsandpositive-negativeemotions

[12℄andthepreseneofwordsrelatedto thedeath(e.g.,dead,kill,sui-

ide),sex(e.g., arouse, makeout, orgasm)and ingestion (e.g., hew,

drink,hunger)besidesemotionsalso resultinguseful[1℄. Studiesonsui-

idalindividuals haveinorporatedLIWCasa valuabletool to extratin-

formationrelated to suiide and suiidal ideation analyzing the ategories

death,health,sad,they,I,sexual,ller,swear,anger,andnegativeemotions

[2,4℄.Due to the fat wewantedto onsidermoremeaningful features, we

alsoonsideredinpreliminarystudiesthemostinformativefeaturesbelong-

ing to linguisti dimensions (for example, personal pronouns and number

offuntionwords),summary oflanguagevariables(forexample,ditionary

words and words with more than 6letters), psyhologialproess (forex-

ample,negativeemotionsandaetiveproesses)and,grammar(verbsand

(5)

(TVT) method [3℄ is an approah for early risk detetion based on using the

variationofvoabularyalongthedierenttimestepsasoneptspaefordou-

mentrepresentation.Thismethodis thekeyomponentofourEDSsystemfor

theeRiskpilottask.

TVT is based on the onise semanti analysis (CSA) tehnique proposed

in [6℄ and later extended in [7℄ forauthor proling tasks. In this ontext, the

underlyingideaisthatvariationsofthetermsusedindierentsequentialstages

ofthedoumentsmayhaverelevantinformationforthelassiationtask.With

this idea in mind, this method enrihes the douments of the minority lass

with the partial douments read in the rst hunks.These rsthunks of the

minority lass, along with theiromplete douments, are onsidered asa new

oneptspaeforaCSAmethod.

TVTnaturallyopeswiththesequentialarateristisofERDproblemsand

alsogivesatoolfordealingwithunbalaneddatasets.Preliminaryresultsofthis

methodinomparisontoCSAandBoWrepresentations[3℄showeditspotential

todealwithERDproblems.

Learning algorithms Inpreliminarystudieswetesteddierentlearningalgo-

rithmsandweobtainedthebestresultswithRandomForestandNaïveBayesin

severalomparisonswithotherpopularmethodslikeLIBSVM 5

.Wealsousedin

oursystem,adeisiontree(Weka'sJ48)obtainedbyrstseletingthe100words

withthehighestinformationgainandthenremovingfromthat listthosewords

that were onsidered dependent of spei domains (names of politiians like

Obama andountrieslikeChina).TheJ48algorithmtrainedonthissubset

offeaturesobtainedadeisiontreeofonly39nodesontainingsomeinteresting

wordslikemeds,depression,therapist andry,amongothers.As wewill

seelater,this deisiontreewasusedto assisttotheTVTmethodin theinitial

hunks.

EventhoughwearriedoutseveralomparativestudieswithLIWCfeatures,

harater3-grams,CSArepresentationsandtheLIBSVMalgorithm,wedeided

toseletonlythoseapproahesthatseemedtobeeetivetoassistTVTinsome

speisituations.Forthisreason,inourERDsystemweonlyusedtheRandom

ForestandNaïveBayesalgorithmswithBoWandTVTrepresentationsandthe

J48deisiontreepreviouslyexplained.

3 Training Stage

The pilot task was divided into a training stage and a test stage. So, in the

rst one,the partiipatingteams had aess to the

T R DS

set with allhunks

ofalltrainingusers.Therefore,weouldtunein thisstageoursystemwiththe

trainingdata.

5

For all the testedalgorithms we used the implementations provided by the Weka

(6)

severalpreliminarystudiestogivebetterresultsthanBoWandCSA(see[3℄for

adetailedstudy),welakedofarobustriteriontosolvearitialaspetinERD

problemsthatwewillreferasthelassiationtimedeision(CTD)issue.That

is, an ERD system needsto onsider not only whih lass should be assigned

to adoument; italsoneedstodeidewhen to makethat assignment.Inother

words,ERDmethodsmusthaveariteriontodeidewhen (inwhatsituations)

thelassiationgeneratedbythesystemisonsideredthenal/denitivedei-

sionon theevaluatedinstanes. Although thisaspethasbeenaddressedwith

verysimpleheuristi rules 6

, we wantedto have arule for the CTD issue that

attempted to exploit thestrengths oftheTVT method andsimultaneouslyal-

leviate its weaknesses. With this goalin mind, webegan athorough study to

determineinwhihsituationstheTVTmethodbehavedsatisfatorilyandwhen

itneededtobeimproved.

Westartedourstudybydividingthetrainingset

T R DS

intoanewtraining

set that we will refer as

T R DS − train

and a test set named

T R DS − test

.

Those sets maintained the same proportions of post per user and words per

user as desribed in [8℄.

T R DS − train

and

T R DS − test

were generated by

randomlyseletingarounda70%of writingsfortherstone andtherest 30%

fortheseondone.Thus,

T R DS − train

resultedin351individuals(63positive, 288negative)meanwhile

T R DS − test

ontains135individuals(20positive,115 negative).Thedivisioninto10hunksofthewritingsprovidedbytheorganizers

waskeptforbotholletions.

Then,weanalyzedtheperformaneoftheTVTmethodbyusingthe

T R DS − train

setastrainingsetandevaluatingtheobtainedresultswith

T R DS − test

.

Westartedassumingthatthelassiationismadeonastatihunkbyhunk

basis.Thatis,foreahhunk

C ˆ i

providedtotheERDsystemsweevaluatedthe

TVT'sperformaneonsideringthatthemodelis(simultaneously)appliedtothe

writingsreeiveduptothehunk

C ˆ i

.Withthistypeofstudyitwaspossibleto

observetowhatextenttheTVTmethodwasrobusttothepartialinformationin

thedierentstages,inwhihmomentitstartedtoobtainaeptableresults,and

other interestingstatistis.Inthis ontext, weonsideredthat the

F 1

measure

whihombines preisionandreallwouldbeavaluablemeasureto gainsome

insightintotheperformaneoftheTVTmethodalongthedierenthunks.

Figure1showsthistypeofinformationwhereweanseethatTVTrepresen-

tationswith twodierentlearningalgorithms(NaïveBayes(NB) andRandom

Forest(RF))reahedthehighest

F 1

valuesinthehunks3and4.Inbothases,

those values were obtained when an instane was lassied as positive if the

probabilityassignedbythelassierwasgreaterorequalthan0.6(

p ≥ 0.6

).Be-

sides, we ouldobservethat in the rsthunk,TVTrepresentations produed

a high numberof false positives.That weakness of TVT methods is shown in

Figure2whereweanseethelowestpreisionvaluesinthehunk1.Thatprob-

lem ofTVTmethodslooksreasonableifweonsiderthatTVTis basedonthe

6

Forinstane,exeedingaspeiondenethresholdinthepreditionofthelas-

(7)

hunks, where little information is available,TVT will need to be assisted by

otheromplementarymethods.

Fig.1.

F 1

valuesobtainedwithTVTmodels.

Fig.2.PreisionvaluesobtainedwithTVTmodels.

Due to the fat that we wanted to fous our preditions on those hunks

wherebestresultswereobtained(hunks3and4)andassumingthatafterthose

hunksthepenalizationomponentinthe

ERDE

omputationwouldaetthe

TVT's resultswedeided to set somehunk by hunk rulesthat aomplish

ertainbasipropertiesinthedierenthunks:

1. Chunk1: HeretheERD systemshould beextremelyonservativeandonly

lassifyinganinstaneaspositive(depressed)ifthereexistsstrongevidene

of that.Weused for this ase, theriterion of only lassifyingan instane

as positive if both models obtainedwith TVT-NB and TVT-RFlassied

the instane as positive with probability

p = 1

and the text inluded all

thewordsinawhitelist. Thatlistwasobtainedfrom thewordswiththe

highestinformationgainofthedoumentsofthersthunk.Itinludedthe

wordsdepression anddiagnosed.

2. Chunk2: Heretherestritionofthewhitelist ouldberelaxedandom-

plementtheTVTmethodswith amoregeneralapproah. Forthisend, we

usedthepreditionsofthe J48deisiontreeexplainedaboveandlassied

aninstane aspositiveifthethreelassierslassieditinthat way.

3. Chunk 3: In this hunk mostof the lassiers obtainedthe best preision

values. However,they obtained low reall values. In order to address this

aspet,aninstanewaslassiedaspositiveifatleasttwolassierslassify

itas positivewithprobability

p ≥ 0.9

.

(8)

preision.Wealso ouldobservethat whenit wasombinedwiththeTVT

method,thislastmethodplayedaroleoflterobtaininggoodresultswhen

bothmethodslassiedaninstaneaspositive.Ontheother hand,many

instanes not deteted for this ombination were lassied as positive by

theJ48method. Forthisreason,anyinstanelassiedaspositivebyboth,

BoWandTVTmethodswithprobability

p ≥ 0.7

or bytheJ48methodwas

lassiedaspositive.

5. Chunks5to10:Fromhunk5forward,weassumedthatmostoftherelevant

lassiationshad alreadybeenmadein the previoushunks.However,we

kept a monitoring system to identify those ases that needed muh more

additionalinformationtodeterminewhetheranindividualwasdepressiveor

not.Forthis purpose,weused thesameruleof hunk4forhunks 5and6

butinreasingtheprobabilityofBoW andTVTto

p ≥ 0.8

.Fromhunks7

to10aninstanewaslassiedaspositiveifatleasttwolassierslassied

itas positivewithprobability

p ≥ 0.9

.

Withthose empiriallyderivedrulesasgeneralriterionfortheCTDissue,

wethen analyzed ifthe TVT method ombined with the other approahesin

spei situations obtained abetter performane than the individual methods

workingseparately.Theresultsobtainedfromthisanalysisarepresentedbelow.

3.1 Analysis of results

In order to study the performane of our ombined approah against eah

method testedseparately,weused

T R DS − train

and

T R DS − test

sets inthe

experiments. Individual methods also need a riterion to deal with the CTD

issue. In this ase, we an diretly use the probability (or some measure of

ondene)assignedbythelassiertodeidewhentostopreadingadoument

and givingits lassiation.That approah,that in [8℄is referredasdynami,

onlyonsidersthatthisprobabilityexeedssomepartiularthresholdtolassify

theinstane/individualaspositive.Inourstudies,wealsoused,fortheindividual

methods,thisdynamiapproahbyonsideringdierentprobabilityvalues:

p = 1

,

p ≥ 0.9

,

p ≥ 0.8

,

p ≥ 0.7

and

p ≥ 0.6

.

Tables 1, 2 and 3 report the values of preision (

π

), reall (

ρ

) and

F 1

-

measure (

F 1

) of the target (depressed) lass for eah onsidered individual model anddierentprobabilities.Statistisalsoinludetheearly riskdetetion

error (ERDE)measureproposedin[8℄withtwovaluesoftheparameter

o

used

in thepilottask:

o = 5

(

ERDE 5

)and

o = 50

(

ERDE 50

).

As we an see from Table 1 themodel TVT-NBobtained the best perfor-

manewhenthelassierassignedtothepositivelass(depressed),theinstanes

withprobability1.ForTVT-RFandBoWthestatistisaredierent.InTable2

thebest

ERDE 50

(and

F 1

measure)wasobtainedwhenRandomForestlassier

usedthelowestprobability(

p ≥ 0.6

)butatthesametimewiththisprobability the

ERDE 5

erroris the worst value. The BoW model (Table3) obtainedthe

best

ERDE 50

valueonsidering

p ≥ 0.8

andwiththisongurationthebest

F 1

,

(9)

π

and

ρ

areobtained.TheJ48modelassignstheinstanesto thepositivelass

usingprobability1,thenitsperformaneisdiretlyshownintheomparisonin

Table4.

Table1.PerformaneofTVT-NBmodel(

T R DS − test

set).

p = 1 p ≥ 0 . 9 p ≥ 0 . 8 p ≥ 0 . 7 p ≥ 0 . 6 ERDE 5

14.13 16.66 16.68 16.82 16.85

ERDE 50

11.25 15.34 15.21 15.24 15.07

F 1

0.40 0.27 0.28 0.27 0.28

π

0.47 0.50 0.48 0.43 0.44

ρ

0.35 0.18 0.19 0.19 0.20

Table 2.PerformaneofTVT-RFmodel(

T R DS − test

set).

p = 1 p ≥ 0 . 9 p ≥ 0 . 8 p ≥ 0 . 7 p ≥ 0 . 6 ERDE 5

16.92 16.85 16.92 16.96 17.14

ERDE 50

16.43 16.05 16.12 16.02 15.79

F 1

0.11 0.15 0.16 0.21 0.22

π

0.50 0.54 0.50 0.50 0.43

ρ

0.06 0.08 0.10 0.13 0.14

Table3. PerformaneofBoWmodel(

T R DS − test

set).

p = 1 p ≥ 0 . 9 p ≥ 0 . 8 p ≥ 0 . 7 p ≥ 0 . 6 ERDE 5

20.79 20.83 21.05 21.16 21.27

ERDE 50

18.82 18.66 18.13 18.24 18.35

F 1

0.19 0.21 0.24 0.24 0.23

π

0.11 0.13 0.14 0.14 0.14

ρ

0.5 0.65 0.75 0.75 0.75

Tohoieapartiularprobabilitytoomparetheperformaneofthemodels,

weseleted the onefor whih themodel obtainedthe best ERDE onsidering

o = 50

(

ERDE 50

).Wedeided touse thismetribeauseweknewin advane

thattheresultsinthepilottaskwouldbeevaluatedwithERDE butwedidnot

knowthe spei

o

value.In this ontext, weonsidered that

ERDE 50

would

bemoreimportant/informativethanthe

ERDE 5

error.

Table4showstheomparison of performane ofall models. Asweanob-

(10)

exeptforthe

ERDE 5

errorforwhihTVT-NBmodelobtainedthebestvalue.

It is interesting to note that beyond that TVT-NB obtained a good

ERDE 5

value, in the other metris it was slightly worse than the ombined methods.

These resultsonrm theadequatenessof theombinedapproahfortheERD

taskovertheotherindividualmethods.

Table4.Performane omparisonofall models(

T R DS − test

set).

ERDE 5 ERDE 50 F 1 π ρ

Combinedmethods 15.36 9.16 0.590.480.75

J48 17.01 12.94 0.4 0.33 0.5

BoW(

p ≥ 0 . 8

) 21.05 18.13 0.24 0.14 0.75 TVT-NB(

p = 1

) 14.13 11.25 0.4 0.47 0.35 TVT-RF(

p ≥ 0 . 6

) 17.14 15.79 0.22 0.43 0.14

4 Test Stage

Thepreviousresultswereobtainedbytrainingthelassierswiththe

T R DS − train

data set and testing them with the

T R DS − test

data set. In this sub-

setion, we show the results obtained by the ombined methods trained with

the fulltrainingset of the pilottask (

T R DS

)and tested with

T E DS

that was

inrementallyreleasedduringthetestingphaseofthepilottask.

InTable5weshow thethree submissionsthat obtainedthe best

ERDE 5

,

ERDE 50

and

F 1

values in the eRisk pilot task as reported in[9℄. There, we

an observethatourombinedmethods (

U N SLA

)obtainedthebest

ERDE 50

value.On theotherhand,FHDO-BCSGAobtainedthebest F-measurewitha

ERDE 50

valueslightlyworsethanourapproah. FHDO-BCSGBobtainedthe

best

ERDE 5

and theworst

F 1

overthethree approahes. With these results,

weanonludethatourproposalisaompetitiveapproahforERDtasks.

Table 5.Bestintherankingofthepilottask(

T E DS

set).

ERDE 5 ERDE 50 F 1 π ρ

Combinedmethods(UNSLA) 13.66 9.68 0.59 0.480.79

FHDO-BCSGA 12.82 9.69 0.640.610.67

FHDO-BCSGB 12.70 10.39 0.55 0.690.46

(11)

OurERD systemtested onthepilot taskwasderivedfrom ouranalysis ofthe

weakness and strengths of the TVT method on the training data. However,

one the golden-truth information of the

T E DS

set was made available by

theorganizers,weould analyzewhat would havebeentheperformaneofthe

models, in partiular the TVTmethod, working aloneif dierentprobabilities

hadbeenseleted.

Table 6 shows this type of information by reporting the results obtained

withTVTrepresentationandusingNaïveBayesandRandomForestaslearning

algorithms. Besides, dierent probability values were tested for the dynami

approahestotheCTDaspet.Theobtainedresultsareonlusiveinthisase.

TVT shows a high robustness in the

ERDE

measures independently of the algorithm used to learnthe model, and the probability threshold. Mostof the

TVT's

ERDE 5

valueswerelowandin7outof10settings,the

ERDE 50

values

werelowestthanthebestonereportedinthepilottask(theombinedmethods

(

U N SLA

) with 9.68). In this ontext, TVT ahieves the best

ERDE 5

value

reported uptonow(12.30)with thesettingTVT-RF (

p ≥ 0.8

)and thelowest

ERDE 50

value(8.17)with themodelTVT-NB(

p ≥ 0.8

).The performane of

theTVTmethods onthe testsetwassurprising forusbeauseon thetraining

stage weouldnotobtainsuh asgood results.Thismakesus onludethatif

theTVTmethodhadpartiipatedaloneinthepilottask,ithadobtainedsimilar

orbetterresultsthantheonesobtainedwiththeombinedmethods.

Table6.TVT's performaneonthe

T E DS

set.

ERDE 5 ERDE 50 F 1 π ρ

TVT-NB(

p ≥ 0 . 6

) 13.59 8.40 0.50 0.37 0.75 TVT-NB(

p ≥ 0 . 7

) 13.43 8.24 0.51 0.39 0.75 TVT-NB(

p ≥ 0 . 8

) 13.13 8.17 0.54 0.42 0.73 TVT-NB(

p ≥ 0 . 9

) 13.07 8.35 0.52 0.42 0.69 TVT-NB(

p = 1

) 12.38 9.84 0.42 0.50 0.37 TVT-RF(

p ≥ 0 . 6

) 12.46 8.37 0.55 0.49 0.63 TVT-RF(

p ≥ 0 . 7

) 12.49 8.52 0.55 0.50 0.62 TVT-RF(

p ≥ 0 . 8

) 12.30 8.95 0.56 0.54 0.58 TVT-RF(

p ≥ 0 . 9

) 12.34 10.28 0.47 0.55 0.40 TVT-RF(

p = 1

) 12.82 11.82 0.20 0.67 0.12

6 Conlusions and future work

This artile presents the partiipation of LIDIC - UNSL at eRisk 2017 Pilot

task onEarlyDetetionof Depression.Weproposed aninterestingrepresenta-

(12)

of thedouments.Beause theexperimentsonthe trainingstageshowed some

weakness of TVT, we proposed a ombined approah to assist TVT in some

spei situations. The results obtained on the pilot task with our ombined

methodsweregood,obtainingthebest

ERDE 50

valueoverallpartiipants.An interesting aspet wasthat the TVTrepresentation alone on the test set,ob-

tainedbetterresultsthantheones ahievedbytheombinedapproah.

As future workweplan to extendthe analysis ofTVT representationto solve

other ERD problems suh as theidentiation of sexual predators and people

withsuiidetendeny.

Referenes

1. M.R.Capeelatro,M.D.Sahet,P.F.Hithok,S.M.Miller,andW.B.Britton.

Majordepressiondurationreduesappetitiveworduse:Anelaboratedverbalreall

ofemotionalphotographs. JournalofPsyhiatriResearh,47(6):809815,2013.

2. M.DeChoudhury,E.Kiiman,M.Dredze,G.Coppersmith,andM.Kumar.Dis-

overingshiftstosuiidalideationfrommentalhealthontentinsoialmedia. In

Proeedingsofthe2016CHIConfereneonHumanFatorsinComputingSystems,

CHI'16,pages20982110, NewYork,NY,USA,2016.ACM.

3. M.L. Errealde,M.P.Villegas,D.G.Funez,M.J.GariarenaUelay,andL.C.

Cagnina.TemporalVariationofTermsasoneptspaeforearlyriskpredition.In

G.J.F.Jones,S.Lawless,J.Gonzalo,L.Kelly,L.Goeuriot,T.Mandl,L.Cappel-

lato,andN.Ferro,editors,ExperimentalIR MeetsMultilinguality,Multimodality,

andInteration.,volume10456.Springer,2017.

4. A.Garia-Caballero,J.Jiménez,M.Fernandez-Cabana,andI.Garía-Lado. Last

words:anliwanalysisofsuiidenotesfromspain. EuropeanPsyh.,27:1,2012.

5. M.Koppel,J.Shler,andS.Argamon.Computationalmethodsinauthorshipattri-

bution. JournaloftheAmerianSoietyforInformationSiene andTehnology,

60(1):926, 2009.

6. Z.Li,Z.Xiong,Y.Zhang,C.Liu,andK.Li. Fasttextategorizationusingonise

semantianalysis. PatternReogn. Lett.,32(3):441448,February2011.

7. A.P.López-Monroy,M. MontesyGómez,H. J.Esalante, L.Villaseñor-Pineda,

and Efstathios Stamatatos. Disriminative subprole-spei representations for

authorprolinginsoialmedia. Knowledge-Based Systems,89:134147,2015.

8. D.E.Losada andF.Crestani. A TestColletionforResearhonDepressionand

Language Use,pages2839. SpringerInternationalPublishing,Cham,2016.

9. D.E.Losada,F.Crestani,andJ.Parapar. eRISK2017:CLEFLabonEarlyRisk

Predition ontheInternet:Experimentalfoundations. InProeedingsConferene

andLabsoftheEvaluationForumCLEF2017,Dublin,Ireland,2017.

10. J. W.Pennebaker, R.L. Boyd, K.Jordan,and K.Blakburn. Thedevelopment

andpsyhometripropertiesofliw2015. 2015.

11. J.W.Pennebaker,M.R.Mehl,andK.G.Niederhoer.Psyhologialaspetsofnat-

urallanguageuse:Ourwords,ourselves. Annualreviewofpsyhology,54(1):547

577, 2003.

12. N. Ramirez-Esparza, C. K. Chung, E. Kaewiz, and J. W. Pennebaker. The

psyhologyofworduseindepressionforumsinenglishandinspanish:Testingtwo

textanalytiapproahes. InPro.ICWSM2008,2008.

Références

Documents relatifs

The second learning phase is a classification model that learns patterns from these time series to detect early signs of such mental disorders.. Predicting at-risk users once at the

Regarding the features that are extracted to build predictive models, the most common ones are those related to the users’ texts such as: frequencies of each word or multiple

Examples of unsuper- vised tasks are intrinsic plagiarism detection (identification of plagiarized parts within a given document without a reference collection of authentic

The caption prediction subtask aims to recreate the original image caption.To predict the captions of the images, we proposed a retrieval-based approach using Open-i and a

Besides, Nearest Neighbor method is also applied to retrieve a similar image and its caption in the training data as a supplement in final caption.. After performing the

In the visual retrieval phase, for a given test image, a sample of the k most visually similar images from the training dataset is retrieved.. Several experiments on the validation

The first release consisted of the 1st chunk of data (oldest writings of all test users), the second release consisted of the 2nd chunk of data (second oldest writings of all

We used these features with both sequential and non-sequential learning models: we used support vector machines as the non-sequential linear model and recurrent neural networks