Pilot task on Early Detetion of Depression
Notebook for eRisk at CLEF 2017
Ma.PaulaVillegas
1
, DarioG.Funez
1
,
Ma.JoséGariarenaUelay
1
,LetiiaC.Cagnina
1 , 2
,andMareloL.Errealde
1
1
LIDICResearhGroup,
UniversidadNaionaldeSanLuis,Argentina
2
ConsejoNaionaldeInvestigaionesCientíasyTénias(CONICET)
{villegasmariapaula74,funezda rio,m jgar iare naue lay} gmai l.o m
{lagnina,merrealde}gmail.o m
Abstrat. InthispaperwedesribethepartiipationoftheLIDICRe-
searhGroupofUniversidadNaionaldeSanLuis(UNSL)-Argentinaat
eRisk2017pilottask.Themaingoalofthistaskisonsideringearlyrisk
detetion senarios, depression inthis ase, where the issue of getting
timely preditions with a reasonable ondene level beomes ritial.
In the pilot task, systems mustbe able to sequentially proess piees
of evidene and detet early traes of depression as soon as possible.
Theused data set is a olletion of writings labeled as depressed or
non-depressed thatwerereleasedtothepilottaskpartiipantsintwo
dierent stages:training andtest.Ourproposalfor thistaskwas based
onasemantirepresentationofdoumentsthatexpliitly onsidersthe
partialinformationthatismadeavailableinthedierenthunkstothe
earlyriskdetetion systemsalongthe time.Thattemporal approah
was omplemented with other standard text ategorization models in
some spei situations that seemed not to be orretly addressed by
ourapproahalone.Intheteststage,theresultingsystemobtainedthe
lowest
ERDE 50 errorona total of30 submissionsfrom 8dierent in- stitutions.However,onethegolden-truth ofthetestingdatasetwas
released,weouldverifythatourtemporalapproahalonemighthave
obtained veryrobust results and the lowest reported
ERDE
error forboththresholdsusedinthepilottask.
Keywords:EarlyRiskDetetion, Unbalaned DataSets,TextRepre-
sentations,SemantiAnalysisTehniques.
1 Introdution
ThewidespreaduseofInternet,soialnetworksandotheromputertehnologies
hasmarkedthebeginningofaneweraintheommuniationamongpeople.In
thisontext,wehavenowavailablealot ofinformationthatmightbeusefulto
detet,inatimelyway,thosesituationsthatmightbepotentiallydangerousor
riskyforthephysial well-beingandhealthofindividuals,ommunitiesandso-
todepression,amongothers.
This type of situations have started to be studied in a new researh eld
known as early risk detetion (ERD) whih has reeived inreasingly interest
from sienti researhersat world level due to the important impat that it
ould have in many relevant and urrent problems of the real world. In this
ontext, thisyearwas organizedtherstearlyrisk preditiononfereneeRisk
2017 3
intheontextoftheCLEF2017Workshop.Aspartofthisevent,apilot
task was organized and the present artile desribes our partiipation in this
task.
TheeRisk 2017pilot task wasorganizedinto twodierentstages:training
andteststages.It alsoassumedanERDsenario,thatis,dataaresequentially
read as a stream and the hallenge onsists in deteting risk asesas soon as
possible.Inordertoreproduethissenario,theeRisk2017'sorganizersreleased
thetestdataset followingasequential,hunkbyhunk riterion;that is,the
rsthunkontainedtheoldest10%ofthemessages,theseondhunkontains
theseondoldest10%,andsoforthuptoomplete10hunksthatrepresentthe
fullwritingoftheanalysedindividuals.
To deal with this problem we implemented an original proposal that we
named temporal variation of terms (TVT) [3℄.However,at thetrainingstage,
TVT seemedto showsome weaknessat spei hunks sowedeidedto om-
plement itwith otherstandardmethods to help ourapproahin these spei
situations.Thus,ourimplementedERDsystemisinfataombinedsystemthat
strongly depends on TVTbut alsouses other methods asadditionalsoure of
opinions.Theresultingsystemhadaveryaeptableperformaneonthedata
setreleasedintheteststageobtainingthelowest
ERDE 50erroronatotalof30
submissionsfrom8dierentinstitutions.However,onethegolden-truthofthe
testdatasetwasreleased,weouldverifythat TVTalonemighthaveobtained
very robust results and the lowest reported
ERDE
error for both thresholdsused in thepilottask. Forthis reason,wealso inlude anadditionalsetionin
this work with preliminaryresults obtainedwith TVT aloneon thetest setin
orderto observethepotentialofthis methodforgeneralERD problems.
Therestoftheartileisorganizedasfollows:Setion2desribessomegeneral
aspetsofthedatasetusedinthepilottaskandthemethodsusedinourERD
system. Next, in Setion 3 the ativities arried out in the training stage are
desribedand thejustiationsof themain designdeisionsmadeonourERD
system, are presented. Setion 4 showsthe obtainedresults with our proposal
ontheeRisk2017datasetreleasedintheteststage.Then,someomplementary
results are presented in Setion 5 where interesting aspets are shown on the
performane of TVT working alone on the test set. Finally, Setion 6 depits
potentialfuture worksandtheobtainedonlusions.
3
2.1 Data Set
The dataset used in the pilottask of theeRisk ompetition 4
wasinitially de-
sribed in [8℄. It is a olletion of writings(posts or omments) from a set of
Soial Media users. There are two ategories of users, depressed and non-
depressed and,foreahuser,theolletionontainsasequeneofwritings(in
hronologial order). Foreah user, his olletionof writingshas beendivided
into 10 hunks. The rst hunk ontains the oldest 10% of the messages, the
seond hunkontainstheseondoldest 10%,and soforth.This olletionwas
split into atraining and atest set that we will refer as
T R DS and T E DS re-
spetively. The
T R DS set ontained 486users (83positive, 403negative) and
the(test)
T E DS set ontained401users(52 positive, 349negative).Theusers
labeledaspositivearethosethathaveexpliitlymentionedthattheyhavebeen
diagnosedwithdepression.
Thepilot task was divided by their organizers into a training stage and a
teststage.Intherstone,thepartiipatingteamshadaesstothe
T R DS data
setwithallhunksofalltrainingusers.Theyouldthereforetunetheirsystems
withthetrainingdata.Then,intheteststage,the10hunksofthe
T E DS data
setweregraduallyreleasedbytheorganizersonebyoneuntilompletingallthe
hunks that orrespond to theompletewritingsof theonsidered individuals.
Eah time that a hunk
ch i was released, partiipants in the pilot task were
askedtogivetheirpreditionsontheusersontainedintheT E DS,basedonthe
partialinformationread fromhunks
ch 1 toch i.
2.2 Methods
In order to dealwith theproblem posed in thepilot task we proposed a new
doumentrepresentationnamedtemporalvariation ofterms (TVT)[3℄.TVTis
anelaborateapproahthatrequiresenoughspaetobedesribedinanadequate
way.Forthisreason,inthepresentworkwewillonlyfousonthedeisionaspets
that wereonsidered at thepilot task toombine TVTwith otherwell-known
doument representation approaheslike Bag of Words (BoW). Thus, we will
give below a separate desription of the method with its main harateristis
but theinterestreaderanobtainadetaileddesriptionofTVTin[3℄.
Weusedinourexperimentsdierentdoumentrepresentationsandlearning
algorithms. Therefore, we start giving some general ideas of those doument
representationsandalittlemoredetailedexplanationofTVT.Afterthat,some
generalonsiderationsontheused learningalgorithmsarebrieyintrodued.
Doumentrepresentations
4
thelanguagemodelsmostusedin textategorizationtasks.InBoW repre-
sentations, features arewordsand doumentsare simply treatedasolle-
tionsofunorderedwords.Formally,adoument
d
isrepresentedbythevetor ofweightsd BoW = (w 1 , w 2 , ..., w n )
wheren
is thesizeof thevoabulary ofthedataset.Eahweight
w i isavaluethatisassignedtoeahfeature(word)
aordingtowhetherthewordappearsinadoumentorhowfrequentlythis
word appears. This popularrepresentation is simple to implement, fast to
obtainand an be used under dierent weighting shemes (boolean, term
frequeny(
tf
)orterm frequeny- inversedoumentfrequeny(tf − idf
)).InourstudyweusedBoWwiththebooleanweightingsheme.
Conise Semanti Analysis (or Seond Order Attributes (SOA)). Conise
SemantiAnalysisisasemantianalysistehniquethatinterpretswordsand
textfragmentsinaspaeofoneptsthatarelose(orequal)totheategory
labels.Forinstane,ifdoumentsinthedatasetarelabeledwith
q
dierentategorylabels(usuallynomorethan100elements),wordsanddouments
willberepresentedina
q
-dimensionalspae.Thatspaesizeisusuallymuh smallerthan standardBoW representations whih diretly depend on thevoabulary size (more than 10000 or20000elements in general). CSA has
been usedin generaltext ategorizationtasks[6℄ and hasbeen adapted to
work in author proling tasksunder the nameof SeondOrder Attributes
(SOA)[7℄.
Charater3-grams. A
n
-gram is asequene ofn
haraters obtainedfromeah text in the dataset.In the
n
-grammodel, theditionary ontainsalln
-gramsthatourinanyterminthevoabulary.Therepresentationsusingharater
n
-grams havedemonstrated to be eetive in many appliations [5℄wheren
-grams areonsideredthetermsusedin BoWrepresentations.
LIW C
-based features. Featuresderivedfrom Linguisti Inquiry and WordCount (LIWC)[11,10℄ havebeenused in severalstudies related to psyho-
logialaspets of individuals. LIWC has been suessfullyused to identify
depressedand non-depressedpeople analyzinglinguistimarkersofdepres-
sionsuhastheuseofthepersonalpronounsandpositive-negativeemotions
[12℄andthepreseneofwordsrelatedto thedeath(e.g.,dead,kill,sui-
ide),sex(e.g., arouse, makeout, orgasm)and ingestion (e.g., hew,
drink,hunger)besidesemotionsalso resultinguseful[1℄. Studiesonsui-
idalindividuals haveinorporatedLIWCasa valuabletool to extratin-
formationrelated to suiide and suiidal ideation analyzing the ategories
death,health,sad,they,I,sexual,ller,swear,anger,andnegativeemotions
[2,4℄.Due to the fat wewantedto onsidermoremeaningful features, we
alsoonsideredinpreliminarystudiesthemostinformativefeaturesbelong-
ing to linguisti dimensions (for example, personal pronouns and number
offuntionwords),summary oflanguagevariables(forexample,ditionary
words and words with more than 6letters), psyhologialproess (forex-
ample,negativeemotionsandaetiveproesses)and,grammar(verbsand
(TVT) method [3℄ is an approah for early risk detetion based on using the
variationofvoabularyalongthedierenttimestepsasoneptspaefordou-
mentrepresentation.Thismethodis thekeyomponentofourEDSsystemfor
theeRiskpilottask.
TVT is based on the onise semanti analysis (CSA) tehnique proposed
in [6℄ and later extended in [7℄ forauthor proling tasks. In this ontext, the
underlyingideaisthatvariationsofthetermsusedindierentsequentialstages
ofthedoumentsmayhaverelevantinformationforthelassiationtask.With
this idea in mind, this method enrihes the douments of the minority lass
with the partial douments read in the rst hunks.These rsthunks of the
minority lass, along with theiromplete douments, are onsidered asa new
oneptspaeforaCSAmethod.
TVTnaturallyopeswiththesequentialarateristisofERDproblemsand
alsogivesatoolfordealingwithunbalaneddatasets.Preliminaryresultsofthis
methodinomparisontoCSAandBoWrepresentations[3℄showeditspotential
todealwithERDproblems.
Learning algorithms Inpreliminarystudieswetesteddierentlearningalgo-
rithmsandweobtainedthebestresultswithRandomForestandNaïveBayesin
severalomparisonswithotherpopularmethodslikeLIBSVM 5
.Wealsousedin
oursystem,adeisiontree(Weka'sJ48)obtainedbyrstseletingthe100words
withthehighestinformationgainandthenremovingfromthat listthosewords
that were onsidered dependent of spei domains (names of politiians like
Obama andountrieslikeChina).TheJ48algorithmtrainedonthissubset
offeaturesobtainedadeisiontreeofonly39nodesontainingsomeinteresting
wordslikemeds,depression,therapist andry,amongothers.As wewill
seelater,this deisiontreewasusedto assisttotheTVTmethodin theinitial
hunks.
EventhoughwearriedoutseveralomparativestudieswithLIWCfeatures,
harater3-grams,CSArepresentationsandtheLIBSVMalgorithm,wedeided
toseletonlythoseapproahesthatseemedtobeeetivetoassistTVTinsome
speisituations.Forthisreason,inourERDsystemweonlyusedtheRandom
ForestandNaïveBayesalgorithmswithBoWandTVTrepresentationsandthe
J48deisiontreepreviouslyexplained.
3 Training Stage
The pilot task was divided into a training stage and a test stage. So, in the
rst one,the partiipatingteams had aess to the
T R DS set with allhunks
ofalltrainingusers.Therefore,weouldtunein thisstageoursystemwiththe
trainingdata.
5
For all the testedalgorithms we used the implementations provided by the Weka
severalpreliminarystudiestogivebetterresultsthanBoWandCSA(see[3℄for
adetailedstudy),welakedofarobustriteriontosolvearitialaspetinERD
problemsthatwewillreferasthelassiationtimedeision(CTD)issue.That
is, an ERD system needsto onsider not only whih lass should be assigned
to adoument; italsoneedstodeidewhen to makethat assignment.Inother
words,ERDmethodsmusthaveariteriontodeidewhen (inwhatsituations)
thelassiationgeneratedbythesystemisonsideredthenal/denitivedei-
sionon theevaluatedinstanes. Although thisaspethasbeenaddressedwith
verysimpleheuristi rules 6
, we wantedto have arule for the CTD issue that
attempted to exploit thestrengths oftheTVT method andsimultaneouslyal-
leviate its weaknesses. With this goalin mind, webegan athorough study to
determineinwhihsituationstheTVTmethodbehavedsatisfatorilyandwhen
itneededtobeimproved.
Westartedourstudybydividingthetrainingset
T R DS intoanewtraining
set that we will refer as
T R DS − train
and a test set namedT R DS − test
.Those sets maintained the same proportions of post per user and words per
user as desribed in [8℄.
T R DS − train
andT R DS − test
were generated byrandomlyseletingarounda70%of writingsfortherstone andtherest 30%
fortheseondone.Thus,
T R DS − train
resultedin351individuals(63positive, 288negative)meanwhileT R DS − test
ontains135individuals(20positive,115 negative).Thedivisioninto10hunksofthewritingsprovidedbytheorganizerswaskeptforbotholletions.
Then,weanalyzedtheperformaneoftheTVTmethodbyusingthe
T R DS − train
setastrainingsetandevaluatingtheobtainedresultswithT R DS − test
.Westartedassumingthatthelassiationismadeonastatihunkbyhunk
basis.Thatis,foreahhunk
C ˆ i providedtotheERDsystemsweevaluatedthe
TVT'sperformaneonsideringthatthemodelis(simultaneously)appliedtothe
writingsreeiveduptothehunk
C ˆ i.Withthistypeofstudyitwaspossibleto
observetowhatextenttheTVTmethodwasrobusttothepartialinformationin
thedierentstages,inwhihmomentitstartedtoobtainaeptableresults,and
other interestingstatistis.Inthis ontext, weonsideredthat the
F 1 measure
whihombines preisionandreallwouldbeavaluablemeasureto gainsome
insightintotheperformaneoftheTVTmethodalongthedierenthunks.
Figure1showsthistypeofinformationwhereweanseethatTVTrepresen-
tationswith twodierentlearningalgorithms(NaïveBayes(NB) andRandom
Forest(RF))reahedthehighest
F 1valuesinthehunks3and4.Inbothases,
those values were obtained when an instane was lassied as positive if the
probabilityassignedbythelassierwasgreaterorequalthan0.6(
p ≥ 0.6
).Be-sides, we ouldobservethat in the rsthunk,TVTrepresentations produed
a high numberof false positives.That weakness of TVT methods is shown in
Figure2whereweanseethelowestpreisionvaluesinthehunk1.Thatprob-
lem ofTVTmethodslooksreasonableifweonsiderthatTVTis basedonthe
6
Forinstane,exeedingaspeiondenethresholdinthepreditionofthelas-
hunks, where little information is available,TVT will need to be assisted by
otheromplementarymethods.
Fig.1.
F 1 valuesobtainedwithTVTmodels.
Fig.2.PreisionvaluesobtainedwithTVTmodels.
Due to the fat that we wanted to fous our preditions on those hunks
wherebestresultswereobtained(hunks3and4)andassumingthatafterthose
hunksthepenalizationomponentinthe
ERDE
omputationwouldaettheTVT's resultswedeided to set somehunk by hunk rulesthat aomplish
ertainbasipropertiesinthedierenthunks:
1. Chunk1: HeretheERD systemshould beextremelyonservativeandonly
lassifyinganinstaneaspositive(depressed)ifthereexistsstrongevidene
of that.Weused for this ase, theriterion of only lassifyingan instane
as positive if both models obtainedwith TVT-NB and TVT-RFlassied
the instane as positive with probability
p = 1
and the text inluded allthewordsinawhitelist. Thatlistwasobtainedfrom thewordswiththe
highestinformationgainofthedoumentsofthersthunk.Itinludedthe
wordsdepression anddiagnosed.
2. Chunk2: Heretherestritionofthewhitelist ouldberelaxedandom-
plementtheTVTmethodswith amoregeneralapproah. Forthisend, we
usedthepreditionsofthe J48deisiontreeexplainedaboveandlassied
aninstane aspositiveifthethreelassierslassieditinthat way.
3. Chunk 3: In this hunk mostof the lassiers obtainedthe best preision
values. However,they obtained low reall values. In order to address this
aspet,aninstanewaslassiedaspositiveifatleasttwolassierslassify
itas positivewithprobability
p ≥ 0.9
.preision.Wealso ouldobservethat whenit wasombinedwiththeTVT
method,thislastmethodplayedaroleoflterobtaininggoodresultswhen
bothmethodslassiedaninstaneaspositive.Ontheother hand,many
instanes not deteted for this ombination were lassied as positive by
theJ48method. Forthisreason,anyinstanelassiedaspositivebyboth,
BoWandTVTmethodswithprobability
p ≥ 0.7
or bytheJ48methodwaslassiedaspositive.
5. Chunks5to10:Fromhunk5forward,weassumedthatmostoftherelevant
lassiationshad alreadybeenmadein the previoushunks.However,we
kept a monitoring system to identify those ases that needed muh more
additionalinformationtodeterminewhetheranindividualwasdepressiveor
not.Forthis purpose,weused thesameruleof hunk4forhunks 5and6
butinreasingtheprobabilityofBoW andTVTto
p ≥ 0.8
.Fromhunks7to10aninstanewaslassiedaspositiveifatleasttwolassierslassied
itas positivewithprobability
p ≥ 0.9
.Withthose empiriallyderivedrulesasgeneralriterionfortheCTDissue,
wethen analyzed ifthe TVT method ombined with the other approahesin
spei situations obtained abetter performane than the individual methods
workingseparately.Theresultsobtainedfromthisanalysisarepresentedbelow.
3.1 Analysis of results
In order to study the performane of our ombined approah against eah
method testedseparately,weused
T R DS − train
andT R DS − test
sets intheexperiments. Individual methods also need a riterion to deal with the CTD
issue. In this ase, we an diretly use the probability (or some measure of
ondene)assignedbythelassiertodeidewhentostopreadingadoument
and givingits lassiation.That approah,that in [8℄is referredasdynami,
onlyonsidersthatthisprobabilityexeedssomepartiularthresholdtolassify
theinstane/individualaspositive.Inourstudies,wealsoused,fortheindividual
methods,thisdynamiapproahbyonsideringdierentprobabilityvalues:
p = 1
,p ≥ 0.9
,p ≥ 0.8
,p ≥ 0.7
andp ≥ 0.6
.Tables 1, 2 and 3 report the values of preision (
π
), reall (ρ
) andF 1-
measure (
F 1) of the target (depressed) lass for eah onsidered individual model anddierentprobabilities.Statistisalsoinludetheearly riskdetetion
error (ERDE)measureproposedin[8℄withtwovaluesoftheparameter
o
usedin thepilottask:
o = 5
(ERDE 5)and o = 50
(ERDE 50).
As we an see from Table 1 themodel TVT-NBobtained the best perfor-
manewhenthelassierassignedtothepositivelass(depressed),theinstanes
withprobability1.ForTVT-RFandBoWthestatistisaredierent.InTable2
thebest
ERDE 50(andF 1measure)wasobtainedwhenRandomForestlassier
usedthelowestprobability(
p ≥ 0.6
)butatthesametimewiththisprobability theERDE 5 erroris the worst value. The BoW model (Table3) obtainedthe
best
ERDE 50valueonsideringp ≥ 0.8
andwiththisongurationthebestF 1,
π
andρ
areobtained.TheJ48modelassignstheinstanesto thepositivelassusingprobability1,thenitsperformaneisdiretlyshownintheomparisonin
Table4.
Table1.PerformaneofTVT-NBmodel(
T R DS − test
set).p = 1 p ≥ 0 . 9 p ≥ 0 . 8 p ≥ 0 . 7 p ≥ 0 . 6 ERDE 5 14.13 16.66 16.68 16.82 16.85
ERDE 5011.25 15.34 15.21 15.24 15.07
F 1 0.40 0.27 0.28 0.27 0.28
π
0.47 0.50 0.48 0.43 0.44ρ
0.35 0.18 0.19 0.19 0.20Table 2.PerformaneofTVT-RFmodel(
T R DS − test
set).p = 1 p ≥ 0 . 9 p ≥ 0 . 8 p ≥ 0 . 7 p ≥ 0 . 6 ERDE 5 16.92 16.85 16.92 16.96 17.14
ERDE 5016.43 16.05 16.12 16.02 15.79
F 1 0.11 0.15 0.16 0.21 0.22
π
0.50 0.54 0.50 0.50 0.43ρ
0.06 0.08 0.10 0.13 0.14Table3. PerformaneofBoWmodel(
T R DS − test
set).p = 1 p ≥ 0 . 9 p ≥ 0 . 8 p ≥ 0 . 7 p ≥ 0 . 6 ERDE 5 20.79 20.83 21.05 21.16 21.27
ERDE 50 18.82 18.66 18.13 18.24 18.35
F 1 0.19 0.21 0.24 0.24 0.23
π
0.11 0.13 0.14 0.14 0.14
ρ
0.5 0.65 0.75 0.75 0.75
Tohoieapartiularprobabilitytoomparetheperformaneofthemodels,
weseleted the onefor whih themodel obtainedthe best ERDE onsidering
o = 50
(ERDE 50).Wedeided touse thismetribeauseweknewin advane
thattheresultsinthepilottaskwouldbeevaluatedwithERDE butwedidnot
knowthe spei
o
value.In this ontext, weonsidered thatERDE 50 would
bemoreimportant/informativethanthe
ERDE 5 error.
Table4showstheomparison of performane ofall models. Asweanob-
exeptforthe
ERDE 5errorforwhihTVT-NBmodelobtainedthebestvalue.
It is interesting to note that beyond that TVT-NB obtained a good
ERDE 5
value, in the other metris it was slightly worse than the ombined methods.
These resultsonrm theadequatenessof theombinedapproahfortheERD
taskovertheotherindividualmethods.
Table4.Performane omparisonofall models(
T R DS − test
set).ERDE 5 ERDE 50 F 1 π ρ
Combinedmethods 15.36 9.16 0.590.480.75
J48 17.01 12.94 0.4 0.33 0.5
BoW(
p ≥ 0 . 8
) 21.05 18.13 0.24 0.14 0.75 TVT-NB(p = 1
) 14.13 11.25 0.4 0.47 0.35 TVT-RF(p ≥ 0 . 6
) 17.14 15.79 0.22 0.43 0.144 Test Stage
Thepreviousresultswereobtainedbytrainingthelassierswiththe
T R DS − train
data set and testing them with theT R DS − test
data set. In this sub-setion, we show the results obtained by the ombined methods trained with
the fulltrainingset of the pilottask (
T R DS)and tested withT E DS that was
inrementallyreleasedduringthetestingphaseofthepilottask.
InTable5weshow thethree submissionsthat obtainedthe best
ERDE 5,
ERDE 50 and F 1 values in the eRisk pilot task as reported in[9℄. There, we
an observethatourombinedmethods (
U N SLA
)obtainedthebestERDE 50
value.On theotherhand,FHDO-BCSGAobtainedthebest F-measurewitha
ERDE 50 valueslightlyworsethanourapproah. FHDO-BCSGBobtainedthe
best
ERDE 5 and theworst F 1 overthethree approahes. With these results,
weanonludethatourproposalisaompetitiveapproahforERDtasks.
Table 5.Bestintherankingofthepilottask(
T E DSset).
ERDE 5 ERDE 50 F 1 π ρ
Combinedmethods(UNSLA) 13.66 9.68 0.59 0.480.79
FHDO-BCSGA 12.82 9.69 0.640.610.67
FHDO-BCSGB 12.70 10.39 0.55 0.690.46
OurERD systemtested onthepilot taskwasderivedfrom ouranalysis ofthe
weakness and strengths of the TVT method on the training data. However,
one the golden-truth information of the
T E DS set was made available by
theorganizers,weould analyzewhat would havebeentheperformaneofthe
models, in partiular the TVTmethod, working aloneif dierentprobabilities
hadbeenseleted.
Table 6 shows this type of information by reporting the results obtained
withTVTrepresentationandusingNaïveBayesandRandomForestaslearning
algorithms. Besides, dierent probability values were tested for the dynami
approahestotheCTDaspet.Theobtainedresultsareonlusiveinthisase.
TVT shows a high robustness in the
ERDE
measures independently of the algorithm used to learnthe model, and the probability threshold. Mostof theTVT's
ERDE 5valueswerelowandin7outof10settings,theERDE 50values
werelowestthanthebestonereportedinthepilottask(theombinedmethods
(
U N SLA
) with 9.68). In this ontext, TVT ahieves the bestERDE 5 value
reported uptonow(12.30)with thesettingTVT-RF (
p ≥ 0.8
)and thelowestERDE 50 value(8.17)with themodelTVT-NB(p ≥ 0.8
).The performane of
theTVTmethods onthe testsetwassurprising forusbeauseon thetraining
stage weouldnotobtainsuh asgood results.Thismakesus onludethatif
theTVTmethodhadpartiipatedaloneinthepilottask,ithadobtainedsimilar
orbetterresultsthantheonesobtainedwiththeombinedmethods.
Table6.TVT's performaneonthe
T E DS set.
ERDE 5 ERDE 50 F 1 π ρ
TVT-NB(
p ≥ 0 . 6
) 13.59 8.40 0.50 0.37 0.75 TVT-NB(p ≥ 0 . 7
) 13.43 8.24 0.51 0.39 0.75 TVT-NB(p ≥ 0 . 8
) 13.13 8.17 0.54 0.42 0.73 TVT-NB(p ≥ 0 . 9
) 13.07 8.35 0.52 0.42 0.69 TVT-NB(p = 1
) 12.38 9.84 0.42 0.50 0.37 TVT-RF(p ≥ 0 . 6
) 12.46 8.37 0.55 0.49 0.63 TVT-RF(p ≥ 0 . 7
) 12.49 8.52 0.55 0.50 0.62 TVT-RF(p ≥ 0 . 8
) 12.30 8.95 0.56 0.54 0.58 TVT-RF(p ≥ 0 . 9
) 12.34 10.28 0.47 0.55 0.40 TVT-RF(p = 1
) 12.82 11.82 0.20 0.67 0.126 Conlusions and future work
This artile presents the partiipation of LIDIC - UNSL at eRisk 2017 Pilot
task onEarlyDetetionof Depression.Weproposed aninterestingrepresenta-
of thedouments.Beause theexperimentsonthe trainingstageshowed some
weakness of TVT, we proposed a ombined approah to assist TVT in some
spei situations. The results obtained on the pilot task with our ombined
methodsweregood,obtainingthebest
ERDE 50valueoverallpartiipants.An interesting aspet wasthat the TVTrepresentation alone on the test set,ob-
tainedbetterresultsthantheones ahievedbytheombinedapproah.
As future workweplan to extendthe analysis ofTVT representationto solve
other ERD problems suh as theidentiation of sexual predators and people
withsuiidetendeny.
Referenes
1. M.R.Capeelatro,M.D.Sahet,P.F.Hithok,S.M.Miller,andW.B.Britton.
Majordepressiondurationreduesappetitiveworduse:Anelaboratedverbalreall
ofemotionalphotographs. JournalofPsyhiatriResearh,47(6):809815,2013.
2. M.DeChoudhury,E.Kiiman,M.Dredze,G.Coppersmith,andM.Kumar.Dis-
overingshiftstosuiidalideationfrommentalhealthontentinsoialmedia. In
Proeedingsofthe2016CHIConfereneonHumanFatorsinComputingSystems,
CHI'16,pages20982110, NewYork,NY,USA,2016.ACM.
3. M.L. Errealde,M.P.Villegas,D.G.Funez,M.J.GariarenaUelay,andL.C.
Cagnina.TemporalVariationofTermsasoneptspaeforearlyriskpredition.In
G.J.F.Jones,S.Lawless,J.Gonzalo,L.Kelly,L.Goeuriot,T.Mandl,L.Cappel-
lato,andN.Ferro,editors,ExperimentalIR MeetsMultilinguality,Multimodality,
andInteration.,volume10456.Springer,2017.
4. A.Garia-Caballero,J.Jiménez,M.Fernandez-Cabana,andI.Garía-Lado. Last
words:anliwanalysisofsuiidenotesfromspain. EuropeanPsyh.,27:1,2012.
5. M.Koppel,J.Shler,andS.Argamon.Computationalmethodsinauthorshipattri-
bution. JournaloftheAmerianSoietyforInformationSiene andTehnology,
60(1):926, 2009.
6. Z.Li,Z.Xiong,Y.Zhang,C.Liu,andK.Li. Fasttextategorizationusingonise
semantianalysis. PatternReogn. Lett.,32(3):441448,February2011.
7. A.P.López-Monroy,M. MontesyGómez,H. J.Esalante, L.Villaseñor-Pineda,
and Efstathios Stamatatos. Disriminative subprole-spei representations for
authorprolinginsoialmedia. Knowledge-Based Systems,89:134147,2015.
8. D.E.Losada andF.Crestani. A TestColletionforResearhonDepressionand
Language Use,pages2839. SpringerInternationalPublishing,Cham,2016.
9. D.E.Losada,F.Crestani,andJ.Parapar. eRISK2017:CLEFLabonEarlyRisk
Predition ontheInternet:Experimentalfoundations. InProeedingsConferene
andLabsoftheEvaluationForumCLEF2017,Dublin,Ireland,2017.
10. J. W.Pennebaker, R.L. Boyd, K.Jordan,and K.Blakburn. Thedevelopment
andpsyhometripropertiesofliw2015. 2015.
11. J.W.Pennebaker,M.R.Mehl,andK.G.Niederhoer.Psyhologialaspetsofnat-
urallanguageuse:Ourwords,ourselves. Annualreviewofpsyhology,54(1):547
577, 2003.
12. N. Ramirez-Esparza, C. K. Chung, E. Kaewiz, and J. W. Pennebaker. The
psyhologyofworduseindepressionforumsinenglishandinspanish:Testingtwo
textanalytiapproahes. InPro.ICWSM2008,2008.