FIDJI @ ResPubliQA 2010

(1)

XavierTannier,VéroniqueMorieau

LIMSI-CNRS

Univ.Paris-Sud,Orsay,Frane

xtannier, morieaulimsi.fr

Abstrat. Inthispaper,wepresenttheresultsobtainedbythesys-

tem FIDJIfor bothFrenh and English monolingual evaluations,

at ResPubliQA 2010 ampaign. In this ampaign, we foused on

arryingonourevaluationsonerningtheontributionofoursyn-

tatimoduleswiththisspei olletion.

1 Introdution

FIDJI(FindingInDoumentsJustiationsandInferenes)isanopen-domain

question-answering(QA) system for Frenh [1℄ and, more reently, English. It

ombines syntatiinformation withtraditional QAtehniques suh asnamed

entityreognitionandtermweightingin ordertovalidate answersthroughdif-

ferentdouments.

This paper fouses on the results obtained by FIDJI at ResPubliQA 2010

evaluation.Itpresentsrstabriefoverviewofthesystemandofitsadaptation

to English.Then, thespei hoiesmade fortheampaign are detailed,and

someresultsarenallygiven.

2 FIDJI

Figure 1 presents the arhiteture of FIDJI. The system relies on a syntati

analysis and named entity tagging of thequestion and of alimited number of

douments foreah question.This analysisis performedby theparser XIP[2℄

enrihed withsomeadditionalspei rules.

ThedoumentolletionisindexedbythesearhengineLuene 1

.Theindex

ontainsrawtextonly.First,thesystemanalysesthequestionandsubmitsthe

keywordsofthequestiontoLuene(moduleA):therst15doumentsarethen

proessed(module B).Wedeidedtoredue thenumberofdoumentsbeause

theyareratherlongandtheirparsingwouldtaketoomuhtime.Thereasonwe

perform this analysis online is that we aim at avoidingasmuh preproessing

aspossible(thesystemisdesignedtoexploreWebolletions[1℄).Amongthese

douments,FIDJIlooksforsentenesontainingthehighestnumberofsyntati

(2)

sentenes(module D1)and theanswertype, whenspeiedin the question,is

validated(moduleE).

ThemainobjetiveofFIDJIistoprodueanswerswhiharefullyvalidated

byasupportingtext(orpassage)withrespettoagivenquestion.Thediulty

isthat ananswer(orsomepieesofinformationomposingananswer)maybe

validatedbyseveraldouments.

Our approah onsists in heking if all the harateristis of a question

(namelythedependenyrelationsandtheanswertype)mayberetrievedinone

orseveraldouments.Inthisontext,FIDJIhastodetetsyntatiimpliations

betweenquestionsandpassagesontainingtheanswersandtovalidatethetype

ofthepotentialanswerin thispassageorin anotherdoument.

Sinethelastevaluationampaign in2009,FIDJIhasbeenadaptedto En-

glish.Speiruleshavebeendeveloppedforquestionanalysis(moduleA)and

doumentproessing(module B).The othermodules areommonto bothEn-

glishandFrenh.

The following examples illustrate how FIDJI extrats answers, and more

detailsonerningthesystemanbefoundin [1℄.

1

(3)

Question analysis provides lemmatisation, POS tagging and dependeny rela-

tions,aswellasthequestiontypeandtheexpetedanswertype.Forexample:

Question:Quelpremierministres'est suiidéen1993?

(Whih PrimeMinister ommittedsuiide in1993?)

Dependenies:DATE(1993)

PERSON(ANSWER)

SUBJ(se suiider, ANSWER)

attribut(ANSWER, ministre)

attribut(ministre, premier)

Questiontype:fatoid

Expetedanswertype:person(spei answertype:primeminister)

Thequestionisturned into adelarativesentene wherethe answerisrep-

resentedby the`ANSWER' lemma. The followingsentene isseleted beause

itontainsthehighestnumberofdependenyrelations:

Pierre Bérégovoys'estsuiidé en 1993.

(Pierre Bérégovoyommittedsuiide in1993.)

Dependenies:

DATE(1993)

PERSON(Pierre Bérégovoy)

SUBJ(se suiider, Pierre Bérégovoy)

PierreBérégovoyinstantiatestheANSWERslotofthequestiondependenies

andbeomesaandidateanswer.Thenamedentitytype(person)andtherst

threedependeniesofthequestionarevalidatedinthissentene.Inordertofully

validate the andidate answer, the system searhes the missing dependenies

(attribut(Pierre Bérégovoy, ministre) and attribut(ministre, premier) ) in

asinglesentene ofthewhole doumentolletion.Thesedependenieswill be

found in any sentene speaking about le premier ministre Pierre Bérégovoy

(PrimeMinister Pierre Bérégovoy)andtheanswerwill bevalidated.

2.2 Example2

Foromplexquestions,it isobviousthatanswersarenotalwaysshortphrases.

For this reason, FIDJI provides a full passage as an answer. On these kinds

of questions,the systembehavesasalassialpassageretrievalsystem,exept

that andidate passages are retrievedthrough syntatirelations and relevant

disoursemarkers(about100nouns,verbs,prepositionsandadjetives,manually

ompiled)insteadofkeywordsonly.Here isanexampleofaomplexquestion:

Question:Whyistheskyblue?

(4)

Expetedanswertype:reason 2

Thefollowingpassageis seletedbeauseit ontainsallthedependenyre-

lationsofthequestionandaausalmarker:

Andif the skyisblue, itisbeause of Rayleigh sattering...

attribut(sky, blue)

VMOD(be, sattering)

PREPOBJ(sattering, beause of)

...

3 ResPubliQA'10 experiments

In2009,ResPubliQAresultslearnedusalot aboutthebehaviorofoursystem.

Other evaluations (former CLEF and Quaero ampaigns) had shown that

using syntati analysis modules for retrieving douments and extrating the

answerssigniantlyimprovedtheresults[1℄.However,withResPubliQA eval-

uationset,passageextrationturnedouttobemuhbetterbyreplaingsyntax

bytraditional bag-of-wordstehniques [3℄.Thisis donebyturning omodules

C1and D1in Figure1.

Passageextrationisthenperformedbyalassialseletionofsenteneson-

tainingamaximumofquestionsigniantkeywords(moduleC2),andanswerex-

trationisahievedwithoutslotinstantiationwithindependenies(moduleD2).

ThenewguidelinesinResPubliQA2010oeredusthepossibilitytoarryon

ourexperimentsin thisway.Indeed,twodierenttaskswereallowedthisyear:

Paragraphseletion(PS),similarto2009task,whereonlythefullparagraph

ontainingtheexatanswerweretobereturned.Passagesarenotindenite

partsoftexts oflimitedlength,but predenedparagraphsidentiedin the

orpusbyXMLtags<p>.

Answer seletion(AS), loserto traditional QAtasks, where systemswere

requiredtodemaratealsotheexatanswer,supportedbyafullparagraph.

Inthis latter task, judged answersan be INEXACT (good support but

bad boundaries for short answer), MISSED (good support but wrong short

answer),RIGHT (goodsupport andgoodanswer)orWRONG.

Tworuns perlanguagewere allowed.Inorderto ontinuetestingourplug/

unplugstrategies,andtoexperimentthemforthersttimeinEnglish,wehose

thefollowingproedureforourtworuns:

2

Reason is not anamed entity, as person inthe rst example, but this answer

typepointsoutthatatextexpliitelyexplainingareasonshouldbeprefered(inour

(5)

passageretrieval,that hadthebest resultsofthesystemlast year.

2. AStask,syntatimodulesturnedon,inordertotestwhetheranswerex-

trationwaseetiveornotonthisolletion.Moreover,byaddinganswers

with INEXACT, MISSED and RIGHT status from our AS run, we

anobtainaPS runwithmodulesturnedon,whihallowsustoevaluate

modulesonthesametask.

4 Results

Wepresenttheresultsof5experimentsforbothFrenh andEnglish.Therst

threeome fromoialResPubliQA runs:

➀

^: ÂS ^task ^with ^syntati ^modules ^turned ôn ^(exat ânswers ^judged âs

RIGHT),

➁

^:^PS^task^with^syntati^modules^turnedôn^(exatânswersôf

➀

^judged^as

RIGHT,INEXACT,MISSED),

➂

^:^PS ^task^with^syntati^modules^turned^o.

Toompletetheevaluation,wealsoranunoialongurationandahieved

theassessmentbyourselves:

➃

^: ÂS ^task ^with ^passage^retrieval^turned ô ^but ânswerêxtration ^turned

on(modules C2andD1,withexatanswersjudgedasRIGHT),

➄

^:^PS^task^with^passage^retrieval^C1^turnedô^butânswerêxtration^turned

on(exat answersof

➃

^judged^as^RIGHT,^INEXACT,^MISSED).

In order to evaluate the performane of the question analysis module, we

manuallyidentiedthetypesofquestion.AsFIDJIannotproessopinionques-

tions,wedeidedtoonsiderthemasfatoid.AlthoughquestionsinFrenhand

Englisharetranslationsofeahotherandtheirrespetiveanswershouldbeex-

tratedfromthesameparagraph,wenotiedthat, foragivenquestion,itstype

isnotalwaysthesameinEnglishasinFrenh.Forexample,inEnglish,thetype

ofquestion169isreason/purpose whileinFrenh,itisfatoid:

(EN)Whyisthe tradeinammoniumnitratefertilizershamperedwithintheEu-

ropean Eonomi Community?

(FR)Qu'est-equiaentravéleommered'engraisàbasedenitrated'ammonium

dans la Communauté Éonomique Européenne? (What has hampered the trade

inammonium nitrate fertilizers...?)

Thisisnotonlyanissueofsyntatidierenesduetotranslationparaphras-

ing;thetargetofthequestionisdierent.Stritlyspeaking,theFrenhquestion

might aept a noun phraselike les réglementations régissant la ommerial-

isation des engrais à base de nitrate d'ammonium (the dierent regulations

(6)

thisissue 3

.

Tables 1 and 2 presents FIDJI's results for runs

➀

^,

➁

^and

➂

^, ^as ^well ^as

experiments

➃

^and

➄

^, ^by ^types ôf ^questions ^(manually îdentied). În ^F^renh,

86%ofquestiontypeswereorretlyidentied byFIDJI(wefound9questions

that were ill-formedor with misspellings and whih FIDJI ouldnot orretly

analyse)whereasin English,only69.5%wereorretlyidentied.

Conerningouroialruns,asweanseeinTables1and2,answerextra-

tion performane (

➀

⁾ îs ^very ^low ^(0.25 ^for ^both Ênglish ând ^Frenh). ^Results

arebetterforpassageseletion(

➁

^and

➂

⁾^forêvery^typeôf^questions ândêven

betterwhensyntatimodules areswithed o(

➂

^). ^Results^are^globally^better

forEnglishthanforFrenhsotheperformaneofthequestionanalysismodule

annotexplaintheseresults.

Inbothlanguages, orretanswersto denition questions dramatially de-

reasewithD1turnedo.Thisisbeausewedonothaveanynon-syntatiway

to extrat the answerfor many ofthese questions (denitions notexpetinga

namedentity,asWhatismaladministration?,anonlybeansweredbydenition

patterns in FIDJI).Turning o syntati modules neessarilyleads to a NOA

answerintheseases.

Weannotiethat forbothEnglishandFrenh,theresultsfollowthesame

trend and that results forpassageseletion are better for omplex questions

(reason/purposeandproedure), probablybeauseFIDJIseletspassages on-

tainingdisoursemarkersforthistypeofquestions.Also,forthesequestions,we

alwaysreturnedthefullparagraphasexatshortanswer,onsideringthattry-

ingtofousevenmoreinsidetheparagraphwasnotusefulforsuhquestions.As

theassessors didonsiderthat shorteranswersanbebetter,thesystemoften

getsanINEXACT statusfor.

Finally, our additional runs

➃

^and

➄

^show ^a ^small improvement, showing that bestresultsare obtainedwhen turning osyntatipassageretrieval,but

turning on syntatianswerextration (usingmodules C2 andD1). This is at

leastlear onerningnon-fatoidquestions.This ndingisimportantand will

helpusinthefuturetohooseoursearhstrategiesaordingtodierentorpora

andquestiontypes.

Last year, the pure information retrieval baseline [4℄ whih onsisted in

queryingtheindexedolletionwiththeexattextofthequestionandreturning

theparagraphretrievedintherstposition,hadthebestresultsforFrenhand

ranked 5out of 14 in English [5℄. Even if asubset of the Europarlorpushas

beenaddedtothedoumentolletionin2010,weanseethatour1measures

(seeTable3)arestilllowerthanthe2009baseline(0.53forEnglishand0.45for

Frenh).

In2009,wenotedthatourresultsweredueto ACQUIS orpusspeiities:

dierentregisteroflanguage,moreonstrainedvoabulary,textshavingaparti-

ularstruture,withanintrodutionfollowedbylongsentenesextendingonsev-

3

(7)

Numberofquestions 110 29 29 32 200

➀

^Corret^answers ¹⁰^(9.1%) ³^(10.3%) ¹^(3.5%) ³^(9.4%) ¹⁷^(8.5%)

➁

^Corret^passages ³³^(30%) ¹⁰^(34.5%) ¹⁰^(34.5%) ¹⁴^(43.8%) ⁶⁷^(33.5%)

➂

^Corret^passages ⁵¹^(46.3%) ³^(10.3%) ¹⁸^(62%) ¹⁷^(53.1%)⁸⁹^(44.5%)

Unoialruns

➃

^Corret^answers ¹³^(11.8%) ³^(10.3%) ²^(6.9%) ⁴^(12.5%) ²²^(11%)

➄

^Corret^passages ⁴⁷^(42.7%) ⁹^(31.0%) ¹⁹^(65.5%) ¹⁸^(56.3%) ⁹³^(46.5%)

Table 1.Resultsbyquestiontype(English).

Typeofquestions Fatoid DenitionReason/Purpose Proedure TOTAL

Numberofquestions 117 29 26 28 200

➀

^Corret^answers ¹¹^(9.4%) ²^(6.9%) ⁰^(0%) ¹^(3.6%) ¹⁴^(7%)

➁

^Corret^passages ³⁵^(29.9%)⁶^(20.7%) ⁸^(30.8%) ⁸^(28.6%) ⁵⁷^(28.5%)

➂

^Corret^passages ³⁰^(25.6%)⁶^(20.7%) ¹³^(50%) ¹³^(46.4%) ⁶²^(31%)

Unoialruns

➃

^Corret^answers ¹²^(10.3%)³^(10.3%) ⁰^(0%) ²^(6.3%) ¹⁷^(8.5%)

➄

^Corret^passages ³¹^(28.2%)⁷^(24.1%) ¹⁴^(53.8%) ¹⁵^(50.0%) ⁶⁷^(33.5%)

Table2.Resultsbyquestiontype(Frenh).

eralparagraphs,et.Table4showsthat FIDJIfoundorretanswers/passages

mainlyintheACQUISolletion.AsFIDJIhasdiultywithseletingpassages

in theACQUISolletion,FIDJI'slowresultsouldbeexplainedifamajority

oforretanswersarein theACQUIS olletion.

Themain dierene betweenFIDJI arhitetureused forResPubliQA and

theoneused forotherevaluationampaigns (CLEF,Quaero) isthenumberof

doumentsreturnedbyLuene:15doumentsforResPubliQAand100forother

ampaigns.Wehavetoevaluateifseletingmoredoumentswouldimprovethe

results.

Campaign FIDJI2010 FIDJI2009

Language EnglishFrenhEnglishFrenh

➀

^0.09 ^0.08 ^- ^-

➁

^0.35 ^0.30 ^- ^0.30

➂

^0.48 ^0.36 ^- ^0.42

➃

^0.11 ^0.08 ^- ^-

➄

^0.47 ^0.34 ^- ^-

Table 3.1measureforFrenhandEnglish.

(8)

Corpus EuroparlAquisEuroparlAquis

➀

³ ¹⁴ ⁶ ⁸

➁

²⁴ ⁴³ ²² ³⁶

➂

³³ ⁵⁶ ²¹ ⁴¹

Table 4.Numberoforretanswers/passagesperorpus.

5 Conlusion

WepresentedinthispaperourpartiipationtotheampaignResPubliQA2010

inFrenhandEnglish.Weevaluatedtwostrategies:pluggingorunpluggingthe

syntatimodulesfordoumentseletionandanswerextration.Asin2009,the

system got low resultsand evenlowerwhen syntati modules are turned o.

Dierentexperimentsontheolletiononrmedthattheuseofsyntatianal-

ysisdereasedresults,whereasitprovedtohelpwhenusedin otherampaigns.

Westillhavetoevaluateifahighernumberofdoumentsseletedbythesearh

engineanimprovetheresults.

6 Aknowledgements

ThisworkhasbeenpartiallynanedbyOSEOunder theQuaeroprogram.

Referenes

1. Morieau,V.,Tannier,X.:FIDJI:UsingSyntaxforValidatingAnswersinMultiple

Douments. InformationRetrieval,SpeialIssueonFousedInformation Retrieval

10791(2010)

2. Aït-Mokhtar, S., Chanod,J.P., Roux,C.: Robustness beyondshallowness: Inre-

mentaldeepparsing. NaturalLanguageEngineering8(2002)121144

3. Tannier, X., Morieau, V.: Studying Syntati Analysisin a QA System:FIDJI

ResPubliQA'09. In:ProeedingsofCLEF2010. NumberLNCS6241inLeture

NotesinComputerSiene,Springer-Verlag,NewYorkCity,NY,USA(2010)

4. Pérez, J., Garrido, G., Álvaro Rodrigo, Araujo, L., Peñas, A.: Information Re-

trievalBaselines fortheResPubliQATask. In:WorkingNotes fortheCLEF2009

Workshop,Corfu,Greee(2009)

5. Peñas,A.,Forner, P.,Sutlie,R.,Rodrigo,A., For su, C.,Alegria, I.,Giampi-

olo, D., Moreau, N., Osenova,P.: Overview of ResPubliQA 2009: QuestionAn-

swering Evaluation overEuropeanLegislation. In:Working Notes for the CLEF

2009 Workshop,Corfu,Greee(2009)