• Aucun résultat trouvé

FIDJI @ ResPubliQA 2010

N/A
N/A
Protected

Academic year: 2022

Partager "FIDJI @ ResPubliQA 2010"

Copied!
8
0
0

Texte intégral

(1)

XavierTannier,VéroniqueMorieau

LIMSI-CNRS

Univ.Paris-Sud,Orsay,Frane

xtannier, morieaulimsi.fr

Abstrat. Inthispaper,wepresenttheresultsobtainedbythesys-

tem FIDJIfor bothFrenh and English monolingual evaluations,

at ResPubliQA 2010 ampaign. In this ampaign, we foused on

arryingonourevaluationsonerningtheontributionofoursyn-

tatimoduleswiththisspei olletion.

1 Introdution

FIDJI(FindingInDoumentsJustiationsandInferenes)isanopen-domain

question-answering(QA) system for Frenh [1℄ and, more reently, English. It

ombines syntatiinformation withtraditional QAtehniques suh asnamed

entityreognitionandtermweightingin ordertovalidate answersthroughdif-

ferentdouments.

This paper fouses on the results obtained by FIDJI at ResPubliQA 2010

evaluation.Itpresentsrstabriefoverviewofthesystemandofitsadaptation

to English.Then, thespei hoiesmade fortheampaign are detailed,and

someresultsarenallygiven.

2 FIDJI

Figure 1 presents the arhiteture of FIDJI. The system relies on a syntati

analysis and named entity tagging of thequestion and of alimited number of

douments foreah question.This analysisis performedby theparser XIP[2℄

enrihed withsomeadditionalspei rules.

ThedoumentolletionisindexedbythesearhengineLuene 1

.Theindex

ontainsrawtextonly.First,thesystemanalysesthequestionandsubmitsthe

keywordsofthequestiontoLuene(moduleA):therst15doumentsarethen

proessed(module B).Wedeidedtoredue thenumberofdoumentsbeause

theyareratherlongandtheirparsingwouldtaketoomuhtime.Thereasonwe

perform this analysis online is that we aim at avoidingasmuh preproessing

aspossible(thesystemisdesignedtoexploreWebolletions[1℄).Amongthese

douments,FIDJIlooksforsentenesontainingthehighestnumberofsyntati

(2)

sentenes(module D1)and theanswertype, whenspeiedin the question,is

validated(moduleE).

ThemainobjetiveofFIDJIistoprodueanswerswhiharefullyvalidated

byasupportingtext(orpassage)withrespettoagivenquestion.Thediulty

isthat ananswer(orsomepieesofinformationomposingananswer)maybe

validatedbyseveraldouments.

Our approah onsists in heking if all the harateristis of a question

(namelythedependenyrelationsandtheanswertype)mayberetrievedinone

orseveraldouments.Inthisontext,FIDJIhastodetetsyntatiimpliations

betweenquestionsandpassagesontainingtheanswersandtovalidatethetype

ofthepotentialanswerin thispassageorin anotherdoument.

Sinethelastevaluationampaign in2009,FIDJIhasbeenadaptedto En-

glish.Speiruleshavebeendeveloppedforquestionanalysis(moduleA)and

doumentproessing(module B).The othermodules areommonto bothEn-

glishandFrenh.

The following examples illustrate how FIDJI extrats answers, and more

detailsonerningthesystemanbefoundin [1℄.

1

(3)

Question analysis provides lemmatisation, POS tagging and dependeny rela-

tions,aswellasthequestiontypeandtheexpetedanswertype.Forexample:

Question:Quelpremierministres'est suiidéen1993?

(Whih PrimeMinister ommittedsuiide in1993?)

Dependenies:DATE(1993)

PERSON(ANSWER)

SUBJ(se suiider, ANSWER)

attribut(ANSWER, ministre)

attribut(ministre, premier)

Questiontype:fatoid

Expetedanswertype:person(spei answertype:primeminister)

Thequestionisturned into adelarativesentene wherethe answerisrep-

resentedby the`ANSWER' lemma. The followingsentene isseleted beause

itontainsthehighestnumberofdependenyrelations:

Pierre Bérégovoys'estsuiidé en 1993.

(Pierre Bérégovoyommittedsuiide in1993.)

Dependenies:

DATE(1993)

PERSON(Pierre Bérégovoy)

SUBJ(se suiider, Pierre Bérégovoy)

PierreBérégovoyinstantiatestheANSWERslotofthequestiondependenies

andbeomesaandidateanswer.Thenamedentitytype(person)andtherst

threedependeniesofthequestionarevalidatedinthissentene.Inordertofully

validate the andidate answer, the system searhes the missing dependenies

(attribut(Pierre Bérégovoy, ministre) and attribut(ministre, premier) ) in

asinglesentene ofthewhole doumentolletion.Thesedependenieswill be

found in any sentene speaking about le premier ministre Pierre Bérégovoy

(PrimeMinister Pierre Bérégovoy)andtheanswerwill bevalidated.

2.2 Example2

Foromplexquestions,it isobviousthatanswersarenotalwaysshortphrases.

For this reason, FIDJI provides a full passage as an answer. On these kinds

of questions,the systembehavesasalassialpassageretrievalsystem,exept

that andidate passages are retrievedthrough syntatirelations and relevant

disoursemarkers(about100nouns,verbs,prepositionsandadjetives,manually

ompiled)insteadofkeywordsonly.Here isanexampleofaomplexquestion:

Question:Whyistheskyblue?

(4)

Expetedanswertype:reason 2

Thefollowingpassageis seletedbeauseit ontainsallthedependenyre-

lationsofthequestionandaausalmarker:

Andif the skyisblue, itisbeause of Rayleigh sattering...

attribut(sky, blue)

VMOD(be, sattering)

PREPOBJ(sattering, beause of)

...

3 ResPubliQA'10 experiments

In2009,ResPubliQAresultslearnedusalot aboutthebehaviorofoursystem.

Other evaluations (former CLEF and Quaero ampaigns) had shown that

using syntati analysis modules for retrieving douments and extrating the

answerssigniantlyimprovedtheresults[1℄.However,withResPubliQA eval-

uationset,passageextrationturnedouttobemuhbetterbyreplaingsyntax

bytraditional bag-of-wordstehniques [3℄.Thisis donebyturning omodules

C1and D1in Figure1.

Passageextrationisthenperformedbyalassialseletionofsenteneson-

tainingamaximumofquestionsigniantkeywords(moduleC2),andanswerex-

trationisahievedwithoutslotinstantiationwithindependenies(moduleD2).

ThenewguidelinesinResPubliQA2010oeredusthepossibilitytoarryon

ourexperimentsin thisway.Indeed,twodierenttaskswereallowedthisyear:

Paragraphseletion(PS),similarto2009task,whereonlythefullparagraph

ontainingtheexatanswerweretobereturned.Passagesarenotindenite

partsoftexts oflimitedlength,but predenedparagraphsidentiedin the

orpusbyXMLtags<p>.

Answer seletion(AS), loserto traditional QAtasks, where systemswere

requiredtodemaratealsotheexatanswer,supportedbyafullparagraph.

Inthis latter task, judged answersan be INEXACT (good support but

bad boundaries for short answer), MISSED (good support but wrong short

answer),RIGHT (goodsupport andgoodanswer)orWRONG.

Tworuns perlanguagewere allowed.Inorderto ontinuetestingourplug/

unplugstrategies,andtoexperimentthemforthersttimeinEnglish,wehose

thefollowingproedureforourtworuns:

2

Reason is not anamed entity, as person inthe rst example, but this answer

typepointsoutthatatextexpliitelyexplainingareasonshouldbeprefered(inour

(5)

passageretrieval,that hadthebest resultsofthesystemlast year.

2. AStask,syntatimodulesturnedon,inordertotestwhetheranswerex-

trationwaseetiveornotonthisolletion.Moreover,byaddinganswers

with INEXACT, MISSED and RIGHT status from our AS run, we

anobtainaPS runwithmodulesturnedon,whihallowsustoevaluate

modulesonthesametask.

4 Results

Wepresenttheresultsof5experimentsforbothFrenh andEnglish.Therst

threeome fromoialResPubliQA runs:

: AS task with syntati modules turned on (exat answers judged as

RIGHT),

:PStaskwithsyntatimodulesturnedon(exatanswersof

judgedas

RIGHT,INEXACT,MISSED),

:PS taskwithsyntatimodulesturnedo.

Toompletetheevaluation,wealsoranunoialongurationandahieved

theassessmentbyourselves:

: AS task with passageretrievalturned o but answerextration turned

on(modules C2andD1,withexatanswersjudgedasRIGHT),

:PStaskwithpassageretrievalC1turnedobutanswerextrationturned

on(exat answersof

judgedasRIGHT,INEXACT,MISSED).

In order to evaluate the performane of the question analysis module, we

manuallyidentiedthetypesofquestion.AsFIDJIannotproessopinionques-

tions,wedeidedtoonsiderthemasfatoid.AlthoughquestionsinFrenhand

Englisharetranslationsofeahotherandtheirrespetiveanswershouldbeex-

tratedfromthesameparagraph,wenotiedthat, foragivenquestion,itstype

isnotalwaysthesameinEnglishasinFrenh.Forexample,inEnglish,thetype

ofquestion169isreason/purpose whileinFrenh,itisfatoid:

(EN)Whyisthe tradeinammoniumnitratefertilizershamperedwithintheEu-

ropean Eonomi Community?

(FR)Qu'est-equiaentravéleommered'engraisàbasedenitrated'ammonium

dans la Communauté Éonomique Européenne? (What has hampered the trade

inammonium nitrate fertilizers...?)

Thisisnotonlyanissueofsyntatidierenesduetotranslationparaphras-

ing;thetargetofthequestionisdierent.Stritlyspeaking,theFrenhquestion

might aept a noun phraselike les réglementations régissant la ommerial-

isation des engrais à base de nitrate d'ammonium (the dierent regulations

(6)

thisissue 3

.

Tables 1 and 2 presents FIDJI's results for runs

,

and

, as well as

experiments

and

, by types of questions (manually identied). In Frenh,

86%ofquestiontypeswereorretlyidentied byFIDJI(wefound9questions

that were ill-formedor with misspellings and whih FIDJI ouldnot orretly

analyse)whereasin English,only69.5%wereorretlyidentied.

Conerningouroialruns,asweanseeinTables1and2,answerextra-

tion performane (

) is very low (0.25 for both English and Frenh). Results

arebetterforpassageseletion(

and

)foreverytypeofquestions andeven

betterwhensyntatimodules areswithed o(

). Resultsaregloballybetter

forEnglishthanforFrenhsotheperformaneofthequestionanalysismodule

annotexplaintheseresults.

Inbothlanguages, orretanswersto denition questions dramatially de-

reasewithD1turnedo.Thisisbeausewedonothaveanynon-syntatiway

to extrat the answerfor many ofthese questions (denitions notexpetinga

namedentity,asWhatismaladministration?,anonlybeansweredbydenition

patterns in FIDJI).Turning o syntati modules neessarilyleads to a NOA

answerintheseases.

Weannotiethat forbothEnglishandFrenh,theresultsfollowthesame

trend and that results forpassageseletion are better for omplex questions

(reason/purposeandproedure), probablybeauseFIDJIseletspassages on-

tainingdisoursemarkersforthistypeofquestions.Also,forthesequestions,we

alwaysreturnedthefullparagraphasexatshortanswer,onsideringthattry-

ingtofousevenmoreinsidetheparagraphwasnotusefulforsuhquestions.As

theassessors didonsiderthat shorteranswersanbebetter,thesystemoften

getsanINEXACT statusfor.

Finally, our additional runs

and

show a small improvement, showing that bestresultsare obtainedwhen turning osyntatipassageretrieval,but

turning on syntatianswerextration (usingmodules C2 andD1). This is at

leastlear onerningnon-fatoidquestions.This ndingisimportantand will

helpusinthefuturetohooseoursearhstrategiesaordingtodierentorpora

andquestiontypes.

Last year, the pure information retrieval baseline [4℄ whih onsisted in

queryingtheindexedolletionwiththeexattextofthequestionandreturning

theparagraphretrievedintherstposition,hadthebestresultsforFrenhand

ranked 5out of 14 in English [5℄. Even if asubset of the Europarlorpushas

beenaddedtothedoumentolletionin2010,weanseethatour1measures

(seeTable3)arestilllowerthanthe2009baseline(0.53forEnglishand0.45for

Frenh).

In2009,wenotedthatourresultsweredueto ACQUIS orpusspeiities:

dierentregisteroflanguage,moreonstrainedvoabulary,textshavingaparti-

ularstruture,withanintrodutionfollowedbylongsentenesextendingonsev-

3

(7)

Numberofquestions 110 29 29 32 200

Corretanswers 10(9.1%) 3(10.3%) 1(3.5%) 3(9.4%) 17(8.5%)

Corretpassages 33(30%) 10(34.5%) 10(34.5%) 14(43.8%) 67(33.5%)

Corretpassages 51(46.3%) 3(10.3%) 18(62%) 17(53.1%)89(44.5%)

Unoialruns

Corretanswers 13(11.8%) 3(10.3%) 2(6.9%) 4(12.5%) 22(11%)

Corretpassages 47(42.7%) 9(31.0%) 19(65.5%) 18(56.3%) 93(46.5%)

Table 1.Resultsbyquestiontype(English).

Typeofquestions Fatoid DenitionReason/Purpose Proedure TOTAL

Numberofquestions 117 29 26 28 200

Corretanswers 11(9.4%) 2(6.9%) 0(0%) 1(3.6%) 14(7%)

Corretpassages 35(29.9%)6(20.7%) 8(30.8%) 8(28.6%) 57(28.5%)

Corretpassages 30(25.6%)6(20.7%) 13(50%) 13(46.4%) 62(31%)

Unoialruns

Corretanswers 12(10.3%)3(10.3%) 0(0%) 2(6.3%) 17(8.5%)

Corretpassages 31(28.2%)7(24.1%) 14(53.8%) 15(50.0%) 67(33.5%)

Table2.Resultsbyquestiontype(Frenh).

eralparagraphs,et.Table4showsthat FIDJIfoundorretanswers/passages

mainlyintheACQUISolletion.AsFIDJIhasdiultywithseletingpassages

in theACQUISolletion,FIDJI'slowresultsouldbeexplainedifamajority

oforretanswersarein theACQUIS olletion.

Themain dierene betweenFIDJI arhitetureused forResPubliQA and

theoneused forotherevaluationampaigns (CLEF,Quaero) isthenumberof

doumentsreturnedbyLuene:15doumentsforResPubliQAand100forother

ampaigns.Wehavetoevaluateifseletingmoredoumentswouldimprovethe

results.

Campaign FIDJI2010 FIDJI2009

Language EnglishFrenhEnglishFrenh

0.09 0.08 - -

0.35 0.30 - 0.30

0.48 0.36 - 0.42

0.11 0.08 - -

0.47 0.34 - -

Table 3.1measureforFrenhandEnglish.

(8)

Corpus EuroparlAquisEuroparlAquis

3 14 6 8

24 43 22 36

33 56 21 41

Table 4.Numberoforretanswers/passagesperorpus.

5 Conlusion

WepresentedinthispaperourpartiipationtotheampaignResPubliQA2010

inFrenhandEnglish.Weevaluatedtwostrategies:pluggingorunpluggingthe

syntatimodulesfordoumentseletionandanswerextration.Asin2009,the

system got low resultsand evenlowerwhen syntati modules are turned o.

Dierentexperimentsontheolletiononrmedthattheuseofsyntatianal-

ysisdereasedresults,whereasitprovedtohelpwhenusedin otherampaigns.

Westillhavetoevaluateifahighernumberofdoumentsseletedbythesearh

engineanimprovetheresults.

6 Aknowledgements

ThisworkhasbeenpartiallynanedbyOSEOunder theQuaeroprogram.

Referenes

1. Morieau,V.,Tannier,X.:FIDJI:UsingSyntaxforValidatingAnswersinMultiple

Douments. InformationRetrieval,SpeialIssueonFousedInformation Retrieval

10791(2010)

2. Aït-Mokhtar, S., Chanod,J.P., Roux,C.: Robustness beyondshallowness: Inre-

mentaldeepparsing. NaturalLanguageEngineering8(2002)121144

3. Tannier, X., Morieau, V.: Studying Syntati Analysisin a QA System:FIDJI

ResPubliQA'09. In:ProeedingsofCLEF2010. NumberLNCS6241inLeture

NotesinComputerSiene,Springer-Verlag,NewYorkCity,NY,USA(2010)

4. Pérez, J., Garrido, G., Álvaro Rodrigo, Araujo, L., Peñas, A.: Information Re-

trievalBaselines fortheResPubliQATask. In:WorkingNotes fortheCLEF2009

Workshop,Corfu,Greee(2009)

5. Peñas,A.,Forner, P.,Sutlie,R.,Rodrigo,A., For su, C.,Alegria, I.,Giampi-

olo, D., Moreau, N., Osenova,P.: Overview of ResPubliQA 2009: QuestionAn-

swering Evaluation overEuropeanLegislation. In:Working Notes for the CLEF

2009 Workshop,Corfu,Greee(2009)

Références

Documents relatifs

Relevant paragraphs are selected from the retrieved documents based on the TF-IDF of the matching query words along with n-gram overlap of the paragraph with the original

We relied on our text search engine to find the paragraphs relevant to the question in the corpus.. The paragraph with the highest score was returned as

The selection of the best answer candidate is based on the shallow features ob- tained by matching the question and the sentence terms, and (for sentences with.. a full parse) also

AS Task Results, ANSWERED: number of questions answered, UNAN- SWERED: number of questions unanswered, ANSWERED R.C.: number of questions answered with right candidate answer,

In an additional experiment, we used the parallel collection to obtain a list of answers in different languages (Spanish, English, Italian and French), choosing as the best

answer type is validated: the seleted answer is tagged as a loation (state).

Given the task characteristics, a paragraph candidate to contain the answer to a question will be one where the maximum number of question terms appear (excluding stopwords); with

a named entity, the entity of the expeted type whih is losest to the question words is seleted. Otherwise, patterns of extration