Question Answering in Spanish

(1)

JoseL.Vicedo,RubenIzquierdo,FernandoLlopis,andRafaelMu~noz

DepartamentodeLenguajesySistemasInformaticos

UniversityofAlicante,Spain

fvicedo,rib1,llopis,rafael g@dl si.u a.es

Abstract. Thispaperdescribesthearchitecture,operationandresults

obtainedwiththeQuestionAnsweringprototypeforSpanishdeveloped

in the Department of Language Processing and Information Systems

at the University of Alicantefor CLEF-2003 Spanish monolingualQA

evaluation task.Oursystem hasbeenfullydevelopedfromscratchand

it combines shallow natural language processing tools with statistical

dataredundancytechniques. Thissystem is able to performQA tasks

independentlyfrom static corpora or from Webdocuments.Moreover,

WorldWideWebcanbe usedas externalresource toobtain evidences

forsupportingandcomplementingCLEFSpanishcorpora.

1 Introduction

OpendomainQAsystemsaredened astoolscapableofextractingtheanswer

to user queries directlyfrom unrestricted domain documents. Investigation in

questionansweringhasbeentraditionally focussedto Englishlanguagemainly

fosteredbyTREC 1

evaluations.However,thedevelopingofQAsystemsforother

languagesthanEnglishwas consideredbytheQARoadmapCommitteeasone

ofthemainlinesoffutureinvestigationsinthiseld[2].Moreover,itconsidered

essentialobtainingsystemsthatperformQAfromsourcesofinformationwritten

in dierentlanguages.

As result of this interest, the Cross-Language Evaluation Forum 2

(CLEF

2003),hasorganisedanewtask(MultipleLanguageQuestionAnswering)guided

to theevaluationof QAsystemsin severallanguages.This evaluation proposes

severalsubtasks:monolingualSpanish,ItalianandDutchQAandbilingualQA.

Thebilingualsubtaskisdesignedtomeasuresystemsabilityin ndinganswers

in a collectionof Englishtexts, when questions are posed in Spanish, Italian,

Dutch,GermanorFrench.

Themaincharacteristicsofthisrstevaluationaresimilartothoseproposed

in past TREC Conferences. For each subtask, the organisation provides 200

questions requiring short, factual answers whose answer is not guaranteed to

occurin thedocumentcollection.Systems should returnupto three responses

perquestion,and answersshould be orderedby condence.Responses haveto

1

TextREtrievalConference

2

(2)

[answer-string, docid] pair or the string \NIL" when the systems do not nd

a correct answer in the document collection. The \NIL" string is considered

correctifthereisnoanswerknowntoexistinthedocumentcollection;otherwise

itisjudgedasincorrect.Twodierentkindsofanswersareaccepted:theexact

answerora50byteslongstringthat shouldcontaintheexactanswer.

OurparticipationhasbeenrestrictedtotheSpanishmonolingualtaskinthe

categoryofexactanswers.AlthoughwehaveexperienceinpastTRECcompe-

titions[4{6],wedecidedtobuildanewsystemmainlyduetothebigdierences

between English and Spanishlanguages. Moreover,we designed averysimple

approach(1personmonth)thatwillfacilitatelatererroranalysisandwillallow

detectingthosebasiclanguage-dependentcharacteristicsthatmakeSpanishQA

dierentfromEnglishQA

This paper is organised as follows: Section 2 describes the structure and

operation of ourSpanish QA system. Afterwards,we present and analyse the

results obtained at CLEF QA Spanish monolingual task. Finally we extract

initialconclusionsand discussdirectionsforfuturework.

2 System Description

OurQAsystemisstructuredintothethreemainmodulesofageneralQAsystem

architecture:

1. Questionanalysis.

2. Passageretrieval.

3. Answerextraction.

Question analysis is the rst stage in QA process. This module processes

questions formulated to the system in order to detect and extract the useful

informationtheycontain.Thisinformationisrepresentedinaformthatallows

to beeasilyprocessed bytheremainingmodules. Passage retrieval module ac-

complishesarstselectionofrelevantpassages.Thisprocessisaccomplishedin

parallelretrievingrelevantpassagesfromtheSpanishEFEdocumentcollection

and the Spanish pages in the World Wide Web. Finally, the answer selection

module processesrelevant passagesin order to locateandextract thenal an-

swer.Figure1showssystemarchitecture.

2.1 Question analysis

Questionanalysismodulecarriesouttwomainprocesses:answertypeclassica-

tion andkeywordselection.Theformerdetectsthetypeofinformationthatthe

questionexpectsasanswer(adate, aquantity,etc)andthelatterselectsthose

questionterms(keywords)thatwillallowlocatingthedocumentsthatarelikely

to containtheanswer.

These processes are performed by using a simple manually developed set

(3)

Question

Relevant passages Question Analysis

IR-n Passage Retrieval

Answer Extraction

Answers

Google Passage Retrieval

Relevant passages

Fig.1.Systemarchitecture

answertype.Thisway,onceapatternmatchesthequestionposedtothesystem,

thisprocessreturnsboth,thelistofkeywordsassociatedwiththequestionand

thetypeoftheexpectedanswerassociatedtothematchedpattern.Asoursystem

lacksofanamed-entitytagger,itcurrentlyonlycopeswiththreepossibleanswer

types:NUMBER,DATEandOTHER.Figure2showsexamplesofthepatterns

andthe outputgeneratedat questionanalysis stagefor testquestions 002,006

and103.

2.2 Passageretrieval

Passage retrieval stage is accomplished in parallel using two dierent search

engines:IR-n[3]andGoogle 3

.

IR-n system is a passage retrieval system that uses groups of contiguous

sentencesasunit of information.FromQA perspective,thispassageextraction

model allowsus to benet from the advantages of discourse-basedpassagere-

trievalmodels sinceself-containedinformationunits of text,such assentences,

areusedforbuildingthepassages.First,IR-nsystemperformspassageretrieval

overtheentireSpanishEFEdocumentcollection.Inthiscase,keywordsdetected

atquestionanalysisstageareprocessedusingMACOSpanishlemmatiser[1]and

theircorrespondinglemmasareusedforretrievingthe50mostrelevantpassages

3

(4)

Question 002 ¿Qué país invadió Kuwait en 1990?

Pattern (qué|Qué)\s+([a-z|áéíóúñ]+) Answer type OTHER

Keywords país invadió Kuwait 1990 Lemmas país invadir Kuwait 1990

Question 006 ¿Cuándo decidió Naciones Unidas imponer el embargo sobre Irak?

Pattern (cuándo|Cuándo)\s+

Answer type DATE

Keywords decidió Naciones Unidas imponer embargo Irak Lemmas decidir Naciones Unidas imponer embargo Irak

Question 103 ¿De cuántas muertes son responsables los Jemeres Rojos?

Pattern ( Cuántos|cuántos|Cuántas|cuántas)\s+([a-z|áéíóúñ]+) Answer type NUMBER

Keywords muertes responsables Jemeres Rojos Lemmas muerte responsable Jemeres Rojos

Fig.2.Questionanalysisexample

fromtheEFEdocumentdatabase.Thesepassagesaremadeupbytextsnippets

of2sentenceslength.Second,thesamekeywordlist(withoutbeinglemmatised)

is posed to Google Internet search engine. Relevant documents are notdown-

loaded.ForeÆciency considerations,the systemonly selectsthe 50 best short

summariesreturnedinGooglemainretrievalpages.Figure3showsexamplesof

retrievedpassagesforquestion103.Inthisexamplequestionkeywordsfoundin

relevantpassagesareunderlined.

2.3 Answerextraction

This module processesbothsets ofpassages selectedat passageretrieval stage

(IR-nandGoogle)inordertodetectandextractthethreemoreprobableanswers

to thequery.Theprocessesinvolvedat thisstagearethefollowing:

1. Relevant sentence selection.Sentencesinrelevantpassagesareselectedand

scored.

(a) Passagesaresplit intosentences.

(b) Each sentence is scored accordingto the number of questionkeywords

theycontain.Keywordsappearing twice ormoretimesare only added

once. Thisvalue(sentencescore)measures thesimilaritybetweeneach

relevantsentenceandthequestion.

(c) Sentencesthatdonotcontainanykeywordarediscarded(sentence score

=0).

2. Candidate answer selection. Candidate answersare selected from relevant

sentences.

(5)

Question 103 ¿De cuántas muertes son responsables los Jemeres Rojos?

First retrieved passage from EFE Collection :

<DOCNO> EFE19940913-06889

... explotan los Jemeres Rojos, quienes no les preocupa que sus ideas no sean respetadas por la comunidad internacional, que los acusa de ser los responsables de la muerte de más de un millón de camboyanos durante el genocidio de 1975 1978.

First retrieved passage from the World Wide Web:

<DOCNO> 1 Gooogle

Los Jemeres Rojos fueron responsables de más de un millón de muertes, mataron al menos a 20.000 presos políticos y torturaron a cientos de miles de personas.

Fig.3.Passagesretrievedforquestion103

(b) Quantities,datesand propernounsequences aredetectedandtheyare

mergedinto uniqueexpressions.

(c) Everyterm or mergedexpression in relevantsentences is considered a

candidateanswer.

(d) Candidateanswersareltered.Thisprocessgetsridofthosecandidates

thatstartofnishwithastopwordorcontainaquestionkeyword.

(e) Fromtheremainingcandidateset,onlythosewhosesemantictypematches

theexpectedanswertypeareselected.Whentheexpectedanswertype

isOTHER,onlypropernounphrasesareselectedasnalcandidatean-

swers. Figure 3 shows (in boldface) the selectedanswer candidates for

question103.

3. Candidate answer combination. Each answer candidate is assigned ascore

thatmeasuresitsprobabilityofbeingthecorrectanswer(answerfrequency).

As thesame candidate answercan probably befound in dierentrelevant

sentences, the candidate answer set may contain repeated elements. Our

systemexploitsthisfactbyrelatingcandidateredundancywithanswercor-

rectnessasfollows:

(a) Repeatedcandidateanswersaremergedintoauniqueexpressionthatis

scored accordingto thenumberof times this candidate appears in the

candidateanswerset.

(b) Shorter expressions are preferred as answer to longer ones. This way,

terms in longcandidates that appear themselvesas answercandidates

boost shorter candidate answerscoresby adding longcandidate scores

tothefrequencyvalueobtainedbyshorterones.

4. Webevidenceaddition. Allpreviousprocessesmaybeoptionallyperformed

in parallel for retrieving answers from web documents. Therefore, at this

momentthe systemhas twolists of candidate answers:oneobtainedfrom

(6)

process consists on increasing answer frequencyof EFE list candidates by

addingtheircorrespondingfrequencyvaluesobtainedonweblist.Thisway,

candidatesappearingonlyinweblistarediscarded.

5. Final answer selection. Answer candidatesfrom previousstepsare givena

nalscore(answer score)thatmeasurestwocircumstances:(1)theirredun-

dancy through the answer extraction process (answer frequency) and (2)

thecontext theyhave beenfound in (sentencescore). As thesame candi-

dateanswermaybefoundindierentcontexts,ananswerwillmaintainthe

maximum score for all the contexts they appear in. Final answer score is

computedasfollows:

answer score=sentencescoreanswer frequency (1)

Answers are then rankedaccordingly to their answer scoreand rst three

answersareselectedforpresentation.Amongthecandidateanswersforques-

tion103(exampleinFigure3),thesystemselects\unmillon"(onemillion)

asthenalanswer.

3 Results

Wesubmittedtworunsforexactanswercategory.First run (alicex031ms)was

obtainedapplyingthewholesystemdescribedabovewhilesecondrunperformed

QAprocesswithoutactivating Web retrieval(alicex032ms). Table1showsthe

resultsobtainedforeach run.

Table1.Spanishmonolingualtaskresults

Strict Lenient

Run MRR% CorrectMRR% Correct

alicex031ms0,3075 40,0 0,3208 43,5

alicex032ms0,2966 35,0 0,3175 38,5

Resultanalysis may notbe asconclusiveaswe would desiremainly due to

thesimplicityofourapproach. Besides,thelackof thecorrectanswersfortest

questions at this moment do notallow us to perform a correct erroranalysis.

Anyway, results obtained show that using the World Wide Web as external

resourceincreasesthepercentageofcorrectanswersretrievedinvepoints.This

factconrmsthatQAsystemsperformanceforotherlanguagesthanEnglishcan

(7)

ThisworkhastobeseenasarstandsimpleattempttoperformQAinSpanish.

Consequently,thereareseveralareasof futureworktobeinvestigated.Among

them,wecan selectthefollowingones:

{ Questionanalysis.Sincethesamequestioncanbeformulatedinverydiverse

forms (interrogative,aÆrmative,using dierent wordsand structures,...),

weneedtostudyaspectssuchasrecognizingequivalentquestionsregardless

ofthespeechact or ofthe words,syntacticandsemanticinter-relations or

idiomaticformsemployed.

{ Answer taxonomy. An important part in the process of question interpre-

tation residesin systems abilityof relating questions with theirrespective

answerscharacteristics. Consequently, we need to develop a broadanswer

taxonomythatenablesmultilingualanswertypeclassication.Probably,us-

ingEuroWordNet 4

semanticnetstructure.

{ Passage Retrieval. An enhancedquestion analysis will improve passagere-

trievalperformancebyincludingquestionexpansiontechniquesthatenable

retrievingpassagesincludingrelevantinformationexpressedwithtermsthat

aredierent(but equivalent)tothoseusedforquestionformulation.

{ AnswerExtraction.Integratingnamed-entitytaggers.Usingabroadanswer

taxonomyinvolvesusingtoolscapableof identifyingtheentitythat aques-

tionexpectsasanswer.Thereforeweneedtointegratenamed-entitytagging

capabilitiesthatallowstonarrowdownthenumberofcandidatestobecon-

sideredforansweringaquestion.

Eventhoughalltheselinesneedtobeinvestigated,itisimportanttoremark

that this investigation needs to be developed from a multilingual perspective.

Thatis,futureinvestigationsneedtoaddresslanguage-dependentandlanguage-

independentmodulesdetectionandcombinationwiththemainlong-termobjec-

tiveof developingawhole system capable of performing multilingual question

answering.

References

1. JordiAtserias, JosepCarmona, IreneCastellon, SergiCervell,MontseCivit,Llus

Marquez,M.A.Mart,LlusPadro,RoserPlacer,HoracioRodrguez,MarionaTaule,

and JordiTurmo. MorphosyntacticAnalysisand Parsingof UnrestrictedSpanish

Text. InProceedingsofFirstInternationalConference onLanguage Resourcesand

Evaluation.LREC'98,pages1267{1272, Granada,Spain,1998.

2. JohnBurger,ClaireCardie,VinayChaudhri,RobertGaizauskas,SandaHarabagiu,

DavidIsrael, ChristianJacquemin,Chin-YewLin,SteveMaiorano,GeorgeMiller,

Dan Moldovan, BillOgden, JohnPrager, EllenRilo,Amit Singhal,Rohini Shri-

hari, Tomek Strzalkowski, Ellen Voorhees, and Ralph Weishedel. Issues, Tasks

and Program Structures toRoadmap Research inQuestion&Answering(Q&A).

http://www-nlpir.nist.gov/projects/duc/papers/qa.Roadmap-paperv2.doc,2000.

4

(8)

retrievalsystema at CLEF2001. InWorkshopof Cross-Language Evaluation Fo-

rum(CLEF2001),LecturenotesinComputerScience,Darmstadt,Germany,2001.

Springer-Verlag.

4. JoseLuisVicedoandAntonioFerrandez.AsemanticapproachtoQuestionAnswer-

ing systems. InNinthText REtrieval Conference,volume500-249of NISTSpecial

Publication, pages 511{516, Gaithersburg, USA, nov 2000. National Institute of

StandardsandTechnology.

5. JoseLuisVicedo,AntonioFerrandez, andFernandoLlopis. UniversityofAlicante

at TREC-10. In Tenth Text REtrieval Conference, volume 500-250 ofNIST Spe-

cialPublication,Gaithersburg,USA,nov2001.NationalInstituteofStandardsand

Technology.

6. JoseLuisVicedo,FernandoLlopis,and AntonioFerrandez. UniversityofAlicante

ExperimentsatTREC-2002. InEleventh Text REtrieval Conference,volume 500-

251ofNIST SpecialPublication,Gaithersburg, USA,nov2002. NationalInstitute

ofStandardsandTechnology.