JoseL.Vicedo,RubenIzquierdo,FernandoLlopis,andRafaelMu~noz
DepartamentodeLenguajesySistemasInformaticos
UniversityofAlicante,Spain
fvicedo,rib1,llopis,rafael g@dl si.u a.es
Abstract. Thispaperdescribesthearchitecture,operationandresults
obtainedwiththeQuestionAnsweringprototypeforSpanishdeveloped
in the Department of Language Processing and Information Systems
at the University of Alicantefor CLEF-2003 Spanish monolingualQA
evaluation task.Oursystem hasbeenfullydevelopedfromscratchand
it combines shallow natural language processing tools with statistical
dataredundancytechniques. Thissystem is able to performQA tasks
independentlyfrom static corpora or from Webdocuments.Moreover,
WorldWideWebcanbe usedas externalresource toobtain evidences
forsupportingandcomplementingCLEFSpanishcorpora.
1 Introduction
OpendomainQAsystemsaredened astoolscapableofextractingtheanswer
to user queries directlyfrom unrestricted domain documents. Investigation in
questionansweringhasbeentraditionally focussedto Englishlanguagemainly
fosteredbyTREC 1
evaluations.However,thedevelopingofQAsystemsforother
languagesthanEnglishwas consideredbytheQARoadmapCommitteeasone
ofthemainlinesoffutureinvestigationsinthiseld[2].Moreover,itconsidered
essentialobtainingsystemsthatperformQAfromsourcesofinformationwritten
in dierentlanguages.
As result of this interest, the Cross-Language Evaluation Forum 2
(CLEF
2003),hasorganisedanewtask(MultipleLanguageQuestionAnswering)guided
to theevaluationof QAsystemsin severallanguages.This evaluation proposes
severalsubtasks:monolingualSpanish,ItalianandDutchQAandbilingualQA.
Thebilingualsubtaskisdesignedtomeasuresystemsabilityin ndinganswers
in a collectionof Englishtexts, when questions are posed in Spanish, Italian,
Dutch,GermanorFrench.
Themaincharacteristicsofthisrstevaluationaresimilartothoseproposed
in past TREC Conferences. For each subtask, the organisation provides 200
questions requiring short, factual answers whose answer is not guaranteed to
occurin thedocumentcollection.Systems should returnupto three responses
perquestion,and answersshould be orderedby condence.Responses haveto
1
TextREtrievalConference
2
[answer-string, docid] pair or the string \NIL" when the systems do not nd
a correct answer in the document collection. The \NIL" string is considered
correctifthereisnoanswerknowntoexistinthedocumentcollection;otherwise
itisjudgedasincorrect.Twodierentkindsofanswersareaccepted:theexact
answerora50byteslongstringthat shouldcontaintheexactanswer.
OurparticipationhasbeenrestrictedtotheSpanishmonolingualtaskinthe
categoryofexactanswers.AlthoughwehaveexperienceinpastTRECcompe-
titions[4{6],wedecidedtobuildanewsystemmainlyduetothebigdierences
between English and Spanishlanguages. Moreover,we designed averysimple
approach(1personmonth)thatwillfacilitatelatererroranalysisandwillallow
detectingthosebasiclanguage-dependentcharacteristicsthatmakeSpanishQA
dierentfromEnglishQA
This paper is organised as follows: Section 2 describes the structure and
operation of ourSpanish QA system. Afterwards,we present and analyse the
results obtained at CLEF QA Spanish monolingual task. Finally we extract
initialconclusionsand discussdirectionsforfuturework.
2 System Description
OurQAsystemisstructuredintothethreemainmodulesofageneralQAsystem
architecture:
1. Questionanalysis.
2. Passageretrieval.
3. Answerextraction.
Question analysis is the rst stage in QA process. This module processes
questions formulated to the system in order to detect and extract the useful
informationtheycontain.Thisinformationisrepresentedinaformthatallows
to beeasilyprocessed bytheremainingmodules. Passage retrieval module ac-
complishesarstselectionofrelevantpassages.Thisprocessisaccomplishedin
parallelretrievingrelevantpassagesfromtheSpanishEFEdocumentcollection
and the Spanish pages in the World Wide Web. Finally, the answer selection
module processesrelevant passagesin order to locateandextract thenal an-
swer.Figure1showssystemarchitecture.
2.1 Question analysis
Questionanalysismodulecarriesouttwomainprocesses:answertypeclassica-
tion andkeywordselection.Theformerdetectsthetypeofinformationthatthe
questionexpectsasanswer(adate, aquantity,etc)andthelatterselectsthose
questionterms(keywords)thatwillallowlocatingthedocumentsthatarelikely
to containtheanswer.
These processes are performed by using a simple manually developed set
Question
Relevant passages Question Analysis
IR-n Passage Retrieval
Answer Extraction
Answers
Google Passage Retrieval
Relevant passages
Fig.1.Systemarchitecture
answertype.Thisway,onceapatternmatchesthequestionposedtothesystem,
thisprocessreturnsboth,thelistofkeywordsassociatedwiththequestionand
thetypeoftheexpectedanswerassociatedtothematchedpattern.Asoursystem
lacksofanamed-entitytagger,itcurrentlyonlycopeswiththreepossibleanswer
types:NUMBER,DATEandOTHER.Figure2showsexamplesofthepatterns
andthe outputgeneratedat questionanalysis stagefor testquestions 002,006
and103.
2.2 Passageretrieval
Passage retrieval stage is accomplished in parallel using two dierent search
engines:IR-n[3]andGoogle 3
.
IR-n system is a passage retrieval system that uses groups of contiguous
sentencesasunit of information.FromQA perspective,thispassageextraction
model allowsus to benet from the advantages of discourse-basedpassagere-
trievalmodels sinceself-containedinformationunits of text,such assentences,
areusedforbuildingthepassages.First,IR-nsystemperformspassageretrieval
overtheentireSpanishEFEdocumentcollection.Inthiscase,keywordsdetected
atquestionanalysisstageareprocessedusingMACOSpanishlemmatiser[1]and
theircorrespondinglemmasareusedforretrievingthe50mostrelevantpassages
3
Question 002 ¿Qué país invadió Kuwait en 1990?
Pattern (qué|Qué)\s+([a-z|áéíóúñ]+) Answer type OTHER
Keywords país invadió Kuwait 1990 Lemmas país invadir Kuwait 1990
Question 006 ¿Cuándo decidió Naciones Unidas imponer el embargo sobre Irak?
Pattern (cuándo|Cuándo)\s+
Answer type DATE
Keywords decidió Naciones Unidas imponer embargo Irak Lemmas decidir Naciones Unidas imponer embargo Irak
Question 103 ¿De cuántas muertes son responsables los Jemeres Rojos?
Pattern ( Cuántos|cuántos|Cuántas|cuántas)\s+([a-z|áéíóúñ]+) Answer type NUMBER
Keywords muertes responsables Jemeres Rojos Lemmas muerte responsable Jemeres Rojos
Fig.2.Questionanalysisexample
fromtheEFEdocumentdatabase.Thesepassagesaremadeupbytextsnippets
of2sentenceslength.Second,thesamekeywordlist(withoutbeinglemmatised)
is posed to Google Internet search engine. Relevant documents are notdown-
loaded.ForeÆciency considerations,the systemonly selectsthe 50 best short
summariesreturnedinGooglemainretrievalpages.Figure3showsexamplesof
retrievedpassagesforquestion103.Inthisexamplequestionkeywordsfoundin
relevantpassagesareunderlined.
2.3 Answerextraction
This module processesbothsets ofpassages selectedat passageretrieval stage
(IR-nandGoogle)inordertodetectandextractthethreemoreprobableanswers
to thequery.Theprocessesinvolvedat thisstagearethefollowing:
1. Relevant sentence selection.Sentencesinrelevantpassagesareselectedand
scored.
(a) Passagesaresplit intosentences.
(b) Each sentence is scored accordingto the number of questionkeywords
theycontain.Keywordsappearing twice ormoretimesare only added
once. Thisvalue(sentencescore)measures thesimilaritybetweeneach
relevantsentenceandthequestion.
(c) Sentencesthatdonotcontainanykeywordarediscarded(sentence score
=0).
2. Candidate answer selection. Candidate answersare selected from relevant
sentences.
Question 103 ¿De cuántas muertes son responsables los Jemeres Rojos?
First retrieved passage from EFE Collection :
<DOCNO> EFE19940913-06889
... explotan los Jemeres Rojos, quienes no les preocupa que sus ideas no sean respetadas por la comunidad internacional, que los acusa de ser los responsables de la muerte de más de un millón de camboyanos durante el genocidio de 1975 1978.
First retrieved passage from the World Wide Web:
<DOCNO> 1 Gooogle
Los Jemeres Rojos fueron responsables de más de un millón de muertes, mataron al menos a 20.000 presos políticos y torturaron a cientos de miles de personas.
Fig.3.Passagesretrievedforquestion103
(b) Quantities,datesand propernounsequences aredetectedandtheyare
mergedinto uniqueexpressions.
(c) Everyterm or mergedexpression in relevantsentences is considered a
candidateanswer.
(d) Candidateanswersareltered.Thisprocessgetsridofthosecandidates
thatstartofnishwithastopwordorcontainaquestionkeyword.
(e) Fromtheremainingcandidateset,onlythosewhosesemantictypematches
theexpectedanswertypeareselected.Whentheexpectedanswertype
isOTHER,onlypropernounphrasesareselectedasnalcandidatean-
swers. Figure 3 shows (in boldface) the selectedanswer candidates for
question103.
3. Candidate answer combination. Each answer candidate is assigned ascore
thatmeasuresitsprobabilityofbeingthecorrectanswer(answerfrequency).
As thesame candidate answercan probably befound in dierentrelevant
sentences, the candidate answer set may contain repeated elements. Our
systemexploitsthisfactbyrelatingcandidateredundancywithanswercor-
rectnessasfollows:
(a) Repeatedcandidateanswersaremergedintoauniqueexpressionthatis
scored accordingto thenumberof times this candidate appears in the
candidateanswerset.
(b) Shorter expressions are preferred as answer to longer ones. This way,
terms in longcandidates that appear themselvesas answercandidates
boost shorter candidate answerscoresby adding longcandidate scores
tothefrequencyvalueobtainedbyshorterones.
4. Webevidenceaddition. Allpreviousprocessesmaybeoptionallyperformed
in parallel for retrieving answers from web documents. Therefore, at this
momentthe systemhas twolists of candidate answers:oneobtainedfrom
process consists on increasing answer frequencyof EFE list candidates by
addingtheircorrespondingfrequencyvaluesobtainedonweblist.Thisway,
candidatesappearingonlyinweblistarediscarded.
5. Final answer selection. Answer candidatesfrom previousstepsare givena
nalscore(answer score)thatmeasurestwocircumstances:(1)theirredun-
dancy through the answer extraction process (answer frequency) and (2)
thecontext theyhave beenfound in (sentencescore). As thesame candi-
dateanswermaybefoundindierentcontexts,ananswerwillmaintainthe
maximum score for all the contexts they appear in. Final answer score is
computedasfollows:
answer score=sentencescoreanswer frequency (1)
Answers are then rankedaccordingly to their answer scoreand rst three
answersareselectedforpresentation.Amongthecandidateanswersforques-
tion103(exampleinFigure3),thesystemselects\unmillon"(onemillion)
asthenalanswer.
3 Results
Wesubmittedtworunsforexactanswercategory.First run (alicex031ms)was
obtainedapplyingthewholesystemdescribedabovewhilesecondrunperformed
QAprocesswithoutactivating Web retrieval(alicex032ms). Table1showsthe
resultsobtainedforeach run.
Table1.Spanishmonolingualtaskresults
Strict Lenient
Run MRR% CorrectMRR% Correct
alicex031ms0,3075 40,0 0,3208 43,5
alicex032ms0,2966 35,0 0,3175 38,5
Resultanalysis may notbe asconclusiveaswe would desiremainly due to
thesimplicityofourapproach. Besides,thelackof thecorrectanswersfortest
questions at this moment do notallow us to perform a correct erroranalysis.
Anyway, results obtained show that using the World Wide Web as external
resourceincreasesthepercentageofcorrectanswersretrievedinvepoints.This
factconrmsthatQAsystemsperformanceforotherlanguagesthanEnglishcan
ThisworkhastobeseenasarstandsimpleattempttoperformQAinSpanish.
Consequently,thereareseveralareasof futureworktobeinvestigated.Among
them,wecan selectthefollowingones:
{ Questionanalysis.Sincethesamequestioncanbeformulatedinverydiverse
forms (interrogative,aÆrmative,using dierent wordsand structures,...),
weneedtostudyaspectssuchasrecognizingequivalentquestionsregardless
ofthespeechact or ofthe words,syntacticandsemanticinter-relations or
idiomaticformsemployed.
{ Answer taxonomy. An important part in the process of question interpre-
tation residesin systems abilityof relating questions with theirrespective
answerscharacteristics. Consequently, we need to develop a broadanswer
taxonomythatenablesmultilingualanswertypeclassication.Probably,us-
ingEuroWordNet 4
semanticnetstructure.
{ Passage Retrieval. An enhancedquestion analysis will improve passagere-
trievalperformancebyincludingquestionexpansiontechniquesthatenable
retrievingpassagesincludingrelevantinformationexpressedwithtermsthat
aredierent(but equivalent)tothoseusedforquestionformulation.
{ AnswerExtraction.Integratingnamed-entitytaggers.Usingabroadanswer
taxonomyinvolvesusingtoolscapableof identifyingtheentitythat aques-
tionexpectsasanswer.Thereforeweneedtointegratenamed-entitytagging
capabilitiesthatallowstonarrowdownthenumberofcandidatestobecon-
sideredforansweringaquestion.
Eventhoughalltheselinesneedtobeinvestigated,itisimportanttoremark
that this investigation needs to be developed from a multilingual perspective.
Thatis,futureinvestigationsneedtoaddresslanguage-dependentandlanguage-
independentmodulesdetectionandcombinationwiththemainlong-termobjec-
tiveof developingawhole system capable of performing multilingual question
answering.
References
1. JordiAtserias, JosepCarmona, IreneCastellon, SergiCervell,MontseCivit,Llus
Marquez,M.A.Mart,LlusPadro,RoserPlacer,HoracioRodrguez,MarionaTaule,
and JordiTurmo. MorphosyntacticAnalysisand Parsingof UnrestrictedSpanish
Text. InProceedingsofFirstInternationalConference onLanguage Resourcesand
Evaluation.LREC'98,pages1267{1272, Granada,Spain,1998.
2. JohnBurger,ClaireCardie,VinayChaudhri,RobertGaizauskas,SandaHarabagiu,
DavidIsrael, ChristianJacquemin,Chin-YewLin,SteveMaiorano,GeorgeMiller,
Dan Moldovan, BillOgden, JohnPrager, EllenRilo,Amit Singhal,Rohini Shri-
hari, Tomek Strzalkowski, Ellen Voorhees, and Ralph Weishedel. Issues, Tasks
and Program Structures toRoadmap Research inQuestion&Answering(Q&A).
http://www-nlpir.nist.gov/projects/duc/papers/qa.Roadmap-paperv2.doc,2000.
4
retrievalsystema at CLEF2001. InWorkshopof Cross-Language Evaluation Fo-
rum(CLEF2001),LecturenotesinComputerScience,Darmstadt,Germany,2001.
Springer-Verlag.
4. JoseLuisVicedoandAntonioFerrandez.AsemanticapproachtoQuestionAnswer-
ing systems. InNinthText REtrieval Conference,volume500-249of NISTSpecial
Publication, pages 511{516, Gaithersburg, USA, nov 2000. National Institute of
StandardsandTechnology.
5. JoseLuisVicedo,AntonioFerrandez, andFernandoLlopis. UniversityofAlicante
at TREC-10. In Tenth Text REtrieval Conference, volume 500-250 ofNIST Spe-
cialPublication,Gaithersburg,USA,nov2001.NationalInstituteofStandardsand
Technology.
6. JoseLuisVicedo,FernandoLlopis,and AntonioFerrandez. UniversityofAlicante
ExperimentsatTREC-2002. InEleventh Text REtrieval Conference,volume 500-
251ofNIST SpecialPublication,Gaithersburg, USA,nov2002. NationalInstitute
ofStandardsandTechnology.