SophieRosset,OlivierGalibert,GillesAdda,EriBilinski
SpokenLanguageProessingGroup,LIMSI-CNRS,B.P.133,91403Orsayedex,Frane
{firstname.lastname}limsi.fr
Abstrat
Inthispaper,wepresenttwetwodierentquestion-answeringsystemsonspeehtran-
sriptswhih partiipatedtotheQAst2007evaluation. These twosystemsare based
on aomplete andmulti-levelanalysis ofbothqueriesanddouments. Therstsys-
tem uses handrafted rules for small text fragments (snippet) seletion and answer
extration. Theseondonereplaesthehandraftingwithanautomatiallygenerated
researh desriptor. A sore based on those desriptors is used to selet douments
and snippets. Theextrationand soringofandidateanswersisbasedonproximity
measurementswithintheresearhdesriptorelementsandanumberofseondaryfa-
tors. The evaluation results are ranged from 17%to 39% asauray depending on
thetasks.
Categories and Subjet Desriptors
H.3[InformationStorageandRetrieval℄: H.3.1ContentAnalysisandIndexing;H.3.3Information
SearhandRetrieval;H.3.4SystemsandSoftware
General Terms
Measurement,Performane,Experimentation
Keywords
Questionanswering,speehtransriptionsofmeetingand letures
1 Introdution
In the QA and Information Retrieval domains progress has been demonstrated via evaluation
ampaigns for both open domain and limited domains [1, 2, 3℄. In these evaluations systems
arepresentedwithindependentquestions andshould provideoneanswerextrated fromtextual
data toeah question. Reently, there hasbeengrowinginterestin extrating informationfrom
multimediadatasuhasmeetings,letures... Spokendataisdierentfromtextualdatainvarious
ways. Thegrammatialstrutureofspontaneousspeeh isquitedierentfrom writtendisourse
and inlude various typesof disuenies. Theleture and interative meeting data provided in
QAstevaluation arepartiularlydiultdueto run-onsentenesandinterruptions. Mostofthe
QAsystemsuseaomplete andheavysyntatiandsemantianalysisof both thequestionand
the doument or snippets given by searh engine in whih the answer has to be found. Suh
analysisan'treliablybeperformedonthedataweareinterestedin. TypialtextualQAsystems
areomposedofquestionanalysis,informationretrievalandanswerextrationomponents[1,4℄.
Theanswerextrationomponentisquiteomplexandinvolvesnaturallanguageanalysis,pattern
mathing andsometimeseven logialinferene [5℄. Mostof thesenaturallanguagetools arenot
designedtohandlespokenphenomena.
evaluation. Our QA systemsare part of an interative and bilingual(English and Frenh) QA
systemalledRitel[6℄whihspeiallyaddressedspeedissues. Thefollowingsetionspresentthe
doumentsandqueriespre-proessingandthenon-ontextualanalysiswhihareommontoboth
systems. Thesetion3desribestheoldersystem(System1). Setion 4presentsthenewsystem
(System2). Setion5nallypresentstheresultsforthesetwosystemsonbothdevelopmentand
testdata.
2 Analysis of douments and queries
Usually, the syntati/semanti analysis is dierent for the doument and for the query; our
approahistoperformthesameompleteandmultilevelanalysisonbothqueriesanddouments.
Thereareseveralreasonsforthis. Firstofall,thesystemhastodealwithbothtransribedspeeh
(transriptionsofmeetingsandletures,userutteranes)andtextdouments,sothereshouldbea
ommonanalysisthattakesinto aountthespeiitiesofbothdatatypes. Moreover,inorret
analysis due to the lak of ontext or limitations of hand-oded rules are likely to happen on
bothdatatypes,sousingthesamestrategyfordoumentandutteraneanalysishelps toredue
theirnegativeimpat. Inorder touse thesameanalysismodulefor allkindsofdata, weshould
transformthequeryand thedouments,whih mayome from dierentmodality(text, manual
transripts, automati transripts) in order to have a ommon representation of the sentene,
word,et. Thisproessisthenormalization.
2.1 Normalization
Normalization,inourappliation,istheproessbywhihrawtextsareonvertedto atextform
where words and numbers are unambiguously delimited, puntuation is separatedfrom words,
and thetext is split into sentene-likesegments(or aslose to sentenesasis reasonablypossi-
ble). Dierentnormalizationstepsareapplied,dependingofthekindofinputdata;thesestepsare:
1. Separatingwordsandnumbersfrom puntuation.
2. Reonstruting orretaseforthewords.
3. Addingpuntuation.
4. Splitting intosentenesat periodmarks.
IntheQAstevaluation,fourdatatypesareof interest:
•
CHILletures[7℄withmanualtransriptions,wheremanualpuntuationsareseparatedfromwords. Onlythesplittingstepisneeded.
•
CHILletureswithautomatitransriptions[8℄. Requiresaddingpuntuationandsplitting.•
AMI meetings [9℄manualtransriptions. Thetransriptionshad beentextied, withpun-tuation joined to thewords,rst wordssentenes upper-ased, et. Requiresallthe steps
exeptaddingpuntuation.
•
AMI meetings with automati transriptions [10℄. Laking ase, they required the last 3steps.
Reonstrutingtheaseandaddingpuntuationisdoneinthesameproessbasedonusingafully-
ased,puntuatedlanguagemodel[11℄. Awordgraphwasbuiltoveringallthepossiblevariants
(allpossiblepuntuationsaddedbetweenwords,allpossiblewordases),anda4-gramlanguage
model wasused to selet the mostprobable hypothesis. Thelanguage model was estimated on
House of Commons Daily Debates, nal edition of the European Parliament Proeedings and
various newspapers arhives. The nal result, with upperase onlyon propernouns and words
The analysis is onsidered non-ontextual beause eah sentene is proessed in isolation. The
generalobjetiveof thisanalysis is to ndthebits of informationthat may be ofuse forsearh
andextration, whihweallpertinent informationhunks. Theseanbeofdierentategories:
named entities,linguisti entities (e.g. verbs,prepositions), orspei entities (e.g. sores). All
words that do not fall into suh hunks are automatially grouped into hunks via a longest-
math strategy. Some examples of pertinent information hunks are given in Figure 1. In the
followingsetions,thetypesofentitieshandledbythesystemaredesribed,alongwithhowthey
arereognized.
_prepin _orgNIST _NNmetadataevaluations _verbreported _NNspeakertraking
_soreerrorrates _auxare _prepabout _val_sore15%
Figure 1: ExamplesofpertinentinformationhunksfromtheCHILdataolletion
2.2.1 DenitionofEntities
Followingommonlyadopteddenitions,thenamedentitiesareexpressionsthatdenoteloations,
people, ompanies, times, and monetary amounts. These entities have ommonly known and
aeptednames. ForexampleiftheountryFraneisanamedentity,apitalofFrane isnota
namedentity. Howeverourexperieneisthattheinformationpresentinthenamedentitiesisnot
suientto analyzethewiderange ofuserutteranes that anbefoundin leturesormeetings
transripts. Thereforewedenedasetofspeientitiesinordertoolletallobservedinformation
expressions ontained in a orpus questions and texts from a variety of soures (proeedings,
transriptsofletures,dialogset.). Figure2summarizesthedierententitytypesthat areused.
Type of entities Examples
lassial pers: RomanoProdi ;WinstonChurhill
namedentities prod: PulpFition;Titani
time: third entury;1998;June30th
org: EuropeanCommission; NATO
lo: Cambridge; England
extended method: HMM,Gaussianmixturemodel
namedentities event: the9thonfereneonspeehommuniationandtehnology
amount: 500; twohundredandftythousand
measure: year;mile ;Hertz
olorred,springgreen
question markers Qpers: whowrote... ;whodiretedTitani
Qlo: where isIBM
Qmeasure: whatistheweightofthebluespoonheadset
linguistihunk ompound: languageproessing;informationtehnology
verb: RobertoMartineznowknows thefullsizeofthetask
adj_omp: themirophoneswouldbesimilarto ...
adj_sup: thebiggestproduerofooaoftheworld
Figure2: Examplesofthemainentitytypes
2.2.2 Automatidetetion oftyped entities
The types weneed to detetorrespondto twolevels ofanalysis: named-entityreognitionand
overage of all dened types and subtypes indued the need of a large number of ourrenes,
andthereforerelyontheavailabilityoflargeannotatedorporawhiharediulttobuild. Rule-
basedapproahestonamed-entityreognition(e.g. [15℄)relyonmorphosyntatiand/orsyntati
analysis ofthe douments. However,in thepresent work,performingthissort of analysisis not
feasible: thespeeh transriptionsare toonoisyto allowforbothaurate androbustlinguisti
analysis basedon typialrulesandtheproessingtimeof mostofexistinglinguisti analyzersis
notompatiblewiththehighspeedwerequire.
We deided to takle the problem with rulesbased on regularexpressions on words asin other
works [16℄: weallowtheuseoflists forinitial detetion,andthedenition ofloal ontextsand
simpleategorizations.Thetoolusedtoimplementtherule-basedautomatiannotationsystemis
alledWmath. Thisenginemathes(andsubstitutes)regularexpressionsusingwordsasthebase
unitinsteadofharaters. Thispropertyallowsforamorereadablesyntaxthantraditionalregular
expressionsand enablestheuse oflasses(listsof words)and maros(sub-expressionsin-linein
a larger expression). Wmath inludes also NLP-oriented features like strategies for prioritizing
rule appliation, reursive substitution modes, word tagging (for tags like noun, verb...), word
ategories(number,aronym,propername...). Ithasmultipleinputandoutputformats,inluding
anXML-basedoneforinteroperabilityandtoallowhainingofinstanesofthetoolwithdierent
rulesets. Rulesarepre-analyzedandoptimizedinseveralways,andstoredin ompatformatin
order to speedupthe proess. Analysis is multi-pass, and subsequentrule appliations operate
ontheresultsof previousruleappliations whih anbe enrihedormodied. The fullanalysis
omprisessome50stepsandtakesroughly4msonatypialuserutterane(ordoumentsentene).
Theanalysisprovides96dierenttypesofentities. Figure3showsanexampleoftheanalysison
aquery(top)andonatransription(bottom).
<
_Qorg>
whihorganization<
/_Qorg> <
_ation>
provided<
/_ation>
<
_det>
a<
/_det> <
_NN>
signiantamount<
/_NN>
<
_prep>
of<
/_prep> <
_NN>
trainingdata<
/_NN> <
_punt>
?<
/_punt>
<
_pro>
it<
/_pro> <
_verb>
's<
/_verb> <
_adv>
just<
/_adv>
<
_prep_omp>
sort of<
/_prep_omp> <
_det>
a<
/_det>
<
_NN>
verypale<
/_NN> <
_olor>
blue<
/_olor> <
_onj>
and<
/_onj>
<
_det>
a<
/_det> <
_adj>
light-up<
/_adj> <
_olor>
yellow<
/_olor>
<
_punt>
.<
/_punt>
Figure 3: Example annotation of a query: whih organization provided a signiant amount of
trainingdata? (top)andofatransriptionit's justsortof averypale blue(bottom).
3 Question-Answering System 1
The Question-Answering system handles searh in douments of any types (news artiles, web
douments,transribedbroadastnews,et.). Forspeedreasons,thedoumentsareallavailable
loally and preproessed: they are rst normalized, and then analyzed with the NCA module.
The(type,values)pairsarethenmanagedbyaspeializedindexerforquiksearhandretrieval.
Thissomewhatbag-of-typed-wordssystem[6℄worksinthreesteps:
1. Doument query lists reation. Using theentities found in the question, we generate
a doument query, and a ordered list of handrafted bak-o queries. These queries are
obtainedbyrelaxingsomeoftheonstraintsonthepreseneoftheentities,usingarelative
importaneordering(Namedentity
>
NN>
adj_omp>
ation>
subs...)and stop as soon as we get doument snippets (sentene or small groups of onseutive
sentenes)bak.
3. Answer extration and seletion: thedetetion oftheanswertypehasbeenextrated
beforehandfrom thequestion, using QuestionMarker,Named, Non-spei andExtended
Entities o-ourrenes(_Qwho
→
_persor_pers_defor_org). Therefore,weseletthe entities in thesnippets with theexpeted typeof the answer. At last,a lustering of theandidate answersis done, basedonfrequenies. Themostfrequent answerwins, and the
distributionoftheountsgivesanideaoftheondeneofthesystemintheanswer.
4 Question-Answering System 2
System1hasthreemain problems:
•
Thebak-oquerieslists requirealargeamountof maintenanework andwill neveroveralloftheombinationsofentities whihmaybefoundin thequestions.
•
Theanswerseletionusesonlyfrequeniesofourrene,oftenendingupwithlistsofrst-rankandidateanswerswiththesamesore.
•
The systemansweringspeed diretlydepends onthenumberof snippets to retrievewhihmaysometimes beverylarge. Tolimitthenumberofsnippets is noteasy, asthey arenot
rankedaordingtopertinene.
Anewsystem,System2hasbeendesignedtosolvetheseproblems. Wehavekeptthethreesteps
desribed in setion 3, with somemajor hanges. In step 1, insteadof instantiating doument
queriesfromalargenumberofpreexistinghandraftedrules(about5000),wegeneratearesearh
desriptor using a very small set of rules (about 10); this desriptor ontains all the needed
informationabouttheentities andtheanswertypes,togetherwith weights. Instep2,asoreis
alulatedfromtheproximitybetweentheresearhdesriptorandthedoumentandsnippets,in
ordertohoosethemostrelevantones.Instep3,theanswerisseletedaordingtoasorewhih
takesinto aountmanydierentfeaturesandtuningparameters,whihallowanautomatiand
eientadaptation.
4.1 Researh Desriptor generation
Therststepof System2isto buildaresearh desriptor(datadesriptorreord, DDR) whih
ontains theimportant elements of thequestion, and the possible answertypeswith assoiated
weight. Someelementsaremarkedasritial,whihmakesthemmandatoryinfuturesteps,while
othersareseondary. Theelementextrationandweightingisbasedonaempiriallassiation
of the element types in importane levels. Answertypes are predited through rules based on
ombinationsofelementsofthequestion. TheFigure4showsanexampleofaDDR.
4.2 Douments and snippets seletion and soring
EahofthedoumentissoredwithgeometrimeanofthenumberofourrenesofalltheDDR
elements whih appear in it. Using a geometri mean preventsfrom resaling problems due to
someelementsbeingnaturallymorefrequent. Thedoumentsaresortedbysoreandthen-best
onesare kept. Thespeedoftheentiresystemanbeontrolledbyhoosingn,thewhole system
beingin pratieio-boundratherthanpu-bound.
The seleted douments are then loaded and all the lines in a predened window (2-10 lines
dependingonquestiontypes)fromtheritialelementsarekept,reatingsnippets. Eahsnippet
issoredusingthegeometrialmeanofthenumberofourrenesofalltheDDRelementswhih
question: in whih ompany Bart works as a projet manager ?
ddr:
{ w=1, ritial, pers, Bart},
{ w=1, ritial, NN, projet manager },
{ w=1, seondary, ation, works },
answer_type = {
{ w=1.0, type=orgof },
{ w=1.0, type=organisation },
{ w=0.3, type=lo },
{ w=0.1, type=aronym },
{ w=0.1, type=np },
}
Figure 4: Exampleof aDDR onstrutedfrom the questionin whih ompany Bart works as a
projetmanager;eahelementontainsaweightw,theirimportaneforfuturesteps,andthepair
(type,value);eahpossibleanswertypeontainsaweightwandthetypeoftheanswer.
4.3 Answer extration, soring and lustering
In eah snippet all the elements whih type is one of the predited possible answer types are
andidateanswers. WeassoiatetoeahandidateanswerA asore
S(A)
:S(A) = [w(A) P
E max e=E w(E)
(1+d(e,A)) α ] 1 −γ × S γ snip
C d (A) β C s (A) δ (1)
Inwhih:
• d(e, A)
isthedistanetoeahelemente
of thesnippet,instantiatingasearhelementE
oftheDDR
• C sis thenumberofourrenesofA intheextrated snippets,C d in thewhole doument
olletion
• S snip istheextratedsnippet sore(see4.2)
• w(A)
istheweightoftheanswertypeandw(E)
theweightoftheelementE
in theDDR• α
,β
,γ
andδ
aretuningparametersestimatedbysystematitrialsonthedevelopmentdata.α, β, γ ∈ [0, 1]
andδ ∈ [ − 1, 1]
Anintuitiveexplanationoftheformulaisthat eahelementoftheDDRaddsto thesoreofthe
andidate(
P
E
)proportionallytoitsweight(w(E)
)andinverselyproportionallytoitsdistaneof theandidate(d(e, A)
). Ifmultipleinstaneof theelementarefoundinthesnippet onlythebestoneiskept(
max e=E). Thesoreisthensmoothedwiththesnippetsore(S snip)andompensated
inpartwiththeandidatefrequenyin allthedouments(
C d)andin thesnippets(C s).
Thesoresforidential(type,value)pairsareaddedtogetherandgivethenalsoringforallthe
possibleandidateanswers.
5 Evaluation
Inthissetion,wepresenttheresultsobtainedinthefourtasks. T1andT2taskswereomposed
ofanidentialsetof98questions;T3taskwasomposedofadierentsetof96questionsandT4
taskofasubsetof93questions. Table1showtheoverallresultswiththe3measuresusedinthis
evaluation. Wesubmittedtworuns,oneforeahsystem,for eah ofthefour tasks. As required
bytheevaluationproedure,amaximumof5answersperquestionwasprovided.
Globally,weanseethatSystem2getsbetterresultsthanSystem1. Theimprovementofthe
T1 Sys1 32.6% 0.37 43.8%
Sys2 39.7% 0.46 57.1%
T2 Sys1 20.4% 0.23 28.5%
Sys2 21.4% 0.24 28.5%
T3 Sys1 26.0% 0.28 32.2%
Sys2 26.0% 0.31 41.6%
T4 Sys1 18.3% 0.19 22.6%
Sys2 17.2% 0.19 22.6%
Table1: GeneralResults. Sys1System1;Sys2System2;A. istheauray,MRRistheMean
ReiproalRankandReallthetotalnumberoforretanswersinthe5returnedanswers
of doument/snippet queries greatly improves the overage as ompared to handrafted rules.
System 2 did notperform betterthan System 1on the T2 task. Further analysis is needed to
understandwhy.
Thedierentmodules weanevaluatearethe analysismodule, thepassageretrievaland the
answerextration. ThepassageretrievaliseasiertoevaluateforSystem2beauseitisaomplete
separate module, whih is not the ase in the System 1. The Table 2 give the results on the
passageretrievalintwoonditions: withalimitationofthenumberofpassagesat 5andwithout
limitation. The diferene between the Reall on the snippets (how often the answeris present
in the seleted snippets) and theQA Aurayshow that the extration and thesoring of the
answerhasareasonnablemarginforimprovement. ThedierenebetweenthesnippetRealland
itsAuray(from 26to 38%for thenolimitondition) illustrates thatthe snippet soringan
beimproved.
Passagelimit=5 Passagewithoutlimit
Task A. MRR Reall A. MRR Reall
T1 44.9% 0.52 67.3% 44.9% 0.53 71.4%
T2 29.6% 0.36 46.9% 29.6% 0.37 57.0%
T3 30.2% 0.37 47.9% 30.2% 0.38 68.8%
T4 18.3% 0.22 31.2% 18.3% 0.24 51.6%
Table2: ResultsforPassageRetrievalforSystem2. Passage 5themaximumofpassagenumber
is5;Passage withoutlimitthereisnolimitforthepassagenumber;A. istheauray,MRRis
theMeanReiproalRankandReallthetotalnumberoforretanswersinthereturnedanswers
Oneofthekeyusesoftheanalysisresultsisroutingthequestionwhihisdeterminingarough
lassforthetypeoftheanswer(language, loation,...). Theresultsoftheroutingomponentare
given in Table3 withdetails byanswerategory. Two questions of T1/T2and three of T3/T4
were notrouted.
Weobservedlargediereneswiththe resultsobtainedonthedevelopmentdata, in partiu-
larlywith themethod,olor and timeategories. Theanalysis module hasbeenbuilt onorpus
observations and it seemsto betoodependanton thedevelopmentdata. That anexplain the
absene ofmajor dierenesbetween System1 and System2for the T1/T2tasks. Most of the
wrongly routed questions havebeen routed to the generi answer type lass. In System 1this
lassseletsspei entities (method, models,system, language...) overtheotherentitytypesfor
thepossibleanswers. InSystem2nosuh adaptationto thetaskhasbeendoneandallpossible
entitytypeshaveequalpriority.
T1/T2
%Corret 72% 100% 89% 75% 17% 95% 89%
#Questions 98 4 9 28 18 20 9
T3/T4
%Corret 80% 100% 93% 83% - 85% 80%
#Questions 96 2 14 12 - 13 15
TIM SHA COL MAT
T1/T2
%Corret 80% - - -
#Questions 10 - - -
T3/T4
%Corret 71% 89% 73% 50%
#Questions 14 9 11 6
Table3: Routingevaluation. All: allquestions;LAN: language;LOC: loation;MEA: measure;
MET:method/system;ORG:organization;PER:person;TIM:time;SHAP:shape;COL:olour.
6 Conlusion and future work
WepresentedtheQuestionAnsweringsystemsusedforourpartiipationtotheQAstevaluation.
Two dierent systems have been used for this partiipation. The two main hanges between
System1andSystem2arethereplaementofthelargesetofhandmaderulesbytheautomati
generationofaresearhdesriptor,andtheadditionofaneientsoringoftheandidateanswers.
Theresultsshowthat theSystem2outperformstheSystem1. Themain reasonsare:
1. Better generiity through theuse of akind of expert systemto generate the researh de-
sriptors.
2. Morepertinentanswersoringusingproximitieswhihallowsasmoothingoftheresults.
3. Preseneofvarioustuningparameterswhihenabletheadaptionofthesystemtothevarious
questionanddoumenttypes.
These systems have been evaluated on dierent data orresponding to dierent tasks. On
themanually transribedletures, the best result is39%for Auray, onmanually transribed
meetings,24%for Auray. There wasno spei eortdoneon theautomatially transribed
leturesandmeetings,sotheperformanesonlygiveanideaofwhatanbedonewithouttryingto
handlespeehreognitionerrors. Thebestresultis18.3%onmeetingand21.3%onletures. From
the analysis presentedin theprevious setion, performane anbeimprovedat everystep. For
example,theanalysisandroutingomponentanbeimprovedinordertobettertakeintoaount
sometypeofquestionswhihshouldimprovetheanswertypingandextration. Thesoringofthe
snippets andtheandidate answersan alsobeimproved. Inpartiularsometuningparameters
(liketheweightofthetransformationsgeneratedintheDDR) havenotbeenoptimizedyet.
7 Aknowledgments
This work waspartially funded by theEuropeanCommission under theFP6 IntegratedProjet
IP 506909Chiland theLIMSIAI/ASPRitelgrant.
Referenes
[1℄ E.M.Voorhees,L.P.Bukland.TheFifteenthTextREtrievalConfereneProeedings(TREC
2006),InVoorheesandBuklandeds.2006.
Saaleanu,P.Roha,R.Sutlie.OverviewoftheCLEF2006MultilingualQuestionAnswering
Trak.Working NotesfortheCLEF2006Workshop.2006.
[3℄ C.Ayahe,B.Grau,A.Vilnat.Evaluationofquestion-answeringsystems: TheFrenhEQueR-
EVALDA EvaluationCampaign.Proeedings ofLREC'06,Genoa,Italy.
[4℄ S. Harabagiuand D. Moldovan. Question-Answering.In The Oxford Handbook of Computa-
tional Linguistis.R.Mitkov(Eds). OxfordUniversityPress.2003.
[5℄ S. Harabagiu, A. Hikl. Methods for using textual entailment in Open-Domain question-
answering.ProeedingsofCOLING'06.Sydney,Australia.July2006.
[6℄ B.vanShooten,S.Rosset,O.Galibert,A.Max,R.opdenAkker,G.Illouz.Handlingspeeh
inputintheRitelQAdialoguesystem.2007.ProeedingsofInterspeeh'07.Antwerp.Belgium.
August2007.
[7℄ CHILProjet.http://hil.server.de
[8℄ L. Lamel, G. Adda, E. Bilinski, and J.-L. Gauvain. TransribingLetures and Seminars. In
InterSpeeh, Lisbon,September2005.
[9℄ AMI projet.http://www.amiprojet.org
[10℄ T. Hain, L. Burget, J. Dines, G. Garau, M. Karaat, M. Linoln, J. Vepa, and V. Wan.
TheAMIMeetingTransription System: ProgressandPerformane.RihTransription 2006
Spring (RT06s)MeetingReognitionEvaluation.3May2006,Bethesda,Maryland,USA.
[11℄ D.Déhelotte,H.Shwenk,G.Adda,J.-L.Gauvain.ImprovedMahineTranslationofSpeeh-
to-Textoutputs.2007.ProeedingsofInterspeeh'07.Antwerp.Belgium.August2007.
[12℄ D.M. Bikel, S. Miller, R. Shwartz, R. Weishedel. Nymble: a high-performane learning
name-nder.ProeedingsofANLP'97,Washington,USA,1997.
[13℄ H. Isozaki, H. Kazawa, Eient Support Vetor Classiers for Named Entity Reognition.
ProeedingsofCOLING,Taipei.2002.
[14℄ M. Surdeanu, J. Turmo, E. Comelles. Named Entity Reognition from spontaneous Open-
Domain Speeh.ProeedingsofInterSpeeh'05,Lisbon,Portugal.2005.
[15℄ F.Wolinski,F.Vihot,B.Dillet.AutomatiProessingofProperNamesinTexts.Proeedings
ofEACL'95,Dublin,Ireland. 1995.
[16℄ S.Sekine.Denition,ditionariesandtaggerofExtendedNamedEntityhierarhy.Proeed-
ingsofLREC'04,Lisbon,Portugal.2004.