• Aucun résultat trouvé

BabelDr: A Web Platform for Rapid Construction of Phrasebook-Style Medical Speech Translation Applications

N/A
N/A
Protected

Academic year: 2022

Partager "BabelDr: A Web Platform for Rapid Construction of Phrasebook-Style Medical Speech Translation Applications"

Copied!
2
0
0

Texte intégral

(1)

Poster

Reference

BabelDr: A Web Platform for Rapid Construction of Phrasebook-Style Medical Speech Translation Applications

BOUILLON, Pierrette, et al.

BOUILLON, Pierrette, et al. BabelDr: A Web Platform for Rapid Construction of

Phrasebook-Style Medical Speech Translation Applications. In: 19th annual conference of the European Association for Machine Translation (EAMT), Riga (Latvia), 2016

Available at:

http://archive-ouverte.unige.ch/unige:83843

Disclaimer: layout of this document may differ from the published version.

1 / 1

(2)

BabelDr: A Web Platform for Rapid Construction of

Phrasebook-Style Medical Speech Translation Applications

The application

Examination-tool for medical subdomains : semantic

coverage consists of a prespecified set of questions, but users can use a wide variety of surface forms when

speaking to the system Translation steps:

 Rule-based recognition of the source question (Nuance)

Est-ce que vous avez de la température ?

 Rule-based translation of the source to source canonical with synchronized CFG

Avez-vous de la fièvre ?

 Human translation from source canonical to target sentence

ረስኒ ኣሎካዶ?

 Data will be used to build a robust recognizer and a classifier

Development platform and steps

Developed with Regulus Lite

 The set of canonical questions is defined by HUG

 The questions are transformed in synchronized CFG by experts (UMLS, etc.)

Utterance

Source ?$votre_douleur $irradie_elle $$vers_organe

Source ?(est-ce (que|qu')) $votre_douleur $irradie $$vers_organe Source ?$vos_douleurs $irradient_elles $$vers_organe

Source ?(est-ce (que|qu')) $vos_douleurs $irradient $$vers_organe Source $$vers_organe

Source ($avez_vous | $ça_fait | $ressentez_vous | $souffrez_vous de)

$mal_de_ventre $$vers_organe

Target/french la douleur au ventre irradie-t-elle $$vers_organe ? EndUtterance

 The canonical questions are translated by translators

Utterance

Target/french la douleur au ventre irradie-t-elle $$vers_organe:1 ? Target/tigrinya እቲ ናይ ከብዲ ቃንዛ ፋሕዶ ይብል ናብ $$vers_organe:1 ? EndUtterance

 The system is compiled on the web

Content

 3000 questions collected for now

 6 domains

 4 languages in development

 Size of the 6 domains:

abdominal poitrine tête accueil suivi habitudes

words 1’642 1’572 1’734 684 881 684

expansions

for source 8’000’000 6’000’000 16’000’000 2’277 236’000 2’514 number of

words in canonical patterns

620 551 604 110 217 66

canonical

patterns 679 498 629 50 159 22

Pierrette Bouillon (FTI), Hervé Spechbach (HUG), Sophie Durieux-Paillard (HUG), Johanna Gerlach (FTI), Sonia Halimi (FTI), Patricia Hudelson (HUG), Manny Rayner (FTI), Irène Strasly (FTI), Nikos Tsourakis (FTI)

Context

Financed by: «Fondation privée des Hôpitaux

universitaires de Genève» and Innogap/Mimosa grants Solution : BabelDr

 Common project UNIGE/FTI- HUG

 a web-based platform for

building medium-size speech translators in medical

domains

  Reliable, easy to add new languages, easy to deploy 52% of patients at HUG (Geneva

University Hospitals) are foreign ; 12%

don't speak French

Google translate is not a solution:

 not reliable and precise enough

 most important

languages for HUG not

covered (Tigrinya, Romani, etc.)

Eritrea 25%

Afghanistan 20%

Syria Irak 12%

6%

Sri Lanka 5%

Somalia 3%

Nigeria 2%

Gambia 2%

Iran 2%

Ethiopia

2% Others

21%

2015 Asylum applications in Switzerland (total: 39’523)

Source: State Secretariat for Migration (SEM), 2016, www.sem.admin.ch

Références

Documents relatifs

Its quality was assessed by twelve native speakers who provided feedback on 60 prompts generated by the synthesizer and on 60 real human recordings across three dimensions,

The hybrid approach combines the best neural ma- chine translation model with the best classification model to build an N-best list of sentences, in this experiment a 2-best list

BabelDr is a medical speech to speech translator, where the doctor has to approve the sentence that will be translated for the patient before translation; this step is done

The source side of the monolingual Turkish data used to create the synthetic corpus are translated to En- glish using two different TR→EN systems namely (E1) and (E2) where the

The S2ST system is basically composed of three modules: large vocabulary continuous automatic speech recognition (ASR), machine text-to-text translation (MT) and text- to-speech

We focus in particular on the speech understanding architecture, which uses grammar-based language models derived using corpus-based specialisation methods from a single

[13] have shown promising results by applying a multilingual entity linking algorithm along with knowledge graph embeddings into the translation phase of a neural machine

Each approach is deployed with a different NMT model and a different dataset variation; a dataset with original translation, a dataset with hybrid back-translated (synthetic) data,