• Aucun résultat trouvé

Semantic Context Model for Efficient Speech Recognition

N/A
N/A
Protected

Academic year: 2021

Partager "Semantic Context Model for Efficient Speech Recognition"

Copied!
2
0
0

Texte intégral

(1)

ISAE-SUPAERO Conference paper

The 1st International Conference on Cognitive Aircraft

Systems – ICCAS

March 18-19, 2020

https://events.isae-supaero.fr/event/2

Scientific Committee

Mickaël Causse, ISAE-SUPAERO

Caroline Chanel, ISAE-SUPAERO

Jean-Charles Chaudemar, ISAE-SUPAERO

Stéphane Durand, Dassault Aviation

Bruno Patin, Dassault Aviation

Nicolas Devaux, Dassault Aviation

Jean-Louis Gueneau, Dassault Aviation

Claudine Mélan, Université Toulouse Jean-Jaurès

Jean-Paul Imbert, ENAC

Permanent link :

https://doi.org/10.34849/cfsb-t270

Rights / License:

(2)

ICCAS 2020 Semantic Context Model for Effici …

Semantic Context Model for Efficient Speech

Recognition

Content

Introduction

Automatic speech recognition system (ASR) contains three main parts: an acoustic model, a lexicon and a language model. ASR in noisy environments is still a challenging goal because the acoustic information is not reliable and decreases the recognition accuracy. Better language model gives limited performance improvement, modeling mainly local syntactic information. In this paper, we propose a new semantic model to take into account the long-term semantic context information and thus to remove the acoustic ambiguities of noisy ASR.

Recent developments in natural language processing have led to renewed interest in the field of distributional semantics. Word embeddings (WE) (T.Mikolov [Mikolov2013] or BERT model [De-vlin2018]) take into account the semantic contexts of words and have been shown to be effective for several natural language processing tasks. The efficiency and the semantic properties of these representations motivate us to explore these WE for our task. Thus, our ASR is supplemented by a semantic context analysis module in order to detect the poorly recognized words and to propose new words of similar pronunciation corresponding better to the context. This semantic analysis re-evaluates (rescoring) the N-best transcription hypotheses and can be seen as a form of dynamic adaptation in the specific context of noisy data.

Proposed methodology

An effective way to take into account semantic information is to re-evaluate (rescoring) the best hypotheses of the ASR (N-best). The recognition system provides us for each word of the hypoth-esis sentence an acoustic score p_acc (w) and a linguistic score p_ml (w). The best sentence is the one that maximizes the probability of the word sequence:

̂

Keywords : Transparent AI, Natural human-machine interaction, Innovative

warning systems, Countermeasures, Mixed-initiative planning

Mr LEVEL, Stephane (Loria/Inria); Mrs ILLINA, Irina (Loria/Inria); Mr FOHR,

Dominique (Loria/

Inria)

Références

Documents relatifs

IT specialists can use this profile to support the lifecycle of SA when validating the correspondence of data schemas (metadata) to domain models or enriching data schemas with

SyncFIFO 1 behaves identically to FIFO 1 , except for the case in which it has an empty buffer and pending I/O- requests on both of its nodes: then, SyncFIFO 1 routes a data item

We propose two DNN-based rescoring models producing P sem (h i ): (a) the first model, called BERT sem , is purely semantic and only uses textual information as input; (b)

We discussed methods based on LDA topic models, neural word vector representations and examined the NBOW model for the task of retrieval of OOV PNs relevant to an audio document..

We present an extension of this algorithm, devoted to the decoding process itself, by using the output of the first pass to drive the second pass according to word confidence scores

A minimum environmental resource model has also been developed to be included in the bridge ontology in order to provide a minimal set of common concept to be shared across

In order to enforce the policy, the FaceBlock application running on Google Glass uses the face identifier to detect if the user who shared the policy is part of the pictures taken

After this transformation, NµSMV verifies the correctness of the model written in NµSMV language with temporal logic in order to verify the consistency of the