• Aucun résultat trouvé

Large Vocabulary Speech Recognition Based on Statistical Methods

5.4 Pronunciation Modeling

6JG RTQPWPEKCVKQP FKEVKQPCT[ KU VJG NKPM DGVYGGP VJG CEQWUVKENGXGN TGRTGUGPVCVKQP CPF VJG NGZKECN KVGOU QWVRWV D[ VJG URGGEJ TGEQIPK\GT 6JG CEEWTCE[ QH VJG CEQWUVKE OQFGNU KU RCTVN[ FGRGPFGPV WRQP VJG EQPUKUVGPE[ QH VJG RTQPWPEKCVKQP FKEVKQPCT[

#UUQEKCVGF YKVJ GCEJ NGZKECN GPVT[ CTG QPG QT OQTG RTQPWPEKCVKQPU FGUETKDGF WUKPI VJG EJQUGP GNGOGPVCT[ WPKVU WUWCNN[ RJQPGOGU QT RJQPGU 6JKU UGV QH WPKVU KU GXK FGPVN[ NCPIWCIG FGRGPFGPV (QT GZCORNG UQOG EQOOQPN[ WUGF RJQPG UGV UK\GU CTG HQT 'PINKUJ HQT )GTOCP CPF +VCNKCP HQT (TGPEJ CPF /CPFCTKP VQ YJKEJ VQPGU OC[ DG CFFGF CPF HQT 5RCPKUJ +P IGPGTCVKPI RTQPWPEKCVKQP DCUGHQTOU OQUV NGZKEQPU KPENWFG UVCPFCTF HWNNHQTO RTQPWPEKCVKQPU CPF FQ PQV GZRNKEKVN[ TGR TGUGPV RJQPGVKE XCTKCPVU 6JKU TGRTGUGPVCVKQP KU EJQUGP CU OQUV XCTKCPVU ECP DG RTG FKEVGF D[ TWNGU CPF VJGKT WUG KU QRVKQPCN /QTG KORQTVCPVN[ VJGTG QHVGP KU C EQPVKP WWO DGVYGGP FKHHGTGPV RJQPGVKE TGCNK\CVKQPU QH C IKXGP RJQPGOG CPF VJG FGEKUKQP CU VQ YJKEJ QEEWTTGF KP CP[ IKXGP WVVGTCPEG KU UWDLGEVKXG $[ WUKPI C RJQPG TGRTGUGPVC VKQP PQ JCTF FGEKUKQP KU KORQUGF CPF KV KU NGHV VQ VJG CEQWUVKE OQFGNU VQ TGRTGUGPV VJG QDUGTXGF XCTKCPVU KP VJG VTCKPKPI FCVC 9JKNG RTQPWPEKCVKQP NGZKEQPU CTG WUWCNN[ CV

Phone Example Phone Example 8QYGNU (TKECVKXGU

K DGGV U UWG

DKV \ \QQ

G DCKV UJQG

D'V OGCUWTG

¿ DCV H HCP

DWV X XCP

DQVV VJKP

Q DQCV 2NQUKXGU

W DQQV D DGV

DQQM F FGDV

DKTF IGV

&KRJVJQPIU R RGV

DKVG V VCV

DQ[ M ECV

DQWV #HHTKECVGU

4GFWEGF 8QYGNU EJGCR

ZDQWV LGGR

FCVGF 0CUCNU

DWVVGT O OGV

5GOKXQYGNU P PGV

N NGF VJKPI

T TGF 5[NNCDKEU

Y YGF O DQVV QO

[ [GV P DWVVQP

J JCV N DQVVNG

FIGURE 5.4

Set of 45 phone symbols for English with illustrative words, with the portion corresponding to the phone sound underlined.

NGCUV RCTVKCNN[ ETGCVGF OCPWCNN[ UGXGTCN CRRTQCEJGU VQ CWVQOCVKECNN[ NGCTP CPF IGP GTCVG YQTF RTQPWPEKCVKQPU JCXG DGGP KPXGUVKICVGF 5WEJ CRRTQCEJGU YJKNG RTQOKU KPI JCXG VQ FCVG IKXGP QPN[ UOCNN RGTHQTOCPEG KORTQXGOGPVU GXGP YJGP VTCKPGF QP OCPWCN VTCPUETKRVKQPU =?

2TQPWPEKCVKQP XCTKCPVU ECP DG QDUGTXGF HQT C XCTKGV[ QH YQTFU #NVGTPCVKXG RTQPWPEKC VKQPU CTG QDXKQWUN[ PGGFGF HQT JQOQITCRJU YQTFU URGNNGF VJG UCOG DWV RTQPQWPEGF FKHHGTGPVN[ YJKEJ TGƀGEV FKHHGTGPV RCTVU QH URGGEJ XGTD QT PQWP UWEJ CUexcuse, record, moderate 5QOG HTGSWGPV CHſZGU UWEJ CUanti-, bi-, multi-, -izationECP DG RTQPQWPEGF YKVJ C FKRJVJQPI QT C UJQTV XQYGN QT 6JG WRRGT RCTV QH (KIWTG IKXGU UQOG GZCORNG YQTFU YKVJ OWNVKRNG RTQPWPEKCVKQPU CPF VJGKT CUUQEK CVGF RTQDCDKNKVKGU 7UKPI C UGV QH CNNQRJQPG OQFGNU EH 5GEVKQP VJG RTQPWP EKCVKQP RTQDCDKNKVKGU CTG GUVKOCVGF D[ ſTUV CNKIPKPI VJG TGHGTGPEG YQTF VTCPUETKRVKQP

%17210 MWRP M[WRP

14)#0+<#6+10 TIP\GP TIP\GP

*70&4'& JPFF JPFTF

JPF JPTF

/1&'4#6' OFV OFGV

61 V VW

+ &10ŏ6 -019 FQPPQ FQPVPQ

FPQ FPQ

&10ŏ6 -019 FQPPQ FQPVPQ

FPQ

&+& ;17 F+F[W

F+ F+F[

)1+0) 61 IQV IQVW

IP IEP

FIGURE 5.5

Some example lexical entries and their pronunciations along with estimate probabilities. For the compound words, the original concatenated pronunci-ation is given in the 1st line and the reduced forms are given in the 2nd line.

YKVJ VJG CWFKQ UKIPCN WUKPI C NGZKEQP EQPVCKPKPI GSWCNN[ NKMGN[ CNVGTPCVKXG RTQPWP EKCVKQPU NGVVKPI VJG 8KVGTDK CNIQTKVJO EJQQUG VJG DGUV RTQPWPEKCVKQP HQT GCEJ YQTF 6JG RTQDCDKNKVKGU CTG VJGP GUVKOCVGF HTQO VJG TGNCVKXG HTGSWGPEKGU QH GCEJ XCTKCPV 9QTFU QH HQTGKIP QTKIKP RCTVKEWNCTN[ RTQRGT PCOGU OC[ JCXG FKHHGTGPV RTQPWPEKC VKQPU FGRGPFKPI WRQP VJG URGCMGTŏU HCOKNKCTKV[ YKVJ VJG QTKIKPCN NCPIWCIG +V KU CNUQ EQOOQP HQT OWNVKU[NNCDNKE YQTFU VQ DG RTQPQWPEGF YKVJ FKHHGTGPV PWODGTU QH U[N NCDNGU (QT GZCORNG CDQWV QH VJG QEEWTTGPEGU QHinterestCPFconference CPF QHcompanyCTG URQMGP YKVJ VYQ U[NNCDNGU KPUVGCF QH VJTGG +H CEQWUVKE OQFGN VTCKPKPI KU ECTTKGF QWV YKVJQWV CNNQYKPI HQT CRRTQRTKCVG RTQPWPEKCVKQP XCTKCPVU VJGTG YKNN PGEGUUCTKN[ DG C OKUCNKIPOGPV QH QPG QT OQTG RJQPGU OCMKPI VJG RJQPG OQF GNU NGUU CEEWTCVG 'ZRGTKGPEG JCU UJQYP VJCV ECTGHWN NGZKECN FGUKIP KORTQXGU URGGEJ TGEQIPKVKQP U[UVGO RGTHQTOCPEG =?

+P URGGEJ HTQO HCUV URGCMGTU QT URGCMGTU YKVJ TGNCZGF URGCMKPI UV[NGU KV KU EQOOQP VQ QDUGTXG RQQTN[ CTVKEWNCVGF QT UMKRRGF WPUVTGUUGF U[NNCDNGU RCTVKEWNCTN[ KP NQPI YQTFU YKVJ UGSWGPEGU QH WPUVTGUUGF U[NNCDNGU #NVJQWIJ UWEJ NQPI YQTFU CTG V[RK ECNN[ YGNN TGEQIPK\GF QHVGP C PGCTD[ HWPEVKQP YQTF KU FGNGVGF 6Q TGFWEG VJGUG MKPFU QH GTTQTU CNVGTPCVG RTQPWPEKCVKQPU KP VJG NGZKEQP ECP CNNQY UEJYCFGNGVKQP QT U[N NCDKE EQPUQPCPVU KP WPUVTGUUGF U[NNCDNGU %QORQWPF YQTFU JCXG CNUQ DGGP WUGF CU C YC[ VQ TGRTGUGPV TGFWEGF HQTOU HQT EQOOQP YQTF UGSWGPEGU UWEJ CUdon’t know did you CPFgoing to 5QOG QH VJG TGFWEGF HQTOU CTG UQ HTGSWGPV VJCV VJG[ JCXG C EQOOQPN[ CEEGRVGF YTKVVGP HQTO gonna, dunno 5QOG GZCORNG EQORQWPF YQTFU CTG UJQYP KP VJG NQYGT RCTV QH (KIWTG CNQPI YKVJ GUVKOCVGU QH VJG RTQPWPEKCVKQP RTQDCDKNKVKGU HQT VJG FKHHGTGPV XCTKCPVU 6JGUG GZCORNGU KNNWUVTCVG VJG KPVGTGUV KP WUKPI EQORQWPF YQTFU KP TGEQIPKVKQP NGZKEQPU (NWGPV URGGEJ GHHGEVU ECP CNVGTPCVKXGN[ DG

OQFGNGF WUKPI RJQPQNQIKECN TWNGU = ? 6JG RTKPEKRNG DGJKPF VJG RJQPQNQIK ECN TWNGU KU VQ OQFKH[ VJG CNNQYCDNG RJQPG UGSWGPEGU VQ VCMG KPVQ CEEQWPV GZRGEVGF XCTKCVKQPU 6JGUG TWNGU CTG QRVKQPCNN[ CRRNKGF FWTKPI VTCKPKPI CPF TGEQIPKVKQP 7UKPI RJQPQNQIKECN TWNGU FWTKPI VTCKPKPI TGUWNVU KP DGVVGT CEQWUVKE OQFGNU CU VJG[ CTG NGUU őRQNNWVGFŒ D[ YTQPI VTCPUETKRVKQPU 6JGKT WUG FWTKPI TGEQIPKVKQP TGFWEGU VJG PWODGT QH OKUOCVEJGU 6JG UCOG OGEJCPKUO JCU DGGP WUGF VQ JCPFNG NKCKUQPU OWVGG CPF ſPCN EQPUQPCPV ENWUVGT TGFWEVKQP HQT (TGPEJ

#U URGGEJ TGEQIPKVKQP TGUGCTEJ JCU OQXGF HTQO TGCF URGGEJ VQ HQWPF CWFKQ FCVC VJG RJQPG UGV JCU DGGP GZRCPFGF VQ KPENWFG PQPURGGEJ GXGPVU 6JGUG ECP EQTTGURQPF VQ PQKUGU RTQFWEGF D[ VJG URGCMGT DTGCVJ PQKUG EQWIJKPI UPGG\KPI NCWIJVGT GVE QT ECP EQTTGURQPF VQ GZVGTPCN UQWTEGU OWUKE OQVQT VCRRKPI GVE