• Aucun résultat trouvé

GRAPH REPRESENTATION

N/A
N/A
Protected

Academic year: 2022

Partager "GRAPH REPRESENTATION"

Copied!
190
0
0

Texte intégral

(1)
(2)
(3)
(4)
(5)

HIERARCHICAL ATTRIBUTED GRAPH REPRESENTATION AND

R ECOGNITION OF HANDWRITTEN C H IN E SE CHARACTERS

By

tt

Ying Ren , B.Sc.

A thesis sub mitt e dto the Schoo lof Graduate Studies inpa rti alfulfill mentof the

requirementsfor the degreeof M1ISt. Cl'of Science

Dcpurrmcntof ComputerScience McmorinlUnlverelty of Newfoundland

Augu st,t901

St.John's Newfonnd'land Canada

(6)

1+1

Naliona1l lbra ry

orcanaoa Blbliothc('JuC ll,l !lClO'la1e oucaoooo

Acquisil'Onsand Dsocnondes ecqorsncos ot

Bjbliographic ServicesI3fill'lCh desSClvi<':CShibhO\J"lplllq'k'~

395WClilnglOlI STrCCI 3(I~.""'W" IO"!lI,..

~IAotI)o'!"'oo ~\I~~~P'I"""J

The author has granted an irrevocable non-exclusivelicence allowing the National Library of Canada to reproduce, toan.

distribute or sell copies of his/herthesis byany meansand in any form or format,making thisthesis availa ble to Interested person s.

The author retainsownershipof the copyright in his/herthesis.

Neither the thesisnorsubstan tial extractsfromitmay be printedor otherwise reproduced without his/herpermission.

L'auteuraeccorde unelicence irrevocable at non exc lusive permettant

a

la Bibllothequ e nationals du Canad a de reproduire,

preter,

distribu erau vendre descopies de sathese de quelque manlere et sous quelqueform eque ce solt pour mettredes exemplairesdecette these

a

la disp osition des personn eslnte res aees.

L'aut eur conse rv e lapropri etedu droit d'auteur qui proteg e sa these. Nilathesenides extraits subs tan ti els de celle-ci ne doivent etre impnmes au autreme nt reprodults sans son auto ris aUon.

IS BN \:)·Jl:i - 781Jl -7

Canada

(7)

A BS T RACT

ThisthesisIlTl'S('nlsl\syslt"l1whichi"(";opahll' ufrt'nJgll izill~h;L1ulwriu .'u(' hill' '''' ' characters.The 1,irT llrcl.;c;l1illtrihlltt'l:lgr;'llhT1'pn'Sl'ul ntiulI(11,\ (:It), a hnd, ·\",.J graph.is illlrod lln'tiloCII'St"rih., tl,t· "Irlld "r al.111tlslal iMlj,·,,1illr"n ll;ll i" lI ur1.;11111·

written Chinese characters.Till'firs tIj'w \.1,'S,-r ilo\'Srmlintls''' 111w1;,Ij,'1I1' ),.-1\\" "11 radicalswithinitd' il Till'lt' l",IIll' ",'\,(>n.II ,,\"<·,'!l";"rilu':!slrulws111111 ....1,.1;,,""II<'1.wo'('1I stro kesin arad ienl.Wit h11,\ <:11,lin-1'l't'o ll,lIil,iuliptu l'l'SS1."'"11111""11shlll'l. ·task

"r

gra ph matching. A ('u"lrllll cliu llllIHp pill1!;11fillll li, I;,t"lll; 11Il",It·, grill' llishu.n ..lufl·'l.

Thlsapprcachrnntoh-rnnLIIl'\';uiilliull"o r llt\ (:llwhirl,t,.(I,,"ttllPillsL,,1.1iLjl'S;1I111

variabilitk'9orhandwrlucu(~hilu~'..!t;lri u"I.,rsTI'Su ltiultrrulII,lilr"n'utwtitil1~slyl.'!!.

Severalrule!! han'"''1'n111'1...1til n'-i\rra llgl'tl."ot. l" t urtill'wrt in 'S..

r

tl...Rt OlI.I,sill orderto avoidth,~(,OIl1hilliltut illlc~x"l"siollinl ...r'~ ltill"rOlI,11ml,td.ill"',1101...1....

JlAG R,t hc mo cld cllltft!lllSL' isUt~alli1....1asa11I'1l·tI ,&',\1<SIll SIllulti·wa yhI S 'slrlldu r<'.

localdecisionsatclilft'T\'lllk-vr-lsurtill'Irl'<,t..li.ltlIIrurrc'SltlilUlill~111"." 1,IHlt'lI'lo,t inthedatab ase.TileIlllltd liligl'tcwc':<Si1'l\1't)'dlj"i. 'lIt1111<1llfTurlll . "HIl.I111'1 w.-11tltl' syste m can «cquirct1~Jlr"1'Icllli\tiutl.sorc;IIMI1I:t"tSbyn 1"ll.tlll11Jt IU'''''ss . Si,v.'n,1 IIA·

Gllsorsam plesorII r:hlll'i lt:ll: r GillIII'sYllllll'siwr1iutunsill,ll,I!,lII\C:f("r ll l"dill t llt t ,'r

whichcan thenbeindllf ll'(lill tilt:IIll1tld,J;ltll l,I'SI',IIIHfl,litjllll,till:It'l,rllill,ll,pron 'ss canupdate thcmodels

"r

dH.fM:letswil l.tImIII\C: lh"ftlwirSMIII,I,,...Tll<:sysll:m isimplement ed ill G 0/1It~III'S/~I.120 fllllnill~UISC;/O S(V" rs illll·:t ,lj .

(8)

iii

ACKNOWLEDGEMENTS

I wishIu '·XI>rr'.SSIll)'tlumksluIllysuporvisnrDr.SiweiLIIforhilicons t a nt en- l:" ur; I W ~ Il Wll t,illsightfu lglli llalwc,Hill!cunst.ructivesuggestion. \Vithonthls generous c:ulItriIIlJtioll,it wuullllwimp ossible to give thisthosisitscurrentquality.Jwouldlike to thank1,1lC'Sys1.l'1lISSlipportsllllrrOI'jlrovidi llgall tll['lrt'll'iUII}illisisla lLC('duringtile n.J"I Iu-l orIllyIl'M'al ck 1,1111 lib"ve-ry griltcrullUthc AdmiuistraLh'estarrwhohave 111'11"...1inOlll'wuy or IIllulllt'l"hi11ll'pTl'pa r;lti(J11 of U,isthesis,[IIaddition,Iwould lik,'to lI('knuwl l·, lg{·I,lwlillllllf iillsupportreceivcdIromtheDepartment of Computer Sri"llfl'andlill'Srll " ,,1ofGradual!' St.udies, Specialtllllllksare dueto myfellow gr<llllllll,'studl'lIlsundgood frit'lId.s,lindinJlllrLicula r10 ToddWareha m,Anthony Wiltg l\('y Szeto,Hclnmt Hoth,;IIHISeanIJogilllfortheirvaluablecommentsam}

Il.se (1l1suggestions. 1 wou ld likl'liS,'the rbauccto thankPrnr. Jane Foltz, Pa t r icia Murphy,aurlHcXIIfortln-irIll'l llan dassis t nucc.

(9)

"'''i.~1J,....,i.~i._dn/i nl/"I /oIlly jNIn'III.'fI/Ill,,,!/...;.•tr r fm·/I"i'·f l"· OI 'l 'fjn","I /{,mf/ !jl ,,w/

1/"n"I1~"_of mynh". ,/li oll.

(10)

T able of Content s

(;hnptcrIIlllru.]•..-tj ' Il' .

1.1SlnlO't 'lwuftl"'sp l"1l1., 1.2{Jr~lIl1il.i11j"ll ..

r

111"11 ,,'Si~•

.1

•••••• ••••;1

2.l llil rud lll'li,,". 2.2() u·lill,·II"III]lI' rill"1I('('It

2.2. 1l-j'atIlTl'il1m l~' si"._

.. ...8 .10

.. _..••... ... .... .... ... .••....•.10 2.2.2 Tiuu-1"1"111"111'"

"r

lUIII'S.•Iirt"<: ticlll!<.or cxtn'l,wlI_. .11

2.2.5,\ u;"rsis.J,y.syllllw"is . 2.2.001111'r111<'111<1l1" ••••••••.••• ••••••••

2.:l 1Ii1luhnilkn ;111,]priltll ,. J(Til. 2.:l .1Sl;llisl inlll t,r!lllill!U'S.. ,

2.3 .1.1'1"-1111,1:'1,- mOll d.iug... . . •. • 2.3.1.2Trau"rtl rulal~lll.

12

...13

...1 ·'

. ... . •1·\

. .•..•15 ... 16 ...16 ... .1;

(11)

2.3.1.:1 Sl rtlk,',lisl ril ,"ti" ll. 2.3.1.·1B'll'kgHlIlllll[I'a t lll'",1islrit'llli" 11.

2.3.1.7 CUllll,ill"l iullSd W IIll'.•.

2.3 .2 Sl l lll·tlll'" llo'dllli,l' lt's ..

2.3 .2.1(:rmtllllill'lllo-lllt,,1 .

Chap ter3I'rt·prol·I'ssin.l.\.

3.1lntrudurtion: 3.2Thinning.

I;

.IS ...•... . •'211 .'211 ... ...'21

..~t;

..~j

..:.!~l

•..:.!~J ..;1lI

3.3Tl'l,r illf,. . ;\:,

3..5SI.I'Okf·~rtlll l'ill,l!,. . -I.'",

3..5.1C(Jtllll~'lioli"luslf'rill~. ..-17

3 . .5.2Ikrl an,ll,]f,rl"S1('riUIl;•.. . ·17 3•.5.3llitl1lryIlil"isiflll,-llIslt-rill,!!, . . ..·IX 3 ..5....l)isla lll:f'UI"il S'1r<'lIlf'lll dllslnill).!,. .•... .... . ... ... . . .-I!l

(12)

4.l l lllr"d u' li" 1I ... ..... . ..'"11

.

):}

4.1("'Il~1.roll"'i" u"fhi,·r..r.-l,i,.;,) ;,llr; I>II1I·,1gral'lI. . ,....•G·\

1.1 .1Il w li"ld1,llrilollll'c!I;rill'hnmslr1wlioll. ... ..6·1

a.1

11Iln " llwli"1l.

.•.•• • .•.G·)

... .Gi

.nr

:;,2IIn,Iic-;,1ililr il ",(",],l!,l"ill'lluIH ( ('bi ng.., 6S .. ... 78 ...81

....8.1

Ch np ter 6~IUlh'lIhll\I,;,sl'(hglll1izl'It11)',\Hcu-rcgeueous~llllli · l':a)'Tree ....89

0,1huruduvtiuu. . .. .• •,89

0.2t!d ,'rlJh,' IH·tJllS 1I11lIl i,w; I.\·In" orguuizutiou.. . .. ... .. . . .. . .... •.!lO

r..2 .1 IIllsit"nllll"l'pl sfu rIIII' Iw1t'rng('lIemlsIllulli-I\'il)"trcr- !l1

6.3('u l1sl ru r l i' lIl,,r1111lIti-w,,)'trcc-

...93 ..101

(13)

6..1.1S,·ar"],ill/:;,I~"rililin... .111 1

0.5~11'11I"1'Yn-dm-tlon. ..II :!

(i.1i.1:\11'1l 1'!I'~'1',·,lucli,m witII r;"linllli,II,,·,1lixt . .II :!

7.1lntrodurtion... . 11-"

7.2,\ 11rihlllt' d I!.nq,hs.\"1I1IHOS is. . . , ,.II!I

7. 2.1Syllll sis1' \'illwI1i,,"ofl\1"UIl '\ (:s.. . 1:!11 7.2.2SY1l1 11l'Sis,,\',,1\1;,1illu"f1\\,,,IIAC:s.. . 1:!7 7.2.3 SYll111l'sis

o r

I\\'''rillli,'aliltll'il, nlo'dj!,1'''l'lts..

7.3 tJl'clillt'llH'nn"I,·1,1;,1al",sl'withIIII' Sl"1I1,1,'S .

7.3.1.1 TIll'llIlslf,JI'd,ilrar:1,'rs .

. I'!!J

.1:10 ..I:H

•...1:1:1 ...1:1-1

(14)

7.:~.1.:l·n,.·I,...1r..rrr ..!i....I"

7.:~.:J.1Tn,m,f"rnlilli'lIl

GIllljl l c r 1'l (:'"WIIlSioll"•... .... ... .... . . ....• ..•. .

tl.1SlIlllm;'I.\· "r n lllt ril>11IiOJl" . .

ix

.•I:Uj

•.I:J5 .111 ..1-11

••1-1:1

..152

... ... . ••. .1!i2

ll('rC rl'IICI' !i

... .. .• ... . • .. ....I!i·j ...15$

.. 1·55 ...155 ...1-59

....160

(15)

List of F igures

Figure1.1 Structureofprup'N'd~pl t'l ll. ..Ii

FigUl"l' a.1 EigllllI\·iglih uri lll-\l'ixl' bof;1pi:-,,!/Ill' :11

Figurea.2 E:wmpll'or,.\'(' 1"C'I"u siu nI,yZhang-Slll'l1;,II-\••rilhill. . :\:I Fignwa. a C'o lllllO trislllLsoftlu-IllilillitlJ.!,alJ.\lJl'itl tllls• ...:1...'

Figure:1..1 ('ode>s ;,11<1,l i n,<"t i.,l\ Nfu rl''n'''l!liln\dialII ,·".Iiufl,. . ,:Ili

Figure:.t•.; Elwudillg ('Xillllph..,. ...:\7

FigUte:I.fi ('] i1ssilic al iun

"r

r.lu-lint·XI'gilll'lll,N,_ •.• •••:I!I

Figure :ti Falsr-slwkc' (·Iilllillil lin).\. .... . .,11

•.•.••..••. • .,1:1

•.J.I

•.•...•..••••.,Hi ...~x

Figure 3.12 lindicalsllill ll'1111bt:.~ql,rrlt~lll.etlwith

e-c'!'

IlI ll 1JI·C']' ~x

... ,!,fl

(16)

xi

Fi.l;llrt!1.~ S1.nWI.llr alnlllll,illation s ofradica ls... . .... .53

Fi~lln!1.:1 Il",li"11 1allrilml"d",nl]Jh. . .... 56

Fip;lll"l'1.'1 Anll"li,lal,' d,aral"1wauditsIIAGB. ...58

1"iJ.;lIl"1',IfI flitsofthl'"lIl ri.·s illthomljiH:ellcy matrixofradi ca l

l'

61

1·'i)..\II I"I'·U i lisill)..\"ll!ry " farlj ill','n' '}·lllil t rix to rl'pn.'St'lItthf'

.... ... ... ... . . (l3 FiJ.;1Il"l· ,1.7 (\JIlsl rlll'l,iuliofII,\CII"fa handwrittenCllilll'SC cha racte r , ....,66

Fip-un'[I.:l A,fjar"'IIl'yIli alCi'TS:l and IJ,..

...69 ... ,,71

. 74

Figu re'!i.,! i\1i\I (~uf ,I1111,1IJ ..Iter I'xdlilligilig the orderofVIandV3 .•••15

Fignn'!i'!j H'lllil'ill

s:

111 111itsmodel. ,.79

Fihlll'"[J.li Exalll!,!"f"r "I'llt-ring slruk,·s" itllli I, IlsingHilleI ,, 82 Figlll"t,[,.7 1·:Xilllll'lc·forun l" I'illgslrok,·s"andI, withHull' 2 , 83 Fi~:lLn '.'i.S Ex;wlpl,' furI1nk ring strokesII findbwith Rule3

Figul'l '~j . !l Exallll,I.,fur()l'Il"rili gstrokcs (/andbwith ltulc-l Fi,t.:lIl'l' .'i.ltl EXilllll)'"forunl,'ring slfuk,'s inarndicnl

... 83 ... .. 8·1 ... . ..,85

(17)

xii

FigureG.::! Tee...•tt1Jrc-'S<'llling tl"· IJOlrtil i"n..

r

a partinln "''''li''lll' ..!lli

Figure6.:1 Tr c...•tt'PN"N.'UliIiS lht'l'i'lrl iliulIufit~t..-tionintoJ:;rlI lll ' S •••••••!l;

figure6..1 TTL...•tt'J,n-st'lIlilll; tl..·Il;lrl il i,,"tlr;1,:;r"'111". • ••!11'l

FigurelL5 RCI'T('St'lllillgrllllriu'kr

) ~

whhapa thill till"In.., !I!I

Figure6.S HIHlin,11isl,uf11/1,\/.1,' . ....11:\

Figure6.9 Clllllprl'Ssi flll \"t'C'lur• • ••••••••••IIIi

•...I:!l;

..J:!;

...I:VI

Figurei.G Tablenetworknr~a ' l i ~~l l ill'I.Ifm"d.·1r1"lltl",s" 1~17 ...I:I~

Figurei.8 EXilllllJlt,..

r

II,\(:i/lt('~r;lli"l1 141/

.. ...11:l

(18)

xiii

Fi~llrr~7.101[1\(;ill t l'~ril1.i ullstr uctureof 11I1inputcharacter .. .147

Fi~ 11 1'1'7,11 Ta l,ll'network1Ji'l~lllli~a1.iOIiofrlata baxe.•.. . •.1-18

Fi';llfl'7.11 HAC;inh'g l'lI ti ulluud111\(;intcgrntionilltheupdating .,.• •..1-19 Upd al iu/!,f..'slills aft( 'rSll~Jl8 "", , 150 Figur«i.I'1Fiu111upd at ill gresul tsfortheinputcharact er ." 151

1"iAlln~";.1 Chlll"1w!.l·rs;'lIIpll'Sbe,jugrl'jl.'d cdormisclasslficd, , 157

(19)

Chapter 1

Introduction

Hand writt enChinesecharacterrl'Cogn iliu lIis wI'1lklll/WlituI,,· awry,lillin.ltpnolt Icm. Because of ibpraglllatir.value andits lh1rtir.1I1ariLyilllilt!!;.·1..1111'1.,...•Spiln ' recognition,tiletopic11a.~allriu:k ..1111<1.11)'rc."Sl'iudIITlialllillil"1,,'('UIIl"HIll" urtil<' mod Ch.ll1clI~i n&researchllllhjl~dsill thefield1I{ patternn",('~lIitjUlL. '1'111"lIlajur difficultiescome lrcm thefcllcwingIactors: (I) There arcllIuteth.m ruttyllrIJusillul Chi nesecha racters.Overa tenthof thcinilrt~flllllll lunl y1/~·. 1ill.lil;lyIif.'IWilll"

1988].Suchalargenlllllhcrorcharad l.·fcak~flri,'SIllak.'ll1I1'~r.!t"u~I:; l iIOllV'''y .Iilli cultand slow. (2)Tilestrnc urresufChines eI:hat ll d"r:!liT!:wry.:"rll pl i':i~1.f'11.MlIJlY

shapes,stylca,positionsIIfraJir.alll,and,Iirl:d ifllill IIrIml~llisuf!.11I:!'ol r..k!",.

(20)

IIIspiteof thesedifficulti es,manytechniques havebeendeveloped forsolvingthe

"rohlclII.TIleYhaska lly canbe classifiedintotwo maj orapproaches,namely, the

staii~ticalllletfJOdandthe struc tural method.Typical statisticalapproach es areoor-

relnuo n matching[Yalllll.6hita , ct 11.1.19821,bac kgroundfea ture distribution[Natio, etul.1981],backgroundanalysis[AkamatusandKomori19811, strokeanalysis[Me r- ishit n,etel.19881,andorthogonalexp ansion[Ar akawa1983J.Correlat ion matchi ng typically requ ires some forms ofnormalization. However,nor malizationhas onlya limitedcapability to compensa tefor the huge varietyof writ ing styles,suchasred- icnlpositions,direct ion andlengthof strokes.Thelastfour approachesdepend on extractedfeatures such as strokelength,strokedirect ion,strokesequence,andre- lationbetween strokes. Usually these features arerepresentedasafeaturevect or

«ndthe characterrecognitionisperformed by selectingtherefer encecharac terwith theminimum distnncefromthe inputcharacte r. However, these featurestend tobe unst a ble ashnudwritingdiffersfromperson toperson.Onemethodtoovercomesuch difficultiesisto shifttheburdento the users byimposingconstra intsontheir writi ng IWakahara and Umeda1983).However,not all userswill acceptthisand somemay havedillicultiesobserving the specified rules.Anotherwayis to increasethenu mber of 1II0delsfor eachcharactercategoryinthe dat abase to allow forthevaria tion caused hytheil1~ ~abili1yofthefea tures.Unfort unately,thiswillincreasethe databaseofthe Chinesecha ractersenormously, For exa mple, the hand writte n Kanjidat abaseFTL8 ('(llItnins152,9 60sam plesofonly 881differentKanji charact ers. Eachcharactercat-

(21)

character5('1,thelIIalc l.ill& IINn'SS willht' \1.'ry lim\""ulI""IlIillt; .

J\lt ho u&!1 Chlncso char a d eI'll han'a\1.'r)'rmnl'liralt~1,.tnlt·lIm',lilt·)·'In',.Inu·, lur edaccord ingto~llmeruleswhicharl'illlll'llt'utll'ulufwfilill~"1)'1.'S.Tllisslflwlu n' canbedividedintothreelevels:thewhlll.' dMrarlt'f11'\'''' ,tilt,r;"lit,.,IIt'vl,I,:\11'11111' str okelevel. The gt'Omcl ric dl1lf adcr isliCll of tIll' slru kt'll \'IlrytllSUIIIl'.'",It'lllwit.h diffe rent writers ,hUIthclrsl' ilti lll rdat inll s,l11i1gl~lllll'l,ri l'1'1l111iAIlr<1 1iullS <HI'IISII;IP,/

wel lmaintained. '1'111:",:Ilw!,l'rtil:;;1:1111Ilt,fq;;ll'lll~1liSiUVilrillllt.r.'iltlll'l':<,!luil11ak l' stru ctural appn!HChl'llmoreattrndi ve:fur1I1l'fl...·l.guitili llu(II,uulw ri1.l.c-1I(:hin,'lII' characters,Olle line orn'lll'ar d l lI'itllillslrndu rill;11'I' TtI;wlll'sr"p n""'lllsil.·llilr;u"lpr patternas ast ringandl\&rAllmlllr(I:ill ll:r omt t'xt,frl"ur tl'St r idl 'llnllllt'xiSl'llsit i\'!'

~fammarsuchallan illdexN I gra mmaroraprog ra n llm,,1Aral mm,r).0111.1ot''''rst'r ft.r tlla lpar t icula r gramma rj"Imiltlurec o gnizetilt:pallt'fll !ZIIO'ur.anel XiaI~JS;I;Tili andLiu1980;ZI1ilO19!)()!,Slrin ll;gralll lJl<ir:!nrclint ,tllWi,tful;u"III.'xillll't'II<II1~1t1..

handleverycomplicatedcharAclcf!iwithS('\'('r....lralljcal~,TI1I1shj~Il('rdillll'lIsj' OII i11 gra m ma rs(lree,plcx,or",(:1,)Uti:tl'lilli tl"f.1II1WI~Vt:t,hiKllI'r .lillll'lIsi"Uid J,!,rilllllllllt!l arc milc hmoteclllllplicl\ll~1and IIII:kpracl k al value. Auuthtitp PIUiU ,.i.sIII1IS1' patternmatchingillstCllfJll(parsiJ1~[Cllll nan.1Cllt~llll~I!JII~J.1.''1:011111Khu,I!JM!II.

A cha racte rpnttcmisrepresente d itS Il relatlouulI\raphill wlIii'll1I1l~Vi:rlin 'sw!', rese n tthest rokeswhiletho arcsreprt~cllttherd aLiunshil1 I.elW....-ustn ,k<::i, Tim

(22)

f)1~~ pi t.~ tlw~/~IUr V,IIII,il~I'~,cu rre ntgra p happroachesde notIullyuscthe str uct ural IIrCJp l ~r ti f'~an.ldllnotpro vide effoctivoorganization ofthehugemodeldatabasefor f,~~talld,U:C" lIri'l",~,:arclli n~. Forl,lti~rea so n,a newmethod wh ichis notsens itive to wril,illAstyll~Si~prop()st~ 1anditolfersusers ahighdeg ree offlex ibilityfo r cllcc- tivl ~rl'cogllil,ifJllofhandwrit t enCh i1Lt'S,~char a cle rs . Thismet hodisderived fr o mthe stahlefea t llrl'Sof thecharnctcrs.Howeve r,illtilerecognitionpro ces s,some of the 1I11SI.1I1.1.'fl"I1.ur. 's~!Idlastheoricntnticnsofthe strokesarcalso necessary,Hicra rchi- cidIIt trill1lll~1grill,hs aredt' vdo]lc, ltodcs c rlbe bothluvarla ntaudunst ablofe at u res, wlth;lIlj'II·"lIl:ymatrlres1IS1't!1,0rep wst'nttheseatt ributedgraphs. The bits ofthe

"lIlri,'~illa mat.t ixd(~snilJl'theitltrihutellsetassociate dwitha vertexoranedge, 1I,'Spd011lilt'1.l1l~bit-wise rcprescnt atlon,IIcostfunc tion isintroducedtoma pthe grnl'hIIf allinputrhnractcr10thatclitsmodel.This approachcallprovide som etol- ,'nUll"!'toLllI'va ri ,llioliSofdl<lflld er swrittenin differentstyles,For gr a ph ma t ching,

S,'\TI";IIrlllt,sareappliedtore-arrange theorde rof tilevert ices of thegraphinorder

1/1il\'(li,11,lwl"OIllhi llat uri a lcxplosic.r. Furthermore, themo deldatab aseisorganized '1.~aIlI'tl' tIIgl'Il"OIIS11ll111.i. wayI,r\.'(~according to thespat ialrelati onsbetweenra di cals, ttu-numlu-r ofs\.rok{'s p,'rradirnl,andthegeometricconfigura lionof strokes in each nulirnl. Irsill~thchil'f,1rl"h;f,11attributedgraphre pre sentatio n andthemult i-waytree urgilll;'m t,;ullIIf1hec1,,1.ahase,theefficien cyofthe mntchlngprecess canbeimproved t"llllsidt'rll hly.

(23)

1.1 Structure o f the syst em

oftwo functionalparts. Thoflrst part isIromIhI' HH,.l'hll~,"llLi IULl'illllpl,'s ofachara c t er" ,to tilehex"''I'"hlp networktlr~nllbwl,iullofIllt,,!t,\,lal ilh ilM'~.TIlt' figureilIll~tralt'lltheprocedur e furbuildingaudIIptla l,illgtil\'111",11·1 .lal;,loa...-. TIll"

rest of the flowchar t showst.hesl~nllillIlilti.whirl!IlI'r[' ll'lllStill'1"\'{"ull;u;1.;m ttnsk.

Therecognition procedure mnyhr-did,I, ,]iutctllrl"(' k'vcls:lull',illl,t·rll"'(liatl·,,1111\

high. Thelow-level essentially ilivolVt'sthin lli ng,skt'ldulltrill' jUg,:"'p;I1"'IIL 1Ilt'rging, andstrokegrouping. At Lil,' iutcrruediatc-Icvel, l.hr-hh-rurehknl;Il.l rihul.,·,1grilt.11 rcprcsontaticn

or

OleinputIlilllllwriu" ueharurteris gt'ltl'rat.t'fl.'1'111'r''('IJKlli ti"u ill the hlghlevel involvesmult.i-waytn 'I'~~,;trl'hi lll!:,l!:1"ill'llIImtcl. il%.111<1 IlI<lppillJ.\<,usl compuLat ion.

1.2 Organ iz a tion of the thesis

The thesis isorgmli;(edinto eigbldla p l.('r~ . Clral, t.f'rlwuilltmd lln~~(,xistill~ll:dl niqu cs forrecognit ion ofCllilifOSl:dlarartl~rssuchas"j;ldli llt~prinkrl(;llillt'sl ~dHlrar"

te rrecognition,

o r r·

lineCldlll~Sf~characterrecog nitiou,IUItIun-lirwChirrl'1\f~f,h.. rndn r['cognitio n.ChapterthreeclcsniheshowlireloralprlJpf:rli(~fjfall illl'utdlil r••ctnr arc chtaincdand()rga ll i~cdforfllrl, ll l~rimag eIlilillysisilll,1n~"rl :S' :lIl.a l,irJII .(:ll«l'lo:r fourisdevotedtothelril:rardriciLlal,lri],ukdgraphrcpresenf.utionf,fIJilut!wril.l. :u

(24)

Oht alnthebinary image of of aninputhandwritten Chlnese charartc r

l'rl!pr'orCl<slng

Thinning

Skeleton tracing

~lcrginKsegmen ts

~<'tlht,input ' No

-<:

~~ Milr lt'r I""" ~llltc1lillg?

TIII'rh,lmderis rt'ro~lIil.t'd

Yes

Figllfl'1, [SII'IIl'lUt('of proposI'llsystem

(25)

Chinese rharar-tcrslIudIllI'fOI1s1rucliuIl

u r

III<'hit'm rt'hir,,1"tlrilolllt..l,:;r;'l' h rl'I,r. ,·

scntat ton.Chal'l l."fIhl:!:i\'l"!llllt' Illt,thudfurrhi'fil(-h-r""'uI:,nil i"Il,,\ t"l",1Illil l'l'illF;

funcrlo uis introduced to11\/llr h till'allr;IIllIt'l1~ri'llhuf au il111111dlilr;u'h "filll,lt !Lill of itsmodel,IIIChlllllt'rsix,lIlI'hc1e"ftl"I'lll'lll1~1tI111ti'\\"il~'trt'"list'lltu"TF;il lli~,'lh., model dlltllhast" ror r<lst itllilitl'f n rith''''';IUhi l1j;is.Ii"fll"st'll.lIi''''''11'' 111111'IIlu ltiWilY tree,thesea rchprt)CI~SScallhe Ili,,;,I,"<1iliinilnutuln-rt1r';;1111'1., ill" III1I'1I1, 1''1';si'"lsill

iug procedureror huildillgilllll llp dirl.illl!;1.1)('II1lJd.·llla l ah;I.~I 'isdl,~t'1"i l ,,·dillCI'ilpll'l"

SC\"Cll,Chaptercil;h1p;i\'I'"IIll'1·'lUdll"i"II.~illII 11"IS,~ill1I'lliTt'di"ll"fu rIurf.lu-rTI'st '"rl"ll,

(26)

Chapte r 2

Survey of T echniques for Chinese Character R ecognition

2.1 Int r o ducti o n

('hilra d I'Trl.'I'uJ;nitill llis II slIhsd

IIr

pattcrurecognition. lt was character rcrognil ion tlllltIl:<ln'till'illl"1'1l1i\"'sfo rmilkingpatternrcc ogn it.iouandimageanalysismature

fit·ldsursc'iI'IIl'l '. Chillesl'dlilrlldwrceoguition(CCH)offeredIIchallengethat was

ill(,(·tlilillk.,~·ilsp ,'("lsTqln'st'1l111'-i\'('orIII('largerworldorpatternrecognition. CeB.

prll\'hlt~linilll1mrl a ll1.II'HY tu inputChinesecharactersinIImassivemanner.Inrecent .\"I'i1TS,al'tJIlSill.' rah ll'muounrorworkhasbeendone onCeltTheCeRtechniques rnuIll'rll1ssi!il,d101\1IhTl,('ruaiura lt'gnrit'S,namely, I)printed

cen

whichis the rl'"ul;lli li tl ll\Ifsl'l',"ilk('hi li l' SI'f' ll l ls(SmJIIg.lllar k.Kal,etc),2)oil-lim'ha nd writt en

(27)

cellwhichisthl'a...·ugnilitJll ufsi1l1;1,'ha11<l \\"ril1t'nChinl's,-..hilr,u·lt'r~_wlll'n'not onlythe c1lilril("ler ill1.lg'-bUI also Ih'-lilllill!!:illf" rll lOl l iuliuft'a dlsl n ,k ,'is1,nl\' i ,I,~I_

3}handwritten

r eB

whirlIistln-rcrognitiou"fsilll;l..hamlwrittvn('hi,u'S('dlOlr"d ,'rs whicharclI11fOlIIlC("\t...l .lull 1101wrhtenill'·ill1igrOl ph y.~IIfar,prilll,',l ('('Il,11,,1,111 linehandwrif.tcuCCHS)'s1.t'lIISMealnwly<ll"ilil"I,lt' illtill'lIl;lr kl'l 1'1,11""Tt"'lllli' llil'S for doillgumlu-font.prinled C{'1l ar.. ill·ail.,I,l..in tlu'lal ,ural" ri,'S.Hnudwritu-n('('II, however.isstill rar Irum'willI;Ilrildicill \Ll'lI\\'l~I~'jI.

Atypir <lltCnsy~tCl1lill.-hltl.'Stilt' f"lIowinl;fllnd iUllill(·tJ1HIHIIl"lIl,s.011ringpn-

images.Ccrteiutyp cs of Hoist' (·l1nl... ,·lilllill;llt' dilli.hlsl'" iu l. l'll tun·lim'Ililll'lw,.il 1,1~1ICCIt.tIll' stroke's llosit iuli.•lin,,·Hun_i1ll1ll"nll,l,1litfl'rupt.ur ....1wllil,'itsll'" k,'is drawn.Tlw rd llrt"il.ist· ilsic rtuIJ111iJi11 1.1w slruk" S!·'11l"1...,os" ra( 'ltilit'sl, ,.hi1rild,I'r f"r theon-HueCCll.Thinning isn'ljlli n '.J III n'IJlln'llll'illf"rmal ,i..11w·,'d..,1fill'1111'lII'xl step. After[lTcprUt;l'Ssillg,tilel')(tritcl cdrl';lllln ","f t.ln-1\1·1...liml'!Isj'III;11d"r" IIl;llrix willbeusedtomatch ilg" illstilsd"fPI·"..~1.llrt·,1rCilt,mt'S"fl.lll'rt'f"I,I,lI, ....1Cllirll'S!' characters.ThelJ('st1Il;11(~hiligwillI",1I~..r11.u i,I'·lIliry !.lll'd" II'a"1.<'r.'1'"1"1',11"",Ill"

recognit iontilll(~IUIIIto1\(:111,:\'(' itIliMItfl't'lJll"i l ilJllnth',i'1111111 1sti,W'II 'l"t'l;ni 1.il)l!

subset ofChinesedlilr;,clerillloslllidlJ4rollp.~,tllf'lI l,lll: lil,n l ,IIS':flllliliitli"nidl'lIl,ili.'s thecharacters fromeachgr<JIlI'.

(28)

10

"J(,gi,~sinceltisgiven. The problems of printedChinesecharact errecognition,

on-lineChinese cha r acl er recognition,andha ndwritt e nChin ese cheree ter reco gni- ticnwiJJhediscussed.

2.2 O n -line handwri tten CCR

Ch i-lineImlu lwritlcllCc.:1lis simpler th an printedCCRand off-Iine handwritt enCe R

Il('cn.u~ethotnechiuecancatc htheaccura tesequenceofstrokes.Inon-lineCeR,the

syst e monlyneedstorecognizelessthanone hundreddifferentkinds ofstrokes. As

loug/1.l\theumchinccallrecog nise thest rokesandtherela tionshi pol the str okes,itcan

n'l'UgniZt:III'to several thousand s Chinesecharacters.Manymeth odsareavailable furon-line classificnt.ionof handwritten Chine sechara c t e r.Theyaredescrib edbelow.

2,2.1 Featur e analys is

Aset offl'ahur~canreprese nta.handwritten Chinesecharact e r.Thefea t uresmight I'l'Ims('d0111IH'sta tic properties ofthe cha rac ler, thedynami c prope rt ies,orboth.

TheIcaturescanbe binary . With binaryfeatures,thename assignedtoa known rharnctvrisoften determinedbya decisiontree[Ha n a ki,et al. 1976 ; Haneki and Yamazaki19801.A dislldvll.ntal;c cfthismethodisthatit may110tproduc e alte rn a t ive dHlr 1H'll'rchain-s,whicharc usuallydesirablefor pest.processing.Recently ,II.binary lIl'dsiolltreeUH' Ssi mplefeat u restoredu cethesetof candidate cha ract ers toasmall

~l'tforsubseq uentan nl ysi~by complexfeatures[Kerrick and Bcvik 19881.Thefeatures

(29)

11

can alsobenon binary.A fixednumberof nonbinaryfeaturesiseomutouinpllllc'TLI recogniti on ,and many classificat ion mclilodsarcavail a blefo r dividingsuchaf.'utHr,' space intodecisionreg ions. Forexa m ple,II.multi-st agecll\5sifi"rwith~('ll<'r:lltree struct ure based on the divid ingofWalshcoefficients11Mbocul!"Vt.[o!1I'11I(:u,d tIL 19831·

2.2.2 Time seque nce ofzo nes.dlroc tlo ns,01'extremes

These methodsrelyprimar ily on dynamicinformatio n.A sequenceof"",1".17."Il'·s canrepresentl\cha r a c te r [Engdahl1977;llanekinnd Yamnaaki1!lllO]. The~" Il ( 'SM,' specifiedbydividingupthe rect anglethat sur rounds th e wriHt'lIrha rl,t'I.I'f,1I1l'1I 111l' ch ara cter is superimpose d onthe rect an gle,and1111.'SC I IUCIl Ct'ofl,Ollt'Strll.v,:r~",lI,y thepentipisdetermined. Thissequence, ora correspo ndingseque nceufr"iltIHl'J;, assigns a name tothe unknow n char ac te r,ofte n hy ex a ct matchfmllln,Iidinuaryor zone sequen ces .A sj milarmethodusesthesequenceofdirec:tiollsflfW'll till lIl" l ion duringthewritingofacha r ac ter(Cha ngnudLaInaiCraneandS;IV"it~I!J77;Grolll~r 19681.Using fourprimitive directio ns(up, down,left, right) ,one systemf",J,~dtill' firs tfour directionsof thesequence and thenclassifie d thedmnld crIlytaM,~l""kul' wherethe table had 256entri es IGroner1968].As the1I1111lhcr,Adiwd i"l1 ~uml time intervalsincreases,tablelookupbecome slesspradical,nurltill:S'~fj lll:Il':':Sart' compared by curvematchin g.

(30)

12 2.2.3 Curve matching

Cur velIla tdlingis apo pul arimageprocessing method,Curves fromanunknown characte rarc matchedagainstthoseof prototyp e cha racters.Thenameofthepro- totypethd hestIiLIL1c!Icstheunknownis assigned to the unknown. Thecurves are usuall y functionsoftimeauch asprepr ocessed:randyvalu es, orthedirectio nangle of the tangenttolIl Ctrajectoryof the writin g [Ishigaki and Morishita1988; Ishii ]08CijOdaka,otal.19861. UsingFreeman code,n characterhas bee n dividedintoten rl'giolls[u,otill.19671.SinceChinesecharactersconsist mostly ofstraigh tstrokes, upproximntj ngtheirstrokesbya sma llnumberoffixed points has been fo undsue- n::;s(1l1[Odnka,et111.19821.Analle rna tivcto thematchingoffunct ion s of time is thematching of."'ouriercoefficien tsobtaine d fromtheJ(I)andY(l)cur ves[Arakawa, t't"I.1978;lmpedovo19lH).This meth odisappr opri atewhenthe characters can herepresentedbyareason ably small number of Fourier coefficients.Since straig ht- Hill'strokes requirehigh-or derFourier coefficients,this method hasbeenusefulfor cbar ncl.crs co nsist ing mostl y of curved strokes, like numera ls,or of concatenations of many str aightstrokes, like Chinesecha ra cters (Ara kawa, etal. 19781.Cur ve match- illgbecomesequivalent to pattern matchinginfeatur e space whenthenumberof I'lJin lllc1mrnclc rizillg thecurveisconstantand thereisa one-to-onecorrespondence IO<la ka,ctnl.IUtl:!l.Thisis alinear alignmentof the pointsofthecurve.However, dueto uonliucn ritics,the bestfitis usuallynot a linear ma tchi ngor alignme nt.For manysequencecompa riso n pro blems,r/l1.~fil'lilli/chillYhas beensuccessful[Ikeda,et

(31)

at. 1978j Setoand Adachi 1985;Wakalnuaann Umcda I!hl:i;Yllshitlll;\lHISllk".' 1982J.Becameelasticmatchingiscomputationallyintensive,tIl('I'r"tot YJl"sart'n'·

quiredto be first pruned10 reducetile co mp utation. Application of a 1000al allin,' transform a tion canenhance the shape discriminat ionofe1asti.~matl'hin~.USill);lln- pointcorrespondencefromclasticmatchingbetweeninputandrd l'fl ' l 1l' I'puttr-rns.a deformation"ectorfieldisgeneratedand then approximatedby meansofill'fa tiw' applicationsof local affinetransforma tions .Finnlly, furtlwrclusf.ic1I1l.tch iul;lu-Lw....-n the input patternand thedcfor rnerlreferen cepnttcru superil11l' nst'(11Iylow ortl,'r 1"<'111 affine transformat ion com po ne nts enhances shnpe discrimin;ltion ,I.lllvlllg tIll'r-rrue rate(Wakahata1988),

2.2.4 Strok e codes

Thestrokecodemethod classifiessubpartsof a chnrectcrunrl thenitlelltilit·stilt, character(romthesequenceofclassified subpart:.IGr(~lI.lIinsuI1l1 Yhupl!lli2jlIilll;- Hua1988 ;Kuo,ct al. 1988;Linand Teai19881.Onesystemuses7f)slrok" codl:Nof constitue ntshape s to specifyandrecogn izemerethan threethousunrlKnuji,:!mr,wt"rs {Yurugi, et al.19851.Strokeclassificationuses the sequence

o r

dir",:li"l1IllI,;ll~S.Tlu-Il decisiontrees ofstrokecode sequencesunder relati ve positionalconstraintsI'llslr"kc.~

classifythe radicalor character.Anoth(·rsystemusesilformalism hasell111'''":UI initial stroke·sequence decision treeantipositionIllatdrirogICh" n, d:.1.I!lHijl.TI,i.~

for malism hastheadvantageofusing thefeat uresI)rstflJk(~s,stmkl~'~'~IIIl"II C'~,allli

(32)

14 I;f:Ql/lci ric relat ions butavoids thedisadvantagescaused by the instabilityofallof llic abovefeatures.

2.2.5 Analysi s-by-synthcsis

Yet anothe rapproachis analysis-by-synthesis,somet imescalledrecogni tion -by-gener a -Jion,Sever nlstudiesarcconcerned with the modelingofhandwritin g gene ratio n[Ya- sulmre1975;Morisllila, ctal.1988).These methodsusually use strokesandrules fureonneeringthem10buildcharacte rs.Characters generatedfromthe inventoryof Hlroh's cons tit uteide alls cdstandard representat ions of thech aracters.An apprOX- imat iontoreal handwrittencharacters callbe attained byspecifyi ngthesestrokes with math em a tical modelsthatdescribethe motion ofthe pen tip as a functionof umc.Theil aImnd wrilten charactercan be divided into strokes.The stro kesare cles- sified using themodel parame ters,andthe strokeseque nces[Yoshidaand Eden 19731.

A similarapproachusesdynamicprog rammingto matchrealandmode ledstro kes

IWakahraandUmcda 19841.Duetoitsoptimality pro perly,dynamic progra mming

ca nheused 10 obtainthe minimum dist a nce betweenan inputand arefer e nce pattern tohandlethe proble mcuusudbythe distortionexistinginthe in putpattern.

2.2.6 Other methods

Perceptualstudies ha vebeenins t rumen talinthe developmentof pair wise distinction methods whichsepa ra tes each pair of characterstha tmightbe confused. Studying

(33)

the way humans distinguishbetweensuch pairsledto athlvry ufchaml"lI'fSI"' M...I on functional att ribu teslCox,eral.1982;Wata nabe.l.'tal.1!I,sSI.I'airdidi nd i" l1 by fundionalattributeshlU'led10robustr~ogn it ionlI\t"thOlt••Ilotahlytl.al intI...

commercial,ystem by Percept Sometim esthesame IIttribllt edilfell.·nti al ....IImr,' than a pairofcharadersISakai,et al.191341.Anothermethollrejlrt'llenh a dU\fad" r bythenumber,order,andrelativeposit ion ofshokes; SOlUeshokt'~awIlividl..liut..

more parts,pnrt icula tlythose ofcharacterswith few1I1ro ku[1{lIIO.t·tal.l~ lT l\l.'l'llt' statistical method of Mar kov modele ispartlculaely~uitl~hlcroedYlll~mkInrOfllliltiu1\.

[Farag19791uses afirst orderMar kovmodelwitheight~t<tlCIIcorrt'lIptllldinf1, t"l'ip;ht pen-ti pdirections.Asyste m unifiesthe statisti clliandIiYlltl\clicalilllpn t<u:llt...fur"II' lineChinesecharact errecognition[Tai andLiu 19901,AfllIlly atlrill1lktlIinitt··..tlltt·

auto m aton isintroducedforstroke recognit ion.AccortlilL l:,tutl...intrinlli....strllt'lllft ""f Chinese characters,a two-dimensional eherecteriSlrllndoflucdintol\0111'fliIlU'II~i"II,,1 att ributed string on the basis oford erarrangcmenh ofChillC5Ccharad ers.SlIdl stringscanbe easily recognizedby templatematching.

2.3 Handw ri t ten and pri nt ed CCft

Handwr itt en(off-line) and printedCCRarcmoredifficultHlILnon-linecell.IlI'f:I~USt·

tlteformeroneisperformed....rter thewriting orprinting is enmplc tcdandtll<~:d'm~

huno temporalordynamic information~uchlUIlIuUllJcr{IfIIlr...lte!l,orr!,:r."tIl"

strokes,direct ion of thewritingforea.ch stroke,orspeedof till:writingwit hinl'ad1

(34)

16 .~1. r<Jkf'. Mor'~JVf,r,lHltl<lw riLlf~1l

cell.

is tile roost difficult aspectofcharacte rrccogni- tion,IW";'IIH"lll'~ lI"is'~saL dCIJWlIlsendthedistor tio nsinstructuresarcdealt with silllllltall'~Ollsly,eSI',~dallytholargo scopeofdisto rt io ns produc edbydiffere nt writ ers . TJll~1,I'f:llllifPU~SlIS1·d illhaudwrlttenandprinte d

ce ll

canhe roughlydividedinto .~llLlisLif:;llandstrnctural approiu:hcs. The sta tist icaldecisionor decision-t heore t ic

~Ipprll..rhinv" lv.,;the use of transforruutlon functions,distrihut iondecision lunc- l.ilJlIs OTtheirI'lluivafl'l1lFnuctions,.~ll("husn;lYI~sianclassifier,statisticalequivalent bhn-k r!ilssili,' r, IUI'I.~"uu. Tho sYlIliu:ticalorstructura lapproachusesvarioustwo- dillll'll.siullalgrll l1lrnilr.~forcharacterdl~~cripti{)n,parsersforanalyzingthe str uct ure ofauunknown charnctcr, Hud'lUrillllt.cu graphsfor describing charact ercomponents andCUlllllOlwllls rdilt.illllships ,

2.3 .1 Statistical te ch niques

Tetup lnt enmt.ch ing

Thet'ilflil'St.lllprmwhforCeB\\',1Srepo rte dtwentyfiveyearsago {Casey and Nagy, 1!Jli(il.'I'lli'illll,lllIrs usedoneofthesiruplcst patternrecogniti ontechniques:tem pla te- lIlatdl illg,1':" ,.]1 rharill'l('ris asslguediltemplate or maskwhichis ama tri xofhlack 'Hulwhit.· plxols. Tod., s~i ryagivcn rhnractcrsample,itsmatr ixis compa redto all tentplntos. Classifir1l1ionis achievedifOIlCofthetempla tesprovidesasufficiently

~,It)dnwtchIII1111' dmrarh'rSil1l1]llt·.To speedthematching,atwo stagematch ing III'm't'SS was i1l11'1ldUrt'll.Thill. is, similarcharactersweregrouped first , thenmasks

(35)

wer e gro upedand finallyindividualmaskswere employed.Ing"lll' fal,1I1is mctluul involved expensivepixel-by-pixelcomparisonoflilt'matrjx oftheinJlutdl1lt,H'krami thetemplat e.In addition , suchmethod isonlyapplicnhle toprintcilcb:,tad" r~in which the size and positionof theradicals can benlmostCOI1~ll\lItaile! sta hl,'.111 thecaseof handwritt en characters , normaliza tionof acha nlcterdu.,~1I0t n...'ssar ily mean normalizationofaradicalwhich const itutesthe subpctte-uof lI... rI':\f<,<'t'·f.

Tr-ans fo r mn t jo n

Fourier,Hadamard,and KL(Karhuncn-Locvc]trl'llsforlllati olls1mVI'h.'t~lLnppli"l!tn pri ntedChinese character recognit ion[Nakata,d nl.1972;Cu,t'l:11.1~11l;1;Sakai, etel. 1976;Leung 19851,butonly KL hasbccnused forhorhhuudwtittcuIUIlI pri nted CCR.One of themostattractiveproperti es ofth"lwu·lIilllt'llSillllalI·~,u ril,r tran sformisits abilityto recognizeposit ion-shift edpatt ernssinceitOhsefV"Sthe magnitud e spect rumand ignoresthe phase.IliswellrecognizedtlmttI",pre(isinn ofcenter-locat ion is a problem forthescanner,and itisaliticipat "l1 lhatwill alslJ be a problemfor ident ifyingprint edChin esecharacte rs.ThelIadallmrd trallsformis more acceptable inhigh-speedprocessingsince itsurillulIdkccrnpututiuuinvolves only additionandsubtractio n.Themajordrawbac kofallpliclitioll

fir

tl1isIt:d llliIIU'' in patternrecognit ionisthatitsperforma ncedependstooheavilyIIp'JIIOil'position ofthepattern,

Inpart icular,KL was verysuccessfulin printed CClt,ill whichthree rn lhog'JIllLl axeswereusedinordertoabsorbthe vari ationsofdisplacement andwidth oflines.

(36)

18 lIowcvcr,more thanlensuchaxesarcneededforhandwrittenCCR. Furtherm ore,it isflat practicalclue tothe heavycomputa tioninvolvedindiagonalizingtheN2x1\'2 correlationmatrixcorrespo nding tothe sample imagesdigit izedonII.NxNgrid [Leung 198.11.

ForChinese characterpatternsrepresented by their oute rcontour,Four ierde- scriptorsateveryusefulinrecognit ion[Krzyzak and Iluaeehi19891.Amongdifferent technique s,Fourierdescript orsaredistin guishedbytheir invarinncerelativetothe standardshapetransformations such as scaling, rotation,translation,andmirror re-

1I('~.I0I15.Themaj ordrawbacks ofFourierdescript orsare (1)theirinsensiti vity to

SpUIS011thebonndaey and (2)disconne ctedpatt ern s giveacompletelydifferentspec·

trum and stylevariation sarereflectedinthelowerordersp ect ru m[Verschu eren,et 111.198t11·

Strokedist r ibu ti on

A distributionof localstroke feat uresca n betakeninatwo-dimension al plane or projectedallaone-dime nsionalRxis.A popul arexa mple istoconside r thelinedirec- tion sasthe localstroke features.Suchascheme iscalleddirectionmat ching{Yasuda

<loudFujisawa1979;Saito,et al.19821.The bound ary direct ioniscalculatedateach boundarypointby followingthe contourbetweentwopre·and post-pointsalonga binary patternon a 64x64 gridplane.Thedirectionisquantifiedintofour directions and mapp edtoa.16 "16 plane.Thismethodisquite simple, however the recogniti on rate isvceylow.Forimprovement,size normali zatio nandshiftsimilarity should be

(37)

Ul used , Anot he rexample ofstroke feat uredistribut ion011Uwt...o-diuwusiouulplum- uses st roke lengt h[HagitaandMasuda1981}whichis consideredns a s\l(...~il~l,'us,'uf the distan cerepresent ationintroducedin theReid cITedmetho d IMuri, d el.1!I'j'I).

Ateach black point,eightquantizeddirections arctakenandtill'di~tal1t·t·iiiIl1t'1l.lillr.·,1 alongeach direclionfromthiscente rpointtothebounda rypoint.Then hl;{ltli~ta.lIl'\' valuesalongtwo oppositedireclionsare suuunedand a Iour-dinu-nsional dista llt'(' vector canbe obtainedat eachpixel ona 128 x128 plane.Toget1\compactf,'a t urt' distribu tion, this planeisdividedinto 8x8 zones, eachorwhichi~orHi"1fj,and thevectorsare averagedovereach zone.Mat chillgisdonehytilt:litH~I.IIt'dMOD rult, (mi nimum distan ce decision] which is essentia llythesameas correlation. Tllis sehenn- extracts mot e globalrcat ures than a localdirect ional Fcutun-Slithat5"111"""lII ph~x featu res suc has intersectionpointsare reflected .Jloweverit CillLn"t dislillgllilihth"

cha racters which Me verysimilarto each other.

An almos tidenticalmet hodwasusedwhich triedto recognise printedUlli lll~M' charact er onthe licenseplates ormoving vehicles[Dni, ctal. WIlIlI. A stMlllanl X andYprofile was definedfor eachcharacter, andfor11givellchurncterSI"IlIIII~, itsprofiles wereconstructe dand compa redtoallstanda rdprofiles. Thecrite riun forrecogniti onwas thepa.ir ofprofilesyielding theclosest ccrreapcudence. 'I'llis ap- proach yields someadvantages:1)Using twoone-dimcnsionnlpattcmsperc!lam.;t':f as opposed to onetwo-dimensional pat ter nresul ts inconsidemblcitl(oflmLtioll rcrluc- tion .2) Theprojectionprofiles are easily extracted[rumtheuriginalJ11111t ~rn. :1)

(38)

20 Sincethe project ionpto lilcs areobtainedbyanintegra tiveprocess,theytendtobe It'SIIsensitive tonoise.However,theproject ionpro files5u lTer~frompositi on errors betweenthe illJlutand atan dardpa tterns.In addit ion,differentcharacterswiththe n,,,,eprojectionfilescannot be discriminated.

Bnckgr o undfea turedistri bution

Thebackgroun dfeaturedistribut iontechnique[Su en 1982;Naito, ctal.1981]isbased 011ll.~Iigh tmodificatio nofGluckman' s wellknownmet hod ofba.ckgrou nd featu res cxt racl ion[G lucketn nn,1967J. Forevery backgr oundpictureelementofa binar y pet tcrn,scanninglinesMCderivedinfourdirections,top,beucm ,left ,andright.

Ineachscanning, thenumber ofcrossingsbet wee nthescanning lineandstrokes iscounted. However,thisdoesnotgiveanexactstrokedensity in eachdiredi on, beea usefourqllanlizeddirectionsarenotnecessarilyperpendicular tothestrokes;

sometimestheycrossthestrokes lllngentiallyoreveninparalld.Forimpro vement,

1\crossing is counted only.henCl\ch directionisnearly perpendiculartoII.stroke.

Bnckgro un d.'lutllysis

lnstcnd ofpropagating blackand white informationalinGlucksman'smethod,more exactinformat ion011shokesbeing propagatedcan beextracted. Theideaisto propagateedges.namelyedge valueand direction[R.Oka1982;Yamamoto19841.

Int1.ismethod,acellildefined011cellularspacewithmeshes of7x7andeach cell has eightintracellsforeightdirections.Eachint racell storesthestrengthofthe

(39)

21 edge (e dge value)whosedirecti onis jus t norma!tothe direction of tlJillintrlwd l.

These intracellfeaturesrepresen tgeometri cfeaturesof theinput patternnrouudtill' cell.Foreachcell,the edgevalues ofall ih intraccllsarc nvcrng ccl. Dulytile'S" cells whoseaver agededgevalue exceedsspec ifiedthresholdvaluearesdecl l,1lUlll IIS("!

inmatching ,whichiscarr ied out by distan cemeasurement.Howe ver,liStill'i1ll;1~"

qualit y of theinpu t cha rac tervari es,itisdiffic ultto selecl an unequ al thn'slu,hlvulm- even for twodifferent qualit yimagesof the salliein put churnctur.

Stroke ana lysis

Strokean alysisis themostttllditionalapproach forChinese-dU\tlu'lc rrccuguition.

Gen erally, st rokeanalysisis basedon theskeleto nobta ined hy tilt,thilllLillJ.;IIr" pr"·

cessing , bu t itis well known tha tsuc h resultsarcnotsa t isfactorybecauseof IIHiseand disto rtion (Kimura,etal.19781.ForprintedChi nese chnmctcr s,thetypicnltyp es of noisearedistorte d inters ecti onpoints ,whiske rs,and brandies ca usedbytouching str okes,since thestro kesof print ed characte rs arc usuallyvery thick. 'I'hcrcslill remainsa majorpro blemofstrokesegmentation afte r t1linuing.Howe ver ,lhinllinJ.;

preproces si ng isstill attractivebecauseofth esim plicit y of the alg orithm. li usicrn- search continuesonthinningalgorit hmsand theirap plica tion to st ro ke5eg nll~lI tatifJlI aswell[Pavlidis1982jWakayama1982;Lieoand lIuang19901.

Onthe otherhand ,to adm it thatnotallofthenoise N()UTCeN mentioned':11111m removed,aprac t icalapp roachisto considersome non-loc alhut st ill"j l1J "lf~I",i",' removalmethod ,whichwouldbeeffective against tile major types ofII'Jisr1.IIIthili

(40)

22

~CflSC,an ideaofthe so ca lled"geedcontinuity rule ofGest alt psychology"is very usefultoremove such noiseasdist ortedinterse ctions .In{act,thisruleisused inIL verysimpleway ofremoving noise and dete cting strokes[Kesve nd 19791.All pairs of segmentsjoiningat11.11intersectionpointare needed ,withsegmentcontinuations being measuredaccordin gto some conditi on s. The pair ofsegme nts withmaximu m continuity whichisgreater thansomethreshold value is chosen as onestrokeand the rest nrctrea tedin rhcsamem an nce.

Forcxtraclion ofthest ablest rokes,atech niqu eusin gHoughtran sform (HT ) was proposedrecentl yfor ha ndwritten

celt

[C heng ,et11.1.19891.Firstthecharact er patternis thinnedendtransformedfromthespatialdomainto theparamet ric one by liT.As most slrokcs ofChinesecharacte rsarcalmostlinear , theycanbe easily deteetcd aslines byIIT withintheheavynoiseimage.Thisis anew approach tothe applica.tionofH'Tand a new attemptatthe strokeextraction of handwritten Chinese charl\ctcr.The methodis still verylime-consumingasno precJassificationexiststo reducethenumberof matching characte rs byusing those features obt ained byHT.

Co mb iun t .lou schemes

For therecognitionmeth od sbased onfeat ureextracti on, each character category is madebyfinding thereference vectorwiththeleastdistan cefrom the input pattern . Motivatedbytherequirement ofseeking mereeffectiveandmore reliablefeatures, muchresearcheffor thasbeen made and various cha racterfeatur eshavebeenpro- posed.Neverth eless,itis apparentthat none ofthese features can yield sufficient

(41)

accuracywhen usedalone.Thi~problemi~l'iIllM'I 1b....IWIl illlll'rt' lil,lr,l\I'l'ad,~nlill' monto allof the features.

One of these dra wba cksi~thelarkuflli~tTiI Hill ;\lnrrinfurmalion,Th''1'n'tk all ,\' speaking,the purposeof featureoxt ractton istil ;ISSIll'C'rl'linbilil)'..

r

n'n',a:.uitiull1..1' removin gredund a ntand irr c1e\'all linformation1ltul 1'1l11111l..in~Llu-s"I,,,ral,i lilyi1Il1Ull~

pattern classes.IIIpractic e,feat u res arc oftencxtrurtedhylllt'a nSofsum,'1lI" il~lIn'·

mcr.ts o,)fthecharac t er pattern.1\ 11tlver"l1dfel'loff('"I,III"\'I'x\.r;u·tiullis I.hal.llIlII' l!

redundantinfcemat lonisremoved, hilt rora sUllllllI11llllH'rIIf dlimlt:l.t'rs ,"'lIll"iru portentdiscr imin a to ry infor ma tiollislus t. 111tile(:hill" s,'dl1m,,'tl'r~<t'I" thr-n-an' many pairsof similarcharactersrhatdifferfrom('a rllnthor (July slip,htly.If S1ll'h,I significantdifferenc e isignoredbytlw tile mcasun-nu-nt,1l1111Jigllitywillure«-IJd wt'l'Jj these charact ers. Worseisthofa d thatS01JleIlissirllililrl:hlll'ill:t"I'S1I1;~y/111'11'vpry simila r fea ture vectors.Ifthereferenc evI'I:lo rs oftll(~dI1l"'1I:11~rsare I'rtlwlll,,1 dlls,'ly in the Featurespace,rccoguitjnnwouldheveryIlilJiC:1Jtt . Fur l'xil1l1j1Il',L111'f"lIl.lIfl' vectorofinputpattern may Jcvj'11efrumil:i refcrencu ill. Il slIlllllll ist,tlWf', liSi,~on"~lJ the casefor a somewhatnoisyc1l11r1lr 1.I'rsample, amih,~dl),'iI~~tto 11fl{I'l"lml'f~'11":\."1' ofanothercharactercategory,rt.~g\lhillginitlrliS. WI:og llil iUll tlliltisrlitlieult.ttl avojll.

Anotheressentialweaknessof tile character feaurn-slil:Sintlldr low sllrl,ility againstno ise Ofdistur ba nce, In tiledllLf,u:l.ersa\ll"ll~n:,u l fromlld uldIlr;nt, ~,1 docum en ts,theremay existmany killd.'lor dist.ur hnncu, such asI,hlfrl~11sl.r" I«:,hrll' ken stroke,position al shift ,charac terrotntlnu,strokethlck ness variutlon,and n"isy

(42)

24 pnints,t;vcry featureisparti cularlysens itiveto dist urbanceswhichcanconsiderably Ilffccltheresults of themeasurements onwhich the featureisbased.Undersucha disturbance,il.grcal dist antcwillexistbetweenthefeaturevectorsofan inputpattern nnrlihreference,therebydegradingthe recognitionperformance.

Ilisnaturalto consideranappropria tecombi nationof some method s inorder tilga inII.bett er result. Becausedifferent featuresare obt ained by differentmea- surctuc nts ofthecbnmctcrpatterns,itisreasonableto sup posethatthefeatures utaylmvcdifferent characterdistribu tionson theirfeat ur es spaces.These ch aracte rs troublesomefor certain Icntue csmayhe verydistinguishablefor some othe rfeatures.

Thus,the sepa ra hility among thecharac tercategories willbe grea tl yenha nce dif sev- eraldifferentfea tu resareutilized jointlyin recognit ion .This stra tegy is commonly ndoptedforChinesechara ct errecognition.HagibandMasuda combi nedstroke dis- lrihut iolimethodand thedirectionstro kelength distri buti onme th od [Hagita and Masuda 1981).hmnediatcly afterthiswork,researchwhichcom binedthree kinds o( features;line direction,crossing count,and backgroundfeat ures,wasreported [Fujii,ct al.1981)' Forpredassificati on ,locallinedirecti on can beusedwit hthe peripher alare nvector [Ta kahashi19821_Most fea ture combinationmethodsare first appliedonhandp rinted Chinesecharacter.Themajo rprobl e mwith thecombination schemesisthatitisdilliculttochose thoseIcetureswhich aremutu allyindependent . Itiiithemutuallyindependencethatma kes thedifferentfeaturesselectedsensitiveto differentdisturbancesandimproves thesta bilityofrecog nitionbymea ns offeature

(43)

combination.To copewit hsuch problem,an approachcombiningIone illll"lwlIII.'nl features,namelycrossingcount,stroke proport ion ,antitwoper ipheralft'Htllr.,,;,hall been proposed [Zhang,et a1.1989J.Tofindoutthesefourfeatllr('~ ,Iirtlt rt'!ati"n analY$i$ of the distancehasbeenmadefor all possiblefeaturepairsnurongth.· [tlilt feature s.

2.3.2 Structuraltechn iques

Chinesecha racters are patternswhicharchighly strllctllwll.'I'll.,sl.rtu'Lun-II[(:hilll'SI' characterscan bedividedinto threelevels:thewholerhurruac rI.'VI,I,till'!,.uh[rud icals) level,and thestroke level.The characte r levelisthehiglled levelwhil"slruk., levelis thelowest.Fortworadicalsin a charact er,ouo rndicnl mayIJI~011thoI.,[tsid.·

oftheotherrad ical , over the otherradical, orsurrounded by1I1'~otherrlllli,~al.'I' w..

strokes inside aradicalmay beuncon nected,or onestrokemny contuinS(JlIl'~m!llll:d· ingpoint swhichjointhat stroketo the otherstroke.Iti~veryrlillk ultto uwdil~ skltl statisticalapp roachestodescribe such complexpntt em struct uresunrlrela tions1)('.

tweensubpatte rns. Itisthestruct uralpropert ies ofChine.t :dlilr.\ckrxthatmak,~

the struct uralappro achesvery promising inhandwritten MHI printt:d CGIl.Ill·,:,,"ll y, more effortshave beencarried outinthisdirection.Theskructur nl.lJlllrl" u·llt:S for CCRcan bedividedintotwo mainstreams,namely gril.llllllliri1IIJlrflll.du:s~,,"ll;rlLlI I, approaches.

(44)

26

GrallIIJWrllll:tIIUU

III tllll grillllll\'l1· rnd llO'] s,thepatternisrepresentedas a stringandagrammar,ei- lhl'rc'JflI,,~xl·rn~~or a restrictedr.on!,exlsensitivogrammar,such as indexedgrammar orjlfOjJ,nllllllll'flgfillllllliH,whir:hisIISC']to describethecharacters[Zhangand Xia 1!IS:I;Tai;1!llHJ. AIHlTSl' !"forthatpnrt.iculurgra mm ar is built to rccoguiac the pa t- In n.Fllrlh<'rr1'~V"I"I'IIl('lIt,,1ollA thisliuoinclude s sto ch as t icla ng uages[FuUJ82j,

r'rrurn.rn'd.ill~I'I,rsjll~,amlstu<:ha.st.icerrorco rr ecti ngpa rsing[Lee and Fu 19771.

II" w'-\'('l",~trinp;grauunarissUlllIotIHlwcrflllenoughtoha ndleverycomplexohjocts lik,-haudwriueuChilll'SCdlMilders,andthereforethoIligllcr dimensionalgramma rs (l.rn',plex,or w,'),}an- !levl·lupe d.Thesetradi Lionalgrammar ap proachesarc still wr-nkill 1l<lll dl illE;lloisyOfrlistort.lonpuu.cmslind numericalsemanticinformation [Tsai alldHJSfll.This sl10rl w IIlingGIllheovercom eifth eatt ribu t ed grammarap- l'WIlrl,isIIst·d. CUllsiderirrg1I11'cliafilct e riNticsof handwrittenChinesecharacter s

111111Hit'('xiNt,iugI'rulJlellls,IItwo-dimensionalextendedattributedgrammarhasbeen

I'rl!po~wdIZhau l!lHOI.ThiNmet.hodearr h,sbothoverallcharactershape information

;dlll llll'JllNt;,iiNt i!' fl'illun>:;u(sillllplt'N,andconductstop-dow nmatchingand bottom- 1111 tI'dlldi,,",Astlll~Clrilll',~('rhnrnetcr set isvery largeand tiregra m mar describing ('hll TilI't,'rs insIKhit,;<'1.isquill '('Ulllplicated,it isverydifficulttodotheparsing.

Houo-.it isnul prnrtiru l10USI'grll1Ulllarappr oac hes(orCeR.

(45)

27 Graph methods

Another line ofdevelopmentistoIISCpa tte rn mat chinginsteadofpll r~i llf.\.IIIthis approach,thepa tt ern is usuallyre presented as a rclnricn nlgraphIUl,1graph1II1\Ld ILU f.\

is used ,In ord ertoin co rp orateme reinform ationinto the rclatlougraph,atlrihul..d graphsare used to combinethe structu ra landstnt ist icalapproachesI'l'lia iand 1"11 1979;Shi and Fu1983],The attr ibute dgra phgivesa verynt'xih len'pr(·~t'lIl,ntiull"f struct u ralpatterns,especially forhnndwriucnChineseChn rnctors. inwhich,'aM',tilt' patter nprimit ivesor ve rticesof the uueihu tod grapl!,'lUI r,,!,rt,,;,'ul

nit'

slr"k,'s whilt' thearcsoredge s of theattribut e dgrap h canrepre sentt11( 'rdatiQIIshipsIJdWI"'1ltill' st rokes . Inrealiz iugthat many natur alproperti esandrela tions ofItalld wr i llt~uCh i- nese cha racte rsarcfuzzy,thenex t stepis10 inclu defU1,1.y'llt rilJlltcs inlilt,a1lrihll1." ,j gra phanduse thisin forma tion for fur t her]Jrocc~sillg.Thefu1.1.yId1rihll l,·,1grap ll lor handwritt en CCRwa spropo sedIChan ,ctal.HI/wi,TheIIHlj"rdr((wlm,~k"ftilis appr oa chis thatthestructu ralpropc rties of Chinesecharacters areriotfullyIIli!il,t,,1 and it isverydifficultto orga n i zethe modelda t abaseIorcl1icie flt st:ard lilll;.1(,., gardi ng the threestr ucturallevels

o r

Chinesechuructc ra,thehicrar chicul il1lr il!lll,: tl graph represent a t ion is propo sed forhnudwrinonCellilltllist111~s is.'l'hehi':n trd lica l aUrib u tedgrap h repr e sen tswholecharacterwithits vort jcc'l<d':scri hillfl,ll..,rll.fli'~ltls in thecharacte rand it s arcs descri bing the spatialtclatiolll; betwocuther;~di l:ltls,III the hier ar chicalallribut ed gra p h,aradical attributedgra phcorrcs ll"Il,linl; tul~adl vert exis usedto representa radical withits vert icesd(~sr.ri\Jillgtl](: slmk.:s1\l1l1 it s

(46)

28

.~d/!Pli,!o.-:;,:ril,illgtln-rd aU" IIS betweenthe: strokes.\Vil htheIIAGR,the hugemodel ,j.,Lal,asl:GUJb(~urg'llIi1.ediL~"treewit h severallevels whichfacilitatesaccurateanti faslS(~"rd d llg.

(47)

C hapt er 3

Preprocessing

3.1 Introduction

Imagesofinputha ndwrittenCllilll,.'Scdlilr;u:lc..'nian' o),l" illl..1I,yiIvi,lpu";lI1l<'r;" i1t111 then normalizedandlta llll[o f1Ul'tlilllohillilryimi'';l '''l.1\I.ill a ryilllllgf:i.'I;u'l llallyil

two-dimensionalarraywhoreCIIchdell1el'll(pilll'l)ill,.j' hl" IlitII. 'I'1I,'d'M;u'll"

pattern consisl.softhose pixc1liofvallieI. ~:"(hslmkl~Sl~"'lIl1'lIlIIfllll~dlMlI.dl~r

patternis morethen onopixcltbick. VilriOllS 1YIM,. elfillfufl lI;,lilill011IIw"xlr;\f'll ~l fromII.binaryimage 1Iy1I(1I11'~Imliklow level01,,·tilliUltli.TIminl,uld,ar;u:ll'rpoll U·tll inthebiliaryimag«islIkd c t Ull izl:r1thruughillililllli ngill~(lrjtlllll. /\ ft" l'w"rds,llll~

skeleton ofthe inp utcharacteris trnccdtu (Jllll.ill theslrok.'se~IIII'lIl,swhich;'/'1 '1.111:11

merged to Iormrhcstrokesort1u~illp u l(harar.Ll~r.G':':Jllldrit:illflltrmLl.i" nslid .ilS1I1l' direc tion andposilillllor lllru kt-,;(lUIlh'~lbecxlr/lt:h:tl.At:I:ur <lillJ;111 llwl",silillllS

"r

2'J

(48)

30

st rokesilIlllllll~coHIlI'dioT!rdatjons betweenstrokes,strokescanbegroup edintothe ra di nd s oftill!inp uteharactur .Allthese ope ratio nsarc conside redas preprocess ing which IITovi dl:S1I1(~local propert ies of tileinp lllcharacter.These prop erti es arcthen

org;U1i~,cdfo rfurthe r image analysisan deeprosontatlon.

3.2 Thinning

ThiuningisIIprol:l~ShywlJir.h11bin arypat t ern istr ansfor m edint o anotherbinary pattt' Tlll:OlIs isLillg ofjLsskeleto n.The majorobjectivesofthinninginpattern rccognl- I,ion,Iudi"1ilgl~IJTOCe-;siligarc toreducedatastorageand tran s missio nrequirements, t"n~ hll:t· t11(~I111lUUI11ofdat il tobe processed,And 10facilitatetheextractionof r'·lIl.ur('.~(ro m UIl: pallel'll.

r..\;lIIYl,hiullingillguril ll1llshavebeenrcponcdjwakeyamas1982;Lu andWang;

I!JS5;PsvlidisI!JS2; Zhuug alillSucu19S'I;NeccarhcandShinghal 19M;Zhang and FIIIllS-I;PllllllIJlld01I andSucn19S!)].TIle thin ningalgorithmdescribedby Zhangand

!-ill('lli~simpleamifIIN!.,and can beimp lem en te dinparallel, however,thealgorithm

cuunotpn'vc'lIt('xcessil'l'erosion,solinesorCurVClItha trepresenttho truefeat uresof tln- ulljl'r ltendtoln-excessively shortened .Inoursystem, theZhang-Su enalgorith m isllltJclili(·dtu overcutuc thiserosionpro blem .

'1'111' Zllillig-SU('1lalgorit hmextractslilt,skeletonof the characte rpatternbyre- tlluvingHlIth,-eclgt'pixels of thl' patt ern exceptpixels that belongtoth eskeleton, luorder to pn'scrvctho COllll{'cth·;ty

or

till'originalpatternintheskeleton,iterative

(49)

:1\

transformationsarc appliedto thehina ryilllil!;"oftill'I'll'lf arl"r path'rllntul "al'h itera ti onisdividedint o twosubitceat ious .

A 3 x 3 windowis usedtuextr a ct. theskeleton. Let,lIOrvpn'sr-ut1I", F;i\'t'u pix,·1 (i,j ),the eightneig hh ors insideitawindowarcIII10lIlt{s,'t-'Figllrt, :I.l}.

JI HI

t-r

i+l

G Q G

(;)- :~: G G G G

Figure

a.1

Eiglu lleighlmringpixdsof a pix,,1/Ill

Intilefirst subitcruticn, accortlilLgtothev;.Ill(!S orthl) (·ight.Iwighlmr illgpixdll, contournois delet edFrom the patternifil,llati.~li()llall tIl!)futluwillgnJlloli1.i<>lIs:

2$.%'(1111 ) $(;

N(lIo)

=

I

III'11','I' Jjr,=0

"'I

1tc,'117=[}

whereZ(no}isthe111I 1I1IJc r ofIlfJlIZ,~rOlIeighlllll"sof /Ill!lIudlVlutI)is1,1.,:lI"IIII.'~rof

"01" patterns in the ordered setfll, •,1I~.

(50)

32 IIIti lesr:cl!Ilflsubitcrntion, the conditio ns(3.:1)anrl (3.4)arc changed into

(3.5) (3.6)

illlelt111:restremainLlu~same.

lIytilt:I;Ollllit ioliS(:1.:1)and(:lA)of thefirst eubit cratlon, the sou t h-castedge pixels IlIIlIlIl('north- west curlierpixelswhichdonotbel ong tothe skeleton arcre moved . Similarly,thepi:-wJ11:I IIClVCflII}' tlwconditions(:1.5)and(3.6)inthesecond iteratio n lIIiglllII{' 111I11rl, h,wl~11IOIIIIII,Irypixelor asont h-oastcornerpixel.Theiteratio n fUI11ill ll C1(until110morepixelscallberemoved ,

ItwasruncludedbyZ!lIIugami Snelltha t :

•Bycoudit.iou(:1.I)till'endpointsofa skeleton line arcpreser ve d.

•Also,funditillll(:1.2 ) I'n:vcnt s tile dclot lonoftlv-sepixelstha tliebet weenthe end poinls of"skeletonHue.

Iftheulgcritlun1Mapplil'd011thepatternill Figur e 3,2(a),the patt ernwouldbe t'XI'I'Ssh '('ly slmrtcucd.Thispa tt e rn l'OllM is tsoritho rizontalsectionanda diagona l

~wfl,i{lll1';11'11ofwhirhistwopixelwide.Arte rthinning,the diagona lsection is deleted 11I'I'aUSI'or1.111'o\'t'rerosion orthealgoritllln(st'CFigure 3,2( b)).

(51)

••••••••

•••••••••

• •

••

diagollal

• • SN:tiUlI

••

:1;1

••••••

Figure3.2Exampleofover erosionby ZllRlIg·Snenalgorithm.

Inthefirst itera tion,the endpixel(6,1)isdclcl l'll Ill'callw thispix!'lsali~li,'s allthecondition.orthennt subitemtion. AncrIIixcl(Ij,l)is,[d,·tt'd , illlilt:s,',··

ond subiteration,thepixel(6,2)isdeletedbecause it.ntililk'l;nil thecOll!!itiollllfllr deletion.No otherpixel.ofthediagonal segmentcanher<-1110Y\:l1Illu inglilt'lirsl iterat ionas none or them .at isfies all the condit ionsorthelirlltor5<-'£01111suhit.~rl,ti"lI. Inthesimilar way,pixeb(5,2)and(5,3)of the dilLf:,onalli('l;lIIc/ltarerem" y.,.1ill1114' second iteration,pixel.(1,3)and(1,1)ofthedilLgollftlllC&lIwlltart'rt:III " ""..1 illtill' thirditeration , and10 on.Untilthe firth iteration,nopixel CAlihe(IIrtlJcrn:III"Y"d from thediagona lsegment. Hence, only one pixel(2,5) orthedinWJllnl""gUll:llt is preservedalter thinning.

Itis clear thatthe problemofover-erosionisvery sl:vcterortill''l.IHtll ~.SIlI·lI algorithm.1n nnextreme case,whenILdillgoflulscl;lIll:llL ortw" pi,.dswi,]"illv"ry long,thesegmentwouldvanish entirely.Thiswillincreasethedilliculti()s illfurtlll)' imageanalysisandrepresentationofthe tru lyfcat urClirQrdl1l.lar.t"rrCC<Jl;lIili.m.III

(52)

34

orde rtoiLvoid theover-eros ionproblem,a modifieda1l1orith mis proposed. Inthe finllubiteraHon,in..dditio nto the $et ofotisiuleon dition s(3.1)to(3.'),aleiof alternativeconditi on s is alsointroduced:

N(fl.)

=

2 Z(UI)=4

(3.7) (3.8) (3.9)

Theecntour pixel"11isdeletedifci UlCrtheset oforiginalconditionsis sat is fiedor tllcset ofalternat ive ccndi ficrnis,;a,tislie<:!. The se toflllternativecon ditionsi.used to guaranteethatifthepixelI/ois onadiagonalsegmentwhichistwo pixels wide and its ncjg hboring pilei'IIi.notonbackg round, then"0isremoved.Therefore, the~&mcnlat"Ibecomcs one pixel wide and thepilld "lcanbepreserved inthe followint;ite ru ion.

Similarly,i.the seeendlubiteration."0is removedifeitherthe setoforisilllli conditions (3.1),(3.2),(3.."'), and (3.6) or thefollowingletofaltern ativeconditions isaa li. Goo.:

II . ' (11:1

+

II~

+

tI~)=2

N(lId=2

Z(ltl)=4

(3.10) (3.11) (3.12)

Withthemodified algori th m,theoverer osionpro blem isovercome.Thecompar- isonisillustratedill.'iSl'tl,'3.3.Figure3.3(a)is theori~nalpattern .Figure 3.3(b)is

(53)

theskeleto n of thepettcrnobtainedloytheZllll.ng-Slll'lInlgorttluu Iti~ ohvi,lll ~lIml the diago nalsectio n ofthe patternisovershorte ned. llowcvcr,tlu-dingulI;Ilsecfion ispreservedbythemodifiedll\goritlllll(sCCPigllT(l3.:I(c)).

1G -1-

(a) Originalpattern(h)Zhung-S ucn.dgcnthm(c) J\'ludi£icdalguri t hm Figure 3.3ComparisonsofthetlriuuingIIlgo rill'llls.

3 .3 Traci n g

Afterthi nning,the skeletonof a chaenctur islrncedillIm lert"I~xt. md1I1<'lir'"

segments.

theskeleto n.

pixelsintheskeleton.

points andend-points)arcdetected.A segment can becxlrac:l(~11flylradlr~fr"1lI

(54)

36 aileIcet urcpoin t tothe ot her.When askeleton pixel isvisit ed,itismarked and itbcnordine tcs are recorded. Dyscan n ing row byrow Ircm top tobottom,ifan unmarked pixcl(ca llcd{(Jt;fIlillflpoi"t)is found, theremaining skeletonconnected by

thispixelcan be tracedandmarkedandthecorres po ndin g segmentscanthen be extracted.

lhjilliliOl! .1,:/A'~"I!fmlli'lIlII/lin t"'S~isa pixel connectingtwo line segmentswhich

!J~IVCdiITcecntdircctions.

The markedsegmentsare firstencod edbyFreeman 'schaincode.Thecodes"'An to "'finare used toindicat e eight different directions

;

,,

, . * ... (:. ... il;

,, asshowninFtgure3.4.

: E-- A :

, ,

, ,

, ,

:I)

c II :

Figure3.4Codes and directions for Free man'schaincoding.

Eachsegmentis coded fromtop to bottomandfromleft toright.For a segment, codingstartsfrom itsone feature pointon toporleft andends atanotherfeature

!,<linton bottomorright,respectively.Thecoding ofasegment in aloop startsfrom itslocat ing point"e"and endsatthe point"e"along an anti-clockwisedirection.

Sl'vcrall"X1UIlillcllau'givenin~'igll re3."'.

(55)

(a)

-

,

uu

" "

" -

"

"

"

-

"

u

" -

" - - -- - -

(h)

:\7

For example,according toFrt't'llIiIU'St:1wiurode,t.heSl~glTll'lIt"I"ill Figllfl' :Ul(a) has chaincode"AAAAAB".Itsta rtsFrom"WlindcUlls ill.".",Fur Sl'gllwut "!i"

illtheloopinFigure 3.!)(h),its chain codestartsat

*

,mIlcrllislit•.'I"ll~ehuiu codeis"DCCCC B A A AA A AIlGG G U FEEEI'~I':I'~".Thelunp cau hesqmr;'l,l~llilltu line segments withdillurcntetirt~clin1\s, TIll'S1~parilli(l1\isIll' rful"1llI'd<UTl,rolill~ttl theserules:

I.Eachlinesegmenthas unoortwokindsofl~III(:.

2.The code ina chain is ddillcdasapri1l1iliv(~':{Jrll~iftill'(~lfl(~IliI.~tlIaj"rityill thc lincsegrncut.

Theprocedureis asfollows:

L Scanthechaincode of asegment.

(56)

38

~.COll1l1 tilenumber ofdifferentkindsof code (denotedasNJ)and the number orco,leswhichart! consecutiveandidentical (de n oted asN,).

a.

IfN,Jrf~/ldu:sthreethcuthescannedsubst ring ofthe chaincode is selectedas

aIilie segment.

<I.Ifthesckc1.edsubstringis100short(say, less than fo ur), thefirstcode is

ru:gl,:f.t' ·ll.n(·c l't:.~~( :N,j1.0less thanthree,amicontinuetheproced ure unt ilNJ

."i.If1,lwtnunhcr()fconsecut ive1l11dident icalcodes(N,lisgreaterthanthree, then tilt:dmillemIl'is sdt:clt'dand the corresponding lineseg mentis extrac te d.

(i.Iftwo lille segmentsarc rcnncctcd toeach other,andhavethesameprimitive code,1I1l'1ltheyHI'cUl1lhilll,,1,Hiliane w line segment is selectedtoreplace

t1WI1l.

Fur CXHlllp!t',till'rhulu r-odeofsegment~G~in Figure3.5(b)isscanned. The tirstli\'e rodcsrontulntwokindsorcode,Nd

=

2,Whenthe6lh code is scanned, thelirsLlive(wit'sarr- selectedand thecorresp onding Hill'seg mentwiththevertical dirrvthm is('xtrilr l.ed,Theprocedure cont in ues unt iltheendofthe chaincodeis read't·11.TIll' loop is n'prl'se n letll Jyfourlinesegments:"D C C CC ~(corresp on d i ngto UIt'!er 1 wrl inl i lillt'segmt'llt.),~IlAAAAI\A"(corwsrondinglotilebottom horizontal li'l(' S"I;lllt'lll),~IIG (jG G~(t'lJrrt'Spon tlilig tothorightvcrt.icalhnesegment},and

~F EEE E E E" ( nJ rrt'SJl(l n di llgtotht,tophortaonulHucsegment ].

(57)

3.4 Merging

Two linesegmentscan be merged1,0 fonu one slrok.·if1IlCy.~h"rt·nil' smUt'k".I' point(fcalurc point, locatingpolut,orsepa ra t ingptliu l)nml111I\'c'tlu-sauli' urit'll' tat.inn.Thercrnlliui nglinescgllll'lllllwllirh<~H lI lI..l1)(·llwrgt',1n·ltliliu;11'imlivi, bl ;11 str o kes. For eachlin esegme nt ,theori"Il\ntinllrnu he dcl.t'fllli1l(,dI))·:

where(XIlYI )tim](X2'Y2)arctill'twokt'}'llll i lilsu(til('lilli'SI'gll1l'll L.Thenr;"l1 lal;oll oralincscg rncutis :

Iliy/ilditl!}01U1/(IUJ) 22.!jO~n<(i7.!i·

~

''''Ok'Verlicil/tV)o X

cr. s-

$0~112,."j°

~ "."'."

"

- - - - - - - .

X l,t:!lllill!/fJIlt1l{/,f)) 112.5° :50:5\.'i7,.'i"

Fig ure :J.6Cla.~ si fi, :atiollofLllCllinf~ lil)~lIlt~JlI. s

I)horlzontaljll},ifO·:S:(\0<22}'j"III'1."i7..'j" ::;,,:sIH(J", 2)riglll diagcnelt

un),

if22.·'j·:5(\'<(j7/j~, 3) verl ical( Vj,if6Vj~:S0<112..'i~,

Références

Documents relatifs

The Interrupt Enable Register (IER) masks the incom- ing interrupts from receiver ready, transmitter empty, line status and modem status registers to the INT output pin..

- Surlignez sur le texte photocopié les mots qui caractérisent les enfants que l’Énorme Crocodile veut manger (dans un premier temps, on ne vise que les nombreux

« Tribune de Lausanne », pour faire chaque jour une ou deux pages spé- ciales pour le Valais ? Vous ne com- prenez donc pas ? Vous ne compre- nez donc pas que ce canton A T T E N D

étéautomne 8 Biodiversité au niveau infraspécifique = importance capitale pour la gestion des ressources - intéret économique - intéret sanitaire.

¢  Nécessité de mise en place d’une densité de mares suffisante pour la taille du site.. P ROBLÈMES DIVERS RENCONTRÉS

Según Chaume (2016: 25), también es la forma más extendida en países como Francia, Alemania e Italia. El segundo es el sincronismo de contenido, elemento fundamental en

Pour en savoir plus : Conservatoire d’espaces naturels Languedoc Roussillon – Benjamin SIROT – 04.67.02.21.29 – life.lagnature@cenlr.org – http://www.lifelagnature.fr

Pour en savoir plus : Conservatoire d’espaces naturels Languedoc Roussillon – Benjamin SIROT – 04.67.02.21.29 – life.lagnature@cenlr.org – http://www.lifelagnature.fr