A two-stage outlier rejection strategy for numerical field extraction in handwritten documents
Texte intégral
Documents relatifs
In the literature, some recent works [Zimmermann 06, Vinciarelli 04, Bertolami 08] have addressed the pro- cessing of lightly constrained handwritten documents such as free mails.
The a priori knowledge such as the number of digits of a researched field and the potential position of the sep- arators are merged into some text line models, for each kind of
We now describe the components of the segmentation-driven recognition : the numeral classifier, the touching digit recognition method using a segmentation method, and the reject
In this paper, an innovating information extraction system is introduced, based on a text line model able to handle relevant (words of a lexicon) and irrelevant (everything
For more real problems where some outliers can be sampled but not completely, different solutions are usable but using reliability functions with RBFN allows more op- erating points
In order to tune these (class- dependent) rejection thresholds, an algorithm based on dynamic programming is proposed which focus on max- imizing the recognition rate for a
We base our analysis on rules commonly used for French mail: sender details are on the top left of the page, recipient details, date and place are on the top right, subject is
Composite hypotheses, nuisance parameters and multiple hypotheses are dis- cussed, minimax, invariant, and most stringent tests are introduced, and some asymptotic approaches