Pages

Tuesday, 30 October 2012

Handwritten Chinese Text Recognition by Integrating Multiple Contexts


Abstract:


            This paper presents an effective approach for the offline recognition of 

unconstrained handwritten Chinese texts. Under the general integrated 

segmentation-and-recognition framework with character oversegmentation, 

we investigate three important issues: candidate path evaluation, path search, 

and parameter estimation.For path evaluation,we combine multiple contexts 

(character recognition scores, geometric and linguistic contexts) from the 

Bayesian decision view, and convert the classifier outputs to posterior 

probabilities via confidence transformation. In path search, we use a refined 

beam search algorithm to improve the search efficiency and, meanwhile, use a 

candidate character augmentation strategy to improve the recognition 

accuracy. The combining weights of the path evaluation function are optimized 

by supervised learning using a Maximum Character Accuracy criterion. We 

evaluated the recognition performance on a Chinese handwriting database 

CASIA-HWDB, which contains nearly four million character samples of 7,356 

classes and 5,091 pages of unconstrained handwritten texts. The 

experimental results show that confidence transformation and combining 

multiple contexts improve the text line recognition performance significantly. 

On a test set of 1,015 handwritten pages, the proposed approach achieved 

character-level accurate rate of 90.75 percent and correct rate of 91.39 

percent, which are superior by far to the best results reported in the literature.


System diagram of handwritten Chinese text line recognition

A page of handwritten Chinese text

No comments:

Post a Comment