Khmer Optical Character Recognition Initiative

The Khmer OCR project has started since July 2008. With full training, two engines of PLC’s OCR system were created. They can recognize the scanned documents with good quality in normal format with no table, picture, column, bold, italic and underline printed in either Limon S1 or Limon R1 with size 22. As a result, the final outputs are in ASCII text which can be easily converted into Khmer Unicode with the implementation of the existing Conversion library, one of the libraries developed during Phase I of PAN Localization projects. The outputs can also be saved into two document formats: Microsoft Word 2003 (*.doc) and Text document (*.txt).


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s