![]() Tahoma is present on any Windows system and the main reason that this font works so well is that there is enough difference between look-a-like characters which make the OCR engine interpret each character correctly even without any context. If you’re adding multiple fonts, or bold, italic or underline, repeat this process multiple times, creating one doc pdf tiff per font variation. ![]() Proportionally spaced means that for example the letter i takes less space than the letter w which makes text more readable and also more compact. You’ll now have a good training image called. ![]() Tahoma is a modern looking sans serif (no curls) proportionally spaced font. We tried a range of standard Windows fonts and we found out that the standard Tahoma font works best. And to be fair without any context even we have difficulty making the difference between an l (lowercase L) and an I (uppercase i) with the font used on this web page. For example the OCR engine needs to distinguish the letter O from the digit 0 and an l from an I. The most challenging for any OCR technology is to distinguish between characters that look similar. We use OCR software daily in an automated system and after testing dozens of fonts (including some OCR specific ones) that Calibri is consistently the best. OCR stands for Optical Character Recognition, the process to convert an image into searchable / editable computer text and required to extract data automatically from scanned images with a product such as MetaTool or MetaServer. We often get questions about the best font type to design a header sheet or form that will be used for OCR reading.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |