Optical Character Recognition (OCR) is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text at electronic speed by simply scanning the form. OCR is a field of research in pattern recognition, artificial intelligence and computer vision. More recently, the term Intelligent Character Recognition (ICR) has been used to describe the process of interpreting image data, in particular alphanumeric text.

Optical Character Recognition is the process of converting printed materials into text or word processing files that can be easily edited and stored. The primary advantage of OCR is that it encodes information in a format that is both machine-readable and human-readable, while barcodes and 2D symbols are only machine-readable.

Optical Character Recognition (OCR) is widely used to convert books and documents into electronic files, to computerize a record-keeping system in an office, or to publish the text on a website. OCR makes it possible to edit the text, search for a word or phrase, store it more compactly, display or print a copy free of scanning artifacts, and apply techniques such as machine translation, text-to-speech and text mining to it. Common industries and applications include date/lot tracking on pharmaceutical or food packaging, sorting mail at post offices and other document handling applications, reading serial numbers in automotive or electronics applications, and many more.

Optical Character Recognition systems require calibration to read a specific font, early versions needed to be programmed with images of each character, and worked on one font at a time. “Intelligent” systems with a high degree of recognition accuracy for most fonts are now common. Some systems are capable of reproducing formatted output that closely approximates the original scanned page including images, columns and other non-textual components.

Optical Character Recognition (OCR) technology requires both hardware and software. In addition, sophisticated OCR systems require an additional circuit board in the computer itself to complete the process. An optical scanner scans the text on a page, then breaks the fonts down into a series of dots called a bitmap. The software can read most common fonts and distinguish where lines start and stop. This bitmap is then translated into computer text.