OCR Shop XTR/API Glossary

ADF	Automatic document feeder
ASCII	An acronym for American Standard Code for Information Interchange. A code in which the numbers from 0 to 127 stand for text characters. ASCII code is used for representing text inside a computer and for transmitting text between computers or between a computer and a peripheral device.
auto segmentation	The process in which the OCR Shop XTR determines where on a page different elements are such as where pictures are and where columns of text are. Divides the page up into regions.
binary image	A image that is represented using only one bit per pixel. Such images are also called black and white, monochrome, bi-level, or 1-bit.
bitmapped image	A collection of bits (dots) in memory that represent the scanned image. The display on the screen is a visible bitmapped image.
chunky bitmapped image	A bitmap that has an internal structure where all of the data for a particular pixel is stored in a "chunk", i.e. a contiguous set of bits. For example, a 24-bit chunky image has sets of three contiguous bytes that represent each pixel.
Code Page	Code Page is a Microsoft® term. A code page is a particular mapping of a set of unsigned bytes to a set of visible characters (and space characters). Different code pages are used to represent in memory the characters in different languages. See http://www.microsoft.com/globaldev/reference/WinCP.asp for more details. Also see shape pack
compound document	A compound document is a set of one or more pages that consists of a mixture of text and images, for example pdf or html.
conversion filter	A program that translates one file format into another. For example, the mpage conversion filter can translate an ASCII file into a PostScript file.
digital image	A digital image is the way a picture or visual image of some object is represented in computer memory. A digital image consists of a number of pixels and a description of how the pixels are arranged to form the image. In addition, information about how each pixel stores the color of the original image is included.
dithering	A method of representing an image using fewer colors than the image actually has.
document	A document is a set of pages that are related usually because the sense of the text on one page flows into the next as in a book. For OCR Shop XTR, it is best to arrange for documents to be sets of pages that have the same font or set of fonts continuing from one page to the next. This best takes advantage of the internal font learning system that is built into the OCR Shop XTR recognition system.
dpi	An abbreviation for dots per inch. This is the number of dots per linear inch that a printer can print or a scanner can produce. See also resolution.
frame	A frame is a way to represent the maximum extent of some page element in the horizontal and vertical direction (X and Y coordinates respectively). A frame can be thought of as a rectangle that is lined up with the X and Y axes. Frames are represented by four numbers, which can be top, left, bottom, right or top, left, height, width. Also see UOR.
grayscale	A way of encoding images that excludes all color information--the image is made up entirely of shades of gray.
machine id	The hardware id of a machine used to identify the host (the hostid on Solaris, or the ethernet MAC address on Linux).
image depth	The image depth, or bit depth of an image, is the number of bits per pixel in the image. A binary, or monochrome image has a depth of one (1).
language pack	A language pack is a data file supplied with the OCR Shop XTR that includes information about how the characters of a given language are put together to write words and sentences in the language. Language packs contain information about the common words used in a language, rules for punctuation and the conventions used when writing things such as numbers, money amounts and dates.
language set	A language set is that set of supported languages that can be recognized with a given shape pack loaded. Each of the supported languages in a language set may or may not have an available language pack associated with it. Languages without an available language pack can still be recognized but accuracy for these languages will not be as high as for languages for which a pack exists
lexical constraints	A lexical constraint is a set of restrictions on how the characters on a given page or region within a page can be recognized. Constraints can include the set of languages allowed, and/or a character set that recognition is restricted to. A lexical constraint can be a weaker preference or a stronger absolute. A custom word list can be used as an additional lexical constraint on the recognition.
lexicon	A lexicon is a list of words used in a given language and perhaps in a special setting. Language packs supplied with OCR Shop XTR contain built in general purpose lexicons. Users may specify a custom lexicon with the user_lexicon parameter.
monospaced font	Any font in which all characters have the same width. For example, in Courier New (a monospaced font), the letter "<TT>M</TT>" is the same width as the letter "<TT>l</TT>" . Thus, `MMMMM` is the same width as `lllll` .
OCR	OCR stands for Optical Character Recognition, and refers to the process of taking a scan of a printed document and converting it to text. OCR does not include recognition of handwriting.
orient	To orient a page is to rotate the page in memory so that it is better positioned for display to the user and/or recognition by the OCR Engine. A page is oriented for recognition when the text flows left to right (from low X to high X coordinates) and from top to bottom (lowY to high Y coordinates).
Portable Document Format (PDF)	PDF is a standard file format used for distribution, viewing and printing of text and images. It combines fonts, text, images, and graphics in a consistent layout for use across many platforms. PDF files are designed to permit compact storage of image and text data for easy transmission and file storage. See the PDF faq.
pixel	Pixel is short for picture element. A point (dot) on the graphics screen. It is the smallest definable unit of a digital image. Each pixel represents a single point in the image. The number of pixels per unit distance (dot-per-inch or DPI for instance) within a digital image is referred to as the resolution of the image. A pixel can be binary, gray, or color, or can be an index into a palette. Binary pixels require only one binary digit or bit of computer memory to store; gray, color and indexed pixels use more bits with 4, 8, and 24 being common values for the number of bits (bit depth) used.
planar image	A planar image is a bitmap with an internal structure that separates elements of pixel data into discrete "planes". For example, a 24-bit RGB planar image might contain all of the red pixel information for the entire bitmap contiguously, followed by all of the green pixel information, and then the blue.
point	A typographic unit of measurement equal to 1/72 inch, measured vertically. Points are used to describe font size.
proportional font	Any font in which characters differ in width. For example, in the proportional font used here, the letter "M" is wider than the letter "l". Thus, "MMMMM" is wider than "lllll".
recognize	In the context of the OCR Shop XTR™/API, when an image is recognized, it is processed using the OCR Engine that is part of the OCR Shop XTR. During this process the pixels making up a digital image are processed by the OCR engine to determine which pixels are parts of visible text characters within the image. The identities of those characters are also determined and stored in memory using the code page representation of the given character. The result of recognition is used to create output based on the user settings.
region	A region is an area of a page that usually contains either text or images, but not both. Regions can be determined by the OCR Shop XTR during auto-segmentation or specified by vvEngAPI::vvSetRegionProperties. Regions on a page can overlap. Regions can be simple rectangles in shape or they can be more complex (see UOR).
resolution	The fineness with which a scanner, printer, or other device produces information. It is expressed in dots per inch (dpi). A higher dpi produces a sharper image. Also can represent the intended dots-per-inch of a bitmap image.
RGB (red-green-blue)	A way of encoding images commonly used in computer monitors that breaks the image down into red, green, and blue components. Also referred to as "additive colors", when you put 100% of each of red, green, and blue together, you get white.
shape pack	A shape pack is a data file supplied with the OCR Shop XTR that describes the shapes of the characters that can be recognized by the OCR engine when that shape pack is loaded. Each shape pack corresponds to a particular code page that will be used for output when that shape pack is loaded. For each shape pack there is an implied language set that represents the supported languages that can be recognized with that shape pack loaded.
skew	Skew is the amount of tilt in an input image.Skew is generally used to describe the tilt in images including text. In such images the tilt is more apparent and affects recognition and layout analysis.
subimage	A subimage is a bitmap image that represents a region of the current document page.
swap file	An area of the hard disk that is used for temporary data storage when RAM is low or used up. This is also known as virtual memory. A swap file lets you run more programs than you could with actual memory, but it is slower than using regular memory.
text file	A file containing information in text form; its contents are interpreted as characters encoded using the ASCII (or comparable) format.
TIFF	An abbreviation for tagged image file format. This is a standard graphic file format for grayscale and high-resolution bit-mapped images.
TrueType™ fonts	One of the major types of scalable fonts. These can be printed or displayed on the screen at any size.
Unicode	UNICODE is a standard for representing visible characters using a stream of bytes in computer memory or on some other digital storage medium. Unlike code pages where each code page can only be used to describe a subset of the known written languages, Unicode is a single standard way to represent all of the world's common written languages. Whereas the code page representation uses a single byte to represent each character, Unicode uses a 16-bit word for each character. The OCR engine that is part of OCR Shop XTR does recognition internally based on a single selected code page. During output however, the text data can be converted to Unicode for use with other applications that expect text data in Unicode format.
UOR (Union of Rectangles)	"UOR" stands for "union of rectangles", a group of rectangles which together define the area covered by a region. By allowing a region to be defined by a number of rectangles, region boundaries may be flexible and precise so that they include only the area needed. This is especially important for image documents where text and graphics are interweaved such that one region could not be defined by one rectangle without including bits and pieces of graphics that should not be recognized as text. Also see the frequently asked question What is a UOR?
XDOC	A ScanSoft text output format which provides detailed information about the text, images, and formatting in a recognized document. See the XDOC faq.
zone	See Region.

Generated on Thu Dec 11 09:32:25 2003 for OCR Shop XTR/API User Documentation by

1.3.2