| dm_min_point |
Minimum point size.
Setting dm_min_point guides the OCR engine in what font sizes to expect. For example, if you set dm_min_point to 14, you effectively tell the engine, "Expect only fonts 14 points and larger."
Setting dm_min_point does NOT tell the engine, "Recognize only fonts 14 points and larger." In other words, if dm_min_point is 14 and the input image includes fonts smaller than 14 points, then the engine still processes all of the page text, including the smaller fonts, but it will not be able to recognize the smaller fonts well because they fall below dm_min_point.
Please note that the resolution of the input image affects what a point size means to the engine. For example, a letter 20 pixels high in an input image with a resolution of 200dpi is a different point size than a letter 20 pixels high in an input image with a resolution of 300dpi.
dm_max_point works in conjunction with dm_min_point.
dm_min_point should be set before the OCR session is started (vvEngAPI::vvStartOCRSes).
- Type: int
- Range: 5-72
- Default: 5
|
| dm_max_point |
Maximum point size.
dm_max_point works in conjunction with dm_min_point to specify to the OCR engine what font sizes to expect. For example, if you set dm_max_point to 18, you effectively tell the engine, "Expect only fonts 18 points and smaller."
Setting dm_max_point does NOT tell the engine, "Recognize only fonts 18 points and smaller." In other words, if dm_max_point is 18 and the input image includes fonts larger than 18 points, then the engine still processes all of the page text, including the larger fonts, but it will not be able to recognize the larger fonts well because they fall outside the dm_min_point and dm_max_point range.
Please note that the resolution of the input image affects what a point size means to the engine. For example, a letter 20 pixels high in an input image with a resolution of 200dpi is a different point size than a letter 20 pixels high in an input image with a resolution of 300dpi.
dm_min_point works in a similar manner to dm_max_point.
dm_max_point should be set before the OCR session is started (vvEngAPI::vvStartOCRSes).
- Type: int
- Range: 5-72
- Default: 72
|
| dm_accept_thresh |
Acceptability threshold.
The threshold above which a character is always considered acceptable. (XDOC output only)
When xdc_cconf is set to vvYes, XDOC output will include confidence value information for characters when their confidence values are between the questionability and acceptability thresholds.
This value should be set before calling vvEngAPI::vvRecognize.
More information on the XDOC output format is available in the core12xdc.pdf and kdoctxt.h documents, included with the distribution in /opt/Vividata/doc or available upon request.
See dm_quest_thresh, dm_xdc_wconf, dm_xdc_cconf.
- Type: int
- Range: 0-999
- Default: 999
|
| dm_quest_thresh |
Questionability threshold.
For XDOC output, when the confidence value for a character is below the questionability threshold, then the questionable character mark is printed in the output file.
When xdc_cconf is set to vvYes, the output will include confidence value information for characters when their confidence values are between the questionability and acceptability thresholds.
This value should be set before calling vvEngAPI::vvRecognize.
More information on the XDOC output format is available in the core12xdc.pdf and kdoctxt.h documents, included with the distribution in /opt/Vividata/doc or available upon request.
See dm_questionable for information on changing the character used to denote a questionable character in certain output formats.
See dm_accept_thresh, dm_xdc_wconf, dm_xdc_cconf.
- Type: int
- Range: -1-999
- Default: 0
|
| dm_recmode |
Recognition mode.
|
| dm_recomp |
Page recomposition enabled.
Page recomposition (dm_recomp) alters the way some things are written to an XDOC output file. Page recomposition is ONLY applicable to the XDOC output format. When page recomposition is on, captions, headers, and footers are output before the body text of the document. In addition, the structures of the pages will be included in the XDOC output. If page recomposition is off, then headers and footers are output at the end.
|
| dm_no_sloppy_manual |
Turn on merging of manual text regions.
|
| dm_no_hdr_ftr |
Output headers and footers as text.
|
| dm_user_specified_order |
User has specified the read order.
This value is used when automatic page segmentation was not used and page layout analysis is run. In this case, if dm_user_specified_order is on, then page layout analysis will not re-order the regions. Please read the documentation on dm_pp_analyze_layout for more explanation.
Should be set before calling vvEngAPI::vvPreprocess.
- See also:
- dm_user_specified_regions
|
| dm_user_specified_regions |
User has specified the regions.
This value is used when automatic page segmentation was not used and page layout analysis is run. In this case, if ::dm_user_specified_region is on, then page layout analysis will not re-form the regions. Please read the documentation on dm_pp_analyze_layout for more explanation.
Should be set before calling vvEngAPI::vvPreprocess.
- See also:
- dm_pp_analyze_layout
dm_user_specified_order
|
| dm_find_headlines |
Find headlines + out in xdoc.
|
| dm_text_out_newline |
Newline designation for output text.
|
| dm_xdc_wconf |
Enable word confidence output in XDOC format.
Word confidence values will range from 0 to 999.
Only has an effect when dm_out_text_format is vvTextFormatXdoc, vvTextFormatXdoclite, or vvTextFormatXdocplus.
|
| dm_xdc_cconf |
Enable character confidence output in XDOC output format.
Character confidence values will range from 0 to 999.
Only has an effect when dm_out_text_format is vvTextFormatXdoc, vvTextFormatXdoclite, or vvTextFormatXdocplus.
|
| dm_xdc_wbox |
Output word bounding boxes in XDOC output format.
Only has an effect when dm_out_text_format is vvTextFormatXdoc, vvTextFormatXdoclite, or vvTextFormatXdocplus.
|
| dm_xdc_cbox |
Output character bounding boxes in XDOC output format.
Only has an effect when dm_out_text_format is vvTextFormatXdoc, vvTextFormatXdoclite, or vvTextFormatXdocplus.
|
| dm_xdc_wbox_pixels |
Use pixel values for word bounding boxes in XDOC output format.
This should be used in conjunction with dm_xdc_wbox; turning on dm_xdc_wbox_pixels affects the word bounding box measurements but does not turn on the word bounding boxes.
Only has an effect when dm_out_text_format is vvTextFormatXdoc, vvTextFormatXdoclite, or vvTextFormatXdocplus.
|
| dm_metric |
Use metric measurements.
Metric measurements are used for the output document, otherwise English measurements are used.
|
| dm_pdf_format |
Type of PDF output text document.
Only for use when dm_out_text_format is vvTextFormatPdf. Should be set before calling vvEngAPI::vvRecognize.
- See also:
- dm_output_img_source for information on how the graphics in the output PDF file will be compressed.
dm_pdf_imgthresh
dm_pdf_img_nodict
dm_pdf_img_alphanum
|
| dm_pdf_imgthresh |
Word threshold for imagette output.
Only for use when dm_out_text_format is vvTextFormatPdf. Should be set before calling vvEngAPI::vvRecognize.
Not supported in the current release.
- Type: int
- Range: 0-100?
- Default: 0
|
| dm_pdf_img_nodict |
Output imagette if word is not in the dictionary.
Only for use when dm_out_text_format is vvTextFormatPdf. Should be set before calling vvEngAPI::vvRecognize.
Not supported in the current release.
|
| dm_pdf_img_alphanum |
Output imagette if word is alphanumeric.
Only for use when dm_out_text_format is vvTextFormatPdf. Should be set before calling vvEngAPI::vvRecognize.
Not supported in the current release.
|
| dm_ls_quote |
Left single quote.
Should be set before call to vvEngAPI::vvRecognize.
|
| dm_rs_quote |
Right single quote.
Should be set before call to vvEngAPI::vvRecognize.
|
| dm_ld_quote |
Left double quote.
Should be set before call to vvEngAPI::vvRecognize.
|
| dm_rd_quote |
Right double quote.
Should be set before call to vvEngAPI::vvRecognize.
|
| dm_document_name |
Document name (read-only).
|
| dm_questionable |
Character to be inserted before each uncertain character.
dm_questionable is applicable to these output formats: vvTextFormatIso, vvTextFormatUnicode, vvTextFormat8bit
In XDOC output, the questionable character cannot be changed, because it is specified in the XDOC specification as Q.
See dm_quest_thresh for information on when a character is uncertain.
- Type: string (only the first character is used)
- Default: (none)
|
| dm_unrecognized |
Character to be used instead of each unrecognized character.
dm_unrecognized is applicable to these output formats: vvTextFormatIso, vvTextFormatUnicode, vvTextFormat8bit
In XDOC output, the unrecognized character cannot be changed, because it is specified in the XDOC specification as E.
See dm_quest_thresh and dm_accept_thresh.
- Type: string (only the first character is used)
- Default: ~ (ASCII 126)
|
| dm_force_single_col |
Force a single column.
When turned on, during region segmentation, multiple columns are merged into one single column.
This value must be set before the call to vvEngAPI::vvPreprocess, since it affects the region segmentation.
The effect of turning dm_force_single_col on is most apparent in a document with multiple columns and text output. When dm_force_single_col is off, each column is listed sequentially in the output text file, because the engine determined the columns are logically separate; when dm_force_single_col is on, the columns appear from the left to the right, giving the text file a similar appearance to the input image.
|
| dm_one_line_table_cells |
Force one line table cells.
|
| dm_improved_single_col_detect |
Improved single column detection.
|
| dm_region_type |
Region type.
|
| dm_region_subtype |
Region sub-type.
|
| dm_region_stacking |
Region stacking order.
Modify this to change how regions overlap each other.
- Type: int
- Range: >= 0
- Default: automatically set
|
| dm_region_grammar_mode |
Grammar mode.
|
| dm_region_lexical_constraint_id |
Lexical constraint id.
- Type: int
- Range:
- Default:
|
| dm_region_lexmode |
Region lexical mode.
|
| dm_region_foreground |
Photometric interpretation of region.
|
| dm_region_out_order |
Region output order.
Modify this to change the order in which the regions are written to the output.
- Type: int
- Range: >= 0
- Default: set automatically
|
| dm_region_name |
Region name.
|
| dm_region_frame_left |
Left frame boundary of the current region, as specified by dm_current_region.
This value is calculated automatically after preprocessing and recognition. The area of a region may be composed of many rectangles, called a "union of rectangles" (UOR), with the result that the region is not rectangular in shape (see dm_region_uor_string). dm_region_frame_left is the coordinate of the left side of the leftmost rectangle in the region's UOR.
- Type: int
- Range: 0-2400
- Default: set automatically
|
| dm_region_frame_right |
Right frame boundary of the current region, as specified by dm_current_region.
This value is calculated automatically after preprocessing and recognition. dm_region_frame_right is the coordinate of the right side of the rightmost rectangle in the region's UOR.
- Type: int
- Range: > 0
- Default: set automatically
|
| dm_region_frame_top |
Top frame boundary of the current region, as specified by dm_current_region.
This value is calculated automatically after preprocessing and recognition. dm_region_frame_top is the coordinate of the top side of the uppermost rectangle in the region's UOR.
- Type: int
- Range: > 0
- Default: set automatically
|
| dm_region_frame_bot |
Bottom frame boundary of the current region, as specified by dm_current_region.
This value is calculated automatically after preprocessing and recognition. dm_region_frame_bot is the coordinate of the lower side of the bottommost rectangle in the region's UOR.
- Type: int
- Range: > 0
- Default: set automatically
|
| dm_region_uor_string |
Defines the area covered by a region through a list of rectangles (see the glossary entry on UOR).
The boundary of a region is set by setting:
Then by calling vvEngAPI::vvSetRegionProperties.
Please see the Frequently Asked Questions section of the main documentation page for an example of setting up a new region's boundaries.
One may also query the engine for the UOR string of the current region: Set dm_current_region, then call vvEngAPI::vvGetValue, passing dm_region_uor_string as the value.
- Type: string
- Format: x1,y1,x2,y2;x3,y3,x4,y4;x5,y5,x6,y6....
(Rectangle coordinates are separated by commas; rectangles are separated by semicolons.)
- Default: NULL string (not a valid setting for a region)
|
| dm_region_uor_count |
The number of rectangles in the current region's UOR string.
When setting the dm_region_uor_string for a region, it is critical to correctly set the dm_region_uor_count for that region, otherwise, the dm_region_uor_string will not be interpreted correctly.
The engine may be queried for this value using vvEngAPI::vvGetValue to find out the number of rectangles in the UOR of the current region, specified by dm_current_region.
- Type: int
- Range: > 0
- Default: none
|
| dm_black_threshold |
Threshold to binarize multi-bit input image data.
This is the threshold used to determine which pixels are black and which are white on a page when converting an image from multiple bits per pixel to the 1-bit per pixel image processed by the OCR Engine itself.
Additional black_threshold options are provided for more sophisticated translations to a bi-tonal image. These include the value 101 which forces a random threshold to be used and the value 102 which directs the OCR Shop XTR to use the Floyd-Steinberg algorithm to determine which pixels are white and which are black.
Adjusting the black_threshold value can significantly affect the OCR Enginehs recognition of image regions.
- Type: int
- Range: 0-102
- Default: 60
|
| dm_in_xres |
Input image x resolution.
Input image resolution. This overrides the resolution specified in the file itself.
The minimum resolution recognized by the OCR Engine is 70 dpi and the maximum is 900 dpi.
- Type: int
- Default: 300dpi, unless the input image itself specifies its resolution
|
| dm_in_yres |
Input image y resolution.
Input image resolution. This overrides the resolution specified in the file itself.
The minimum resolution recognized by the OCR Engine is 70 dpi and the maximum is 900 dpi.
- Type: int
- Default: 300dpi, unless the input image itself specifies its resolution
|
| dm_language |
Language pack(s) to load.
The parameter is a string with a comma-separated list of language pack names. (see below)
If multiple languages are specified in the dm_language parameter then they all must use the same set of shapes (same Code Page).
To recognize languages that use different sets of shapes on the same page, regions need to have their language specified separately using vvEngAPI::vvSetLexicalConstraints.*
Note: Some languages produce output which is incompatible with some output formats. For example Russian cannot be represented by ASCII text.
Languages with dictionaries: czech, danish, dutch, english, finnish, french, german, greek, hungar (for Hungarian), italian, norsk, polish, port (for Portuguese), russian, spanish, swedish, turkish
Languages without dictionaries: romanian, estonian, afrikaans, albanian, aymara, basque, breton, bulgarian, byelorussian, croatian, faroese, flemish, friulian, gaelic, galician, greenlandic, hawaiian, icelandic, indonesian, kurdishlat, latin, latvian, lithuanian, sorbianl, macedonianc, malaysian, piginenglish, serbian, ukranian, catalan, sbcroatian, slovak, slovenian, swahili, tahitian, sorbianu, welsh, frisianw, zulu
Languages by Shape Pack/Code Page:
-
Baltic (1257): Estonian, Latvian, Hawaiian, Lithuanian
-
Central Europe (1250): Albanian, Polish, Croatian, Romanian, Slovenian, Czech, Serbo-Croatian, Sorbian - Lower, Hungarian, Slovak, Sorbian - Upper
-
Cyrillic (1251): Bulgarian, Macedonian (Cyrillic), Serbian, Byelorussian, Russian, Ukranian
-
Greek (1253): Greek
-
Latin I (1252): Afrikaans, French, Malaysian, Aymara, Frisian - West, Norwegian, Basque, Friulian, Pigin English, Breton, Gaelic, Portugese, Catalan, Galician, Spanish, Danish, German, Swahili, Dutch, Greenlandic, Swedish, English, Icelandic, Tahitian, Faroese, Indonesian, Welsh, Finnish, Italian, Zulu, Flemish, Latin
-
Turkish (1254): Kurdish (Latin), Turkish
Also see the Technical Specifications documentation.
-
Type: string
-
Default: english
- See also:
- vvEngAPI::vvStartOCRSes
|
| dm_english_chars |
Include the english character set.
For use with character sets other than Latin 1. This allows recognition of Latin characters mixed in with a document containing primarily words written in another language and character set.
|
| dm_char_set |
Lexical constraint character set.
If set, the OCR engine will only recognize the characters specified in the dm_char_set string. If the document being recognized only contains a few characters specified by this value, the resulting text file will contain characters from the dm_char_set along with the reject_char character for any characters not in the dm_char_set.
Setting to the null string will cause the OCR Engine to not be constrained.
dm_char_set is case-sensitive. For instance, to recognize both uppercase and lowercase d, both must be specified. Because "s" and "5" have similar character shapes, if "s" (or "S") is specified than the number 5 will also be recognized in the output, even if it is not specified.
dm_char_set must be set before the OCR Session is started (see vvEngAPI::vvStartOCRSes).
Also see the character set documentation.
- Type: string
- Default: not set
|
| dm_word_lexicon_id |
ID number for the word lexicon.
Should be set before a call to vvEngAPI::vvSetLexicalConstraints. Corresponds to the constraintId passed to vvEngAPI::vvAddWordToLexicon.
- Type: int
- Range: 0-?
- Default: 0
More detailed information:
The dm_word_lexicon_id is used in the vvEngAPI::vvSetLexicalConstraints call, and it affects the recognition portion of the processing.
The dm_word_lexicon_id is the id number for the word lexicon, which identifies a group of words as one lexicon. Having multiple lexicon ids permits you to create several lexicons (groups of words), and then set individual regions or pages to use different word lexicons.
The process of using dm_word_lexicon_id is:
-
Create a word lexicon by calling the function vvEngAPI::vvAddWordToLexicon once for each word in the lexicon, passing: the word to add, the lexicon's id number.
-
Set dm_word_lexicon_id to the id number for the lexicon you just created.
-
Set the focus area (region or page) and for a region, set the current region id number.
-
Call vvEngAPI::vvSetLexicalConstraints to associate the region or page with the word lexicon you just set up. Note that vvSetLexicalConstraints is used to set other lexical properties as well.
|
| dm_format_analysis |
Analyze document format to determine layout of output document.
(Note: Turn format analysis off for correct PDF output.)
|
| dm_double_dimension |
Double non-square dimensions.
This value is used for non-square images (e.g., faxes which were transmitted at 200x100 dots per inch). When set, the dimensions of each image will be examined and pixels will be doubled in the dimension with the lower resolution.
If the image is already square (i.e., x-dpi equals y-dpi), no doubling is performed.
Should be set prior to calling
Note: this option is an undocumented feature in the ScanSoft engine.
|
| dm_current_region |
Current region.
This is the region the engine is currently focused on. All region-specific settings and actions will affect the region specified by this value.
-
Type: int
-
Range: depends on the region ids in the current document
-
Default: REGION_DEFAULTREGION
- :
|
| dm_focus_area |
Area on which to focus.
All actions and settings will affect the area set here.
|
| dm_pp_remove_halftone |
Preprocessing option to remove image regions from output.
When set to vvYes, the OCR Engine removes all image regions from output including halftone and line art regions.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_auto_segment |
Preprocessing option to perform auto segmentation.
When set, the OCR Engine performs the analysis to divide the different image and text areas of the document. These areas are referred to as "regions".
Should be set before calling vvEngAPI::vvPreprocess.
- See also:
- dm_pp_analyze_layout
dm_current_region
vvEngAPI::vvSetRegionProperties
|
| dm_pp_rotate |
Preprocessing option to perform a specific rotation.
Explicitly rotate the input image during pre-processing. Non-orthogonal rotation by an arbitrary angle is not supported by this value.
If dm_pp_auto_orient is set to vvCorrect and rotate is set to a non-zero value, the image will be rotated so that it is upright before recognition, even if this differs from the rotate value specified. The only effect of specifying both values is that the OCR Engine will favor the suggested rotation of the rotate value in determining the orientation of the page.
Should be set before calling vvEngAPI::vvPreprocess.
- Type: int
- Range: <0|90|180|270>
- Default: 0
|
| dm_pp_fax_filter |
Preprocessing option to use the fax filter.
If the fax filter is set to vvAuto, the fax filter is only applied when needed.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_auto_orient |
Preprocessing option to orient the input document automatically.
When this is set to vvCorrect then the image is automatically rotated to correct its orientation. See the note for the dm_pp_rotate variable above.
Should be set before calling vvEngAPI::vvPreprocess.
- See also:
- dm_auto_flip
|
| dm_pp_invert |
Preprocessing option to invert the image data.
When set, the 1-bit per pixel image is inverted, so that white becomes black and black becomes white for use of the image in the recognition process.
This value affects the recognition process; it does not affect output of individual graphics elements.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_newspaper_filter |
Preprocessing option to apply newspaper filters.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_deskew |
Preprocessing option to automatically deskew image data.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_photometric_interp |
Preprocessing option to detect and/or correct the photometric interpretation.
When set, photometric interpretation is performed, i.e., the OCR Engine determines if the input document is overall reverse or normal video. If the document is mostly reverse video, then the OCR Engine will invert the entire image.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_dotmatrix_filter |
Preprocessing option to apply the dotmatrix filter.
Used to improve recognition for documents printed by a dot-matrix printer. When set to vvAuto, the dotmatrix filter is only applied if needed.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_autosetdegrade |
Preprocessing option to turn on degraded image processing.
This value affects the preprocessing portion of the program, and should be set before the call to vvEngAPI::vvPreprocess.
Setting dm_pp_autosetdegrade to vvYes turns on a preprocessing mode used for particularly degraded images, and will affect how the image is processed, and as a result the recognition results.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_recognition |
Preprocessing recognition flag.
This flag is not required to recognize an image.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_segment_lineart |
Preprocessing option to detect lineart regions and add them to the halftone mask, detecting it as an image instead of as text.
This will only have an effect if dm_pp_auto_segment is set to vvYes or the dm_pp_remove_halftone is set to vvYes.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_reverse_video |
Preprocessing option to detect reverse video regions.
When set to vvYes, this value detects which regions of the page are reverse video so that the OCR Engine will know to invert the image before recognition. This option will affect the output on pages where reverse-video text exists and is detected.
Text will not be recognized in reverse video regions unless this option is set.
Should be set before calling vvEngAPI::vvPreprocess.
|
| dm_pp_analyze_layout |
Preprocessing option to analyze the page layout.
Page layout analysis (dm_pp_analyze_layout) is an intentionally flexible option in order to allow a complicated matrix of different desired features in a product that supports both automatic segmentation and manual segmentation.
Page layout analysis can be thought of as part of automatic page segmentation. Most of it runs before recognition (vvEngAPI::vvRecognize) and in conjunction with other preprocessing functions.
Page layout analysis tries to understand the layout of text on the page, the reading order, the structure of the tables, column structure, captions, headers, footers, insets, etc. In addition, page layout analysis changes the regions that were found by automatic page segmentation. However, if automatic page segmentation was not used, page layout analysis can still run; the regions will not get re-ordered or re-formed if certain other settings are set (dm_user_specified_order and dm_user_specified_regions), but they will be analyzed.
If dm_pp_auto_segment is set, then page layout analysis is automatically run (it is considered part of the functionality of automatic page segmentation) unless dm_pp_analyze_layout is specifically turned off. However, if automatic page segmentation is not run, such as when manual page segmentation is used, page layout analysis must be run explicitly if the results from it are needed later. Page layout analysis can be run alone by setting dm_pp_analyze_layout to vvYes and turning off all of the other preprocessing options before calling vvEngAPI::vvPreprocess.
This value should be set before calling vvEngAPI::vvPreprocess.
- Type: <vvNo|vvYes>
- Default: depends on other options
|
| dm_auto_flip |
Option to automatically flip upside down images.
This option should be set before calling vvEngAPI::vvRecognize.
When called, it causes recognition to be run a second time, after the first time fails because the image is upside down.
- See also:
- dm_pp_auto_orient
|
| dm_in_filename |
Input image filename.
This is the filename of an image to read from disk. Use along with vvEngAPI::vvOpenImageFile.
- Type: string
- Default: none
|
| dm_in_curr_page |
Current page of the input document.
Use only with input image files which have multiple pages. Set after an image has been opened with vvEngAPI::vvOpenImageFile, and before a page is opened with vvOpenPage.
This is not relevant for image data passed directly to the engine through the vvPutImage function.
- Type: int
- Range: 0-99999
- Default: 0
|
| dm_in_num_pages |
Number of pages in the input image document (read-only).
For PDF and PostScript input, the number of pages can not be determined without cycling through each page individually. For these input filetypes, dm_in_num_pages will be set to a very large integer. If you need to cycle through all pages of a PDF or PostScript input file, you can keep trying to process the pages until the engine returns an error that the current page does not exist.
Only relevant for images read from a file.
|
| dm_in_format |
Format of input image document (read-only).
Only relevant for images read from a file.
|
| dm_region_ids |
List of all region ids in the image currently being processed (read-only).
|
| dm_region_ids_text |
List all region ids of text regions in the image currently being processed (read-only).
|
| dm_region_ids_image |
List all region ids of image regions in the image currently being processed (read-only).
|
| dm_out_text_format |
Output text format.
- See also:
- dm_output_img_source for information on how the graphics in the output PDF file will be compressed.
dm_pdf_format
|
| dm_out_graphics_format |
Output graphics format.
|
| dm_doc_memory_size |
Memory sized for an output document (read-only).
Provided so that the user knows how much space to allocated before calling vvEngAPI::vvAcquireDocMemory.
|
| dm_subimage_memory_size |
Memory sized for an output subimage (read-only).
Provided so that the user knows how much space to allocated before calling vvEngAPI::vvAcquireSubimageMemory.
|
| dm_output_img_source |
Image source for image output.
This value is used to specify whether the output image data should be drawn from the original input image or from the processed image from the OCR engine. Both the original input image and the processed image from the engine will be corrected for orientation and skew. The processed image could in addition be filtered, depending on the preprocessing options and image properties. The processed image from the engine always has a bit depth of 1.
For PDF output, the output file will use Flate compression for 1-bit input images and JPEG compression for 8 and 24-bit images when the input image is used for the output graphics. If the processed image is used for the output graphics in the PDF file, then CCITT 4 fax compression is used; note that this compression is not supported by all PDF readers and is not recommended for PDF output.
Used during calls to:
vvEngAPI::vvSpoolDoc (For vvTextFormatPdf and all HTML output formats)
vvEngAPI::vvCaptureSubimage<br>
|
| dm_recognize_timeout |
A timeout for vvEngAPI::vvRecognize.
When this timeout is exceeded, the daemon will assume something is wrong and shut itself down immediately.
This timeout is intended as a last-resort action, to prevent the system from freezing or running out of resources if the OCR engine encounters a fatal problem. This is not a "nice" timeout, because the engine state is lost and the client side program will only know that it can no longer communicate with the engine.
We recommend setting this timeout to be a high number, because you do not want it to be triggered while the engine is processing normally. Some degraded or complicated images do take a long time to process and should be permitted sufficient time.
- Type: int (seconds)
- Default: 0
|
| dm_preprocess_timeout |
A timeout for vvEngAPI::vvPreprocess.
When this timeout is exceeded, the daemon will assume something is wrong and shut itself down immediately.
This timeout is intended as a last-resort action, to prevent the system from freezing or running out of resources if the OCR engine encounters a fatal problem. This is not a "nice" timeout, because the engine state is lost and the client side program will only know that it can no longer communicate with the engine.
We recommend setting this timeout to be a high number, because you do not want it to be triggered while the engine is processing normally. Some degraded or complicated images do take a long time to process and should be permitted sufficient time.
- Type: int (seconds)
- Default: 0
|