Migraph OCR/DTP/Commercial
From: JJL101@PSUVM.BITNET
Date: 07/18/92-02:19:14 PM Z
- Next message by date: JJL101@PSUVM.BITNET: "Straight Fax/Telecommunications/Commercial"
- Previous message by date: JJL101@PSUVM.BITNET: "Straight Fax/Telecommunications/Commercial"
- Return to Index: Sort by: [ date ] [ author ] [ thread ] [ subject ]
From: JJL101@PSUVM.BITNET Subject: Migraph OCR/DTP/Commercial Date: Sat Jul 18 14:19:14 1992 Taken from: Atari Explorer Online (#9202) =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= | | | MIGRAPH OCR | | | By John L. McLaughlin | | | --------------------------------------------------------------- Requirements: Any ST/STe/TT computer with 2 MB or more RAM and hard disk. Hand- or full-page scanner optional. Summary: Sophisticated, trainable optical character-recognition (OCR) package, capable of making short work of data-input. Manufacturer: MiGraph, Inc., 32799 Pacific Highway S., Federal Way, WA 98003 (206) 838-4677 Price: $299.00 Though paper provides a convenient and tangible medium for human communication, it's not great for talking to machines. Scanning has solved the problem of how to get images from paper into computer memory. But because computers store images and text in completely different ways, images of text, such as a scan of this magazine page, require further processing before the information they contain can be used by word processors, spreadsheets, and other "text-handling" applications. MiGraph OCR (short for "Optical Character Recognition") provides the missing link -- converting scanned text to ASCII files that can be used directly by a wide variety of applications. The program can accept previously-scanned monochrome .IMG or TIFF files; or process input directly from a MiGraph or compatible hand-scanner. The OCR Process MiGraph OCR begins its job by methodically chopping up a scanned image: first into discrete lines of text, then into masses identified as words and subdivided into characters. This, alone, is a fairly complicated process, involving raster image-processing (to remove spurious background shading and stray pixels, improve contrast and separate characters, etc.) and geometric analysis (to correct for text misalignment). Next, using a font-recognition engine licensed from Omnifont (world leaders in OCR software design), MiGraph OCR turns the bitmapped image of each character into a vector expression describing its shape in terms unrelated to size or resolution. Characters are recognized by comparing their vector descriptions against a dictionary of character forms in different fonts and point sizes -- a process that yields a far higher percentage of "hits" than prior OCR techniques involving bitmap comparisons. Additional refinement is obtained by referencing against a user dictionary, created by "training" the device on text with particular characteristics. As a last step, MiGraph OCR performs a complex lexical and syntactic analysis, using one of four supplemental dictionaries based on the Proximity/Merriam-Webster Linguibase. This further assists the program in making intelligent "guesses" about characters whose forms remain ambiguous. Using OCR Installing MiGraph OCR is simple. An INSTALL program is included on the main disk that lets you specify the folder into which you want program files stored. The utility also lets you identify which of the four supplemental dictionaries you wish installed: versions for English, German, French, and Dutch are included on two support disks. A minimum of 2 MB free space must exist on the target partition, prior to installation. OCR's main control screen is simple and well-designed, and a little random button-clicking quickly reveals how most of the program works. Nevertheless, to help get you started, the manual includes several step-by-step, hands-on tutorials. The general control panel, accessed by clicking on the "hammer" icon, lets you specify input source (scanner or file), output format, and set refining parameters for the OCR process. Selecting "scanner" as the input device causes the appearance of a secondary scanner configuration dialog which lets you define resolution, area, and direction of input scans. Select "Get Image," and you're flying. If you've elected to scan, the hand scanner is activated and managed automatically -- all you have to do is move it down (or across) the page. OCR performs best when presented with a straight scan, so a scanning tray is recommended. The only glitch I noticed was caused, as it turned out, by the fact that I was running MiGraph OCR on a Mega STe at 16 MHz, with blitter and caches enabled. Apparently, some combination of these features throws off the sample timing, so that illegible scans are produced. The fix, at least until MiGraph issues an upgrade, is to use the Control Panel to turn off all enhancements while scanning is in progress. They can (and should) be turned on again, afterwards, since OCR processing benefits from the increased system throughput. Once scanning is complete, the scanned image appears in OCR's work window. Your first job is to assess the quality of the scan, to determine if it is appropriate for OCR processing. Because low-quality scans take unnecessarily long to process, and produce a large number of errors, it's best to repeat doubtful scans at this point. The next step is to select regions of the scanned image for input to OCR. This is done in very straightforward fashion, by dragging rectangles or drawing polyline boxes around desired portions of the image. Multiple regions can be sorted so that they are processed in any desired order. An added plus: to avoid having to make duplicate scans of the same material, MiGraph OCR also lets you define the graphic regions of any scan, saving them as .IMG or TIFF files. When OCR is initiated, the program performs several unattended passes: rectifying the image, segmenting it, and generating a first interpretation of its content. Because the process can take a while, you are kept appraised of progress by a succession of dialog boxes. If automatic processing has been selected, output text is then saved transparently to the designated file. Otherwise, the interactive learning phase begins. During interactive learning, the system presents you with problem areas of your scan, in greatly enlarged form, and asks you to correct or approve of its interpretations. The process is easily managed, though it can be time-consuming if many problems exist (the process can be aborted at any point, however, and the resulting text file saved to disk with markers inserted to indicate ambiguous characters). When correcting a problem, it's important to determine whether it's a result of poor scan quality or from an unfamiliar font or point size. When scan-quality is at fault, you should correct the problem in text, without updating the current user dictionary. Entering a correction is usually a matter of typing a single letter, though occasionally, the program will present you with groups of several adjacent letters for identification. Very rarely, the program will assume that two adjacent characters are one, and will not accept multiple characters for insertion. Alternatively, when you've identified a legitimate "training" situation (i.e., the program has failed to recognize text because it contains some regular feature (e.g., font, point size, or special letterform) which is unfamiliar) you can "train" OCR to recognize the character in the future. A vectorized image of the new letterform is added to the current user dictionary, which can be saved back to disk at the end of the session. Over time, dictionaries can be developed and refined for each type of text you regularly use as input, and these can add remarkably to the accuracy of OCR's interpretation. When you tell OCR to "learn" a new character, you must take care to input the correction properly. OCR immediately applies any corrected interpretation to similar ambiguities throughout the text -- a process designed to prevent your having to correct the same mistake more than once. Unfortunately, however, this also means that an erroneous correction can easily be propagated through your output, and -- if unrecognized at the end of the session -- perhaps even entered accidentally in the current dictionary when it is saved back to disk. Unfortunately, there's no way to "edit" the updated dictionary after a training pass, nor to return to a problem area during the pass, to re- enter a correction. So a fair amount of dictionary-refinement can be lost, if you're not careful. While I've described using OCR to process only a single scanned unit of text, it's also very easy to append the results of several OCR sessions to the same output file, creating a single result document that can be imported to a word processor. Alternatively, however, I've had good luck employing utilities such as WizWorks!' Scan-Lite to conjoin several scans into one uniform image before importing into OCR. Unfortunately, I have no means of testing how well MiGraph OCR would perform on input from a full-page flatbed scanner; but I suspect that for serious applications, this option should be thoroughly explored. Performance Once a sufficiently-refined user dictionary has been created for text from a particular source, MiGraph OCR is very accurate. It's also fairly quick, at least when processing in automatic mode: a page of Courier 10-pitch type, scanned at 300 dpi, can be output as ASCII in something like three minutes, which is marginally faster than an average-to-good touch typist could enter the same material. Naturally, text output by OCR must be further processed before it can be considered correct. At least part of this process (i.e., spell- checking) can be automated, however. Because performance accuracy is so dependent on user dictionaries, MiGraph OCR is most useful when input is derived from only a limited range of text-types. Even with this constraint, however, it's easy to imagine a broad range of applications. Particularly intriguing is the idea of using MiGraph OCR to convert faxes, received via faxmodem, to ASCII files -- providing a wholly "paperless" solution to fax correspondence in the computer context. Only one significant feature is lacking: the ability to queue multiple files for input and unattended processing. Hopefully, this feature will be added in a future upgrade, since it would make the program highly competitive with Kurzweil and other dedicated OCR systems, particularly in the small office environment.
- Next message by date: JJL101@PSUVM.BITNET: "Straight Fax/Telecommunications/Commercial"
- Previous message by date: JJL101@PSUVM.BITNET: "Straight Fax/Telecommunications/Commercial"
----------------------------------------- Return to message index