| We take the cabinet out of filing! |
Home page | Request information from BOLT | Back to LaserFiche page
Reprinted from the LaserFiche site:
Frequently
Asked Questions
Document Imaging in the New Millennium
Q. What is a document?
A. A document can be from one to several thousand pages, and can include
images and/or text, plus annotations, and one template (index card).
Q. Can I edit or alter images?
A. An imaging system should not provide any facility for editing or
altering images. This is important as many users consider that images should
be sacrosanct and that any changes would undermine the integrity of the system.
In addition, the system should provide an audit trail function to keep track
of which users have accessed which documents at what times.
Q. Do imaging systems support
audit trails?
A. An imaging systems audit trail product should record a user
name, date, time, document name and action whenever a user accesses a database
or document. Various levels of audit-trail logging detail and activity tracking
should be available. The system should also support a viewer for sorting and
filtering these logs.
Q. What is the standard format
used to store images?
A. Black and white images are most commonly stored as standard TIFF
files using CCITT Group 4 (two-dimensional) compression. Grayscale and color
images are frequently stored as TIFF files with JPEG compression.
Q. Which types of desktop operating
systems are usually supported?
A. Most imaging systems have client applications that can run as Windows
applications on Windows 95, 98 and Windows NT. Internet/intranet systems may
be able to run on additional platforms, such as Macintosh and Unix, among
others.
Q. How much disk space does
an imaging system typically require?
A. With the rapid drop in prices for hard drives and optical media,
it costs much less to store documents on an imaging system than with paper.
A single page typically occupies around 50KB of disk space if the image is
stored in TIFF Group IV. Each gigabyte (GB) of storage space (which costs
only a few dollars) will hold approximately 20,000 pages.
Q. What if my database is too
big to fit in one data volume?
A. A high-end imaging system will allow data and images to be stored
across multiple volumes, with each volume residing in a different directory
or on a different drive, disk array, CD or MO disk.
Q. How much total RAM does imaging
software require?
A. Client software generally requires 16 to 20 MB of RAM to run, with
higher requirements for scanning and OCR. Most systems recommend having 64MB
or more.
Q. Are special display cards
or monitors required?
A. Most systems work with any Windows-compatible video card and VGA
(or better) monitor, and recommend that you use at least a 15" monitor
with at least 800 x 600 dpi in resolution.
|
Scanning/Importing/Storing
|
Q. Which manufacturers make
document imaging scanners?
A. Some of the top scanner manufacturers include Ricoh, Fujitsu, Panasonic,
Bell & Howell, Canon, Hewlett Packard, Avision, Mitsubishi, Visionshape,
Kodak and BancTec. Document imaging scanners typically have document feeders
and fast scan rates to quickly bring in large amounts of documents.
Q. What are the most common
hardware and software scanner interfaces?
A. Kofax Image Controls (http://www.kofax.com) provide the most popular
document imaging scanner interfaces. Many scanners attach to an Adaptec SCSI
card or to a Kofax Image processing board. Most scanners use either TWAIN
or ISIS scanner drivers to communicate with the computer.
Q.
How can I scan checks?
A. Several manufacturers make scanners specifically designed for checks
that read the magnetically encoded MICR numbers at the bottom of the check.
If you do not have one of these scanners, most checks can be scanned with
regular document imaging scanners and OCRed as usual, though the MICR
numbers will not be read.
Q. How can I scan large format
documents?
A. Several manufacturers, including Contex, Vidar, Océ and Calcomp
make scanners specifically designed for large format documents up to E-size
(34" x 44") and A-0 size (33" x 46.8"). If you do not
have one of these, the document can be reduced in size using a photocopier
and then scanned with a normal scanner, or sent to a service bureau that has
large format scanners.
Q. What image resolution should
I use?
A. Most imaging systems can support documents scanned at various resolutions,
from 50 dpi to 600 dpi (or more) depending on your scanner. Depending on the
purpose and the contents of the page, most documents are scanned in black
and white at 300 dpi.
Q. What about color files or
photographs?
A. Imaging systems should support black and white, grayscale and color
images. Color files can be scanned with a color scanner or imported into an
imaging system. There are a wide range of color scanners on the market. Many
document imaging scanners support color and grayscale.
Q.
How can I scan double-sided documents?
A. An imaging system should provide two
different ways to do this. It should support duplex scanners, which simultaneously
scan both sides of a page. Also, with a simplex scanner, the user should be
able to scan all the front sides, place the documents in upside down and scan
all the back sides, and then the system should automatically collate the pages
into the correct order.
Q.
Can I scan landscape and portrait pages together in one batch?
A. An imaging system should allow you to change the orientation of
pages as you scan or after scanning. A well-designed system will also include
an option to automatically check and correct the orientation of pages.
Q. How are skewed
images handled?
A. Skewed (crooked or tilted) images can adversely affect the accuracy
of the OCR process, so an imaging system should include software that recognizes
skewed images and compensates for them. This is particularly important when
scanning press cuttings on a flat bed scanner or when scanning documents through
a worn-out or poorly-designed ADF (automatic document feeder).
Q.
What file formats can a versatile system import?
A. A versatile system should be able to import the files you would
encounter in your office. This includes word processing files, spreadsheets
and presentations as well as common image formats such as TIFF 4, TIFF 3,
TIFF Raw, TIFF LZW, PCX, BMP, CALS, JPEG, GIF, PICT, PNG and EPS Preview images.
An imaging system providing long term archival of documents should allow the
images of each page to be stored in a non-proprietary format. For example,
electronic document pages would be printed to the imaging system,
black and white graphical files would be converted to TIFF Group 4 format
and color/grayscale images would be converted to TIFF JPEG.
Q. What is the difference between
CD or DVD jukeboxes/changers and towers?
A. In a jukebox/changer, there are more slots and disks than there
are drives. Robotic mechanisms automatically place the correct disk into one
of the drives when the disk is needed. In a tower, many CD or DVD drives are
stacked together in a single unit, and every disk is always sitting in a drive.
Towers provide faster data access but typically cost more per disk and do
not hold as many disks. Jukeboxes/changers cost less per disk and can hold
up to 500 disks, but are slower because swapping disks in and out of the drives
is time-consuming. Viewing/Printing/Exporting
Q. Can I view combinations of
images, text and index fields side by side?
A. To allow convenient access to document information, a well-designed
imaging system will allow the view screen to be configured to show the text,
images, template index fields or thumbnail images.
Q. Can I open and display more
than one document at a time?
A. Some imaging systems will allow you to display multiple documents,
with the number of documents you can have open simultaneously limited only
by the amount of memory available.
Q. How can I re-sequence pages?
A. If pages are out of order and need to be re-sequenced, a well-designed
imaging system will allow thumbnail views of pages to be simply
dragged to the required position. In the same way, individual pages can be
selected and deleted, subject to appropriate security access control and privileges.
Q. Will I need a specialized
imaging display?
A. No, most systems run perfectly well on standard VGA and better monitors.
A 15" display using a Super VGA controller should be considered the absolute
minimum practical display for an ad hoc user of the system. Frequent users
should have a 17" monitor, and users who scan or review imaged documents
full-time may want to consider a 19" or 21" monitor.
Q. What is the advantage of
a large monitor for power users?
A. For people who use an imaging system intensively, screen size can
be a critical factor. If users are to flip between pages with the ease of
real paper, they must be able to view the whole page at once in a way that
allows the text to be readable. If 81/2" x 11" pages are the dominant
paper size, then a 21" monitor capable of displaying 1600 x 1200 is optimal.
Using a standard 14" VGA monitor will require scrolling and panning if
the image is viewed at normal size.
Q. What is important besides
monitor size?
A. Screen resolution and the refresh rate of the monitor are also important.
Generally, the larger a monitor is and the higher resolution it has, the harder
it is to get the high refresh rate that is required for sustained viewing
without screen flicker. The optimum threshold for minimum flicker is generally
considered to be a horizontal refresh rate of 72 Mhz on a 21" monitor.
The maximum refresh rate is a function of the monitor and the graphics controller.
Q. Will I need a specialized
printer for images or OCRed text?
A. Generally no. Most imaging systems support most Windows compatible
printers, but recommend that you use a laser printer with at least 4 MB of
RAM. If you are using a networked system and printing high volumes of pages
to a network printer, you might consider installing a separate laser printer
either locally or on its own network segment to minimize network traffic.
Q. In which formats can I export
documents?
A. It depends on the imaging system. Common graphical formats you may
need include TIFF III, TIFF IV, TIFF Raw, BMP, GIF, CALS and JPEG.
|
OCR Optical Character
Recognition
|
Q. What is OCR?
A. OCR stands for Optical Character Recognition, which is how a computer
converts words in an unsearchable scanned image to searchable text. OCR is
usually necessary in order to use full-text indexing and searches, and it
should be included in an imaging system. OCR engines can generally only recognize
typed or laser-printed text, not handwriting.
Q. What is the difference between
OCR and indexing?
A. OCR is the process of converting scanned images to text files. Full-text
indexing is the process of taking a text file and adding each word to an index
file that specifies the location of every word on every document. Well designed
imaging software can make this a fast and easy procedure, providing rapid
access to any word in any document.
Q. How accurate is OCR?
A. Accuracy on a freshly laser-printed page is typically better than 99.6%.
Accuracy on faxed, dirty or degraded documents will of course be lower, but
a few imaging systems have image clean-up technology that can improve OCR
accuracy.
Q. Do I have to go through and
correct OCR mistakes?
A. Not if the imaging system supports fuzzy logic, which
will find words even if the OCR engine made a few mistakes.
Q. How fast is the OCR process?
A. The performance of the OCR and indexing processes is entirely dependent
on factors such as the speed and configuration of the host system as well
as the contents of the image. A 133 MHz Pentium generally needs about 6 seconds
per page, while a 450 MHz Pentium II will take about 2-3 seconds per page.
Q.
What is ICR (Intelligent Character Recognition)?
A. ICR is pattern based character recognition and is also known as
Hand-Print Recognition. Handwritten text is more difficult for computers to
recognize and results in higher error rates than printed text. ICR engines
usually do best at recognizing constrained printing, which means block printed
letters with one letter in each box. Accurate recognition of unconstrained
handwriting, especially cursive handwriting, typically requires that the ICR
engine be trained to recognize each users style of writing.
Q.
What is OMR (Optical Mark Recognition)?
A. OMR, also called Mark-Sense Recognition,
is the recognition of marks commonly used on forms, such as check marks, circled
choices, and filled-in bubbles. OMR can be an important part of an imaging
system for organizations that process many standard forms. Scantron exam forms
and customer survey cards are perhaps the best-known examples of OMR in action.
Q. Can OCRed text be exported
and re-used in a word processor?
A. Yes, you can usually cut and paste text between the imaging system
and another Windows application, or you can export complete text files (all
text pages in a document) to a directory and open it with your favorite word
processor.
Q. Can I manually correct OCR
errors and typos?
A. Well-designed systems allow users to correct OCR errors from within
the system. However, when hundreds or thousands of pages are scanned every
day, it is usually not practical to have someone clean up the text. If fuzzy
logic search capabilities are available, it is not necessary to correct the
text as searches will typically still find misread words.
|
COLD: Computer Output
to Laser Disc
|
Q. What is the difference between
COLD and imaging?
A. Imaging is for scanning, compressing, storing, indexing, OCRing,
searching and retrieving millions of pages of paper documents or electronic
documents archived as permanent images. COLD is for archiving, indexing, searching
and printing reports from huge text files generated by mainframes, mini-computers
and other computer applications. COLD stores huge report files and extracted
index fields on hard disk, optical cartridge or CD-ROM instead of printing
all the information out on paper or storing it to microfilm.
Q. How many index fields can
the COLD server extract from each report?
A. The number of index fields is usually unlimited. However, the more
fields extracted from each report, the slower the extraction process will
run and the larger the index files will be.
Check BOLT
Q & A for more info
Home page | Request information from BOLT | Back to LaserFiche page
For more in
depth response, please contact sales@getbolt.com
or call 1-800-GET-BOLT.
Portions © Compulink Management Center,
Inc.
| © BOLT Document Management 2008 |