articlecrossroad.com articlecrossroad.com
   Home Page :> About Us :> Privacy Policy :> ToS :> Add Your Link :> Submit Article
Search:   
Get Free Links
 
   

Garden & Home

   

Cooking & Drinking

   

Business & Services

   

Automobiles

   

Relationship & Lifestyle

   

Policies & Law

   

Finance & Banking

   

Education & Reference

   

Internet & Computers

   

Society & Issues

   

Self Enhancement

   

Sports & Adventure

   

Property & Agents

   

Technology & Science

   

Fitness & Health

   

Tour & Travel

   

Healthcare & Treatment

   

Issues & News

   

Jobs & Careers

   

Recreation

   

Culture & Art

   

Teens & Kids

   

Malls & Shopping

   

Online & Board Games

 

Home Page › Policies & Law › National Documents & Archives
 

OCR Software--Optical Character Recognition or Optical Crud Recognition?

 

Author: James Eglin

Optical Character Recognition (OCR) refers to a software technology and processes that involve the translation of printed text into computer searchable text.

Done correctly, OCR enables users to search for and retrieve individual words contained within a file or page. In addition, when a set of files is indexed, users are able to search for keywords across an entire document library and retrieve each page with exact precision. OCR enables users to execute searches in seconds, searches that once could take several hours or days to complete.

However, this technology did not work well on older or poor quality documents that contained mixed fonts or combinations of texts and graphics. Until now!!

Due to several recent technology advances, it is now possible to obtain six-sigma level character accuracy from these types of document collections.

Although it is important to keep in mind that the quality and condition of the paper documents are still key factors in the successful OCR conversion, dramatically improved results can be obtained by enhancing the quality of the scanned image prior to processing.

Noise removal of borders, speckles and skews are now common on the more advanced document scanners.

Furthermore, advanced color filter technologies may be used to reduce any page background colors, in conjunction with multi-light image capture technologies to remove any shadows cast by page creases that could impact image quality or recognition accuracy.

Once document scanning and processing are complete, an OCR text layer can actually be added and hidden behind each image. An additional orientation filter can be used to ensure that the best image is presented to the OCR engines.

To achieve the highest conversion accuracy possible, the characters in the image can be processed using multi-engine OCR voting technologies that rank each character to determine the best text recognition fit. Then once a word is generated, it will be filtered through a proprietary lexicon to ensure the highest quality results.

Finally, this text can be processed utilizing sophisticated layout retention technologies to represent the image text layout, to provide the best possible text representation for precise search and retrieval. After all, isn't that why they call it Optical Character Recognition?

Author Bio:
James Eglin is a specialist in this area. James has written several articles in the past on this topic.
You can also reach this article by using: documents & archives, documents, document, records, government records, archives, archival documents
 
 
 

Related Articles

 
Road Traffic Accident Claim, Va Va Voom
 
DUI Lawyers and What You Can Expect
 
The Right Focus on Tort Reform
 
What Does A Criminal Attorney Do?
 
Should Profiling Be Allowed At Our Airports?
 
OCR Software--Optical Character Recognition or Optical Crud Recognition?
 
Lie Detector Tests for All Law Enforcement
 
White Collar Criminal Defense
 
The History of Lawyers
 
Courtroom FEA: But how does FEA work?
 
 
 
Home Page :> Privacy Policy :> ToS  
Copyright © 2008 www.articlecrossroad.com