suitable OCR
Téma indítója: telefpro
telefpro
telefpro
Local time: 04:17
portugál - angol
+ ...
Sep 18, 2008

I have a document typewritten in 1957 and it is converted into PDF. The printouts cannot be read. Is there any way to read this document better, using some OCR. It is in French.Please advise.

 
Martin Skara, PhD.
Martin Skara, PhD.  Identity Verified
Szlovákia
Local time: 00:47
francia - szlovák
+ ...
only ABBYY Fine REader Sep 18, 2008

is the best solution for conversions PDF-DOC/RTF.

https://abbyy.asknet.com/cgi-bin/dlreg/ml=EN?ID=FRP9DEMOM


Good luck
Martin


 
mediamatrix (X)
mediamatrix (X)
Local time: 18:47
spanyol - angol
+ ...
The world's top OCR system ... Sep 18, 2008

telefpro wrote:

I have a document typewritten in 1957 and it is converted into PDF. The printouts cannot be read. Is there any way to read this document better, using some OCR. It is in French.Please advise.


... is the human eye.

If you can't read the document, then no OCR software will be able to either.

MediaMatrix


 
esperantisto
esperantisto  Identity Verified
Local time: 01:47
Tag (2006 óta)
angol - orosz
+ ...
A WEBOLDALAT LOKALIZÁLÓ FORDÍTÓ
Precisely! Sep 18, 2008

mediamatrix wrote:
If you can't read the document, then no OCR software will be able to either.


Absolutely right!

However, theoretically, one might convert a PDF into a set of graphic files such as TIFF and try to play with gamma/color correction. But that's alchemy, not an exact science


 
Jack Doughty
Jack Doughty  Identity Verified
Egyesült Királyság
Local time: 23:47
orosz - angol
+ ...
Az Ő emlékére:
Zoom it? Sep 18, 2008

With Adobe Acrobat, even if you only have Adobe Acrobat Reader, you can zoom the page out to show the details a lot larger. If this doesn't help, I have no idea what else would.

 
Anna Villegas
Anna Villegas
Mexikó
Local time: 16:47
angol - spanyol
This should do the trick Sep 18, 2008

Though it's in Spanish, you'll certainly be able to understand if you have Office 2003. Click on the link below.

You have your own OCR


 
Viktoria Gimbe
Viktoria Gimbe  Identity Verified
Kanada
Local time: 18:47
angol - francia
+ ...
I beg to differ Sep 18, 2008

mediamatrix wrote:

If you can't read the document, then no OCR software will be able to either.


I have successfully displayed on screen text that didn't even seem to be there. My OCR software is OmniPage, and it has a built-in function to enhance images before starting the recognition on it (if you know what you are doing, you can also do this with graphic programs like Photoshop).

There are documents that cannot be read by the human eye that can be made readable with the help of software. If I was in telefpro's situation, I would try it.

[Edited at 2008-09-19 15:22]


 
mediamatrix (X)
mediamatrix (X)
Local time: 18:47
spanyol - angol
+ ...
QA? Sep 18, 2008

Viktoria Gimbe wrote:

There are documents that cannot be read by the human eye that can be made readable with the help of software. If I was in telefpro's situation, I would try it.


And how do you propose that telefpro should go about validating the OCR output? Even with texts that are easily human-readable, no OCR software is ever 100% accurate. If telefpro can't read the source text is it reasonable to assume that the output from image-enhanced OCR will, by some miracle, be 100% reliable on this particular occasion?

Telefpro is up against a fundamental law of entropy here.

MediaMatrix


 
Viktoria Gimbe
Viktoria Gimbe  Identity Verified
Kanada
Local time: 18:47
angol - francia
+ ...
That's the part his/her intelligence is needed for Sep 18, 2008

mediamatrix wrote:

And how do you propose that telefpro should go about validating the OCR output?


Well, s/he can read, right? Once s/he gets the OCR input, s/he can read through it to decide whether the output is strong enough to be processed. Isn't that what we should be doing even with texts that are already in an editable format?

The point of telefpro's question is to make the text readable by the human eye, not to turn it into editable text (although s/he may ultimately be interested in that as well). What other method do you propose? I don't know of any other solution. It's a matter of using technology to enhance what is humanly feasible.

In some cases, like the present one, technology can go much farther than the human brain - although this is usually not the case.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

suitable OCR






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »