Donate SIGN UP

Scanning Documents

Avatar Image
tommo | 16:42 Thu 18th Dec 2008 | Computers
9 Answers
When I scan a word document on my computer it automatically saves as a .tif file in my scans folder.

How do i get it to automatically save as a .doc type of file.

Tried renaming but does not work. When I go to e mail it on some people can not open it at the other end.

Any help appreciated.
Gravatar

Answers

1 to 9 of 9rss feed

Best Answer

No best answer has yet been selected by tommo. Once a best answer has been selected, it will be shown here.

For more on marking an answer as the "Best Answer", please visit our FAQ.
As far as your scanner is concerned, everything you put into it (including a page of typescript) is a picture. i.e. your scanner doesn't 'see' individual letters, it simply records the position of different colours on the page. The software you're using with your scanner then saves that picture as an image file. (TIF = Tagged image format).

To get text from a scan you must use special software, which examines the picture and tries to convert the patterns within the image into letters. This process is known as OCR (optical character recognition). Most scanners are now supplied with OCR software. This can either be a 'stand alone' program or it can be built into the main scanning application. (With a Canon scanner you select the 'OCR' button from the CanoScan Toolbox).

So check the software which you installed when you got the scanner, looking for the OCR application. That software creates a text file, which can be pasted into a new Word document. (NB: OCR programs can be less than perfect, particularly when the original document uses unusual fonts. You should always check that the text has been read correctly by the software).

Chris
PS: If you're happy to email a picture of your document to people (rather than sending them an actual document), your normal scanning software should have a 'save as' option, which lets you change the file type from the default TIF to the more commonly used Jpeg (JPG) format. If you save your scans into this format, people who receive your emails will almost certainly be able to see the picture. (However, because it's a picture, rather than a document, they won't be able to edit the text).

Chris
Question Author
Chris

Thanks a lot for explaining how it all works. I didn't realise that the scanner doesn't recognise text.

My scanner is an all in one HP psc 2410 photosmart model. I will have to try and find the software that came with it and look for OCR.

I have had it for about 4 years. If I can't find it, can I download a program off the internet ?

Thanks Tommo
Thanks for the reply.

The best-rated free OCR program on Download.com appears to be this one:
http://www.download.com/FreeOCR/3000-10743_4-1 0717191.html?tag=mncol

Chris
Question Author
Thanks Chris, I have just downloaded that program and scanned a word document. When it has scanned it opens up on left hand side of screen in OCR program. However I can't get it to save anywhere. Tried using save text button at top of screen and also the word button but no joy.

Sorry to be a pain. Have you ever used this program ?

Tommo
Sorry for the delay. I'm still on dial-up and it's taken ages to download that program.

Step 1: Click 'Scan' to get the scanner to take a picture of the document, which appears in the left-hand panel. (With the paperback book I've just experimented with, with very small text, the 300dpi setting produced a far better end result than the 200dpi setting).

Step 2: Click 'Start OCR'. This forces the software to try to recognise the characters on the page. The result appears in the right hand panel.

Step 3 (Optional) Edit the text (in the same way that you would in any word processor) to clean up all the things that the OCR program got wrong.

Step 4: Either click 'Save text' (which saves the text as a basic text file) or, probably more useful, click 'Word' which opens a new Word document with the text already pasted into it.

Chris
Question Author
Thanks Chris, I have followed your instructions but have some problems.

I wrote on an A4 piece of paper testing testing 123.

I then clicked scan. Then start OCR. However the text on the right hand side was just a load of wierd characters. Not the text on the paper.

Tried using the word buton at top of screen but when word 2007 opened it would not display the page. Went into task manager and it said not responding ?

Tried the save text button to desktop but just displayed all the wierd text.

Did the program work ok for you ?

Sorry to take up so much of your time Chris.

Tommo
I've no idea why Word isn't responding. (Simply restarting your computer might fix the problem).

OCR programs can generally only 'read' clear typescript. Technology is available which can read some forms of handwriting. (I believe that Post Office sorting centres have machines which can read some hand-written post codes). However, it requires a great deal of computing power and the necessary software usually costs mega-bucks. (When your brain interprets those three squiggles on a page as the numbers 1, 2 and 3 it's using a lot of computing power, together with a mental 'program' which has been gradually refined through many years of use).

OCR technology is still really in its infancy. The software available for home PCs is still only barely adequate for reading clear, type-written text in a non-serif font and with a good contrast ratio between the text and the background. I rarely use it, simply because I find that I can re-type the original document in about the same time as it takes to scan it and then correct all of the errors introduced by the OCR program.

Try using the OCR program to 'read' a well-spaced, clearly-printed page from a book. That will illustrate the limits of the technology. (There might be some fairly expensive commercial programs which can do a better job than the free one, but none of them are perfect).

For background information, see here:
http://en.wikipedia.org/wiki/Optical_character _recognition

Chris
Question Author
Chris, Thanks for taking the time to help me out on this one it is much appreciated. I tried scanning a typed document and it worked ok.

Thanks again

Tommo

1 to 9 of 9rss feed

Do you know the answer?

Scanning Documents

Answer Question >>