+ All Categories
Home > Documents > How to Create an eBook - Abby FineReader Tutorial v0-1.3

How to Create an eBook - Abby FineReader Tutorial v0-1.3

Date post: 28-Oct-2014
Category:
Upload: hutilaci
View: 477 times
Download: 1 times
Share this document with a friend
Description:
How to Create an eBook - Abby FineReader Tutorial
Popular Tags:
17
1 Shiva´s PDF ebook tutorial with use of ABBYY FineReader This tutorial is not a replacement for the ABBYY FineReader Help File - you get to know most of the things you need to know there. But as there are a lot of ways create an OCRed PDF, I will show one way to do it fast and with good results. Contents 1 Quick and dirty: main steps 2 Startscreen 2 Scan 3 Windows 4 Save results as PDF 5 Results in Acrobat 6 2 Options and Settings 7 Scan, Read and Font Options 7 Save and View 8 3 Image editing - levels 9 4 OCR in depth 10 Areas and tools (image window) 10 Text areas with tables 11 Background images 12 Background Images II 13 Proofreading and spell checking 14 5 Finding Fonts 15 6 Additional Software 16 Using Pistop Part I 17 Using Pitstop Part II 17
Transcript
Page 1: How to Create an eBook - Abby FineReader Tutorial v0-1.3

1

Shiva´s PDF ebook tutorialwith use of ABBYY FineReader

This tutorial is not a replacement for the ABBYY FineReader Help File - you get to know most of the things you need to know there. But as there are a lot of ways create an OCRed PDF, I will show one way to do it fast and with good results. Contents

1 Quick and dirty: main steps 2 Startscreen 2 Scan 3 Windows 4 Save results as PDF 5 Results in Acrobat 62 Options and Settings 7 Scan, Read and Font Options 7 Save and View 83 Image editing - levels 94 OCR in depth 10 Areas and tools (image window) 10 Text areas with tables 11 Background images 12 Background Images II 13 Proofreading and spell checking 145 Finding Fonts 156 Additional Software 16 Using Pistop Part I 17 Using Pitstop Part II 17

Page 2: How to Create an eBook - Abby FineReader Tutorial v0-1.3

2

1 Quick and dirty: main stepsThis is what it looks like when you start finereader.

In this chapter we use the main buttons: „Scan“ and „Read“.

If you want to work on already scanned images or PDF with images, you can import these with „Open”.

1 Quick and dirty: main steps

Page 3: How to Create an eBook - Abby FineReader Tutorial v0-1.3

3

1.1 ScanAfter we click „Scan“, we get the preview window. In the settings we can choose „ABBYY FineReader Interface“ or „native interface“I always use „native interface“ because i have more options there.

„ABBYY FineReader Interface“ „native interface“

Scanner Settings:Resolution: Even if its often recommended to use 300DPI and more, i have good results at 150/200DPI. Scanning Mode: Greyscale is optimal for OCR. At colored pages I switch the scanning mode.Brightness: Manual. No changes here. If needed, do it later at pages with images (using curves)

Paper Settings: Draw a rectangle in the preview window - a bit smaller, because this area will be used for all pages. Use a corner of the scanner so that the book is always at the same place.

Image Processing: Check all checkboxes - this is also done in options/setting - we come to that later.

Below the screen of my „native interface“(looks different depending on the scanner/scansoftware you use/ couldn`t switch to english menu here)There is just one thing that I regularly use: Descreening at pages with images. Scanning process takes a bit longer.

1 Quick and dirty: main steps - Scan

Bend the book at different pages before start with page 1. Use one corner of the scanner.

Page 4: How to Create an eBook - Abby FineReader Tutorial v0-1.3

4

Back from the preview window (click “close” at preview window after all pages are scanned), we see the scanned pagesLeft window: Icons of all scanned pages.Center: The Image window displays an image of the current page. You can edit image areas, page images, and text properties in this window. But this later - sometimes and in this example the automatic analyzing of the layout works with good resultsRight Window: We will see the recognized text after the next step.

So, what we do next is press the „Read“ button

1 Quick and dirty: main steps - windows

Optional:Normally I save the project at this step for the first time. Depending on the stability of your computer/system you could close the preview window every 100 pages (check out, if you interface is keeping the scan area for getting all pages at the same size).

Page 5: How to Create an eBook - Abby FineReader Tutorial v0-1.3

5

Now we see the results in the right window.

red underlined: words not found in the dictionaryblue background: finereader is not sure about these characters.

Both are very helpful for you to check spelling and make the manual corrections (last step before save to PDF) If you don´t correct errors here, they will show up in the PDF. In that case better save jpg/(jpg-PDF) only.

Press save to PDF button after everything is corrected:

In the Image window you see the recognized text areas (green rectangles)

1 Quick and dirty: main steps - Save results as PDF

Page 6: How to Create an eBook - Abby FineReader Tutorial v0-1.3

6

The result opened in Acrobat (two pages):

1 Quick and dirty: main steps - Results in Acrobat

Page 7: How to Create an eBook - Abby FineReader Tutorial v0-1.3

7

2 Options and Settings - Scan, Read and Font Options

Scan/OpenGeneral: I work with selected „Do not read and analyze acquired pages images automatically“. Sometime it´s more work to correct wrongly analyzed pages. If you have to edit contrast, you have to analyze layout again.

Image processing: all boxed checked. (more on exceptions later)

Scanner: here you find the selection between the interfaces that I mentioned earlier.

ReadTraining: I tried once (6 hours) to work with training a user pattern on a diffi-cult scan that I found -> waste of time. Built-in patterns are better. Correcting errors manually takes less time. -> „Use only built-in patterns“

If you click „Fonts“ you can set the fonts used in recognized text (screenshot to the right)

Font MatchingFinereader isn´t really good at assigning the right font. I always use just one font. If there are different fonts in headlines etc., I edit that manually later (howto in the next chapter)

How to find out, what font is used, where to get and how to use it, I will explain in the Font-chapter.

Page 8: How to Create an eBook - Abby FineReader Tutorial v0-1.3

8

2 Options and Settings - Save and View

Save:Default paper size: Use original image size (I like the original look)

Save mode: text and pictures only (no jpg text needed - we use a nice font and get a small PDF)

Image Settings: I mostly work with 150DPI. There are many possibilities to set the final resolution: First at scanning, here or optimizing at Acrobat.

Font settings: It´s very important to embed fonts. You never know what fonts are installed at the readers computers. A good layout can be destroyed if an-other font is used by the reader.

View:Text window: Highlight uncertain characters and non-dictionary words (im-portant for spell checking later.

Page 9: How to Create an eBook - Abby FineReader Tutorial v0-1.3

9

3 Image editing - Working on levels

Working on levels at page with images (copied from finereader help file):

Levels allows you to adjust the tonal values of the image by selecting the levels for shadows, highlights, and midtones on a histogram.To increase image contrast, move the right and left sliders on the input levels histogram. The tone corresponding to the position of the left slider will be assumed to be the blackest part of the image, and the tone corresponding to the position of the right slider will be as-sumed to be the whitest part of the image. The remaining levels between the sliders will be distributed between level 0 and level 255. Moving the central slider to the right or to the left will make the image darker or brighter respectively.To decrease image contrast, adjust the sliders for the output levels.

Grey areas in the background.Move the white slider to a point where about 90% of that curve are “whitened”.Black slider to the beginning of the curve will look best

Page 10: How to Create an eBook - Abby FineReader Tutorial v0-1.3

10

4 OCR in depth - Areas and tools (image window)

Remember that you have to edit the pages before you analyze the layout and read it. (sometimes you dont´t need that)

Preparing the OCR process:This is done in the image window. You define the areas - mainly into text or image areas. I will explain the tools/buttons:

Text: This is the main tool to define text areas (green). Don´t give to much space left and right - there may occur errors in layout recognition or dirt on pages may be recognized as characters.Different font styles or text areas (headline, page number etc.) can be marked with one rectangle.

Picture: With this tool you define picture areas (red). As you see in my example to the left, you can save time to define an image, where text and graphic are mixed. Finereader does a bad job to separate it automatically (sometimes I do that manually)

Table: I use the table tool very often - not only at tables. Examples (contents/ index) later.

Background Picture: I rarely use this. One example later.

Edit Image: Most used at greyscale images to optimize contrast. In a clean OCR PDR white areas of an image should be white - howto later. Also often used for cropping pages - not needed if you scan yourself but if you OCR a scan frome someone else that has too much wasted space.

Analyze: This is the automatic layout analyzation of the current page. You get a feeling with the time if its more effective on special page to analyze automatically and then correct it or to do it manually only. Finereader sometimes has problems with mixed pages (text/images), tables, text in coloumns.

Read: This function will OCR the analyzed areas - if there is no analyzed area, finerader will analyze the page before that. If there was an area missing - often the page number - you have to add that manually and read the whole page again.

Select: With the Select tool you can work on the analyzed areas - change size, define rows and columns in tables (howto next page) etc.

Page 11: How to Create an eBook - Abby FineReader Tutorial v0-1.3

11

4 OCR in depth - text areas with tables

At the contents pages I often work with tables for a clean layout. As the table area is defined, you get more tools with the Select tool:

Add horizontal separator

Add vertical separator

Merge table cells

1) In this example I start drawing the rectangle with the table tool

2)define colomns with vertical separator

4) select table cells to combine 5) no more separation needed for good results

3)define rows with horizontal separator

Page 12: How to Create an eBook - Abby FineReader Tutorial v0-1.3

12

4 OCR in depth - background images

Using the background image area:Sometimes I like to have clear characters in schemes and diagrams.Usually you can place image and text areas side by side - sometimes you have to add and cut area parts. When they overlap you can still use background images - first draw the background image area and overlay text areas. Both is seen in the screenshot to right.

4.

Adv

ance

d m

ind

(rat

iona

l, m

enta

l-ego

ic, s

elf-r

efle

xive

)

The

Gro

und

Unc

onsc

ious

Fig

. 1.

Th

e G

reat

Ch

ain

of

Bei

ng

SELF

-CO

NSC

IOU

S(p

erso

nal)

SUPE

RCO

NSC

IOU

S(t

rans

-per

sona

l)SU

BCO

NSC

IOU

S(p

re-p

erso

nal)

3. E

arly

min

d (v

erba

l, m

ythi

cal,

mem

bers

hip,

pa

leol

ogic

al,

bica

mer

al)

2. B

ody

(hig

hest

bo

dily

life

form

s, es

peci

ally

typh

onic

, m

agic

al)

1. N

atur

e (p

hysic

al

natu

re a

nd lo

wer

life

fo

rms;

pler

omat

ic, m

ater

i­al

; uro

boric

-rep

tilia

n)

5. P

sych

ic(N

irman

akay

a,sh

aman

istic

)

6. S

ubtle

(Sam

bhog

akay

a,sa

intly

)

7. C

ausa

l(D

harm

akay

a,sa

gely

)

8. U

ltim

ate

(Sva

bhav

ikak

aya,

abso

lute

)

Soul

Spiri

t

final page (PDF)

Page 13: How to Create an eBook - Abby FineReader Tutorial v0-1.3

13

4 OCR in depth - background images II

This example will show the high capability of background images tool. Sometimes the is a background image under text. You can decide to dis-card it, but if you want to keep it, FineReader does a good job: Parts of the background image, where the original font is seen, will be replaced by a mix of surrounded pixels. So the new text/font can be overlaid. See close-up to the right.

4Chakra drei:

Feuer

Eingangsmeditation

Wir sind still und spüren doch, wie eine Wärme in uns

wächst. Wir sind allein und spüren doch die anderen um uns

herum, die sich nach Freiheit, Wärme und Licht sehnen.

Hier ist eine Form, aber sie ist leer. Hier ist Leben, aber es ist

still. Hier ist Bewusstsein, und es erwacht!

Aus der Stille beschwören wir Bewegung. Langsam stre­

cken wir die Hände aus, dehnen uns, atmen, strecken uns

und fließen. Wir beschwören das Leben und geben ihm

Gestalt. Es ist ein feuriger Funke in der Zwischenwelt - zwi­

schen uns und anderen, zwischen Vergangenheit und Zu-

kunft, zwischen dem Bekannten und dem Unbekannten.

Wir bewegen uns, tanzen. Der Tanz des Lebens verzehrt

all unsere Ängste und Schmerzen in seinen Flammen, und

Freude erfüllt uns. Spüren Sie, wie die Wärme dieser Freude

Spannungen auflöst, wie sie pulsiert, wächst, wie ihr Rhyth-

mus uns erhebt und bewegt, heilt und beruhigt, wärmt und

kühlt.

In a close-up we see the comparison of the original scan (above) and the resulting PDF (below).

Page 14: How to Create an eBook - Abby FineReader Tutorial v0-1.3

14

4 OCR in depth - Proofreading and spell checking

Proofreading and spell checking:This is the most important part and its taking 80% of the time of a project.For example i took a very difficult scan that I found on the net. Sometimes Scans have manually underlined words. To delete that, you have to select all and click two times Underline (Ctrl+u).

Setting the sizes of the windows:At the left I have the icons of the pages to know “where I am” (not necessary). Image window also not needed (not seen in screenshot). The Text window is as big as possible to see as much text as possible at once and font big enough to identify recognized characters. Below about 3 lines of the original scan.

Going through the text:I start at page 1, click into the text window and jump forward with PgDn-Key (back with PgUp). The actual cursor position is shown in the window below by a yellow rectangle with blue outline. If there´s a blue marked word or character, move the cursor there and com-pare the content of the two windows.

The whole process may take from 5 to 15 hours. It depends on the quality of the scan and the number of pages.

Page 15: How to Create an eBook - Abby FineReader Tutorial v0-1.3

15

5 Finding the fonts used in the book

You can see this as a step for advanced user and simply use a font you like and already have. You can decide to take a similar (often serif-) font or a non-serif font, that is better for screen reading.

When there are different fonts on different pages (usually one for headline and one for main text), I rightclick the icons of the pages in the left window (select more than one pages with pressing “Ctrl”) and choose “Save selected images”.

I open these pages in Photoshop (alternative freeware Gimp) and crop a sentence (minimum two words) and save as jpg. I upload this jpg to http://www.myfonts.com/WhatTheFont

In most cases the correct font will be identified - sometime you get suggestions for similar fonts. You can use the font names to continue search here, where similar fonts are shown: http://www.identifont.com/find-font.html

In most cases you find the font with google - there are a lot of torrents wit font packages or single font downloads. There are collections of many GB sorted fonts. Never install too many fonts - it will slow down your computer. Use a font manager - but thats not needed - you can search the font achive folder for the font name.

Finerader has some Problems with otf fonts. You have to convert them to ttf before. You can do that online her: http://www.freefontconverter.com/ or here: http://onlinefontconverter.com/

results page at whatthefont:

extracting jpg from scans:

Page 16: How to Create an eBook - Abby FineReader Tutorial v0-1.3

16

6 Additional software - Using Pitstop Part I

Pitstop ist a great plugin for Acrobat Pro. In the last stepp you can edit the final PDF. You can do everything you need:Delete, resize object, add lines, change colors, copy & paste objects between pages and different PDF etc.Just one example where i use it when OCR the backcover:

Analyzed page in Image window Recognized text

Result in Acrobat Pro:

Page 17: How to Create an eBook - Abby FineReader Tutorial v0-1.3

17

6 Additional software - Using Pitstop Part II

With Pitstop you get many more toolboxes in Acrobat Pro - one is “Pitstop Edit” :

If yo want to delete, move or scale an object (text line, image or background), you have to select it with this tool. The object will be marked with blue corners or outlines. (Screenshot below)

You can move objects with this tool. Sometimes Finereader has layout errors, that can´t be corrected there. Sometime you work on scans, where text blocks are too close to one side - you can center it with this tool.

Text editing is possible with this tool from another toolbox. Don´t change text, when you have embed-ded fonts - do that in finereader. I use this tool only for textcolor (example to the right)

"If we cannot carry our practice into sleep," Tenzin Wangyal Rinpoche writes, "if we lose ourselves every night, what chance do we have to be aware when death comes? Look to your experience in dreams to know how you will fare in death. Look to your experience of sleep to discover whether or not you are truly awake."

This book gives detailed instructions for dream yoga, including foundational prac­tices done during the day. In the Tibetan tradition, the ability to dream lucidly is not an end in itself, rather it provides an additional context in which one can engage in advanced and effective practices to achieve liberation.

Dream yoga is followed by sleep yoga, also known as the yoga of clear light. It is a more advanced practice, similar to the most secret Tibetan practices. The goal is to remain aware during deep sleep when the gross conceptual mind and the operation of the senses cease. Most Westerners do not even consider this depth of awareness a possibility, yet it is well known in Tibetan Buddhist and Bon spiritual traditions.

The result of these practices is greater happiness and freedom in both our waking and dreaming states. The Tibetan Yogas of Dream and Sleep imparts powerful methods for progressing along the path to liberation.

PHILOSOPHY/RELIGION

Tenzin Wangyal Rinpoche, a lama in the Bon tradition of Tibet, presently resides in Charlottesville, Virginia. He is the founder and director of The Ligmincha Institute, an organization dedicated to the study and practice of the teachings of the Bon tradition. He was born in Amritsar, India, after his parents fled the Chinese invasion of Tibet, and received training from both Buddhist and Bon teachers, attaining the degree of Geshe, the highest academic degree of traditional Tibetan culture. He has been in the United States since 1991 and has taught widely in Europe and America.

"A detailed guide to using our night-lives for Awakening; thought-provoking, inspiring, and lucid."—Stephen LaBerge, Ph.D., author of Lucid Dreaming

"This explication of the dream and sleep practices becomes a window on the entire teachings of Tibetan Tantra and Dzogchen. I enjoyed this book immensely...powerfully and beautifully presented."—Martin Lowenthal, Ph.D., co-author of Opening the Heart of Compassion

ISBN 1-55939-101-4 Cover design: Jesse Townsley/

Sidney Piburn Printed in Canada $16.95 in USA £11.50 in UK

Snow Lion

Select text with TouchUp tool

rightclick - Properties - change font color from white to black

select black background rectangle(s) and delete

Final page in Acrobat.


Recommended