I’m Ben Denckla and this page advertises my ebook consulting services. My services center on writing custom software to create ebooks from sources, that is, from the files that were used to create the paper book.
If you are considering OCR since it is not clear how to turn your sources into an ebook, I may be able to help.
Below I review some of your options, including my services.
OCR is far from perfect
Typically, even when sources are available, an ebook is created by running OCR (optical character recognition) software on a scan of a paper book. OCR works amazingly well, but it is far from perfect. So, most ebooks have typos in them introduced by OCR.
If you are interested in the details, here is a link to a page I am working on classifying, discussing, and giving examples of many types of OCR errors.
Is OCR good enough?
Below are some questions to consider in deciding whether OCR is good enough for a particular book.
- What is the cost of doing better than OCR?
- What is the benefit of doing better than OCR?
- How many typos would be eliminated? For example, in a book I worked on, by switching to sources, I was able to eliminate more than one OCR typo per page. But a simple book like a novel in an OCR-friendly font might only have a handful of OCR typos in the whole thing, so there’s not that much room for improvement.
- How sensitive are your customers to typos? For example, readers of a reference book may be more sensitive to typos than readers of a novel.
Adding proofreading and spell-checking to OCR
Most books are proofread and spell-checked before they are printed on paper. But, the economics of ebooks seems to be that it is usually too expensive to do P&S (proofreading and spell checking) a second time for an ebook created using OCR.
Replacing OCR with conversion from sources
To avoid OCR entirely, I write custom software to create ebooks from sources. By “sources” I mean whatever computer files were used to create the paper book.
For a quote, write to me (Ben Denckla) at EbooksFromSources@yahoo.com.
Can you use generic software?
In many cases, you don’t need custom software to create an ebook from sources. For example, recent versions of Adobe InDesign have features to create ebooks. So if your paper book was created using InDesign, it makes sense to try to use InDesign to create your ebook.
What I specialize in is “middle-aged” books: those produced within the digital era, but not recently enough to use off-the-shelf software that can create ebooks.
Won’t it cost even more than OCR + P&S?
Custom software can be more expensive than OCR + P&S, the first time you do it. But, if you want to convert a bunch of books that have the same source file format, it can be cheaper. So, whether or not custom conversion software beats OCR + P&S depends on the answers to the following questions.
- How much will it cost to write the conversion software?
- How many books will be converted, i.e. over how many books will the cost of the conversion software be spread?
In addition, even if you have only one book to convert, I may have already developed software to convert its format or something close to it. Or, even if I’ve never seen something like your format, I may choose to absorb the fixed cost for you if I think I can use the code in the future.
Finally, keep in mind that OCR + P&S is a lot closer to perfect than OCR alone, but it is still not perfect. To guarantee that certain types of errors will not appear, you must create the ebook from sources. So, directly comparing the cost of OCR + P&S to the cost of custom conversion from sources is not totally fair, since the benefits of custom conversion are greater.
Sources are only close to perfect
Even using sources only guarantees that certain types of errors will not appear. There are some types of errors that are latent within the sources, even though they do not appear in the paper book. These errors mainly have to do with line breaks. Only P&S can catch these errors, and, since it is a human endeavor, it is of course not guaranteed to catch them.
My areas of expertise
Though I’m game to consider any job, here are some particular areas of expertise I have.
- Conversion from 3B2 (now PTC Arbortext) sources
- Books mixing English and other languages
- Books mixing English with right-to-left languages
- Books mixing English with languages using the Hebrew alphabet (Hebrew, Aramaic, Yiddish, etc.)