Smashwords And EPUBs
Smashwords made a change recently and now allows publishers to upload hand-formatted EPUB files. Originally, the only EPUB files you could have at Smashwords were the ones automatically generated by their Meat Grinder program. While Meat Grinder does a pretty decent job, I always found the ebooks to be rather plain looking. And those EPUB files are what gets syndicated to premium catalog stores like iTunes, Kobo, and Barnes & Noble. There were also some nice things I do in the Kindle versions of our books that I simply could not do with Smashwords.
My Conversion System, And The Problems With It
I’ve been experimenting with creating MOBI and EPUB files and have come up with my own little system, which is basically done in these steps:
- I layout the ebook in LibreOffice Writer according to the Smashwords Style Guide.
- I export the file to Microsoft DOC format and upload to Smashwords.
- I copy the original LibreOffice file and rename it to Kindle Version. I make some tweaks and export it to HTML.
- I edit the HTML by hand to make it smaller and more efficient (usually removing 50-60% of the original file).
- I create the Table of Contents and index files for the Kindle version.
- I run that through KindleGen for uploading to Amazon.
- (new step) I used Calibre to convert the Kindle file to an EPUB.
- (new step) I used Sigil to tweak the EPUB layout, text styles, and to fix the conversion issues.
I did run into some odd problems along the way, mostly due to Calibre sticking an extra file in the EPUB (it stores your bookmarks inside your EPUB files). Luckily, that’s easy to solve (just open the EPUB in a compression program like 7-Zip and delete that errant file). But every time you test your ebook, you have to remember to remove that bookmark file. Turning off Calibre’s setting for keeping track of your reading location appears to stop it from putting that bookmark file in there. So far.
The next problem I ran into was with the Smashwords auto-testing program, which kept complaining about my cover. It said the cover had to be at least 1400 pixels wide. The cover in the new EPUB file for my test book (Pariah) is 1600 x 2400 pixels. I’m mildly embarrassed to say it took me almost an hour to figure out they weren’t talking about the cover inside the EPUB file — they were talking about the cover associated with the book in their dashboard (which was only 600 x 900 pixels). They could use a minor interface update to make that error more clear. Once I figured out which cover they had a problem with, it was easy to upload the higher resolution version.
When all the problems with the cover were eventually solved, I started getting error messages about other things. Apparently, Sigil uses a different EPUB validation library (FlightCrew) than Smashwords (Epubcheck). They are not the same. Passing validation inside Sigil does not mean the file will pass when you upload it to Smashwords.
The really odd thing is some of the errors thrown were rather stupid — like the Calibre bookmark text file above, and my favorite error, “length of first filename in archive must be 8, but was 22.” Don’t even get me started on the MS-DOS / Windows 3.1 flashbacks. I eventually tracked down the source of this error, which (ironically) was due to me opening the compressed file manually to delete that stupid Calibre bookmark file. Re-compressing the EPUB file fixed the issue (note: for those running into the same problem, add the mimetype file first).
Unfortunately, dealing with a constant stream of these errors led me to installing my own copy of Epubcheck on my computer. It was easier than I expected — which in Linux terms means “I didn’t have to compile it myself” — and I was hopeful that now I could actually finish converting the other books in my catalog.
At this point, I figured I was finally done and had a perfect EPUB book. Wrong again. After a few days, Smashwords again complained about the cover, telling me:
Please make sure the cover image inside your EPUB is sized correctly. Currently, if you take a look at it in Adobe Digital Editions, you will see that much of the title and author name get cut off. Many eReaders base their software upon Adobe, so your cover will appear poorly on some of them.
In their own FAQ, on the same page, it clearly states:
A good dimension is 1,600 pixels wide, so if you aim for 50% greater on the height, multiply the width by 1.5 and you get a height of 2,400 pixels.
So I followed their directions exactly and made a cover that was 1600 by 2400 pixels, and now suddenly the size is wrong?
I decided to install Adobe Digital Editions to check it out for myself. What a terrible program! The fonts are so blurry I can’t even look at my laptop’s screen for more than ten seconds at a time. I’m having trouble typing this post (which is quickly deteriorating into a rant) because of the nausea inflicted by that program. I know what’s wrong with the program — I’ve seen it many times before. They are doing something screwy with the font rendering. Microsoft calls it “Clear Type” and I am one of the few people who get physically ill when that stuff is put on screen. The kicker is that I’ve configured my laptop (trusty old Windows XP) to completely disable that font “smoothing” tech. And yet, somehow, Adobe has managed to enable it, just for their program. I could never use this program to read books.
After much fruitless searching for a cure to this problem, I gave up and dealt with it, going through the process of previewing my book… in ten to twelve second increments. I’m starting to feel dizzy, but I have to know what is going on. I load my EPUB file and it has quite a few problems. The cover is all messed up and this program also appears to be incapable of rendering centered text. Everything looks fine in Sigil, Calibre, FBReader, and Cool Reader (on my tablet). So what is Adobe’s problem (aside from the crappy font rendering)? I’m really starting to wonder if anybody supports the EPUB standard. This situation reminds me of another so-called standard that was turned into an absolute mess (I’m looking at you, HTML).
The Nuclear Option
After countless hours wasted trying to figure out what was wrong with (a) my cover and (b) Adobe Digital Editions, I decided to take what I call the Nuclear Option. Since I couldn’t seem to get Sigil to do what I needed, and really did not like the prospect of having to format my ebooks three or four times, I decided the best solution was to modify my workflow into this:
- Layout the ebook in LibreOffice Writer.
- Export to Microsoft DOC format and upload to Smashwords.
- Export the ebook as an HTML file.
- Edit the HTML by hand to make it smaller and more efficient. Split the giant ebook file into one HTML file per chapter. Manually create the entire EPUB folder structure.
- Write a Linux bash shell script to automatically generate the table of contents and index files.
- Write another Linux shell script to automatically generate the rest of the files I need, build the EPUB, run it through EPUBCheck, then build a Kindle version (which at this point is only a few very small differences), and run that through Kindlegen.
- Drink a beer.
The advantages of this new system are many (and I’m not even counting step 7).
First, instead of having to hand-format a Kindle edition in Libre Office, export as HTML, then clean up the HTML code, run it through Kindlegen, run it through Calibre, and then still have to clean it up in Sigil… I now export the Smashwords document without doing any additional work. I have to do the same HTML cleanup, but it will take just a little longer due to requirements of the EPUB container. But I’ve eliminated two entire programs (Calibre and Sigil) as well as the conversion steps they both impose.
The second big advantage is that I am now able to very easily set up another folder with common files — sort of like having a template. My new Catalog pages, all of my About The Author pages, book cover images, author bio images, and more… all shared across all my books without having to do any additional work. If I update my catalog, all future books will automatically use it when I run my shell script.
A minor benefit that few people would probably even notice is the image quality. Putting an image inside LibreOffice, then exporting to HTML reduces the quality of the image somewhat. Then I was running it through KindleGen, which reduced it more. Then Calibre and Sigil both mangled it further. That’s the problem with jpeg images. The more you edit the file, the worse it gets, and every program in my previous workflow was altering the image. So my new system lets me easily plug in all of the original image files, and they look as good as a first edition ebook should.
The last big benefit is time. Yes, it was a big initial investment. Yes, I still have to hand-edit the HTML and split it into one file per chapter. But now I put a few other files into a folder, set up some symbolic links, edit one book metadata file, and then run my shell scripts. The bulk of the work gets done automatically and I get a good looking, well-formatted, and fully compliant EPUB file in about two seconds. About one second later, I get a Kindle MOBI file of similar quality.
I think the best benefit, by far, is that my new system finally fixed that screwed up cover issue in Adobe Digital Editions. What was wrong with all my earlier attempts? I have no idea.
And I only had to write about 340 lines of bash script to accomplish this. Sadly, I still think that was the easy part.
Things I Changed in the New EPUB Files
- New Covers – I replaced cover images with new, higher resolution versions. The original ebooks had 600 x 900 pixel covers. The new ones will all have 1600 x 2400 pixel covers.
- Updated Style – Better layout of paragraphs, or at least something I think looks nicer.
- Very Minimal Style – My style sheet is very, very small. I didn’t touch fonts and the only thing I did with font sizes was to make chapter header styles larger. So your EPUB reader should give you full control over the font family and text size.
- Table of Contents update – The auto-generated EPUB files from Meat Grinder only included chapters in the Table of Contents. Now all the appropriate entries are there (Intro, Cover, About the Author, etc).
- Bio Pictures – The author bio pictures were absent in the back of the auto-generated EPUB files.
- About Terran Shift – For titles set in the Terran Shift universe, I added the updated text in back that includes our “Explore the Terran Shift Universe” section.
- Lost Luggage Studios Catalog – I created a new section in the back of these EPUB books that includes a visual catalog of our titles, complete with promotional text and miniature cover art. This is something I’m testing out, and will probably be updated and included in all of our future Kindle and Smashwords titles.
So, What’s It Look Like?
The only title I have completed conversion on (so far), is my first novel Pariah. Here is a quick comparison of the layout of the first chapter. On the left is the original auto-generated Smashwords EPUB. On the right is the new EPUB I made:
And here is a look at the catalog section in the back. The original layout was just a bunch of links. Very plain, rather ugly actually. For the new catalog section, I went with a more visual look that mimics the catalog section we include in the backs of our paperbacks:
And that concludes my overview of these changes. It’s something I’ve been meaning to do since Smashwords first announced Direct EPUB uploads. I’ve been plugging away at this for several days now, and swearing up a storm on more than one occasion. It shouldn’t be much of a surprise that once I decided to take the Nuclear Option and write a shell script, everything got smoother. I’m one of those old-school programmers who likes reinventing the wheel. It may not be a perfect wheel, but it’s efficient and I know exactly how it works!
Pariah is converted, and I don’t anticipate any more conversion or testing issues with Smashwords. Over the next few days I will be updating the other titles in our catalog. So keep an eye on titles you’ve purchased in your favorite stores over the next few weeks.