It’s a while since I’ve written about anything to do with my work, but I thought how I’ve worked towards creating ebooks might make an interesting post. I’ve pondered over this problem for a long time and while it is relatively easy to create an ebook using Calibre, it is difficult to automate it and get the formatting right. My criteria for this project are:
- Create a reasonably well formatted ebook from an Author-it output.
- It has to be done automatically from publishing profile in Author-it with no manual intervention.
- I have to be able to publish from any book within Author-it i.e. once set up a book a publish to Word, PDF, or help is also publishable to eBook.
- I should be able to set the actual eBook format, i.e. epub, mobi, etc within Author-it.
- There had to be no cost involved (barring my time in setting it up).
Calibre is a free tool that allows you to create eBooks in a variety of formats. Author-it is a component content management system that allows you to store and publish content in various formats but not eBook.
After a bit of experimentation I decided that the the best format from Author-it would have to be a Word document converted on the fly by a macro to a filtered *.htm format. The main reason for doing this step was due to the fact that when I converted a Word doc directly all the numbering was screwed. I could have published straight to html but I found that there were just too many things wrong with the initial output, and while using CSS to correct the output may have worked going via the Word route made sense (at least it did to me).
The macro code I used to covert the Word doc to htm is shown below:
Dim FileName As String
FileName = ActiveDocument.Name
FileName = Mid(FileName, 1, InStrRev(FileName, ".") - 1)
FileName = ActiveDocument.Path & "\" & FileName & ".htm"
ActiveDocument.SaveAs2 FileName, FileFormat:= _
wdFormatFilteredHTML, LockComments:=False, Password:="", AddToRecentFiles _
:=True, WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts _
:=False, SaveNativePictureFormat:=False, SaveFormsData:=False, _
ActiveWindow.View.Type = wdWebView
However, I had another problem, when I publish normally to Word I run an adjust table macro that aligns and sets the width of all the tables. This caused a problem in the htm format as the width was now set as absolute and when converted to eBook the tables would go off the edge of page. This was easily fixed by disabling the macro in the after publish macro in the Word template, but this also meant, as I still wanted to run it for normal outputs, I had to change all my other Publishing Profiles which run the after publish macro, to run the adjust-table macro on its own. Once I'd done this the output looked much better.
There were still problems, I don't think an eBook needs a mini TOC at the beginning of each chapter so I removed these with another two macros that removed the Word styles for the document. The code for one is shown here:
Selection.Find.Style = ActiveDocument.Styles("MiniTOCItem")
.Text = ""
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
The next problem I had was running all this automatically. Luckily you can run Calibre from the command line and from Author-it you can run batch files from a publishing profile and also pass in arguments. There are many commands I could have used with Calibre and I have used only a few here, I suspect with more time there is a lot more I could do.
It took me a while to work out the syntax for Calibre, but it turned out it was quite simple and I'm just stupid. So the batch file that runs looks something like this:
ebook-convert "C:\Users\davidj\Documents\AITPublishing\ebook\numero interactive Classifier Trainer Guide\numero interactive Classifier Trainer Guide.htm" "C:\Users\davidj\Documents\AITPublishing\ebook\numero interactive Classifier Trainer Guide\numero interactive Classifier Trainer Guide.epub" --enable-heuristics --disable-markup-chapter-headings --page-breaks-before /
But with the arguments that Author-it can pass into it much simpler:
ebook-convert %1 %2 --enable-heuristics --disable-markup-chapter-headings --page-breaks-before /
- %1 = "<SYS_PUBLISH_FOLDER>\<ni> <Guides>.htm"
- %2 = "<SYS_PUBLISH_FOLDER>\<ni> <Guides>.<epubextension>"
If you know Author-it you will know how flexible it can be, you'll know that with the system variable and my own defined variables I have complete control of what I publish from Author-it, down to the type of ebook I produce. Currently I am only interested in epub and mobi file formats but if at some future date I want to create other formats I can just add them to my <epubextension> variable.
The output I'm getting now is OK, or it looks OK in the Calibre ebook reader. A screen shot is shown below.
I need to check it on a few devices to get a better idea of what it looks like. I'm not happy with the main TOC as it has page numbers as does the index. I can exclude the index easily enough through the publishing profile but I think the TOC is still needed so I need to devise a way of either regenerating it through Calibre (which I think you can do) or modifying it in some other way. There are a few minor indentation issues but I should be able to sort those out relatively easily.
In general I'm satisfied with what I am getting, now once I have finished writing or editing a book I have a new publishing option, available as a single click with no further manual intervention.