Author-it, Calibre, Macros and Batch files

It’s a while since I’ve written about anything to do with my work, but I thought how I’ve worked towards creating ebooks might make an interesting post. I’ve pondered over this problem for a long time and while it is relatively easy to create an ebook using Calibre, it is difficult to automate it and get the formatting right. My criteria for this project are:

  • Create a reasonably well formatted ebook from an Author-it output.
  • It has to be done automatically from publishing profile in Author-it with no manual intervention.
  • I have to be able to publish from any book within Author-it  i.e. once set up a book a publish to Word, PDF, or help is also publishable to eBook.
  • I should be able to set the actual eBook format, i.e. epub, mobi, etc within Author-it.
  • There had to be no cost involved (barring my time in setting it up).

Calibre is a free  tool that allows you to create eBooks in a variety of formats.  Author-it is a component content management system that allows you to store and publish content in various formats but not eBook.

After a bit of experimentation I decided that the the best format from Author-it would have to be a Word document converted on the fly by a macro to a filtered *.htm format. The main reason for doing this step was due to the fact that when I converted a Word doc directly all the numbering was screwed. I could have published straight to html but I found that there were just too many things wrong with the initial output, and while using CSS to correct the output may have worked going via the Word route made sense (at least it did to me).

The macro code I used to covert the Word doc to htm is shown below:

Sub SaveAsHTM()

Dim FileName As String  

FileName = ActiveDocument.Name
FileName = Mid(FileName, 1, InStrRev(FileName, ".") - 1)
FileName = ActiveDocument.Path & "\" & FileName & ".htm"

ActiveDocument.SaveAs2 FileName, FileFormat:= _
wdFormatFilteredHTML, LockComments:=False, Password:="", AddToRecentFiles _
:=True, WritePassword:="", ReadOnlyRecommended:=False, EmbedTrueTypeFonts _
:=False, SaveNativePictureFormat:=False, SaveFormsData:=False, _
SaveAsAOCELetter:=False, CompatibilityMode:=0
ActiveWindow.View.Type = wdWebView

End Sub

However, I had another problem, when I publish normally to Word I run an adjust table macro that aligns and sets the width of all the tables. This caused a problem in the htm format as the width was now set as absolute and when converted to eBook the tables would go off the edge of page. This was easily fixed by disabling the macro in the after publish macro in the Word template, but this also meant, as I still wanted to run it for normal outputs, I had to change all my other Publishing Profiles which run the after publish macro, to run the adjust-table macro on its own. Once I'd done this the output looked much better.

There were still problems, I don't think an eBook needs a mini TOC at the beginning of each chapter so I removed these with another two macros that removed the Word styles for the document. The code for one is shown here:

Sub DeleteMiniTOCItem()

Selection.Find.ClearFormatting
Selection.Find.Style = ActiveDocument.Styles("MiniTOCItem")
Selection.Find.Replacement.ClearFormatting

With Selection.Find
.Text = ""
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With

Selection.Find.Execute Replace:=wdReplaceAll
End Sub

The next problem I had was running all this automatically. Luckily you can run Calibre from the command line and from Author-it you can run batch files from a publishing profile and also pass in arguments. There are many commands I could have used with Calibre and I have used only a few here, I suspect with more time there is a lot more I could do.

It took me a while to work out the syntax for Calibre, but it turned out it was quite simple and I'm just stupid. So the batch file that runs looks something like this:

ebook-convert "C:\Users\davidj\Documents\AITPublishing\ebook\numero interactive Classifier Trainer Guide\numero interactive Classifier Trainer Guide.htm" "C:\Users\davidj\Documents\AITPublishing\ebook\numero interactive Classifier Trainer Guide\numero interactive Classifier Trainer Guide.epub" --enable-heuristics --disable-markup-chapter-headings --page-breaks-before /

But with the arguments that Author-it can pass into it much simpler:

ebook-convert %1 %2 --enable-heuristics --disable-markup-chapter-headings --page-breaks-before /

  • %1 = "<SYS_PUBLISH_FOLDER>\<ni> <Guides>.htm"
  • %2 = "<SYS_PUBLISH_FOLDER>\<ni> <Guides>.<epubextension>"

If you know Author-it you will know how flexible it can be, you'll know that with the system variable and my own defined variables I have complete control of what I publish from Author-it, down to the type of ebook I produce. Currently I am only interested in epub and mobi file formats but if at some future date I want to create other formats I can just add them to my <epubextension> variable.

The output I'm getting now is OK, or it looks OK in the Calibre ebook reader.  A screen shot is shown below.

Sample ebook

Sample ebook

I need to check it on a few devices to get a better idea of what it looks like. I'm not happy with the main TOC as it has page numbers as does the index. I can exclude the index easily enough through the publishing profile but I think the TOC is still needed so I need to devise a way of either regenerating it through Calibre (which I think you can do) or modifying it in some other way. There are a few minor indentation issues but I should be able to sort those out relatively easily.

In general I'm satisfied with what I am getting, now once I have finished writing or editing a book I have a new publishing option, available as a single click with no further manual intervention.

About these ads
This entry was posted in Author-it, Technical Communication and tagged , , , , . Bookmark the permalink.

3 Responses to Author-it, Calibre, Macros and Batch files

  1. Thanks for sharing this David. Could you pass in the publishing profile type to the macros and turn them on and off accordingly? If you could do that I think Word can produce a toc without page numbers.

    • dacj40 says:

      Actually what I have found works best now is to leave the TOC out entirely. So the publishing profile swaps out the TOC template with one that doesn’t publish to Word and then create an ebook TOC using Calibre. All I have had to do is a few more commands to the batch file. I shall share this in another post when I have some time.

  2. Pingback: An Update on ebooks | D A C Jones Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s