Reference: GutenMark

GutenMark Home Page
Attractively formatting Project Gutenberg texts

What is GutenMark?

GutenMark is a command-line tool for automatically creating high-quality HTML or LaTeX markup from Project Gutenberg etexts. As of April 2008, there is also a graphical front-end called GUItenMark that greatly simplifies usage for casual users. Both Windows and Linux ‘x86 are supported. Mac OS X is also supported, though in some respects it lags the others. Limited iPhone support is also possible.

In combination with other freely-available conversion tools GutenMark aims to convert Project Gutenberg etexts into publication-quality Postscript or PDF, for print-on-demand applications. The goal is for this conversion to be completely automatic, without manual markup or editing, but for the forseeable future some manual intervention will almost always be needed—at least, if your standards are at least as high as mine.

I took the Project Gutenberg plain text file of The Adventures of Sherlock Holmes and ran it through this.

Amazingly, this:

To Sherlock Holmes she is always THE woman.

was transformed to this:

To Sherlock Holmes she is always the woman.

As it should be!

I was impressed with the available options and did some light testing. It could be a very useful tool for Project Gutenberg etexts that have only a plain text version available.

On the other hand, I also downloaded the Project Gutenberg HTML of the same Holmes and it was superior.

But this tool remains a very painless way of changing those text files into a format that can then go on to further processing to create an eBook.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: