Gutenberg's Sub icon Gutenberg's Sub



Johannes Gutenberg and the Gutenberg Project: from expensive, hand-printed books to widely-available, free, electronic texts.

Johannes Gutenberg

Around 1440 Johannes Gutenberg invented his printing system. Before then books were copied by hand and they were rare and expensive. There were no newspapers or magazines. Most people could not read or write. Only an elite had access to important information. It's trite but true: the invention of the printing press really did change the world.

Gutenberg-type press

We think of Gutenberg in terms of his printing press but his invention was really a combination of three things. First, he adapted the screw press for pressing inked letters against paper. He modified the press to allow even pressure on the paper and added a movable table to make changing sheets faster and easier. Second, he developed oil-based inks which worked much better than water-based inks.

Third, he developed a method of efficiently producing movable type (individual pieces could be moved). A mold was used to cast metal-alloy type (letters). The type would then be arranged in forms which were used to print a page. After printing the desired number of copies the form could be broken up and the type reused. The economy and efficiency of using movable type is apparent if we contrast it to carving part or all of a page from a block of wood which, after it wore out, would have to be re-carved.

Eventually the type would show signs of wear; new type could be cast by reusing the molds which allowed for the rapid creation of large quantities of type. Movable type had been used earlier in Korea and China but the thousands of ceramic characters were expensive to produce.

Gutenberg's system is called letterpress: locking movable type in a form, inking it and pressing paper against it to form an impression. This system enabled the production of relatively inexpensive books (compared to hand copied texts) in large quantities.


The new printing technology was widely and quickly adopted. Advances in compounding inks, paper making and book binding followed. The results are well-known and do not require much discussion. The wider diffusion of information stimulated scientific and technological development. Cheaply printed broadsides led to the development of the newspaper. Access to (sometimes revolutionary) ideas contributed to the Reformation and political movements. Cheaper texts, especially the Bible, created a demand for more educational opportunities and increased literacy.

Except for the development of new and better inks and paper, printing technology changed little until the 19th century when several advances resulted in faster printing and lower costs. Hand-power was replaced by steam power and then electricity. The flatbed of type was replaced by a rotary cylinder. Rolls of paper were used instead of single sheets and the ability to print on both side of the paper at the same was developed. The Linotype machine (1884) sped up the composition process by producing an entire line of metal type at once (line o' type). Today offset lithography and computer typesetting are the standard technologies for large-volume commercial printing.

Although the mass production of books still requires large printing presses, electronic technology has made single-copy, on-demand printing economically competitive. When a book is ordered, it is printed; pre-printed copies are not stored in warehouses. There are even kiosks which print your book as you wait.

Today, there is so much newsprint around that it is shredded to make bedding for dairy cows; soon it may be as rare as it was some hundreds of years ago. Increasingly newspapers and magazines are turning to on-line publishing to reduce costs. Some have ceased printing on paper entirely. Newspapers used to rush to print extra editions when there was important breaking news but now they can simply update their Web sites as often as necessary.

While large-scale printing was becoming ever faster and cheaper, new technologies improved low-volume and non-commercial printing and copying: the typewriter (1860s), carbon paper (1870s), Mimeograph (1880s), Photostat (1907), Ditto machine (1923), xerographic copier (1949) and Thermofax (1950). The development of dot-matrix, inkjet and laser printers enabled printing from the home and office. Anyone with a computer and printer can do things that early printers probably didn't even dream of: with a few keystrokes we can change typefaces, font sizes and color, insert tables, charts and illustrations, create indexes, check spelling and much more.

We take for granted the availability of relatively inexpensive printed material -- from books printed on large printing presses to newsletters created on personal computers. Although books continue to be printed on paper, publishers are offering more and more titles as ebooks. An individual can now write, edit, "manufacture" and distribute an ebook without using a single sheet of paper. The development of electronic texts1 may be as transformative as the development of hard-copy printing begun by Johannes Gutenberg.

One of the most interesting aspects of electronic texts is the free ebook. Free books!

Gutenberg Project

In the days before personal computers, the Internet and the World Wide Web (1971) Michael S. Hart was given access to a mainframe computer; he created a copy of the United States Declaration of Independence and posted it on the network for anyone to download and read.

The development of the ebook has a long history; important contributions were made by many people, including Doug Engelbart, Alan Kay and Andries van Dam in the 1960s. Hart's contribution, however, required no special equipment or software. His document was stored as plain-vanilla, ASCII text which could be read using almost every computer in the world.

Whether he invented the ebook (as is sometimes claimed) is irrelevant. What's important are his ideas of storing "books" in a format that is readable on most computers, placing them on a network (and eventually the Internet) so that they are widely available, and allowing them to be downloaded at no cost.

Hart created Project Gutenberg a non-profit organization that aims "to make information, books and other materials available to the general public [at no cost] in forms a vast majority of the computers, programs and people can easily read, use, quote, and search."

It does this by creating electronic digital versions of documents, primarily books in the public domain (never copyrighted, the copyright has expired or the copyright has been placed in the public domain).

students with slates that look like e-book readers


Since the founding of Project Gutenberg several open and proprietary ebook formats and ebook readers have been developed.2 Project Gutenberg ebooks are now available in HTML, EPUB, Kindle, Plucker and other formats as well as plain-vanilla ASCII. It also distributes free audio books in multiple formats.

Gutenberg ebooks are readers' versions; they are not authoritative scholarly texts nor are they necessarily exact reproductions of particular editions. Documents are scanned and software is used to covert the images to text files. A distributed group of volunteer proofreaders, editors and readers then make corrections to these files. This results in relatively error-free reading texts; Project Gutenberg claims they are 99.5% accurate. The resulting edited files are then converted to several different formats (see below) and made available for downloading. They can also be read on-line. Most are in English but increasingly ebooks in other languages are being created.

In addition to individual books disk images suitable for burning to CD or DVD of collections of 600 to 29,000 books can be downloaded. The collections are also available on disk at no cost.


Johannes Gutenberg to the Gutenberg Project: from hand-printed, expensive books with limited distribution to relatively-inexpensive, mass-produced books available primarily through book stores and libraries to free ebooks available to anyone with an Internet connection.

Repositories and Catalogs

For a time Project Gutenberg was the only source of free, edited ebooks. Today, it has the largest single collection that can be downloaded at no cost. There are also many other sources of free ebooks, edited and unedited. Some specialize, for example, in classical literature and others in detective fiction. Two large collections are Internet Archive and Google Library Project.

Internet Archive

Internet Archive was founded by Brewster Kahle in 1996. At first it archived only World Wide Web pages (now available via the "Way Back Machine"). Currently, its archives include other digitized materials such as audio, photographs, moving images, text and ebook items. It also hosts several other projects, including the library catalog Open Library, specialized services for the "print-disabled" and a large book digitization program.

Internet Archive claims to provide access to "over 5 million books and items from over 1,500-curated collections ... [which] come from 900 content providers."3 The digital texts can be read on-line or downloaded in several formats. Anyone can upload and download material free of charge. Some ebooks are edited but many are not.

Google Library Project

In 2002 Google began a “secret books project,” originally called Google Book Search and now called Google Library Project, which scans the books and periodicals of major libraries. Software is used to convert the scanned works to text which is stored in a database that can be searched on-line and to provide downloadable versions of the works in several formats. The scanned images are available in PDF format. About 30 million documents have been scanned and most of the public-domain works are available for free downloading. They are unedited.

Google's scanning project benefits many people. However, unlike Gutenberg Project and Internet Archive, Google is a for-profit corporation and it digitizes copyrighted as well as public-domain works.

In 2005 groups representing authors and publishers sued Google arguing that making copies of entire books, which can be stored forever and easily reproduced at any time, violates the fair-use provisions of the copyright law. Google claims that it protects "copyright holders by making sure that when users find a book under copyright, they see only a card catalog-style entry providing basic information about the book and no more than two or three sentences of text surrounding the search term to help them determine whether they've found what they're looking for." Google also says that copyright holders can easily exclude their works from scanning, but should holders of copyrights have to take any action to prevent their works from being copied in whole?

Especially, troublesome are orphan works -- works still in copyright but whose authors cannot be contacted; thus, the authors are not represented and cannot complain about anything Google does. Further, there is the problem of compensation when Google sells a digital edition of a copyrighted work.

After some years a settlement was reached, but in 2011 a Federal judge rejected the agreement based on copyright and antitrust concerns. The United States Department of Justice, the European Commission, the Open Book Alliance and others have all voiced objections. The legal fight continues.



Two major on-line projects, WordCat and the Open Library, help to locate free ebooks although their catalogs are not limited to ebooks.

WordCat

The WorldCat Web site is run by the Online Computer Library Center, a worldwide library cooperative. Its aims are several including improving access via the World Wide Web to the resources held in its libraries. It claims to be the world's largest library catalog. The Web site enables you to search library collections from around the world. Depending upon the type of media and its location you may be able to read or view it on-line, download it or even check it out of a library.

Open Library

The Open Library is a project of the Internet Archive. It's goal is to provide a web page for every book every published (including all the editions). It claims to hold over 20 million catalog records of books. For each book there is publishing information (similar to a library catalog entry) and links for reading the book on-line, downloading the book (in various formats), borrowing the book from a library and purchasing the book. Which links are available depends upon the book.

Problems

The free ebook is an amazing development. At no cost you can download (in various formats) works as diverse as Daniel Defoe's Journal of the Plague Year (1722), Samuel Johnson's A Dictionary of the English Language (1755), Charlotte Brontë's Shirley (1849), Lewis Carroll's Alice's Adventures in Wonderland with the John Tenniel illustrations (1865), Isabella Beeton's Household Management (1868), Anthony Trollope's The Way We Live Now (1875), Olive Schreiner's The Story of an African Farm (1883), Mrs. Humphry Ward's Robert Elsmere (1888), The Yellow Book (1894 to 1897), H.G. Wells' Kipps (1905), Abraham Cahan's The Rise of David Levinsky (1917), Agatha Christie's The Secret Adversary (1922), E. E. Cummings' The Enormous Room (1922), and Zora Neal Hurston's De Turkey and De Law (1930).

In addition to ebook versions of older books there are many free ebook versions of new works, both fiction and non-fiction (including textbooks) are available.

However, there are three significant problems related to free ebooks: formats, copyrights and cost.

Formats


An ebook may be free but unreadable.

Ebooks are available in a number of formats, both open and proprietary, such as MOBI, AZW, PRC, AAX, EPUB, PDB, ASCII, PDF and HTML. Ebook readers may be able to handle from one to several formats, but not all. There are a number of applications for converting from one format to another.

The Portable Document format (PDF) was developed in the early 1990s by Adobe Systems as a means of representing documents so that they can be used on different kinds of computers. Each PDF file contains a description of a document, including text, fonts, illustrations and other information. A PDF file may contain only scanned images of a document's pages or only the text of the document or both.

man speaking angrily to woman


Many free ebooks in PDF format contain only page images (except for some brief introductory text). When reading these documents, you are looking at photographs of the document's pages. This is especially useful if the document contains illustrations, photographs, maps, marginalia and so on. Also, it enables you to read an old book with its original typefaces and layout.

There are two major problems with PDF files that contain only scanned images. First, you cannot search the document which may or may not be important to you. Second, ebook readers do not display these documents well; you have to do a lot of zooming and shifting of each page in order to read it.


Frequently, you can download the same ebook in formats other than PDF. But this is problematical also.

The process by which scanned pages are converted to text by software can introduce some or many errors, depending upon a number of factors -- the quality of the document's pages, the scanner and operator, the resulting image and the software used for conversion. A PDF ebook may contain exact copies of the pages of a document, while the other formats may be difficult or even impossible to read.

This is why edited documents from Project Gutenberg and other sources are so valuable. After being converted from scanned images the documents are edited and proofread by humans. The number of errors in the final documents is very small. Of course, it takes much more time to prepare these ebooks and the costs are greater unless done by volunteers; therefore, there are far fewer human-edited ebooks than PDF versions.4

Of course, PDFs are also created by humans. A person creates a text file of a document (for example, "types in" William Thackeray's Vanity Fair) and uses software to convert the file to PDF and other formats. Because they are edited before conversion, they also contain few errors.

example of PDF text



At left is a section of The "Breakfast-Table" Series by Oliver Wendell Holmes (London: George Routledge and Sons,1882) in PDF format.

The page images enable us to see the book with its original layout and typeface. The size of the pages could be increased for easier reading.

This version is not searchable.


example of PDF converted to MOBI format



At left is the same section of the PDF file converted by software to MOBI format.

The number of errors can clearly be seen.

Not all of the book is this bad, but it's representative of the bad sections which are numerous

This version is searchable but the results would not be reliable.


example of human-edit Project Gutenberg MOBI file



At left is the Project Gutenberg version in MOBI format after editing by humans.

This version is searchable and the results would be reliable.



The PDF version of the book above can be read using a PDF-reading program on a tablet, laptop or desktop computer. It would be a chore (if even possible) to read it using an ebook reader. The MOBI versions can also be read on tablets, laptops and desktop computer and most if not all ebook readers. The human-edited version is the obvious choice.

However, there are many books that are available only in page-image, error-free PDF or converted, error-ridden formats. It's useful to have both an ebook reader and a ten-inch or larger tablet computer. Depending upon the document you want to read you choose the appropriate ebook format and device for reading it.

Formatting is not a problem with new free ebooks because they are not created by converting page images.


Copyrights


Although millions of ebooks are available at no cost, they are limited to older works whose copyrights have expired (which varies from country to country) and those that the author has placed in the public domain or otherwise made available. Copyright laws are complex. For example, in the United States works published before 1923 and most government documents are in the public domain; those published between 1923 and 1978 are copyrighted for ninety-five years; and those published after 1978 are copyrighted for the life of the author plus seventy years. There are a number of exceptions and special cases. Unless copyright laws are changed (the historical trend has been to extend not shorten the length of copyrights), only books published many decades ago will be available as ebooks at no cost. Although many public libraries loan ebook versions of recent works, the selection is quite limited.

In general copyrights are not a problem with new free ebooks. Some are placed in the public domain. With others the authors retain the copyright but release them under a license that allows for their use at no cost, with or without restrictions.

Cost


Potentially, the depositories offer direct access to millions of books by anyone at anytime from anywhere. However, although the ebooks may be free, an Internet connection, a device to connect to the Internet and a device on which to read them are not.

Books have always been expensive. Even mass-produced, cheap editions are too expensive for many people. Since the development of the free library even the poor have had access to books, periodicals and other materials. But a library is not always convenient either because of its location or hours. Sometimes you can purchase a used book on the Internet and have it delivered to your home for less than it costs to travel twice to a library to check out and return it.

The online depositories cannot and do not aim to replace libraries, but they are always open and accessible from within your home. Further, they provide free and convenient access to documents that would be unavailable to most people: for example, old books and periodicals that can be inspected only by traveling to a specific (maybe distant) library; out-of-print books that can be obtained (maybe) only through inter-library loans; multiple editions of the same work; different translations of the same work as well as the original-language version.

If you already own a computer, smart phone or other device and pay for a connection to the Internet to perform other tasks then free ebooks are actually free. But what if you don't? New works such as CK-12 Foundation's geometry textbook and old works such as Charles Dickens' Great Expectations are of little use if you have no way to download and read them.

If you have a portable device you might go to a place such as an airport or cafe that offers free Internet connection but travel costs may exceed the price a used book. Many public libraries and schools provide both computers and access to the Internet, but there may be restrictions on downloading (if you are not going to read the work while online).

Even if someone obtains the ebook for you, you still need a device for reading it. Although ebook readers are not expensive (equivalent to the price of three or four new hardbound books), they are beyond the means of many people. Some public libraries loan ebook readers.

If you are poor -- a term that describes much of the population of the earth -- free ebooks may not be free. This is not to diminish the significance of the development of the free ebook, but to put it in perspective.

The Future

Free ebooks are only one aspect of the growth of Internet access to all sorts of information: Internet Movie Database, Wikipedia, WikiLeaks, YouTube. It's unlikely that the changes that result from the Gutenberg-Project system of creating and distributing etexts will be as significant as those that followed the development of the Gutenberg system of printing.

Still, the development of the ebook, especially the free ebook, will certainly affect the publishing industry and our reading practices.

Will schools continue to purchase printed versions of classics when free electronic versions are available? Will publishers have to add additional material in order to make them attractive? Will authoritative editions even make sense when readers can obtain PDF versions of original editions? How will this affect publishers' back lists? Will ebook readers eventually cost so little to manufacture that they will sometimes be given away as are once-expensive electronic calculators and clocks? Or will they disappear and be replaced by multi-function devices like smart phones? Given the small costs of storage and delivery will copyrighted ebooks become cheaper and cheaper?

Surely there will be some interesting economic, social, even political, consequences of (eventually) enabling (almost) everyone (almost) everywhere to access at no cost (almost) all non-copyrighted texts?

Will the ebook replace the book? Will the book even survive? These are silly questions. The ebook is not a substitute for the printed book nor vice versa. They are simply different things; they are similar but not equivalent to each other.

An ebook reader is more expensive than a printed book but it can hold many ebooks (many of which can be obtained at no cost). Searching for particular words and phrases is much easier as is looking up definitions with a built-in dictionary. It's a simple task to change the font for easier reading. But an ebook is useless without an ebook reader -- an electronic device which can range from a relatively large, multi-purpose, desktop computer to a much smaller, single-function reader. And an ebook reading device is useless without a power source.

On the other hand a book doesn't have to be charged. It's easier to annotate and to find some things such as headings, chapter beginnings, illustrations and even particular (unbookmarked) pages. Although both are subject to water damage, ebook readers are more fragile: a book won't cease to function due to some electro-mechanical or electronic problem.

For some readers the choice between the two simply comes down to a preference for the appearance, the texture, even the odor of one or the other. The technologies will coexist. Ebooks are here to stay but printed books are not going away.

Gutenberg's Sub

While playing Gutenberg's Sub, after the first move you substitute one new tile for a tile already on the board when forming a new word.

You must substitute one and only one tile and you can add additional tiles.


game boardgame board

You can create a new word by substituting only a tile already on the board.


game boardgame board

Or you can substitute one tile and add new tiles.


For each correctly-spelled word you receive the sum of the letter-points of all the letters of the new word (and cross-words) and 1 point for each letter of the new word (and cross-words). Because you get one point for each letter of the new word (and cross-words) in addition to the letter-points there is an incentive to play longer words.

When you use all seven tiles in your tray to form a word (sometimes call a bingo), the total points for the play are doubled.



See also

Quick Intro to Gutenberg's Sub

Stacks and In Praise of Libraries

Notes

1Although electronic texts are sometimes referred to as "virtual," they do have a physical presence as magnetic encoding on a storage device. However, unlike physical books you can't pick them up or even see them.

2For example, Rocket eBook around 1998, Sony Librie in 2004, Sony Reader in 2006, Amazon Kindle in 2007.

3Between late 2006 and May 2008 Microsoft ran a project similar to Google's Library Project that scanned 750,000 books. The project was less controversial than Google's because only works in the public domain or copyrighted works for which permission had been granted were scanned. After the project was abandoned many of the books were made freely available on the Internet Archive.

4Converting texts into etexts is inexpensive if done by computer software or human volunteers. The costs for storing, presenting and replicating them are small compared to the costs of storing and printing physical books. Small but not zero. Some repositories depend upon advertising to generate income. Others, including Internet Archive and Project Gutenberg, eschew commercial advertising and are funded by endowments, grants, and donations. Free ebooks and ad-free Web pages -- a concept worth supporting. For more about our view of advertising see Commercial Advertising.