A new settlement has been reached in the 7-year-old Google Books (now known as Google Library Project) controversy, albeit a vague and probably minor one. This battle and its related lawsuits could go on for a generation and still not be resolved.
First - a quick recap. Google began a process of scanning millions of old books back in 2004. The idea was to create an enormous digital library, eventually including most books ever published. Anyone, in effect, would be able to access most of the older books held in virtually every library on earth from his/her own home. A research tool such as this would have been unimaginable a decade or two earlier.
However, books often come with copyrights. Only books published before 1923 are clearly out of copyright. Later ones may or may not be, more recent ones definitely are not. Google did not scan more recent books, but did scan many in that in-between, post-1923 area. Authors and publishers cried foul. It is a violation of copyright law to republish online others' books without their permission. They sued. After much wrangling, Google reached a settlement with author and publisher groups, one that allowed them to continue copying, and making a small “snippet” of text available for works still under copyright without the author or publisher's approval. The settlement provided that if the consumer wanted to see more than a “snippet,” they would have to pay. The copyright holders would receive 63% of the proceeds, Google the rest.
There was one problem with this settlement. It settled the dispute between Google and the largest author and publisher groups, but what Google was doing, copying books without copyright holders' permission in advance, was still considered to be a violation of copyright law. Authors not aligned with the Authors Guild, and various others, including the U.S. Department of Justice, sued to block the settlement.
Now, you may wonder why Google didn't just change its procedure to get the copyright holders' permission in advance. Well, just try to do that for the author of an obscure book published 80 years ago. That person undoubtedly died long ago. Who and where are the heirs whose permission must be sought? That is impossible to determine. Seeking permission in advance effectively means that many old books can never be made available online, even if the author or heirs couldn't care less. They were never to see another dime from these old books anyway. There are no winners from being a stickler here, just losers – the public.
To make a distressingly long story slightly shorter, the federal court struck down Google's settlement with the authors and publishers, whereupon the latter two immediately sued Google all over again. What a mess. Now we find that Google has again settled with the Association of American Publishers. How are they avoiding the problem with the last settlement? They haven't told us all of what's in this settlement. However, what we do know is that it applies to only five specific members of the AAP, who have granted Google permission in advance. That takes care of the permission problem. Another of the objections to the previous 63% agreement was the fear that it might lock Google's competitors out of the deal, since it granted only Google the right to offer “orphan books” (those whose copyright holders cannot be found). This settlement does not speak for the “orphans,” or anyone other than the named publishers. Indeed, it says, “Apart from the settlement, US publishers can continue to make individual agreements with Google for use of their other digitally-scanned works.” Of course, this assumes the publisher holds the copyright, not the author, as no authors or author groups are part of this settlement.
One nice thing about this agreement is that it allows Google to publish 20% of the text, rather than just a “snippet.” It can be hard to decipher whether a book has sufficient relevant information for your needs to make a purchase from one sentence. As to what the financial arrangements are, that is an unknown. The AAP statement concludes with, “Further terms of the agreement are confidential.”
There are a few alternate attempts to make digitized old books available to the public besides Google Library. The Internet Archive and HathiTrust are notable examples, though each specializes in clearly out of copyright books and have far fewer titles scanned. The Digital Public Library of America, formed at Harvard in 2010, seeks to be a larger force, but it has not yet begun to make books available. They hope to combine others' resources with their own to form a large database. They intend to start with the easy ones – out of copyright books. The next group would include the so-called “orphan books.” Still, they don't have a clear path to get around the legal hurdles, other than thinking maybe Congress will adjust the copyright law on behalf of a nonprofit organization. Perhaps, but I am dubious of the belief that Congress will respond more favorably to a nonprofit. Congress seems more prone to respond to bribes... let me rephrase that... Congress is more prone to respond to campaign contributions, and for-profit companies seem more likely to be able to grease those wheels.
Editor's Note: Andi Sporkin of the AAP has provided the following corrections to this article:
As the person who handled the media relations on the Google Books settlement, I wanted to advise you that several facts Mr. Stillman makes in his piece on the subject are incorrect. Among them: the settlement covers all 300+ AAP publisher members as well as publishers who belong to a number of designated US trade associations. Also, it is a private settlement between parties rather than a settlement of the lawsuit, so its terms would not be released publicly.