Wednesday, June 22, 2011

The Order of Battle: First Skirmish Line

BNF - Paris

This takes us to the actual battle line on scanning library books which runs like this: Forget about it; it can't be done; it is illegal because of copyright. It is possible to have some fun with this argument.

By definition, legal arguments are self-serving interpretations of the law to further the ends of plaintiff and defendant, as the case may be. The only thing that makes this workable is that one side will win the other will crawl under the porch and lick its wounds. Motivation to play the game with virtuosity is high. We accept that although the wrong ones often win.

However, as a researcher who grew up in libraries and has worked in them (and in computer centers) for several decades, it is possible to achieve a perspective on electronic library books based on the common good as represented by electronic libraries. It is really an unashamed faith more than anything else, a faith in technologies, in the importance of libraries and in the electronic representation of sources, and even a faith that the law will eventually find its way down this path. Or will be pushed, pulled and kicked down this path.

It is possible to know with great certainty what is good for libraries and how technologies have contributed in the past and how they will continue to contribute in the future - Robert Darnton and the "we only work with vellum" crowd notwithstanding. [Note: there are influential voices who argue that "real" research can only be done with the actual physical artifacts. While this may be true for a small elite of scholarly superstars, the future lies in an opening of sources to a much wider group through electronic representation. The work is well underway. See: Batke, "Google Book Search and its Critics" 2010.]

From that perspective, the legal arguments seem thin, the judgement of the court seems arbitrary and the justification of the judge for the ruling seems contrived and manufactured in haste despite the time it took to produce. I am afraid from this perspective the work of Congress and the courts in the area of copyright appear ramshackle indeed. One can have some sympathy with the legal profession - with the law so completely tied up in illusions, half-truths, denial of fact and capitulation to cartoon figures ironed onto T-shirts, how can coherent arguments about our recent textual heritage be made with integrity and how can judgements worthy of respect be rendered? How can you tell I am not a lawyer?

There is a thorough, professional and respectful discussion of the March 2011 ruling and all the steps leading up to the ruling on Prof. Grimmelmann's (NYU Law) blog "Laboratorium." I have gained insight from his analyses and from the many comments and answers; I have also been humbled by the recognition of my fundamental ignorance of the workings of the law. I will be careful to document passages from his work and from others that have contributed to my evolving understanding.

I have also been surprised by a strong and rising undercurrent of anger welling up, when I read the matter of factness with which the quite ordinary pronouncements of the court are taken as "that's the way it is." It reminds me of being a kid again, long long ago, and dealing with the absolute rulings of the parental units. It is not really anger, now in my early dotage, but rather a loud and resounding: "You gotta be kidding!" I am not used to finality of a judgement, even the temporary finality that may be reversed.

I will look past the fact that Prof. G. used to be a programmer for Microsoft. I will also look past the fact that Microsoft funds part of his research. It means that he has first hand experience with the workings of monopoly. I do not see an obvious conflict playing out in his work. Yet, I am a waif in dangerous waters. Prof. G. speaks with enthusiasm of the electronic library as an achievement we - the technological collective - can be proud of. I am tempted to take a full swing at his pinata, but I fear the cluster bombs inside. Part of my fear comes from what I don't understand, the consequences of legal terms and maneuvers hidden from the layman. The book of legal openings is huge; work outside the book, you lose. That much I do understand - the big yellow signs: CAUTION MINEFIELD.

[Aside: written communication leads one into an imagined world. Our shared definitions of words provide some communality, but the author's view, represented in words, allows only a narrow bandwidth filled with noise. Ideological commonplaces fill in the gaps. These commonplaces can lead to acceptance or rejection. In any case, the author presents images from the imagination that reflect his or her view of the way things are. To understand what is written requires some little willingness. However, I claim no objectivity - you are getting the benefit of the ideas of a researcher who started scanning and indexing texts in the early 1980's and has been fascinated by the intellectual vistas this kind of work provides. The current incursion into the "law" is untypical for me, but it represents a growing conviction that it is necessary to start shouting to be heard above the media commonplaces.]

The Irrelevance of Copyright.

Before the recent work of Google with libraries and before establishment of the fact that it is possible to create a dataset of 12+ million library books, copyright for books was truly irrelevant. Historical treatments of the early history of copyright during the founding of the nation as well as the later developments up to the time after WWII, show that copyright was initially, and also continued into modernity to be irrelevant for the majority of books.

In the early days, pre-1800, copyright was thought too expensive (published newspaper notices for four weeks etc.) and requests for protection of copying were in the single digits percentage points of published materials. It almost seems as though the founding fathers or the pioneers for this kind of legislation in England had a profound empathy with text pirates and wrote legislation with a bias against granting easy protection. There was a substantial industry of verbatim copying of freshly published material. Jobs were on the line.

Just as the music industry seems to have written the current law, so it seems the text pirates wrote the original statutes in the early days of the Republic.

Before the knowledge explosion in our time, during our agrarian past on the outskirts of the civilized world, books were a coveted, precious resource for the mind, and I can imagine not everybody in Philadelphia or in Mount Vernon was particular where their set of Voltaire was printed. In any case, the lawmakers were certain that at some point, in a timely fashion, verbatim copying would be allowed.

Over time, the period of protection was doubled to 28 years, (1909) with an extension, as before, of the same period upon application. Yet the trend of ignoring copyright continued into the 1960's when renewal rates to gain a second 28 years reached around 15%. If you consider the large portion of works that were not copyrighted at all, the 15% renewal may only cover 1% of published material. More on this with citations below.

Today, when everything far and wide is copyrighted, we have become ensnared in a tangled web. Copyright, as a meaningful and widely enforced regulation has migrated to music, to films, graphics, and software. Copyright infringers are infringing in China and Russia and all over the world, and they make no distinction between copyrights and patents or trademarks. Intellectual property grows by the side of the road, as does everything else.

Copyright and families of patents have become the weapon of choice in arcane software wars as computer code is cloned for competing products and many nonsensical things are copyrighted to delay the competition.

[Aside: I have read only superficially in the patent wars and would like to restrict the discussion to library books. I feel that library books have been caught up in the war of litigation about things electronic that has raged for the last three decades. I will plead for their status as innocent and unintended victims.]

Curiously, nobody is pirating library books of the 70's or 80's. If you don't read Rowling or Grisham in Chinese, your textual pirate career will be dismal indeed. Granted, there was a high profile cases where it was decided one could not write sequels to best-sellers like "Gone with the Wind" - that briefly brought some air to the embers. A silly case, really. Generally, those wanting to piggy-back on publishing hits were forced to desist, which proved bad for extreme Harry Potter fans, but for the majority of authors and researchers and for the users of books in libraries, copyright is a non-issue.

Still, while we slept, in the 1990's Congress redid copyright with a vengeance, following the European model; it set the time limit to forever and a day and removed all reporting and renewal requirements. Anything that was fixed in a "medium" would hence forward be copyrighted as would this draft should I send it to my printer and post it on my refrigerator. Clearly the idea of administrating copyrights for millions of snatches of music, or graphics seemed daunting in the 90's when this legislation was developed, given the hash the Library of Congress had made of the data up to that time. Records in the music industry were even more chaotic. It seems less daunting today when everything will soon be on-line anyway.

Nonsensical, and extra-constitutional as these European (Berne Convention) rules might be, they were essentially irrelevant for out of print books since researchers worked in libraries (20 years ago) with physical artifacts under fair use, exempt from copyright; high volume money makers worth the trouble to protect are academically of marginal interest anyway, and verbatim copies were not really feasible in the 1990's.

Clearly, CTEA (Copyright Term Extension Act, 1998) had nothing to do with books. Books were simply collateral damage. Regrettable, but actually, no harm done. Congress was simply too disorganized and methodologically inept to come up with an appropriate scheme to regulate movies, music, software and T-shirt production when copyright would do just fine.

If one needed a book - two decades ago - one would look it up in Books in Print, later on Amazon or Abe and buy it; failing that, one would look it up on WorldCat or whatever it was called at the time and get it from a library, via Inter Library Loan, in two weeks. For the average researcher there were no viable opportunities for infringement; as there are none today, unless one wanted to publish a personal letter one had received from Pappa Hemingway or from Albert Schweitzer or if one wanted to use a picture of Mike Tyson's face since the rights to the black swirls belong to the tatooist, freely derived and copyrighted from Maori designs too old to be protected by copyright. I think his parents probably own the other side of Mike Tyson's face, or perhaps his dermatologist.

That does not mean that there are not authors associations who use their dues to flex legal muscle. Generally, compared to napster and DVD ripping, undergraduates Xeroxing books is paltry fare, even when the books are cut and fed to automatic copying and collating devices at Staples. At 10 cents the b/w page plus binding, it is hard to compete with the book store. The Xeroxing phenomenon, which has declined significantly in recent years with the widespread licensing of electronic journals is not making a serious dent in the publishing industry. Xeroxing library books or text books is a small cottage industry on the level of shoplifting. It represents the cost of doing business in a mass market. For out of print books it represent defensible fair use practice. Much of Xeroxing also represents bad study or research habits and is a waste of time and money which may dawn on the Xeroxers when they realize that Xeroxing does not equal reading. Most of what is Xeroxed is eventually discarded and does not find its way into the market. Yet Xeroxing has also been a valuable aid to research although it should gradually fade from the scene as pdf's become available on tablets and laptops and personal printing becomes optional as well as much more inexpensive.

It was the work of Google, starting in 2004 that brought the copyright lawyers to the ramparts about books and started what now has become 6 years of litigation, negotiation and attempted settlement. Alas, the Judge blinked, he nixed the fix that would have revived the living dead, and the problem of text trapped in printed ink on paper will continue for a while. Heaven help us if it were to turn out that the messes created by Congress could be fixed simply by highly motivated and concerted action by the geeks in California; the pillars of democratic government would surely sway and crack.

So much for the first of several ventings of outrage at what has been done to the concept of intellectual property and a sensible balancing of the right of authors with the rights of the public by the Jubilation T. Cornpones on Capitol Hill.

In the next installment I plan to delve into some actual legal arguments (Grimmelmann and Sprigman) and start documenting the sources for the discussion. I like to think of it as heading into the jungle.