The records in a university library catalogue typically have many different origins: created by the library, obtained from a national library or a book supplier etc. So, who ‘owns’ them? And what are the legal implications of making them available to others when this involves copying, transferring them into different formats, etc.?

The JISC has just commissioned a study to explore some of these issues as they apply to UK university libraries and to provide practical guidance to library managers who may be interested in making their catalogue records available in new ways. Outcomes are expected by the end of 2009.

The proceedings jumped the line to farce when Fritz Attaway and a colleague from the MPAA pulled out a cinematic demonstration of just how to camcord a movie from your television screen. (You start with a $900 HD video camera, a tripod, a flat-screen television, and a room that can be completely darkened.) Tim Vollmer captured the whole scene on a video of his own. Mind you, this is the same industry that has lobbied to make a crime of camcording in movie theaters, telling us how to frame shots properly from the television. (As Fred Benenson notes, they’re also demonstrating DRM’s impossibility of closing the “analog hole.”)

In the short run, the Google Book Search settlement will unquestionably bring about greater access to books collected by major research libraries over the years. But it is very worrisome that this agreement, which was negotiated in secret by Google and a few lawyers working for the Authors Guild and AAP (who will, by the way, get up to $45.5 million in fees for their work on the settlement—more than all of the authors combined!), will create two complementary monopolies with exclusive rights over a research corpus of this magnitude. Monopolies are prone to engage in many abuses.

The Book Search agreement is not really a settlement of a dispute over whether scanning books to index them is fair use. It is a major restructuring of the book industry’s future without meaningful government oversight. The market for digitized orphan books could be competitive, but will not be if this settlement is approved as is.

Join the Internet Blackout – Protest Against Guilt Upon Accusation Laws in NZ — Creative Freedom Foundation (

Update: The NZ government have suspended the introduction of Section 92a (via MiramarMike)

An interesting campaign to ‘blackout’ your online presence to campaign for change to one of NZ’s clauses started today. Protest Against Guilt Upon Accusation Laws in NZ — Creative Freedom Foundation ( I spotted this a few days ago thanks to Mike Brown who tweeted about it.

What’s interesting about the law is that it changes the presumption of guilt quite significantly. Currently in most Copyright jurisdictions if someone is infringing your copyright then the first thing you’d do (after asking them politely to stop) is take out an injunction against them. This involves persuading the court that you have enough of a case that the (alleged) infringer should be told to stop until the case is heard. The bar for getting an injunction is, then, quite high.

What Section 92 of the Copyright Amendment Act does is compels ISPs (and there is a broad definition of that term in the law) to take down sites or revoke internet access when an accusation of infringement is made. The clause looks like this:

Internet service provider liability

92A Internet service provider must have policy for terminating accounts of repeat infringers
  • (1) An Internet service provider must adopt and reasonably implement a policy that provides for termination, in appropriate circumstances, of the account with that Internet service provider of a repeat infringer.
  • (2) In subsection (1), repeat infringer means a person who repeatedly infringes the copyright in a work by using 1 or more of the Internet services of the Internet service provider to do a restricted act without the consent of the copyright owner.

The potential downsides of a law like this are many, but one of the biggest is the impact it is likely to have on fair-use. Fair use is not explicitly defined and is clarified by case law, though some common examples are often postulated – parody, criticism, illustration are often quoted. There are also tests around the commercial impact of the use.

What this means is that Copyright is not absolute, it’s a negotiation between creators and the state to strike a balance that is most effective for the country’s cultural and economic prosperity. This clause changes that for internet based uses by preventing that negotiation and also by makes people more fearful by increasing the immediate penalty for an accusation of infringement from very little to the loss of internet service. That could be enough to close many small businesses.

The reasoning behind the bill is one of practicality. Those with large catalogs of Copyright works, such as the music labels, are having a really tough time preventing copying on the internet (because teh internet is one big copying machine). The reason is that the current laws make pursuing people difficult and expensive as the RIAA have found out in the states. The solution in Section 92, though, may be a little heavy handed. ISPs are likely to comply with the law, and the cheapest thing for them to do is simply take down anything they’re asked to. ISPs are a commodity, they don’t have big profit margins to use up helping you keep your content up online.

Labour MP Judith Tizard is quoted as saying

It is easier for ISPs, Internet Service Providers, to cut off anyone who might be breaking the law.

Now, this seems to be a more and more common perception. That it would be too much trouble to ask a copyright holder to file suit and that ISPs look perfectly placed to handle issues. What that misses though is that ISPs are not at all equipped to perform any kind of arbitration, so with an individual customer on one side and a large, wealthy corporate lawyer on the other the ISP will always play it safe.

If this were happening in the US then I wouldn’t even have blogged about it, but it seems odd to me that this is happening at almost exactly the same time, and in the same city, as Webstock, one of the best web conferences in the world.

Wendy Grossman, in The Guardian, covers the difficulties of libraries publishing their catalogue data online.

Despite the internet’s origins as an academic network, when it comes to finding a book, e-commerce rules. Put any book title into your favourite search engine, and the hits will be dominated by commercial sites run by retailers, publishers, even authors. But even with your postcode, you won’t find the nearest library where you can borrow that book. (The exception is Google Books, and even that is limited.)

via Why you can’t find a library book in your search engine | Technology | The Guardian.

I get a namecheck and a quote at the end:

Rob Styles, a programme manager for Talis’s data services, says: “The main reason I think libraries need freedom to innovate is because we don’t know what they’re going to look like”.

Warning, Patent Rant follows.
91019234.pdf (page 15 of 19)
Sure, everyone knows about high profile patents like Amazon One-Click, but what about the effect of less prominent cases?

A patent is a monopoly over an invention – in order to encourage innovation, patents are granted to inventors so they are assured of an income from the invention. Inventions usually make money either through the sale or licensing of the patent or through production of a product that makes use of the invention. The patent prevents others from simply copying the idea. Unlike Copyright, however, patents cover the idea even if the second person came up with the idea completely independently.

But why am I writing about this now? Well, I’ve come to the end of the line with a child’s clock. Yep, you read that right.

Back in 1991 Julian Renton designed a clock for children. The intention was simple, provide children who were too young to tell the time a visual indication of whether or not they should be awake. This is a great idea and one that, as a father of three, I whole-heartedly support.

He patented the clock.

All would be fine and dandy if he, or a licensed manufacturer, had gone on to produce a product based on the patent that both worked in practice and was sound value-for-money. Unfortunately that’s not what happened. Without the patent the idea would have been open for several manufacturers to pick up the idea and produce competing versions. This would have had the usual market benefits of encouraging the development of better products as well as driving cost down. The patent prevents this.

So, we’re left with just Sleep Time Bunny.

Now, as far as Julian’s concerned, the patent system has worked very well. He has a nice little business selling Sleep Time Bunny directly over the internet and through some shops. The usual street price for Sleep Time Bunny is a little under £20. Bear in mind that, other than patent-protected bunny face, this is the same complexity as a standard alarm clock. It’s also been manufactured with cost very much in mind – you can tell. I would suspect the manufacturing cost doesn’t exceed £2 per clock – and it is possible to buy unbranded alarm clocks that appear to be the same quality for around £4. Julian should be making a healthy profit on the sale of each clock.

But as a consumer, the patent protection has delivered me poor value-for-money and resulted in no consumer choice. The problem for me, as a consumer, is that the idea has been allowed to run as a monopoly, thus requiring no innovation or development to make the product better or cheaper. Take one of the most obvious problems for a children’s alarm clock:

A 2½ year old, the stated lower end of the age range for the clock, will often go to bed around 7pm and be expected to stay in bed until 7am the following morning. This is obviously something that people ask a lot as makes it onto Bunny Clock’s FAQ:

Bunny Clock Q. I tried to set Bunny Clock to sleep at 7.00pm to wake at 7.00am but the Bunny won’t stay asleep. Why is this?

Bunny Clock A. The waking time selection is set with the normal alarm set hand. However, the alarm mechanism, as with all alarm clocks, works on a 12 hour cycle and so you are effectively trying to set Bunny Clock to sleep when it wants to wake. If you want Bunny to sleep for 12 or more hours you will need to adjust the wake setting at a later time – possibly just before you go to bed.

Right, so for a very common case the clock simply doesn’t work. Notice the aside there, “as with all alarm clocks”. Not all alarm clocks are marketed for young children, not all alarm clocks are patented. How much product development has gone into solving that problem in the past 17 years? Zero. Because there is no need to solve it. The product has no immediate competitors.

What the patent doesn’t prevent is someone designing a product that solves the exact same problem and competes in the exact same space, as long as it doesn’t infringe on any of the claims made for Sleep Time Bunny. And that’s exactly what someone should do. Please someone.

Another paper from LDOW2008 that I worked on with Tom Heath and Paul Miller. The Open Data Commons licensing is about providing clear licensing for data shared on the web. It’s not like Creative Commons because it is for data that doesn’t qualify for Copyright protection, whereas Creative Commons relies on an underlying Copyright ownership.

Open Data Commons, A License for Open Data is predominantly a position paper explaining what’s been happening with Open Data Commons and its predecessor the Talis Community License.

The first steps of the Semantic Web are now a short distance behind us and some organisations are starting to pick up the pace. With more and more data coming online, marked up for linking and sharing in a web of data, perhaps it’s time to look again at the trade-off of different intellectual property rights.

Back in November of 2004 James Boyle published A Natural Experiment in the Financial Times. This piece sees him debating the merits of intellectual property rights over data with Thomas Hazlett and Richard Epstein. His primary thrust is that we should be making policy decisions in this area based on empirical data about the economic benefits one way or another. Something all three protagonists agree on.

Much has changed between 2004 and now, not least our understanding of how the web can affect the way we collaborate, share, communicate; it fundamentally affects the way we live. We chat, we blog, we Twitter, we Flickr and we Joost. Content flows from person to person in unprecedented ways and at unprecedented speeds. This changes the nature of the experiment that Boyle talks about.

If the database right were working, we would expect positive answers to three crucial questions. First, has the European database industry’s rate of growth increased since 1996, while the US database industry has languished? […] Second, are the principal beneficiaries of the database right in Europe producing databases they would not have produced otherwise? […] Third, […] is the right promoting innovation and competition rather than stifling it?

Boyle’s first two questions centre around the creation of databases and his third, by his own admission, is difficult to measure. If one of our primary goals for the growth of the Internet is to have a web of data that can be linked and accessed across the globe we may be better served by assessing how companies might make data open.

Boyle asks for, and discusses, the empirical evidence of databases being created in the EU and US. The differences in numbers should provide insight into the economic ups and downs as the EU adopted a robust database right in 1996 while the US ruled against such protection in 1991. I am interested in how we expect the growth of data on the Semantic Web to differ in the two jurisdictions.

Boyle explains that the US Chamber of Commerce oppose the creation of a database right in the US

[The US Chamber of Commerce] believe that database providers can adequately protect themselves with contracts, technical means such as passwords, can rely on providing tied services and so on.

And therein lies the rub. Without appropriate protection of intellectual property we have only two extreme positions available: locked down with passwords and other technical means; or wide open and in the public-domain. Polarising the possibilities for data into these two extremes makes opening up an all or nothing decision for the creator of a database.

With only technical and contractual mechanisms for protecting data, creators of databases can only publish them in situations where the technical barriers can be maintained and contractual obligations can be enforced.

We don’t tolerate this with creative works, our photographs, our blog posts and so on. Why would we expect it to make sense for databases? Whether or not it makes sense comes down to whether or not it is beneficial to society. We allow Copyright in order to provide adequate remuneration to be collected by the creator of a work. We allow patents to allow the recovery of development costs for an invention. Which is database right more like?

Patent is a very broad monopoly. If I had a patent on the clock, a mechanical means of measuring the passing of time, nobody else would be able to make clocks. Copyright, on the other hand is much narrower only allowing me to protect the specific design of my clocks. This is where it can get confusing with databases. Database right in the EU is like Copyright. It is a monopoly, but only on that particular aggregation of the data. The underlying facts are still not protected and there is nothing to stop a second entrant from collecting them independently.

Richard Epstein points to this in his contribution

The question is why do databases fall outside [the general principle of copyright], when the costs of compilation are in many cases substantial for the initial party and trivial for anyone who receives judicial blessing to copy the base? In answering this question, it will not do to say, as the Supreme Court said in the well known decision in Feist Publications v. Rural Telephone Service, (1991) that these compilations are not “original” in the sense that it requires no thought to check the spelling of the entries and to put them all in alphabetical order. But that obvious point should be met with an equally obvious rejoinder. If it requires no thought or intelligence to put the information together, then why not ask the second entrant into the market to go through the same drudge work as the first.

This is exactly what we see happening with Open Street Map. Ordnance Survey in the UK have rights over the map data they have collected. The protection covers the collection of geospatial data that they have created, they are not granted a monopoly in geospatial data.

This leaves a special case of databases, those which are created at low cost as a by-product of normal business. Examples used in Boyle’s article are telephone numbers, television schedules and concert times. Boyle gives us the answer directly

the [European] court ruled that the mere running of a business which generates data does not count as “substantial investment” enough to trigger the database right.

This reminds me strongly of The Smell of Food and the Sound of Coins a folk tale in which a wise judge decides that a restaurateur may charge for the smell of food wafting from his restaurant, however the appropriate price is the sound of coins chinking together.

That a database right may not and should not apply in all cases, and that there is a requirement to restrict anti-competitive practices, does not necessarily extend to the conclusion that a right is not required.

It seems to me that much of the debate around intellectual property rights has focussed on how they are used to keep things closed. Having suggested earlier that we have only the abilities to keep databases locked away or in contrast open them completely, I’d like to consider what it might mean to have a database right for keeping things open.

In response to Thomas Hazlett’s contribution Boyle asks

How many databases are now created and maintained entirely “free” and thus escape commercial directories altogether? There are obviously many, both in the scientific and the consumer realm. One can no more omit these from consideration, than one can omit free software from the software market.

This strikes me as a great comparison to consider. Taking one of the most prevalent free software licenses, the Gnu Public License, what might that look like for data?

One of the primary functions of the GPL is that it enforces Copyleft – the requirement to license derivative, and even complimentary, works under an the same license. That is, any commercial software that makes use of GPL code must, under the terms of the license, also be released under the GPL. The viral nature of this license is possible only because of the backing of Copyright.

Without a database right communities have no mechanism to publish openly and still insist upon this kind of Share-Alike agreement.

Consider the impact of this for situations where you you might use the idea of promiscuous copying to maintain the availability of data. Promiscuous copying relies on two things, lots of copies being made and lots of copies being available. Without the necessary licensing in place there is no mechanism with which to compel those who have copies to make those available. Public Domain means, by definition, no restriction – that means I can lock it away again.

Copyleft is just one position along a spectrum where ‘locked away’ and ‘free as a bird’ sit at each end. What the web shows us is that other business models form crucial parts of the eco-system. Epstein picks up on the controlling aspect of Boyle’s argument:

They can control their list of subscribers; give them each passwords; charge them based on the amount of the information that is used, or some other agreed-upon formula; and require them not to sell or otherwise transfer the information to third parties without the consent of the data base owner.

Imagine if this were true of Copyright material on the web? It has been, and still is on the occasional site. But mostly copyright owners are starting to see the value of publishing content online and they are underpinning the delivery of that content to consumers with other business models. Without Copyright the types of business that could participate would be reduced.

Epstein goes on to say:

The contractual solution is surely preferable, because general publication will allow for use by others that may not offend the copyright law, but which will block the possibility of payment for the costly information that is supplied.

And again, the very heart of the matter. If we are to encourage those who have large databases to make them open, to post them on the Semantic Web, we must provide them with models and solutions that are preferable to technical barriers and restrictive contracts. Allowing them to pick their own position on the spectrum seems to me to be a necessity in that. You can see any form of protection in two lights. When Boyle says

They make inventors disclose their inventions when they might otherwise have kept them secret.

I say

They allow inventors to disclose their inventions when they might otherwise have had to keep them secret.

That’s why we’ve invested in a license to do this, properly, clearly and in a way that stays Open.

Rob Styles is Programme Manager for Data Services at Talis, a UK company building Semantic Web technologies. Rob Styles is not a lawyer.

