http://1984, de-referencing George Orwell

doublethink

Winston sat at his usual table in the Chestnut Tree Cafe, it was unusually busy on this hot, sunny day. The waiter passed his table then returned with the gin bottle; filling his glass with the gin infused with cloves that the cafe was famous for.

The telescreen announced its imminent news of victory with a trumpet fare. It talked about further arrests of disidents, those who had commited crimes against Big Brother. Winston glanced across the room to Big Brother’s kindly smiling face looking down at him from the wall, a poster from floor to ceiling filling the cafe with his benevolent presence.

A commotion outside caused Winston to look out. The Thought Police marching past. They weren’t coming into the cafe today, nobody here of interest he guessed. They marched on.

The Brotherhood he now knew was real, but how could he really know how many others like him there were. O’Brien had told him he would only ever meet one or two others. He would do exactly as he was told. Follow orders. Was there really any hope of overthrowing The Party? He couldn’t see how, but that didn’t matter. Any act of rebellion however small felt great.

Maps of the war were now scrolling across the telescreen, they showed Oceania’s progress against Eurasia. Oceania was at war with Eurasia. Oceania had always been at war with Eurasia.

Winston suddenly became aware of another person beside him. Had his face showed sign of what he was thinking? Julia sat down to join him. A swift gesture across the face of the telescreen made it go blank, the narrative in his earpiece stopped. "You’re playing that game again, love?" asked Julia. "Yes, you know how good it is" he replied. Julia looked unimpressed. "It sends me to sleep, stops my brain working." she responded dismissively. "But it’s so clever" Winston defended. "it’s a view of what this world could have become!" His eyes gleaming, Winston pulled a small book out of his bag. The inscription on the title-page ran:

THE THEORY AND PRACTICE OF
OLIGARCHICAL COLLECTIVISM

by Emmanuel Goldstein

Winston began reading:

Chapter I

Ignorance is Strength

Throughout recorded time, and probably since the end of the Neolithic Age, there have been three kinds of people in the world, the High, the Middle, and the Low. They have been subdivided in many ways, they have borne countless different names, and their relative numbers, as well as their attitude towards one another, have varied from age to age: but the essential structure of society has never altered. Even after enormous upheavals and seemingly irrevocable changes, the same pattern has always reasserted itself, just as a gyroscope will always return to equilibrium, however far it is pushed one way or the other.

The aims of these groups are entirely irreconcilable…

Julia’s eyes had glazed over, they always did when he talked to her about the tenets behind Big Brother, the massively multi-player game sweeping the world. He stopped reading but it was several seconds before Julia noticed he had stopped. "It just doesn’t grab me, sorry" she apologised.

The waiter arrived, smiled and exchanged pleasantaries. "Sloe today Julia?" he asked. "Yes, please" she replied. The cafe was famous for its clove gin, but the sloe gin was also excellent. In total they had more than 70 flavoured gins and many other drinks besides. When not drinking the gin Winston would often try one of the ever-changing supply of world beers that flowed through the cafe, carefully savouring each one and keeping notes in his online review diary.

"Oh, but look at this", Julia’s face lit up as she remembered why she had come to find him. It was a heavy lump of glass, curved on one side, flat on the other, making almost a hemisphere. There was a peculiar softness, as of rainwater, in both the colour and the texture of the glass. At the heart of it, magnified by the curved surface, there was a strange, pink, convoluted object that recalled a rose or a sea anemone. "It’s from Mr Charrington’s," Julia went on, "he said I could bring it round to show you. I want to find out all about it."

She handed the glass to Winston who waved it across the small camera on the telescreen in front of him. The screen lit up as it took the code from the bottom of the glass and matched it with several photos of the object and a description. It was a paperweight, manufactured in Italy sometime in the early 1930s. The strange pink shape was Coral, not rare, but beautiful none the less.

The telescreen was changing, new pictures were arriving and more information appearing alongside the initial description as Goldstein (Winston had named his search agent after one of the Game’s main characters) trawled the semweb for more references. Within minutes he had a near complete history of the paperweight’s manufacture, who had owned it over the years and how it related to other more or less rare collections. It wasn’t really worth anything, semi-antiques like these were plentiful and Mr Charrington had a shop full of them, but it was just the kind of pretty thing that Julia loved to decorate their studio with. Winston smiled at her. "I love you. Go and buy it." he said.

As Julia left the cafe Winston heard the usual weekend commotion of the Thought Police returning. They were out-of-step now, not marching and there was a great deal of whooping and laughter. He sat back, glass in hand, to hear what had happened. In this mood they were bound to invade the cafe. He was right, around a dozen of them bounced up to the bar. Not one of them could have been more than 25 he thought to himself. Winston vaguely remembered a time in his childhood when people weren’t that interested in politics, but not now. Groups of Thought Police, young activists holding politicians and large corporations to account were a common sight. The group laughed and joked, congratulating each other on the day’s work.

Winston pieced together the lively fragments of chatter to conclude they had managed to secure yet another resignation of a corrupt politician. He didn’t catch the name or which of the many parties the poor chap had been part of. What he couldn’t understand was how any politician thought they could do anything but serve their constituents when the Thought Police, like everyone else, could query every vote, business partnership, gift and expense claim.

A broad smile broke across his face – maybe they thought it was all just a game.

Data Portability

Data Portability is a great campaign, starting to gain some momentum, about ensuring the data you put into sites like facebook and linkedin is available for you to move between sites as you choose to move. Some major sites, including facebook have agreed to work with the group to develop standards for portable data, but still a long way to go.

So, dull bit done – there’s a host of videos going around promoting Data Portability

Here are the best two (IMHO) so far…

Connect, Control, Share, Remix by Michael Pick

and Get Your Data Out! by (friend and colleague) Danny Ayers

Nothing is Miscellaneous

A few months back I read David Weinberger’s Everything is Miscellaneous. After his contribution to The Cluetrain Manifesto back in 1999 my expectations were high. Mine higher than most, perhaps, as the book is dedicated “to the librarians” and I work for a company with several decades of heritage in library systems.

I already knew I liked David’s style of writing from Cluetrain, so I was glad to hear that same voice coming through loud and clear.

By chapter three we’re into library history; Dewey, Ranganathan, Carlyle and Panizzi all get a mention, but Dewey takes the brunt of the attack on geographies of knowledge. It would be easy to take Weinberger’s text as an attack on libraries specifically and the organization of knowledge generally, and many have, but it appears to me these are only used as examples of the limitations we face when organizing anything made of atoms.

And that really forms the nub of this book, that digital media can be organized differently for different purposes at different times – lifting a major limitation of the physical world. I wrote in a similar vein earlier this year when I tried to explain why virtual worlds suck. In that I tried to show the difference between approaches to search in Second Life and approaches to search on any web search engine.

This is touched on briefly in a Second Life book club review of EiM where it was said:

[18:23] Teofila Matova: chaos leads to order

[18:23] Stolvano Barbosa: yes

[18:23] Teofila Matova: lets think of the librarian with all the books on the floor

Who in their right mind would suggest all the books in the library sitting in one “miscellaneous” pile on the floor? Surely Weinberger is mad. Or evil? Maybe he’s trying to destroy knowledge!

But hold on there – a pile on the floor… isn’t that essentially what automated archives are? Large robot-managed warehouses like the National Library of Norway sort the books by size, paper type, frequency of access; anything that makes the storage facility more efficient rather than a map of human knowledge. Of course, where each document lives is noted and cross-referenced with title, author, subject and so on. ‘Where it lives’, like an address or a Resource Locator… If they were all the same they’d be Uniform Resource Locators or URLs.

That’s what Weinberger’s getting at, that things can have an address independent of any taxonomy. Not only that, but by giving something an address or identity that is not simply it’s position in a taxonomy then you can cross-reference the same items in several different taxonomies at the same time and add more as and when you need them.

Which brings me to where I take issue with everything being miscellaneous; to explain let me take a little detour. I’ve spent many years trying to learn what makes code (programming that is, Java, C++, you know the stuff) better. One of the things that I’ve concluded about programming in OO languages is that there are a few terms that smell bad – ‘utils’ is one of my favorites. ‘Utils’ almost always means ‘the things that don’t it in the hierarchy of my code’ and that usually means the hierarchy is wrong. The same goes for any taxonomy where things don’t really fit and end up tagged in that ‘miscellaneous’ section on the end.

The things that end up in the ‘misc’ section are the things that weren’t thought about or weren’t really cared about or weren’t really understood in the design of the structure of the knowledge; Dewey 297, Islam, Bahai and Babism – things that aren’t Christian… Their importance and differences from each other hidden by the understanding set like concrete in the classification.

But somebody always cares about those things that end up in ‘misc’, probably deeply and in a way that they could classify in detail. Take a friend of mine doing a PhD analyzing the differences in scholarly texts that have been cited a lot and those that have been only cited once. You should try finding a way to search for those scholarly texts that have been relegated to the bottom of the heap.

That is what the third order of order, as Weinberger coins it, allows us to do, for ourselves… Leading me to believe that the book should be called Nothing is Miscellaneous.

Technorati Tags: , , , ,

Open Data Licensing

Back at the end of September we finally got to the point of releasing the first draft of the Open Data Commons License. This is work I’ve been involved in since Ian’s first draft of the TCL about a year and a half ago.

It’s great to see this license come to fruition, having argued about the need for this more than once.

It’s interesting to see the conversation happening around LibraryThing’s Common Knowledge and the Open Library project. Both of these are collections of factual data, I’ve been speaking to people involved in both and both have a clear desire to protect the data and ensure that it’s available for the community into the future.

Licensing is critical to that – as I said in Banff (listen) at the start of the year.

Back then we were concerned with navigating the difference in protection afforded to database in the EU and the US. In essence, databases have protection in the EU, but have no protection on the US. The reason we were looking at that was because the natural thinking goes something like this:

Creative Commons extends Copyright to allow you to easily position yourself on the spectrum of ‘All Rghts reserved’ to ‘Public Domain’.

Therefore Open Data Commons must need to extend a Database Right to allow to position your data on the same spectrum.

Well, the Open Data Commons license gets around that by being couched in contract law. This seems like a great way to license data for open use and prevent it being locked away in future.

With all that’s been going on then, it’s no surprise that I missed the Model Train Software case that could have a big impact on how Open-Source software licenses are drafted. A San Francisco judge ruled that the Artistic License was a contract – meaning that breach of the license did not necessarily mean infringing the copyright. That changes the legal redress and potential penalties available for breaching a license.

Interesting.

Technorati Tags: , , , ,

This post originally appeared on Talis’ Nodalities blog.

The first steps of the Semantic Web are now a short distance behind us and some organisations are starting to pick up the pace. With more and more data coming online, marked up for linking and sharing in a web of data, perhaps it’s time to look again at the trade-off of different intellectual property rights.

Back in November of 2004 James Boyle published A Natural Experiment in the Financial Times. This piece sees him debating the merits of intellectual property rights over data with Thomas Hazlett and Richard Epstein. His primary thrust is that we should be making policy decisions in this area based on empirical data about the economic benefits one way or another. Something all three protagonists agree on.

Much has changed between 2004 and now, not least our understanding of how the web can affect the way we collaborate, share, communicate; it fundamentally affects the way we live. We chat, we blog, we Twitter, we Flickr and we Joost. Content flows from person to person in unprecedented ways and at unprecedented speeds. This changes the nature of the experiment that Boyle talks about.

If the database right were working, we would expect positive answers to three crucial questions. First, has the European database industry’s rate of growth increased since 1996, while the US database industry has languished? [...] Second, are the principal beneficiaries of the database right in Europe producing databases they would not have produced otherwise? [...] Third, [...] is the right promoting innovation and competition rather than stifling it?

Boyle’s first two questions centre around the creation of databases and his third, by his own admission, is difficult to measure. If one of our primary goals for the growth of the Internet is to have a web of data that can be linked and accessed across the globe we may be better served by assessing how companies might make data open.

Boyle asks for, and discusses, the empirical evidence of databases being created in the EU and US. The differences in numbers should provide insight into the economic ups and downs as the EU adopted a robust database right in 1996 while the US ruled against such protection in 1991. I am interested in how we expect the growth of data on the Semantic Web to differ in the two jurisdictions.

Boyle explains that the US Chamber of Commerce oppose the creation of a database right in the US

[The US Chamber of Commerce] believe that database providers can adequately protect themselves with contracts, technical means such as passwords, can rely on providing tied services and so on.

And therein lies the rub. Without appropriate protection of intellectual property we have only two extreme positions available: locked down with passwords and other technical means; or wide open and in the public-domain. Polarising the possibilities for data into these two extremes makes opening up an all or nothing decision for the creator of a database.

With only technical and contractual mechanisms for protecting data, creators of databases can only publish them in situations where the technical barriers can be maintained and contractual obligations can be enforced.

We don’t tolerate this with creative works, our photographs, our blog posts and so on. Why would we expect it to make sense for databases? Whether or not it makes sense comes down to whether or not it is beneficial to society. We allow Copyright in order to provide adequate remuneration to be collected by the creator of a work. We allow patents to allow the recovery of development costs for an invention. Which is database right more like?

Patent is a very broad monopoly. If I had a patent on the clock, a mechanical means of measuring the passing of time, nobody else would be able to make clocks. Copyright, on the other hand is much narrower only allowing me to protect the specific design of my clocks. This is where it can get confusing with databases. Database right in the EU is like Copyright. It is a monopoly, but only on that particular aggregation of the data. The underlying facts are still not protected and there is nothing to stop a second entrant from collecting them independently.

Richard Epstein points to this in his contribution

The question is why do databases fall outside [the general principle of copyright], when the costs of compilation are in many cases substantial for the initial party and trivial for anyone who receives judicial blessing to copy the base? In answering this question, it will not do to say, as the Supreme Court said in the well known decision in Feist Publications v. Rural Telephone Service, (1991) that these compilations are not “original” in the sense that it requires no thought to check the spelling of the entries and to put them all in alphabetical order. But that obvious point should be met with an equally obvious rejoinder. If it requires no thought or intelligence to put the information together, then why not ask the second entrant into the market to go through the same drudge work as the first.

This is exactly what we see happening with Open Street Map. Ordnance Survey in the UK have rights over the map data they have collected. The protection covers the collection of geospatial data that they have created, they are not granted a monopoly in geospatial data.

This leaves a special case of databases, those which are created at low cost as a by-product of normal business. Examples used in Boyle’s article are telephone numbers, television schedules and concert times. Boyle gives us the answer directly

the [European] court ruled that the mere running of a business which generates data does not count as “substantial investment” enough to trigger the database right.

This reminds me strongly of The Smell of Food and the Sound of Coins a folk tale in which a wise judge decides that a restaurateur may charge for the smell of food wafting from his restaurant, however the appropriate price is the sound of coins chinking together.

That a database right may not and should not apply in all cases, and that there is a requirement to restrict anti-competitive practices, does not necessarily extend to the conclusion that a right is not required.

It seems to me that much of the debate around intellectual property rights has focussed on how they are used to keep things closed. Having suggested earlier that we have only the abilities to keep databases locked away or in contrast open them completely, I’d like to consider what it might mean to have a database right for keeping things open.

In response to Thomas Hazlett’s contribution Boyle asks

How many databases are now created and maintained entirely “free” and thus escape commercial directories altogether? There are obviously many, both in the scientific and the consumer realm. One can no more omit these from consideration, than one can omit free software from the software market.

This strikes me as a great comparison to consider. Taking one of the most prevalent free software licenses, the Gnu Public License, what might that look like for data?

One of the primary functions of the GPL is that it enforces Copyleft – the requirement to license derivative, and even complimentary, works under an the same license. That is, any commercial software that makes use of GPL code must, under the terms of the license, also be released under the GPL. The viral nature of this license is possible only because of the backing of Copyright.

Without a database right communities have no mechanism to publish openly and still insist upon this kind of Share-Alike agreement.

Consider the impact of this for situations where you you might use the idea of promiscuous copying to maintain the availability of data. Promiscuous copying relies on two things, lots of copies being made and lots of copies being available. Without the necessary licensing in place there is no mechanism with which to compel those who have copies to make those available. Public Domain means, by definition, no restriction – that means I can lock it away again.

Copyleft is just one position along a spectrum where ‘locked away’ and ‘free as a bird’ sit at each end. What the web shows us is that other business models form crucial parts of the eco-system. Epstein picks up on the controlling aspect of Boyle’s argument:

They can control their list of subscribers; give them each passwords; charge them based on the amount of the information that is used, or some other agreed-upon formula; and require them not to sell or otherwise transfer the information to third parties without the consent of the data base owner.

Imagine if this were true of Copyright material on the web? It has been, and still is on the occasional site. But mostly copyright owners are starting to see the value of publishing content online and they are underpinning the delivery of that content to consumers with other business models. Without Copyright the types of business that could participate would be reduced.

Epstein goes on to say:

The contractual solution is surely preferable, because general publication will allow for use by others that may not offend the copyright law, but which will block the possibility of payment for the costly information that is supplied.

And again, the very heart of the matter. If we are to encourage those who have large databases to make them open, to post them on the Semantic Web, we must provide them with models and solutions that are preferable to technical barriers and restrictive contracts. Allowing them to pick their own position on the spectrum seems to me to be a necessity in that. You can see any form of protection in two lights. When Boyle says

They make inventors disclose their inventions when they might otherwise have kept them secret.

I say

They allow inventors to disclose their inventions when they might otherwise have had to keep them secret.

That’s why we’ve invested in a license to do this, properly, clearly and in a way that stays Open.

Rob Styles is Programme Manager for Data Services at Talis, a UK company building Semantic Web technologies. Rob Styles is not a lawyer.

Technorati Tags: , ,