Webstock – New Zealand's web conference

While wandering around searching for interesting semwebby bits and pieces I stumbled across Webstock – New Zealand’s web conference. The programme looks great, with Ze Frank, Matt Biddulph, Tom Coates, Toby Segaran and Heather Champ amongst others. This looks like an awesome line-up.

Shame it’s on the opposite side of the planet in just over 2 weeks, otherwise I’d be trying to wangle a trip. I wonder if Matt Biddulph has space in his luggage for little old me?

Nounification of Verbs

For a long time I’ve felt uncomfortable every time I’ve written a class with a name like ‘FooManager’, ‘BarWatcher’ or ‘BazHelper’. This has always smelt bad and opening any codebase that is structured this way has always made me feel ever so slightly uneasy. My thoughts as to why are still slightly fuzzy, but here’s what I have so far…

Firstly some background, my perspective on object-oriented programming is deliberately naive. I don’t like to create interfaces for everything and I don’t use lots of factories. This comes, I guess, from my earliest education in C++, through one of the best books ever written on the subject. Simple C++ by Geoffrey Cogswell. While you stop laughing at the idea that you can learn something as complex as object-oriented programming from a thin paperback featuring a robot dog and the term POOP (Profound Object Orientated Programming), think about the very essence of what it is we’re trying to do.

OOP is about modelling objects. Objects are things that are, and to name things that are we use nouns. Then we give the objects responsibilities, things that they can do, behaviour. So we use what my primary school english so beautifully called ‘doing words’ or verbs if you prefer.

Now, not long ago I wrote a ByteArrayHelper class in Java. I’m not ashamed of it. The code is good, efficient, readable code that does many of the common things I needed to do with a byte[]. However, help is a verb. My classes responsibility is to help byte[] by doing things that byte[] doesn’t do. I’ve made the class name into a noun by nounifying the verb.

By de-nounifying it I can see where the responsibilities should really sit – with byte[]. My ByteArrayHelper does nothing for itself. All of its methods do something with a byte[]. The methods are things like SubArray(offset, length) and insertAt(offset, bytes). These are methods that I wanted on byte[].

Now, what I really wanted was to be able to add these methods to byte[], making them available wherever a byte[] was being handled, but as Java is statically typed I couldn’t do that (even if byte[] were a class, which it isn’t). In SmallTalk, Javascript or Ruby I likely could have just added the methods I wanted. The next best thing would have been to declare a sub-class of byte[] and put the methods on that, then the initial construction of my byte[] instance could create my own, more capable, object, but still pass it around everywhere as a byte[]. But byte[] isn’t a class in Java, byte isn’t even a class, it’s a primitive – sort of an object, but much less powerful.

Following the search for a noun-base approach I could have created my own ByteArray that may or may not have delegated to a byte[] internally. This could not have been passed around as a byte[] though, so would have required substantial refactoring of the classes already there. So, I wrote a ByteArrayHelper instead. Having written the ByteArrayHelper, though, it was obvious that none of the methods required any instance variables, they all took and returned byte arrays – so I made them all static. So, my nounified verb had actually led me to write nothing more than a function library.

Whether or not I made the right decision is left as an exercise for the reader.

Taking another example, this time from a friend’s code. Looking through it we noticed that one of the classes was a FileLoaderManager – a class who’s responsibility is to manage FileLoaders. A nounified verb looking after another nounified verb. I hasten to add that this is not bad code – the code in question does some awesome processing of relationships looking for similarities, like Amazon’s ‘people who bought this also bought’ but more generic.

When we looked into the FileLoaderManager and took away some of the responsibilities that fitted better with other classes we were left with just the need to list all the files in a given path that matched a particular pattern. Knowing what files are at a given path sounds like the responsibility of a Directory to me. Now, being very lean C++ we didn’t bother looking for one of the readily available Directory classes, the code we already had could be re-factored quickly. Having written the Directory class it becomes obvious that it would be useful elsewhere, whereas the FileLoaderManager could only be used for the one specific case it originally fulfilled. The nounified verb had led to the code being far more specific than it needed to be.

Two classes I came across in a PHP codebase recently were called FilePutter and FileGetter. These two classes wrap the file_put_contents and file_get_contents functions in PHP, wrapping these functions as classes allows them to be mocked, and therefore users of them can be unit tested. Wouldn’t a single class called, simply, File be easier to follow? The nounified verb approach had led to a peculiar structure in the code made it less obvious for a reader to follow.

So far then, my conclusion is that nounified verbs are likely to be a sign that I’m not using OO techniques for specialisation of behaviour; that my code is more specialised than it could be or that I’m writing in a way that is less easy to read than it could be.

BlueBlog: How and Why Glue is Using Amazon SimpleDB instead of a Relational Database

Alex blogs over at Adaptive Blue about their use of Amazon’s SimpleDB to power their browser add-on Glue.

The post is interesting, and the comments useful. What I noticed, though, is that they’re using natural keys…

The solution that Glue uses relies on data duplication. Each Person and each Thing in our system has a unique key. In the case of a Person, the key is the username. In the case of a Thing, the key is a combination of the type, its name and an attribute, like author for a book or director for a movie, which provides a way to disambiguate among the objects that have the same type and the same name.

via BlueBlog: How and Why Glue is Using Amazon SimpleDB instead of a Relational Database.

Why you can't find a library book in your search engine | Technology | The Guardian

Wendy Grossman, in The Guardian, covers the difficulties of libraries publishing their catalogue data online.

Despite the internet’s origins as an academic network, when it comes to finding a book, e-commerce rules. Put any book title into your favourite search engine, and the hits will be dominated by commercial sites run by retailers, publishers, even authors. But even with your postcode, you won’t find the nearest library where you can borrow that book. (The exception is Google Books, and even that is limited.)

via Why you can’t find a library book in your search engine | Technology | The Guardian.

I get a namecheck and a quote at the end:

Rob Styles, a programme manager for Talis’s data services, says: “The main reason I think libraries need freedom to innovate is because we don’t know what they’re going to look like”.

When Patents Go Wrong…

Warning, Patent Rant follows.
91019234.pdf (page 15 of 19)
Sure, everyone knows about high profile patents like Amazon One-Click, but what about the effect of less prominent cases?

A patent is a monopoly over an invention – in order to encourage innovation, patents are granted to inventors so they are assured of an income from the invention. Inventions usually make money either through the sale or licensing of the patent or through production of a product that makes use of the invention. The patent prevents others from simply copying the idea. Unlike Copyright, however, patents cover the idea even if the second person came up with the idea completely independently.

But why am I writing about this now? Well, I’ve come to the end of the line with a child’s clock. Yep, you read that right.

Back in 1991 Julian Renton designed a clock for children. The intention was simple, provide children who were too young to tell the time a visual indication of whether or not they should be awake. This is a great idea and one that, as a father of three, I whole-heartedly support.

He patented the clock.

All would be fine and dandy if he, or a licensed manufacturer, had gone on to produce a product based on the patent that both worked in practice and was sound value-for-money. Unfortunately that’s not what happened. Without the patent the idea would have been open for several manufacturers to pick up the idea and produce competing versions. This would have had the usual market benefits of encouraging the development of better products as well as driving cost down. The patent prevents this.

So, we’re left with just Sleep Time Bunny.

Now, as far as Julian’s concerned, the patent system has worked very well. He has a nice little business selling Sleep Time Bunny directly over the internet and through some shops. The usual street price for Sleep Time Bunny is a little under £20. Bear in mind that, other than patent-protected bunny face, this is the same complexity as a standard alarm clock. It’s also been manufactured with cost very much in mind – you can tell. I would suspect the manufacturing cost doesn’t exceed £2 per clock – and it is possible to buy unbranded alarm clocks that appear to be the same quality for around £4. Julian should be making a healthy profit on the sale of each clock.

But as a consumer, the patent protection has delivered me poor value-for-money and resulted in no consumer choice. The problem for me, as a consumer, is that the idea has been allowed to run as a monopoly, thus requiring no innovation or development to make the product better or cheaper. Take one of the most obvious problems for a children’s alarm clock:

A 2½ year old, the stated lower end of the age range for the clock, will often go to bed around 7pm and be expected to stay in bed until 7am the following morning. This is obviously something that people ask a lot as makes it onto Bunny Clock’s FAQ:

Bunny Clock Q. I tried to set Bunny Clock to sleep at 7.00pm to wake at 7.00am but the Bunny won’t stay asleep. Why is this?

Bunny Clock A. The waking time selection is set with the normal alarm set hand. However, the alarm mechanism, as with all alarm clocks, works on a 12 hour cycle and so you are effectively trying to set Bunny Clock to sleep when it wants to wake. If you want Bunny to sleep for 12 or more hours you will need to adjust the wake setting at a later time – possibly just before you go to bed.

Right, so for a very common case the clock simply doesn’t work. Notice the aside there, “as with all alarm clocks”. Not all alarm clocks are marketed for young children, not all alarm clocks are patented. How much product development has gone into solving that problem in the past 17 years? Zero. Because there is no need to solve it. The product has no immediate competitors.

What the patent doesn’t prevent is someone designing a product that solves the exact same problem and competes in the exact same space, as long as it doesn’t infringe on any of the claims made for Sleep Time Bunny. And that’s exactly what someone should do. Please someone.

Resource Lists, Semantic Web, RDFa and Editing Stuff

Some of the work I’ve been doing over the past few months has been on a resource lists product that helps lecturers and students make best use of the educational material for their courses.

One of the problems we hoped to address really well was the editing of lists. Historically products that do this have been deemed cumbersome and difficult by academic staff who will often produce lists as simple documents in Word or the like.

We wanted to make an editing interface that really worked for the academic community so they could keep the lists as accurate and current as they wanted.

Chris Clarke, our Programme Manager, and Fiona Grieg, one of our pilot customers, describe the work in a W3C case study. Ivan Hermann then picks up on one of the way we decided to implement editing using RDFa within the HTML DOM. In the case study Chris describes it like this:

The interface to build or edit lists uses a WYSIWYG metaphor implemented in Javascript operating over RDFa markup, allowing the user to drag and drop resources and edit data quickly, without the need to round trip back to the server on completion of each operation. The user’s actions of moving, adding, grouping or editing resources directly manipulate the RDFa model within the page. When the user has finished editing, they hit a save button which serialises the RDFa model in the page into an RDF/XML model which is submitted back to the server. The server then performs a delta on the incoming model with that in the persistent store. Any changes identified are applied to the store, and the next view of the list will reflect the user’s updates.

This approach has several advantages. First, as Andrew says

One thing I hadn’t used until recently was RDFa. We’ve used it on one of the main admin pages in our new product and it’s made what was initially quite a complex problem much simpler to implement.

The problem that’s made simpler is this – WYSIWYG editing of the page was best done using DOM manipulation techniques, and most easily using existing libraries such as prototype. But what was being edited isn’t really the visual document, it is the underlying RDF model. Trying to keep a version of the model in a JS array or something in synch with the changes happening in the DOM seemed to be a difficult (and potentially bug-ridden) option.

By using RDFa we can distribute the model through the DOM and have the model updated by virtue of having updated the DOM itself. Andrew describes this process nicely:

Currently using Jeni Tennison’s RDFQuery library to parse an RDF model out of an XHTML+RDFa page we can mix this with our own code and end up with something that allows complex WYSIWYG editing on a reading list. We use RDFQuery to parse an initial model out of the page with JavaScript and then the user can start modifying the page in a WYSIWYG style. They can drag new sections onto the list, drag items from their library of bookmarked resources onto the list and re-order sections and items on the list. All this is done in the browser with just a few AJAX calls behind the scenes to pull in data for newly added items where required. At the end of the process, when the Save button is pressed, we can submit the ‘before’ and ‘after’ models to our back-end logic which builds a Changeset from before and after models and persists this to a data store on the Talis Platform.

Building a Changeset from the two RDF models makes quite a complex problem relatively straightforward. The complexity now just being in the WYSIWYG interface and the dynamic updating of the RDFa in the page as new items are added or re-arranged.

As Andrew describes, the editing starts by extracting a copy of the model. This allows the browser to maintain before and after models. This is useful as when the before and after get posted to the server the before can be used to spot if there have been editing conflicts with someone else doing a concurrent edit – this is an improvement to how Chris described it in the case study.

There are some gotchas in this approach though. Firstly, some of the nodes have two-way links:

<http://example.com/lists/foo> <http://purl.org/vocab/resourcelist/schema#contains> <http://example.com/items/bar>
<http://example.com/items/bar> <http://purl.org/vocab/resourcelist/schema#list> <http://example.com/lists/foo>

So that the relationship from the list to the item gets removed when the item is deleted from the DOM we use the @rev attribute. This allows us to put the relationship from the list to the item with the item, rather than with the list.

The second issue is that we use rdf:Seq to maintain the ordering of the lists, so when the order changes in the DOM we have to do a quick traversal of the DOM changing the sequence predicates (_1, _2 etc) to match the new visual order.

Neither of these were difficult problems to solve 🙂

My thanks go out to Jeni Tennison who helped me get the initial prototype of this approach working while we were at Swig back in Novemeber.

dev8D – Developer Happiness Days

dev8D – Developer Happiness Days. 9-13 February 2009, London

JISC is running a Developer Happiness Days meet, sort of like a 4 day hackfest, come code4lib type thing.

Over four intensive days we’re bringing together the cream of the crop of educational software developers along with coders from other sectors, users, and technological tinkerers in an exciting new forum.

Share your skills and knowledge with the coding community in a stimulating and fun environment and come away with new skills, fresh contacts – and you might even win a prize.

Sounds like it will be a great few days.