Multi-Tenant Configuration Schema

Are you writing multi-tenant software? Are you using RDF at all? Do you want to keep track of your tenants?

You might want to comment on the first draft of the new Multi-Tenant Configuration Schema.

This schema attempts to describe a simple set of concepts and relationships about tenants within a multi-tenant software system. It avoids anything that would constitute application configuration, but will happily co-exist with classes and properties to do that. The documentation is sparse currently, awaiting questions and comment so that I can expand on areas that require further explanation. Comment here, or email me.

Domain Specific Editing Interface using RDFa and jQuery

I wrote back in January about Resource Lists, Semantic Web, RDFa and Editing Stuff. This was based on work we’d done in Talis Aspire.

Several people suggested this should be written up as a fuller paper, so Nad, Jeni and I wrote it up as a paper for the SFSW 2009 workshop. It’s been accepted and will be published there, but unfortunately due to work priorities that have come up we won’t be able to attend.

A draft of the paper is here: A Pattern for Domain Specific Editing Interfaces Using Embedded RDFa and HTML Manipulation Tools.

The camera ready copy will be published in the conference proceedings. Feedback welcomed.

Ruby Mock Web Server

I spent the afternoon today working with Sarndeep, our very smart automated test guy. He’s been working on extending what we can do with rspec to cover testing of some more interesting things.

Last week he and Elliot put together a great set of tests using MailTrap to confirm that we’re sending the right mails to the right addresses under the right conditions. Nice tests to have for a web app that generates email in a few cases.

This afternoon we were working on a mock web server. We use a lot of RESTful services in what we’re doing and being able to test our app for its handling of error conditions is important. We’ve had a static web server set up for a while, this has particular requests and responses configured in it, but we’ve not really liked it because the responses are all separate from the tests and the server is another apache vhost that has to be setup when you first checkout the app.

So, we’d decided a while ago that we wanted to put in a little Ruby based web server that we could control from within the rspec tests and that’s what we built a first cut of this afternoon.

require File.expand_path(File.dirname(__FILE__) + "/../Helper")
require 'rubygems'
require 'rack'
require 'thin'
class MockServer
  def initialize()
    @expectations = []
  end
  def register(env, response)
    @expectations << [env, response]
  end
  def clear()
    @expectations = []
  end
  def call(env)
    #puts "starting call\n"
    @expectations.each_with_index do |expectation,index|
      expectationEnv = expectation[0]
      response = expectation[1]
      matched = false
      #puts "index #{index} is #{expectationEnv} contains #{response}\n\n"
      expectationEnv.each do |envKey, value|
        puts "trying to match #{envKey}, #{value}\n"
        matched = true
        if value != env[envKey]
          matched = false
          break
        end
      end
      if matched
        @expectations.delete_at(index)
        return response
      end
    end
    #puts "ending call\n"
  end
end
mockServer = MockServer.new()
mockServer.register( { 'REQUEST_METHOD' => 'GET' }, [ 200, { 'Content-Type' => 'text/plain', 'Content-Length' => '11' }, [ 'Hello World' ]])
mockServer.register( { 'REQUEST_METHOD' => 'GET' }, [ 200, { 'Content-Type' => 'text/plain', 'Content-Length' => '11' }, [ 'Hello Again' ]])
Rack::Handler::Thin.run(mockServer, :Port => 4000)

The MockServer implements the Rack interface so it can work within the Thin web server from inside the rspec tests. The expectations are registered with the MockServer and the first parameter is simply a hashtable in the same format as the Rack Environment. You only specify the entries that you care about, any that you don’t specify are not compared with the request. Expectations don’t have to occur in order (expect where the environment you give is ambiguous, in which case they match first in first matched).

As a first venture into writing more in Ruby than an rspec test I have to say I found it pretty sweet – There was only one issue with getting at array indices that tripped me up, but Ross helped me out with that and it was pretty quickly sorted.

Plans for this include putting in a verify() and making it thread safe so that multiple requests can come in parallel. Any other suggestions (including improvements on my non-idiomatic code) very gratefully received.

Resource Lists, Semantic Web, RDFa and Editing Stuff

Some of the work I’ve been doing over the past few months has been on a resource lists product that helps lecturers and students make best use of the educational material for their courses.

One of the problems we hoped to address really well was the editing of lists. Historically products that do this have been deemed cumbersome and difficult by academic staff who will often produce lists as simple documents in Word or the like.

We wanted to make an editing interface that really worked for the academic community so they could keep the lists as accurate and current as they wanted.

Chris Clarke, our Programme Manager, and Fiona Grieg, one of our pilot customers, describe the work in a W3C case study. Ivan Hermann then picks up on one of the way we decided to implement editing using RDFa within the HTML DOM. In the case study Chris describes it like this:

The interface to build or edit lists uses a WYSIWYG metaphor implemented in Javascript operating over RDFa markup, allowing the user to drag and drop resources and edit data quickly, without the need to round trip back to the server on completion of each operation. The user’s actions of moving, adding, grouping or editing resources directly manipulate the RDFa model within the page. When the user has finished editing, they hit a save button which serialises the RDFa model in the page into an RDF/XML model which is submitted back to the server. The server then performs a delta on the incoming model with that in the persistent store. Any changes identified are applied to the store, and the next view of the list will reflect the user’s updates.

This approach has several advantages. First, as Andrew says

One thing I hadn’t used until recently was RDFa. We’ve used it on one of the main admin pages in our new product and it’s made what was initially quite a complex problem much simpler to implement.

The problem that’s made simpler is this – WYSIWYG editing of the page was best done using DOM manipulation techniques, and most easily using existing libraries such as prototype. But what was being edited isn’t really the visual document, it is the underlying RDF model. Trying to keep a version of the model in a JS array or something in synch with the changes happening in the DOM seemed to be a difficult (and potentially bug-ridden) option.

By using RDFa we can distribute the model through the DOM and have the model updated by virtue of having updated the DOM itself. Andrew describes this process nicely:

Currently using Jeni Tennison’s RDFQuery library to parse an RDF model out of an XHTML+RDFa page we can mix this with our own code and end up with something that allows complex WYSIWYG editing on a reading list. We use RDFQuery to parse an initial model out of the page with JavaScript and then the user can start modifying the page in a WYSIWYG style. They can drag new sections onto the list, drag items from their library of bookmarked resources onto the list and re-order sections and items on the list. All this is done in the browser with just a few AJAX calls behind the scenes to pull in data for newly added items where required. At the end of the process, when the Save button is pressed, we can submit the ‘before’ and ‘after’ models to our back-end logic which builds a Changeset from before and after models and persists this to a data store on the Talis Platform.

Building a Changeset from the two RDF models makes quite a complex problem relatively straightforward. The complexity now just being in the WYSIWYG interface and the dynamic updating of the RDFa in the page as new items are added or re-arranged.

As Andrew describes, the editing starts by extracting a copy of the model. This allows the browser to maintain before and after models. This is useful as when the before and after get posted to the server the before can be used to spot if there have been editing conflicts with someone else doing a concurrent edit – this is an improvement to how Chris described it in the case study.

There are some gotchas in this approach though. Firstly, some of the nodes have two-way links:

<http://example.com/lists/foo> <http://purl.org/vocab/resourcelist/schema#contains> <http://example.com/items/bar>
<http://example.com/items/bar> <http://purl.org/vocab/resourcelist/schema#list> <http://example.com/lists/foo>

So that the relationship from the list to the item gets removed when the item is deleted from the DOM we use the @rev attribute. This allows us to put the relationship from the list to the item with the item, rather than with the list.

The second issue is that we use rdf:Seq to maintain the ordering of the lists, so when the order changes in the DOM we have to do a quick traversal of the DOM changing the sequence predicates (_1, _2 etc) to match the new visual order.

Neither of these were difficult problems to solve 🙂

My thanks go out to Jeni Tennison who helped me get the initial prototype of this approach working while we were at Swig back in Novemeber.

Exploring OpenLibrary Part Two

This post also appears on the n2 blog.

More than two weeks on from my last look at the OpenLibrary authors data and I’m finally finding some time to look a bit deeper. Last time I finished off thinking about the complete list of distinct dates within the authors file and how to model those.

Where I’ve got to today is tagged as day 2 of OpenLibrary in the n2 subversion.

First off, a correction – foaf:Name should have been foaf:name. Thanks to Leigh for pointing that out. I haven’t fixed in this tag, tagged before I realised I’d forgotten it, but next time, honestly.

It’s clear that there is some stuff in the data that simply shouldn’t be there, things that cannot possibly be a birth date such [from old catalog] and *. and simply ,. When I came across —oOo— I was somewhat dismayed. MARC data, where most of this data has come from, has a long and illustrious history, but one of the mistakes made early on was to put display data into the records in the form of ISBD punctuation. This, combined with the real inflexibility of most ILSs and web-based catalogs has forced libraries to hack there records with junk like —oOo— to fix display errors. This one comes from Antonio Ignacio Margariti.

In total there are only 6,156 unique birth date datums and 4,936 unique death dates. Of course there is some overlap, so in total there’s only 9,566 datums to worry about overall.

So what I plan to do is to set up the recognisable patterns in code and discard anything I don’t recognise as a date or date range. Doing that may mean I lose some date information, but I can add that back in later as more patterns get spotted. So far I’ve found several patterns (shown here using regex notation)…

“^[0-9]{1,4}$” – A straightforward number of 4 digits or fewer, no letters, punctuation or whitespace. These are simple years, last week I popped them in using bio:date . That’s not strictly within the rules of the bio schema as that really requires a date formatted in accordance with ISO8601. Ian had already implied his dis-pleasure with my use of bio:date and suggested I use the more relaxed dc elements date. However, on further chatting what we actually have is a date range within which the event occurred, so we need to show that the event happened somewhere within a date range. This can be solved using the W3C Time Ontology which allows for better description.

I spent some time getting hung up on exactly what is being said by these date assertions on a bio:Birth event. That is, are we saying that the birth took place somewhere within that period, or that the event happened over that period. This may seem a daft question to ask, but as others start modelling events in peoples’ bios this could easily become indistinguishable. Say I want to model my grandfather’s experience of the second world war. I’d very likely model that as an event occurring over a four year period. So, I feel the need to distinguish between an event happening over a period and an event happening at an unknown time within a period. I thought I was getting too pedantic about this, but Ian assured me I’m not and that the distinction matters.

The model we end up with is like this


@prefix bio: <http://vocab.org/bio/0.1/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix mine: <http://example.com/mine/schema#> .
@prefix time: <http://www.w3.org/TR/owl-time/> .

<http://example.com/a/OL149323A>
	foaf:Name "Schaller, Heinrich";
	foaf:primaryTopicOf <http://openlibrary.org/a/OL149323A>;
	bio:event <http://example.com/a/OL149323A#birth>;
	a foaf:Person .

<http://example.com/a/OL149323A#birth>
	dc:date <http://example.com/a/OL149323A#birthDate>;
	a bio:Birth .

<http://example.com/names/schallerheinrich>
	mine:name_of <http://example.com/a/OL149323A>;
	a mine:Name .

<http://example.com/dates/gregorian/ad/years/1900>
	time:unitType time:unitYear;
	time:year "1900";
	a time:DateTimeDescription .

<http://example.com/a/OL149323A#birthDate>
	time:inDateTime <http://example.com/dates/gregorian/ad/years/1900>;
	a time:Instant .

The simple year accounts for 731,304 of the 748,291 birth dates and for 13,151 of the 181,696 death dates, about 80% of the dates overall. Following the 80/20 rule almost perfectly, the remaining 20% is going to be painful. It has been suggested I should stop here, but it seems a shame to not have access to the rest if we can dig in, and I can, so…

First of the remaining correct entries are the approximate years, recorded as ca. 1753 or (ca.) 1753 and other variants of that. These all suffer from leading and trailing junk, but I’ll catch the clean ones of these with “^[(]?ca\.[)]? ([0-9]{1,4})$”. The difficulty with these is that you can’t really convert these into a single year or even a date range as what people consider as within the “circa” will vary widely in different contexts. So, the interval can be described in the same way as a simple year, but the relationship with the authors birth is not simply time:inDateTime. I haven’t found a sensible circa predicate, so for now I’ll drop into mine.


@prefix bio: <http://vocab.org/bio/0.1/> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix mine: <http://example.com/mine/schema#> .
@prefix time: <http://www.w3.org/TR/owl-time/> .

<http://example.com/a/OL151554A>
	foaf:Name "Altdorfer, Albrecht";
	foaf:primaryTopicOf <http://openlibrary.org/a/OL151554A>;
	bio:event <http://example.com/a/OL151554A#birth>;
	bio:event <http://example.com/a/OL151554A#death>;
	a foaf:Person .

<http://example.com/a/OL151554A#birth>
	dc:date <http://example.com/a/OL151554A#birthDate>;
	a bio:Birth .

<http://example.com/a/OL151554A#death>
	dc:date <http://example.com/a/OL151554A#deathDate>;
	a bio:Death .

<http://example.com/names/altdorferalbrecht>
	mine:name_of <http://example.com/a/OL151554A>;
	a mine:Name .

<http://example.com/dates/gregorian/ad/years/1480>
	time:unitType time:unitYear;
	time:year "1480";
	a time:DateTimeDescription .

<http://example.com/a/OL151554A#birthDate>
	mine:circaDateTime <http://example.com/dates/gregorian/ad/years/1480>;
	a time:Instant .

Ok, it’s time to stop there until next time. I have several remaining forms to look at and some issues of data cleanup.

Next time I’ll be looking at parsing out date ranges of a few years, shown in the data 1103 or 4. These will go in as longer date time descriptions so no new modelling needed.

Then we have centuries, 7th cent., again just a broader date time description required I hope. There are some entries for works from before the birth of Christ – 127 B.C.. I’ll have to take a look at how those get described. Then we have entries starting with an l like l854. I had thought that these may indicate a different calendaring system, but it appear not. Perhaps it’s bad OCRing as there are also entries like l8l4. Not sure what to do with those just yet.

In terms of data cleanup, there are dates in the birth_date field of the form d. 1823 which means that it’s actually a death date. There are also dates prefixed with fl. which means they are flourishing dates. These are used when a birth date is unknown but the period in which the creator was active is known. These need to be pulled out and handled separately.

Of course, I haven’t dealt with the leading and trailing punctuation yet or those that have names mixed in with the dates, so still much work to do in transforming this into a rich graph.

Exploring OpenLibrary Part One

This post also appears on the n2 blog.

I thought it was about time I got around to taking a better look at what might be possible with the OpenLibrary data.

My plan is to try and convert it into meaningful RDF and see what we can find out about things along the way. The project is an own-time project mostly, so progress isn’t likely to be very rapid. Let’s see how it goes. I’ll diary here as stuff gets done.

To save me typing loads of stuff out here, today’s source code is tagged and in the n2 subversion as day 1 of OpenLibrary.

Day one, 3rd October 2008, I downloaded the authors data from OpenLibrary and unzipped it. I’m also downloading the editions data from OpenLibrary, but that’s bigger (1.8Gb) so I’m playing with the author data while that comes down the tubes.

The data has been exported by OpenLibrary as JSON, so is pretty easy to work with. I’m going to write some PHP scripts on the command line to mess with it and it looks great for doing that.

Each line of the JSON in the authors file represents a single author, although some authors will have more than one entry. Taking a look at Iain Banks (aka Iain M Banks) we have the following entries:


{"name": "Banks, Iain", "personal_name": "Banks, Iain", "key": "\/a\/OL32312A", "birth_date": "1954", "type": {"key": "\/type\/type"}, "id": 81616}
{"name": "Banks, Iain.", "type": {"key": "\/type\/type"}, "id": 3011389, "key": "\/a\/OL954586A", "personal_name": "Banks, Iain."}
{"type": {"key": "\/type\/type"}, "id": 9897124, "key": "\/a\/OL2623466A", "name": "Iain Banks"}
{"type": {"key": "\/type\/type"}, "id": 9975649, "key": "\/a\/OL2645303A", "name": "Iain Banks         "}
{"type": {"key": "\/type\/type"}, "id": 10565263, "key": "\/a\/OL2774908A", "name": "IAIN M. BANKS"}
{"type": {"key": "\/type\/type"}, "id": 10626661, "key": "\/a\/OL2787336A", "name": "Iain M. Banks"}
{"type": {"key": "\/type\/type"}, "id": 12035518, "key": "\/a\/OL3127859A", "name": "Iain M Banks"}
{"type": {"key": "\/type\/type"}, "id": 12078804, "key": "\/a\/OL3137983A", "name": "Iain M Banks         "}
{"type": {"key": "\/type\/type"}, "id": 12177832, "key": "\/a\/OL3160648A", "name": "IAIN M.BANKS"}

In total the file contains 4,174,245 entries. First job is to get a more manageable set of data to work with. So, I wrote a short script to extract 1 line in every 10 from a file. The resulting sample author data file contains 417,424 entries. This is more manageable for quick testing of what I’m doing.

So now we can start writing some code to produce some RDF. Given the size of these files, I need to stream the data in and out again in chunks. The easiest format I find for that is turtle which has the added benefit of being human readable. YMMV. Previously I’ve streamed stuff out using n-triples. That has some great benefits too, like being able to generate different parts of the graph, for the same subject, in different parts of the file then being them together using a simple command line sort. It’s also a great format for chunking the resulting data into reasonable size files as breaking on whole lines doesn’t break the graph, whereas with rdf/xml and turtle it does.

So, I may end up dropping back to n-triples, but for now I’m going to use turtle.

I also like working on the command line and love the unix pipes model, so I’ll be writing the cli (command line) tools to read from STDIN and write to STDOUT so I can mess with the data using grep, sed, awk, sort, uniq and so on.

First things first, Let’s find out what’s really in the authors data. Reading the json line by line and converting each line into an associative array is simple in PHP, so let’s do that, keep track of all the keys we find in the arrays and recurse into the nested arrays to look at them – then dump the result out. The arrays contain this set of keys:

alternate_names
alternate_names
alternate_names\1
alternate_names\2
alternate_names\3
bio
birth_date
comment
date
death_date
entity_type
fuller_name
id
key
location
name
numeration
personal_name
photograph
title
type
type\key
website

So, they have names, birth dates, death dates, alternate names and a few other bits and pieces. And they have a ‘key’ which turns out to be the resource part of the OpenLibrary url. That’s means we can link back into OpenLibrary nice and easy. Going back to our previous Iain Banks examples, we want to create something like this for each one:


@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix bio: <http://vocab.org/bio/0.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .

<http://example.com/a/OL32312A>
	foaf:Name "Banks, Iain";
	foaf:primaryTopicOf <http://openlibrary.org/a/OL32312A>;
	bio:event <http://example.com/a/OL32312A#birth>;
	a foaf:Person .

<http://example.com/a/OL32312A#birth>
	bio:date "1954";
	a bio:Birth .

This gives us a foaf:Person for the author and tracks his birth date using a bio:Birth event. While tracking the birth as a separate entity may seem odd it gives the opportunity to say things about the birth itself. We’ll model death dates the same way, for the same reason. I’ve written some basic code to generate foaf from the OpenLibrary authors.

Linking back to the OpenLibrary url has been done here using foaf:primaryTopicOf. I didn’t use owl:sameAs because the url at OpenLibrary is that of a web page, whereas the uri here (http://example.com/a/OL32312A) represents a person. Clearly a person is not the same as a web page that contains information about them.

The only thing worrying me is that the uris we’re using are constructed from OpenLibrary’s keys. This makes matching them up with other data sources hard. Matching with other data sources requires a natural key, but there’s not enough data in these author entries to create one. The best I can do is to create a natural key that will enable people to discover the group of authors that share a name.


@prefix mine: <http://example.com/mine/schema#> .
<http://example.com/names/banksiain>
	mine:name_of <http://example.com/a/OL32312A>;
	a mine:Name .

These uris will enable me to find authors that share the same name easily, either because they do share the same name or because they’re duplicates. The natural key is simply the author’s name with any casing, whitespace or punctuation stripped out. That might need to evolve as I start looking at the names in more detail later.

Next step is to look in more detail at the dates in here, we have some simple cases of trailing whitespace or trailing punctuation, but also some more interesting cases of approximate dates or possible ranges – these occur for historical authors mostly. The complete list of distinct dates within the authors file is in svn. If you know anything about dates, feel free to throw me some free advice on what to do with them…

Pages, Screens, MVC and not getting it…

MVC Web

About two years ago my colleague Ian Davis and I were talking about different approaches to building web applications. I was advocating that we use ASP.Net; The framework it provides for nesting controls within controls (server control and user controls) is very powerful. I was describing it as a component-centric approach where we could build pages rapidly by plugging controls together.

Ian was describing a page-centric approach, and advocating XSLT (within PHP) as one of several possible solutions. He was suggesting that his approach was both simpler and that we could be more productive using it. Having spent two years working with ASP.Net I was not at all convinced.

Two years on and I think I finally get what he was saying. What can I say, I’m a slow learner. The difference in our opinions was based on two different underlying mental models.

The ASP.Net mental model is that of application software. It tries to bring the way we build windows software to the web. ASP.Net has much of the same feature set that we have if building a Windows Forms app; it’s no coincidence that the two models are now called Windows Forms and Web Forms. In this model we think about the forms, or screens, that we have to build and consider the data on which they act as secondary – a parameter to the screen to tell it which user, or expense claim or whatever to load for editing.

In this mental model we end up focussing on the verbs of the problem. We end up with pages called ‘edit.aspx’, ‘createFoo.aspx’ and ‘view.aspx’; where view is the in the verb form, not the noun. ASP.Net is not unique in this, the same model exists in JSP and many people use PHP this way – it’s not specific to any technology, it’s a style of thinking and writing.

Ian’s mental model is different. Ian’s mental model is that of the web. The term URL means Uniform _Resource_ Locator. It doesn’t say Uniform _Function_ Locator. A URL is meant to refer to a noun, not a verb. This may seem like an esoteric or pedantic distinction to be making, but it affects the way we think about the structure of our applications and changing the way we think about solving a problem is always interesting.

If we think about URLs as being only nouns, no verbs, then we end up with a URL for every important thing in our site. Those URLs can then be bookmarked and linked easily. We can change code behind the scenes without changing the URLs as the URLs refer to objects that don’t change rather than functions that do.

So if URLs refer to nouns, how do we build any kind of functionality? That’s tied up in something else that Ian was saying a long time ago – when he asked me “What’s the difference between a website and a web API?”. My mental model, building web applications the way we build windows apps, was leading me to consider the UI and the API as different things. Ian was seeing them as one and the same. When I was using URIs refer to verbs I found this hard to conceptualise, but thinking about URIs as nouns it becomes clearer – That’s what REST is all about. URIs are nouns and then the HTTP verbs give you your functionality.

That realisation and others from working on Linked Open Data means I now think they’re one and the same too.

At Talis we’ve done a few projects this way. Most notably our platform, but also Project Cenote some time ago and a few internal research projects more recently. The clearest of these so far is the product I’m working on right now to support reading lists (read course reserves in the US) in Higher Education. We’re currently in pilot with University of Plymouth, here’s one of their lists on Financial Accounting and Reporting. The app is built from the ground up as Linked Data and does all the usual content negotiation goodness. We still have work to do on putting in RDFa or micro-formats and cross references between the html and rdf views – so it’s not totally linked data yet.

What I’ve found is that this approach to building web apps beats anything else I’ve worked with (In roughly chronoligical order – Lotus Domino, Netscape Application Server, PHP3, Vignette StoryServer, ASP, PHP4, ASP.Net, JSP, PHP5).

The model is inherently object-oriented, with every object (at least those of meaning to the outside world) having a URI and every object responding to the standard HTTP verbs, GET, PUT, POST, DELETE. This is object-orientation at the level of the web, not at the level of a server-side language. That’s a very different thing to what JSP does, where internally the server-side code may be object-oriented, but the URIs refer to verbs, so look more procedural or perhaps functional.

It’s also inherently MVC, with GET requests asking for a view (GET should never cause a change on the server) and PUT, POST and DELETE being handled by controllers. With MVC though, we typically think of that as happening in lots of classes in a single container, like ASP.Net or Tomcat or something like that. This comes from two factors in my experience. Firstly the friction between RDBMS models and object models and secondly the relatively poor performance of most databases. These two things combine to drive people to draw the model into objects alongside the views and controllers.

The result of this is usually that it’s not clear how update behaviour should be divided between the model and the controllers and how display behaviour should be divided between the model and the views. As a result the whole thing becomes complex and confused. That doesn’t even start to take into account the need for some kind of persistence layer that handles the necessary translation between object model and storage.

We’ve not done that. We’ve left the model in a store, in this case a Talis Platform store, but it could be any triple store. That’s what the diagram at the top shows, the model staying seperate from views and controllers… and having no behaviour.

A simple example may help, how about tagging something within an application. We have the thing we’re tagging, which we’ll call http://example.com/resources/foo and the collection of tags attached to foo which we’ll call http://example.com/resources/foo/tags. A http GET asking for /resources/foo would be routed to some view code which reads the model and renders a page showing information about foo, and would show the tags too of course. It would also render a form for adding a tag which simply posts the new tag to /resources/foo/tags.

The POST gets routed to some controller logic which is responsible for updating the model.

The UI response to the POST is to show /resources/foo again, which will now have the additional tag. Most web development approaches would simply return the HTML in response to the POST, but we can keep the controller code completely seperate from the view code by responding to a successful POST with a 303 See Other with a location of /resources/foo which will then re-display with the new tag added.

“The response to the request can be found under a different URI and SHOULD be retrieved using a GET method on that resource. This method exists primarily to allow the output of a POST-activated script to redirect the user agent to a selected resource.” rfc2616

This model is working extremely well for us in keeping the code short and very clear.

The way we route requests to code is through the use of a dispatcher, .htaccess in apache sends all requests (except those for which a file exists) to a dispatcher which uses a set of patterns to match the URI to a view or controller depending on the request being a GET or POST.

Ian has started formalising this approach into a framework he’s called Paget.