Bringing FRBR Down to Earth…

I’ve been looking at FRBR for some time. I’ve written about it and spoken about it. Overall I’ve found it difficult to work with and not really useful in solving the problems of resource discovery.

One of the recurring themes I see when looking at library data in 2009 is that it is centred far too often on the record – a MARC21 record usually. This record-centric view of the world pervades much of what is possible, but often it even restricts our very thinking about what might be possible. We are constrained.

I’ve also seen many conversations about FRBR go along a similar route, discussing what exactly classifies as a work or an expression. Is the movie of the book a new work or just a different expression? The answer never being the same. According to Karen Coyle (who has taught me so much about library data) the abstract concept of Work has reached the point of being a fluid and malleable set of all the things that claim to be part of the work. Reading that I got really confused. Then, a few weeks ago, reading through several mailing lists and some more old blog posts, it hit me. The answer was right there in the discussion.

Nobody talks about works, expressions and manifestations, so why describe our data that way?

We talk about books and the stories they tell, we talk about how West Side Story is a re-telling of Romeo and Juliet. We talk about DVDs, Blu-Ray Discs and VHS Videos (OK, not so much anymore) and the movies they contain and we talk about the stories the movies tell.

Let’s look at an example and try to reconcile what we see with FRBR.

In FRBR speak (which is probably a squeaky, slightly digital noise) we would say that Wuthering Heights is a Work produced by Emily Bronte. We might have a copy of it in our hands, maybe the Penguin Classics edition (978-0141439556). We’d call the thing in our hands an Item. Then in-between Work and Item we have two levels of abstractness, the Expression which would be the story as written down in English (nobody’s quite sure where translations fit) and the Manifestation which would be that particular paperback version from Penguin.

If we add in the terms for the relationships it gets rather prosaic.

Wuthering Heights is a work by Emily Bronte, realized in a written expression of the same name. The written expression is embodied in several different manifestations each of which is exemplified by many items, one of which I hold in my hand.

I’m being deliberately extreme, I know. Comment below if you think I’m being too harsh or if you understand the FRBR/WEMI model differently.

Here it is in diagrammatic form:


The difficulty I, and I suspect many others, have is that I don’t ever use any of those words. They’re too abstract to be useful. FRBR generalises its model and in that generalisation loses a great deal. Let’s talk about it using more natural language.

Wuthering Heights is a story by Emily Bronte. It was originally published as a novel in 1847 and has subsequently been made into a movie (several times) and re-published in many languages beyond its original English. It has been republished in many editions and as a part of many collections. It features several fictitious people including Catherine Earnshaw and Heathcliff. The author, Emily Bronte, had sisters who authored several other novels, though she authored only this one. Emily Bronte is also the subject of several biographies. I have the paperback in my hand right now.

No works, expressions and manifestations. No items. No abstraction. We can model this more clearly now, at least in my opinio.

Real 01

The structure of the model remains broadly the same, but the language allows us to see how it works and classify things more obviously. This has strong similarities to the way Bibliontology is modelled and Bibliontology is very easy to use for its intended purpose – citations.

The more specific nature of the language goes on to pay dividends when we start to add in more data. Wuthering Heights has been made into a movie (several times) and one of the problems often discussed in FRBR circles is whether or not a movie based on a book is a new work or a new expression. Of course, the argument is false as a movie that faithfully reproduces a novel is both an expression of the story told in the novel and a creative work in its own right. While the movie could not exist without the novel it is based on, the art of film-making is a creative act as well. This is a hard thing to model with the four abstract levels defined in FRBR.

Here is the FRBR model showing the movie as an expression of the original work:


This now seems to imply that the movie is somehow a lesser creative work than the original novel and I’m uncomfortable about that, but we do have the relationship between the book and the movie modelled.

The alternative is to recognise the movie as a creative work in its own right in which case the model looks like this:


Now we’ve recognised the movie as a creative work in its own right, but lost the detail that it shares something with the novel. That makes the model less useful.

Using less abstract terms, and more of them, we can model in a way that describes the real-life situation – and hopefully avoid some of the argument, though I’m sure other issues will arise. Adding in the movie using the less abstract terms gives us this:

Real 02

Now we have the movie recognised as what it is and we have the relationship with the original novel.

I’ve applied the same logic to the physical items. It doesn’t help me to know that something is simply an item – I want to know what it is. So classes of Hardback, Paperback, CD-ROM, Blu-Ray Disc and Vinyl LP would be useful, where currently RDA provides a complex combination of Encoding Format and Carrier Type. This level of detail is more than likely required for archive and preservation purposes, but for the mainstream use of the data a top-level type would be very useful.

We can add more stuff than movies, though. We can add recordings. Showing my strange taste in music I’ll start with Wuthering Heights by Kate Bush (and the title nicely gives away where this is going). I shan’t try an model this using FRBR for comparison because I can’t see how to. If you feel you can then please sketch it out and add it in the comments or email it to me.

I don’t see a practical way in which making Wuthering Heights (the song) an expression of Wuthering Heights (story) is useful; yet their still exists a relationship between them. The song tells the same story (albeit abridged to 4:29).

Real 03

Modelling with real world terminology also allowed us to separate the song from the recording and the recording from the album it features on. Perhaps not something we can get to from the data we have today, but a useful feature to have in the model.

The richness and utility of modelling comes from giving more detail, not less and from using more specific terms, not more general terms.

The introduction of more specific terms also leads us to write more specific data conversion routines; looking to identify novels, albums, tracks, stories and more. Much of the data will not be mined from our MARC records, but by looking at the specifics we get past much of the variation that is difficult when we try to treat all works, expressions and manifestations the same across all mediums and forms of artistic endeavour.

One of the potential downsides of this approach is an ontology that may explode to contain many classes. While this seems like it is adding detail it is actually just moving detail. RDA documents this as ‘Form of Work’ – ‘A class or genre to which a work belongs.’

If the work belongs to that class, why not model it as that class?

I know several folks out there have been having a hard time applying FRBR to serials and other things, if you fancy having a go at modelling it with real-world language instead I’d love to talk to you – comment below.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>