You're not the one and only…

The chorus of Chesney Hawkes‘ song goes “I am the one and only”, a huge pop hit with teenage girls in the 1990s, but what does that have to do with SemTech 2010?

I was in the exhibit space yesterday evening and there was so much really interesting stuff. I had some really great conversations. Talking about storage implementations with Franz and revelytix (and drinking their excellent margaritas), looking at vertical search with Semantifi and having a great discussion about scaling with the guys from Oracle.

A really useful exhibition of some great technology companies in the semweb space.

So why the Chesney reference? Well, several of the exhibitors started out with

we’re the only end-user semantic web application available today

and

we have the first foo bar baz server that does blah blah blah

and

we are the first and only semantic search widget linker

and all I could hear in my head every time it was said was Chesney… “You are the one and only” only they’re not.

For all of the exhibitors that said they were first or only I had serious doubts, having seen other things very similar. Maybe their ‘first’ was very specific — I was the first blogger at SemTech to write a summary of the first two days that included a reference to Colibri…

The problem with these statements is that they are damaging, how much depends on the listener. If the listener is new to the semweb and believe the claim then it makes our market look niche, immature and specialist. If the listener is informed and does not believe the claim it makes your business look like marketeers who will lie to impress people. Either way it’s not a positive outcome. Please stop.

The 18 Mistakes That Kill Startups

when I think about what killed most of the startups in the e-commerce business back in the 90s, it was bad programmers. A lot of those companies were started by business guys who thought the way startups worked was that you had some clever idea and then hired programmers to implement it. That’s actually much harder than it sounds—almost impossibly hard in fact—because business guys can’t tell which are the good programmers. They don’t even get a shot at the best ones, because no one really good wants a job implementing the vision of a business guy.

from The 18 Mistakes That Kill Startups by Paul Graham.

Agile Coach Code of Conduct – What To Fix

A while back I started on an Agile Coach Code of Conduct. I noticed that after coaching for a while I started to forget basic principles that should be part of every coaching engagement.

So I put this list together to help me (and others) remember what coaching is all about.

Like everything in agile, it’s an ideal, not something you can ever perfectly do.

  1. I will always remember that my teams are full of intelligent professionals acting in the best way that they can for project completion
  2. I will not direct or follow, but lead the team by example, which means walking a fine line between participation and observation
  3. I will always remember that I am a guest on each team, and will strive to behave respectfully to my hosts as much as I can

read the full list of Daniel Markham’s 12 principles at Agile Coach Code of Conduct – What To Fix.

Coghead closes for business

With the announcement that Coghead, a really very smart app development platform, is closing its doors it’s worth thinking about how you can protect yourself from the inevitable disappearance of a service.

Of course, there are all the obvious business type due diligence activities like ensuring that the company has sufficient funds, understanding how
your subscription covers the cost (or doesn’t) of what you’re using and so on, but all these can do is make you feel more comfortable – they can’t provide real protection. To be protected you need 4 key things – if you have these 4 things you can, if necessary, move to hosting it yourself.

  1. URLs within your own domain.
  2. Both you and your customers will bookmark parts of the app, email links, embed links in documents, build excel spreadsheets that download the data and so on and so on. You need to control the DNS for the host that is running your tenancy in the SaaS service. Without this you have no way to redirect your customers if you need to run the software somewhere else.

    This is, really, the most important thing. You can re-create the data and the content, you can even re-write the application if you have to, but if you lose all the links then you will simple disappear.

  3. Regular exports of your data.
  4. You may not get much notice of changes in a SaaS service. When you find they are having outages, going bust or simply disappear is not the time to work out how to get your data back out. Automate a regular export of your data so you know you can’t lose too much. Coghead allowed for that and are giving people time to get their data out.

  5. Regular exports of your application.
  6. Having invested a lot in working out the write processes, rules and flows to make best use of your app you want to be able to export that too. This needs to be exportable in a form that can be re-imported somewhere else. Coghead hasn’t allowed for this, meaning that Coghead customers will have to re-write their apps based on a human reading of the Coghead definitions. Which brings me on to my next point…

  7. The code.
  8. You want to be able to take the exact same code that was running SaaS and install it on your own servers, install the exported code and data and update your DNS. Without the code you simply can’t do that. Making the code open-source may be a problem as others could establish equivalent services very quickly, but the software industry has had ways to deal with this problem through escrow and licensing for several decades. The code in escrow would be my absolute minimum.

SaaS and PaaS (Platform as a Service) providers promote a business model based on economies of scale, lower cost of ownership, improved availability, support and community. These things are all true even if they meet the four needs above – but the priorities for these needs are with the customer, not with the provider. That’s because meeting these four needs makes the development of a SaaS product harder and it also makes it harder for any individual customer to get setup. We certainly don’t meet all four with our SaaS and PaaS offerings at work yet, but I am confident that we’ll get there – and we’re not closing our doors any time soon ;-)

Cause and Effect

For those reading outside of England…

We’ve just had the first run on a bank in 150 years – Northern Rock has been forced to borrow billions of pounds from the Bank of England. It’s been really interesting to watch.

But what’s really interesting is the discussion of blame. Here are the reasons why the Northern Rock went down according to their senior management:

  • The Bank of England refused an earlier rescue loan
  • Another major bank backed out of buying/bailing them out
  • The wholesale money markets closed
  • The BBC announced that they needed an emergency loan before they had an official comment ready

What’s notable about these are that they are not the reason Northern Rock went down. That’s plain and simple that they had an imbalance between how much money they had in deposits and how much they had tied up in non-liquid assets. This happened because there was an imbalance in priorities throughout Northern Rock culture that led to them being great at lending money and not as focussed on bringing it in.

The reason I find this interesting is that it ties in with other thoughts I’ve been having about root causes.

The things that the Northern Rock managers are bringing up in their defense are all things that didn’t help, and maybe if some or all of them had been different there wouldn’t have been a run on the bank – maybe.

We’ve been talking at work about why nobody seems to really effectively achieve code re-use and several reasons come up again and again. Deadline; too hard; component not good enough to re-use and so on.

But, as with Northern Rock, I think the main reason nobody really effectively achieves code re-use is because there’s an imbalance in priorities throughout our industry that leads us to be great at producing new functionality and not as focussed on sharing capabilities.

Technorati Tags: ,

Open Data Licensing

Back at the end of September we finally got to the point of releasing the first draft of the Open Data Commons License. This is work I’ve been involved in since Ian’s first draft of the TCL about a year and a half ago.

It’s great to see this license come to fruition, having argued about the need for this more than once.

It’s interesting to see the conversation happening around LibraryThing’s Common Knowledge and the Open Library project. Both of these are collections of factual data, I’ve been speaking to people involved in both and both have a clear desire to protect the data and ensure that it’s available for the community into the future.

Licensing is critical to that – as I said in Banff (listen) at the start of the year.

Back then we were concerned with navigating the difference in protection afforded to database in the EU and the US. In essence, databases have protection in the EU, but have no protection on the US. The reason we were looking at that was because the natural thinking goes something like this:

Creative Commons extends Copyright to allow you to easily position yourself on the spectrum of ‘All Rghts reserved’ to ‘Public Domain’.

Therefore Open Data Commons must need to extend a Database Right to allow to position your data on the same spectrum.

Well, the Open Data Commons license gets around that by being couched in contract law. This seems like a great way to license data for open use and prevent it being locked away in future.

With all that’s been going on then, it’s no surprise that I missed the Model Train Software case that could have a big impact on how Open-Source software licenses are drafted. A San Francisco judge ruled that the Artistic License was a contract – meaning that breach of the license did not necessarily mean infringing the copyright. That changes the legal redress and potential penalties available for breaching a license.

Interesting.

Technorati Tags: , , , ,

This post originally appeared on Talis’ Nodalities blog.

The first steps of the Semantic Web are now a short distance behind us and some organisations are starting to pick up the pace. With more and more data coming online, marked up for linking and sharing in a web of data, perhaps it’s time to look again at the trade-off of different intellectual property rights.

Back in November of 2004 James Boyle published A Natural Experiment in the Financial Times. This piece sees him debating the merits of intellectual property rights over data with Thomas Hazlett and Richard Epstein. His primary thrust is that we should be making policy decisions in this area based on empirical data about the economic benefits one way or another. Something all three protagonists agree on.

Much has changed between 2004 and now, not least our understanding of how the web can affect the way we collaborate, share, communicate; it fundamentally affects the way we live. We chat, we blog, we Twitter, we Flickr and we Joost. Content flows from person to person in unprecedented ways and at unprecedented speeds. This changes the nature of the experiment that Boyle talks about.

If the database right were working, we would expect positive answers to three crucial questions. First, has the European database industry’s rate of growth increased since 1996, while the US database industry has languished? […] Second, are the principal beneficiaries of the database right in Europe producing databases they would not have produced otherwise? […] Third, […] is the right promoting innovation and competition rather than stifling it?

Boyle’s first two questions centre around the creation of databases and his third, by his own admission, is difficult to measure. If one of our primary goals for the growth of the Internet is to have a web of data that can be linked and accessed across the globe we may be better served by assessing how companies might make data open.

Boyle asks for, and discusses, the empirical evidence of databases being created in the EU and US. The differences in numbers should provide insight into the economic ups and downs as the EU adopted a robust database right in 1996 while the US ruled against such protection in 1991. I am interested in how we expect the growth of data on the Semantic Web to differ in the two jurisdictions.

Boyle explains that the US Chamber of Commerce oppose the creation of a database right in the US

[The US Chamber of Commerce] believe that database providers can adequately protect themselves with contracts, technical means such as passwords, can rely on providing tied services and so on.

And therein lies the rub. Without appropriate protection of intellectual property we have only two extreme positions available: locked down with passwords and other technical means; or wide open and in the public-domain. Polarising the possibilities for data into these two extremes makes opening up an all or nothing decision for the creator of a database.

With only technical and contractual mechanisms for protecting data, creators of databases can only publish them in situations where the technical barriers can be maintained and contractual obligations can be enforced.

We don’t tolerate this with creative works, our photographs, our blog posts and so on. Why would we expect it to make sense for databases? Whether or not it makes sense comes down to whether or not it is beneficial to society. We allow Copyright in order to provide adequate remuneration to be collected by the creator of a work. We allow patents to allow the recovery of development costs for an invention. Which is database right more like?

Patent is a very broad monopoly. If I had a patent on the clock, a mechanical means of measuring the passing of time, nobody else would be able to make clocks. Copyright, on the other hand is much narrower only allowing me to protect the specific design of my clocks. This is where it can get confusing with databases. Database right in the EU is like Copyright. It is a monopoly, but only on that particular aggregation of the data. The underlying facts are still not protected and there is nothing to stop a second entrant from collecting them independently.

Richard Epstein points to this in his contribution

The question is why do databases fall outside [the general principle of copyright], when the costs of compilation are in many cases substantial for the initial party and trivial for anyone who receives judicial blessing to copy the base? In answering this question, it will not do to say, as the Supreme Court said in the well known decision in Feist Publications v. Rural Telephone Service, (1991) that these compilations are not “original” in the sense that it requires no thought to check the spelling of the entries and to put them all in alphabetical order. But that obvious point should be met with an equally obvious rejoinder. If it requires no thought or intelligence to put the information together, then why not ask the second entrant into the market to go through the same drudge work as the first.

This is exactly what we see happening with Open Street Map. Ordnance Survey in the UK have rights over the map data they have collected. The protection covers the collection of geospatial data that they have created, they are not granted a monopoly in geospatial data.

This leaves a special case of databases, those which are created at low cost as a by-product of normal business. Examples used in Boyle’s article are telephone numbers, television schedules and concert times. Boyle gives us the answer directly

the [European] court ruled that the mere running of a business which generates data does not count as “substantial investment” enough to trigger the database right.

This reminds me strongly of The Smell of Food and the Sound of Coins a folk tale in which a wise judge decides that a restaurateur may charge for the smell of food wafting from his restaurant, however the appropriate price is the sound of coins chinking together.

That a database right may not and should not apply in all cases, and that there is a requirement to restrict anti-competitive practices, does not necessarily extend to the conclusion that a right is not required.

It seems to me that much of the debate around intellectual property rights has focussed on how they are used to keep things closed. Having suggested earlier that we have only the abilities to keep databases locked away or in contrast open them completely, I’d like to consider what it might mean to have a database right for keeping things open.

In response to Thomas Hazlett’s contribution Boyle asks

How many databases are now created and maintained entirely “free” and thus escape commercial directories altogether? There are obviously many, both in the scientific and the consumer realm. One can no more omit these from consideration, than one can omit free software from the software market.

This strikes me as a great comparison to consider. Taking one of the most prevalent free software licenses, the Gnu Public License, what might that look like for data?

One of the primary functions of the GPL is that it enforces Copyleft – the requirement to license derivative, and even complimentary, works under an the same license. That is, any commercial software that makes use of GPL code must, under the terms of the license, also be released under the GPL. The viral nature of this license is possible only because of the backing of Copyright.

Without a database right communities have no mechanism to publish openly and still insist upon this kind of Share-Alike agreement.

Consider the impact of this for situations where you you might use the idea of promiscuous copying to maintain the availability of data. Promiscuous copying relies on two things, lots of copies being made and lots of copies being available. Without the necessary licensing in place there is no mechanism with which to compel those who have copies to make those available. Public Domain means, by definition, no restriction – that means I can lock it away again.

Copyleft is just one position along a spectrum where ‘locked away’ and ‘free as a bird’ sit at each end. What the web shows us is that other business models form crucial parts of the eco-system. Epstein picks up on the controlling aspect of Boyle’s argument:

They can control their list of subscribers; give them each passwords; charge them based on the amount of the information that is used, or some other agreed-upon formula; and require them not to sell or otherwise transfer the information to third parties without the consent of the data base owner.

Imagine if this were true of Copyright material on the web? It has been, and still is on the occasional site. But mostly copyright owners are starting to see the value of publishing content online and they are underpinning the delivery of that content to consumers with other business models. Without Copyright the types of business that could participate would be reduced.

Epstein goes on to say:

The contractual solution is surely preferable, because general publication will allow for use by others that may not offend the copyright law, but which will block the possibility of payment for the costly information that is supplied.

And again, the very heart of the matter. If we are to encourage those who have large databases to make them open, to post them on the Semantic Web, we must provide them with models and solutions that are preferable to technical barriers and restrictive contracts. Allowing them to pick their own position on the spectrum seems to me to be a necessity in that. You can see any form of protection in two lights. When Boyle says

They make inventors disclose their inventions when they might otherwise have kept them secret.

I say

They allow inventors to disclose their inventions when they might otherwise have had to keep them secret.

That’s why we’ve invested in a license to do this, properly, clearly and in a way that stays Open.

Rob Styles is Programme Manager for Data Services at Talis, a UK company building Semantic Web technologies. Rob Styles is not a lawyer.

Technorati Tags: , ,

Spolsky on VBA for Mac Office

Joel’s talking about Microsoft’s withdrawal of VBA in the Office Suite for Mac users.

Joel starts by explaining why VBA was strategic for MS:

However, it was seen as extremely “strategic.” Here’s what that meant. Microsoft thought that if people wrote lots and lots of VBA code, they would be locked in to Microsoft Office.

As Joel was on the Excel team at the time you’d expect this to be an accurate reflection of what was happening. It makes sense, Office is the big cash cow and always has been.

Towards the end Joel says:

they’re effectively making it very hard for many Mac Office 2004 users to upgrade to Office 2008, forcing a lot of their customers to reevaluate which desktop applications to use.

I’m guessing the outcome MS are hoping for is different; Apple have made great in-roads into compatibility with Windows, a lot of hard work of their own but also a move to internet and intranet approaches have helped shrink the gap. To the point where Macs are showing up more and more at conferences, in businesses and in homes. The removal of support for VBA on mac is unlikely to push mac users in business to run a different office suite, it’s far more likely to push their corporate overlords to move them back to Windows, or run Windows through Parallels or Bootcamp – either way MS get a mac customer back onto windows.

Sad.

Offshoring…

Over at Virtual Chaos, Nad’s been rambling about offshoring and outsourcing.

The problem though, is that writing code isn’t something you can translate into an assembly line. What I think the people pushing this type of outsourcing failed to comprehend, and seemingly still dont understand is that farming out development overseas doesn’t lead to innovation. … Someone famously once said every line of code is a design decision, I’m struggling to remember who it was [insert clever guys name here]. But that single statement embodies for me what the real problem is with outsourcing projects abroad.

I don’t really see this as about innovation directly. I think it’s all about design. There are plenty of software projects that aren’t about innovation; they’re about cost-reduction or about refreshing technologies (which is generally about cost-reduction) or about well, cost-reduction. Most of what IT departments in enterprise are asked to do is cost reduction – hence most of what enterprise will outsource is also about cost-reduction. Innovation isn’t really a factor.

The indirect quote that Nad references about code being design could well have been from Code as Design: Three Essays by Jack W. Reeves. Jack’s discussions parallel what Nad is saying; that writing code is the act of designing something, not the act of manufacturing something. “Tooling up” is done by the compiler and manufacture is when you press lots of CDs, or on-demand when folks download your RPM or MSI.

The answer you arrive at about outsourcing (and offshoring is just outsourcing with extra big communication barriers and some cultural differences thrown in) will depend on what part of the software universe you feel you’re in. For us, we write commercial software that we sell and maintain for a number of years, the quality of code is important.

But that’s not always the case for everyone. Say you have a legacy application that was not well-written to start with, it’s built on top of old, unsupported languages (say, Java 1.2?) and you need to keep running it with minor changes for a few more years. There is no innovation to be done. The design decisions in-the-small aren’t that important to you as they’ll be better than the ones made last time! Your team in the office are de-moralised by the very mention of the application’s name and you’re over-stretched on new projects that are innovative… Surely that’s a contender for off-shoring?