Twitter Updates for 2008-09-15

  • back to work after the weekend after 3 days away on school trip with J8. 8 year olds are cool, but not too cool to talk to you :-> #
  • ugh, firefox just lost keyboard input. haven’t seen that bug for a while. #
  • right, email caught up, time for another cuppa #
  • Running 3 OSs simultaneously on MacBook Pro with 4Gb, runs fine – Windows XP, Ubuntu and OSX (of course) – nice separation of dev and office #

Throttling Flickr Uploads

I’m part way through uploading a substantial backlog of photos (11,000+) to Flickr.

I had started out trying to go through them and decide which were worth uploading and which weren’t, but that approach was taking far too much time. I needed some help. So I decided that I’d upload all of them and open them up to my family to help with the sorting – asking them to tag the photos with the people in them and also use tags like "TODELETE" to mark those that are just noise and should be thrown out.

I could have opened the net wider, but there are photos of my kids and other people’s kids in there so just family it is.

I’m using phpFlickr to batch upload photos to Flickr and my script uploads as fast as the bandwidth will allow, though admittedly single-threaded. This meant that I couldn’t use it at work, at least not with a clear conscience, so have been uploading from home only. Those 8+ hours a day in the office have been bugging me though, so I was looking for a way to have photos uploading without having a detrimental impact on our connection.

A little googling found an article on bandwidth throttling in OSX that showed the basics of using ipfw to limit transfer rates. A bit of tweaking and I ended up with

sudo ipfw pipe 1 config bw 128KByte/s
sudo ipfw add 1 pipe 1 dst-ip 68.142.214.24

This limits the upload traffic to api.flickr.com to 128KB/s and means I’m not going to cause anyone a problem.

Sweet.

Instructional Code and Modelling Code

Styles of coding interest me enormously and it’s many years since I accepted the notion that

Any fool can write code that a computer can understand. Good programmers write code that humans can understand. ~Martin Fowler

Writing code is most definitely about writing for other people to read, whether other members of your team or people somewhere down the line, maybe long after you’ve moved on. But the number of languages that we have available for coding is just one sign that there isn’t a ‘best’ way of doing everything, just many different trade-offs that fit better or worse for any particular task. What is clear, though, is that languages have evolved since punch cards.

We have procedural languages (C, Pascal), functional languages (Erlang, Haskell, XSLT), object-oriented languages (C++, Objective-C, Java, C#) and logic languages (Prolog) to name several paradigms, but by no means all. People often talk about C still being the best tool for writing performant code as it remains the closest to the underlying hardware – C also remains in common use for work in embedded applications.

So why all the different types of language? The main reason as I see it is that they represent different ways of thinking about problems, they allow you to describe solutions in ways that match your model of thinking. In OO languages this approach has been taken to some heights through the use of design patterns – documented ways of combining objects to solve a problem in a particular way of thinking. In other types of languages there also idiomatic practices and norms that match the way that language community think. Ever wondered why OO Perl never really got popular? It’s because most people writing Perl don’t think about solving problems in terms of objects.

Different languages can take this to extremes in different ways. Brainfuck, for example, presents a model of a sequence of bytes and a data pointer. Solving problems with only a byte array and increment/decrement functions is, in this case, deliberately obtuse; and the syntax of the language has been chosen to make it even more so.

In my mind I’ve always separated out ‘higher level’ languages from, say, assembler. But, of course, they’re really just different ways of thinking, ways of modelling a solution. We often talk about the evolution of languages and speak of more modern languages as being better than older languages as if in some Darwinian competition. Taken outside of the context of time, though, they simply form a series of different models. Sure some of them build upon the concepts of others, but not always in ways that improve upon them. Assembler is not less a modelling language than Java or Smalltalk, it just so happens that the modelling you do in assembler uses the same conceptual model as the underlying hardware of the machine.

Recognising how best to describe a solution (in code) so that future readers can understand clearly what is happening seems to be at the core of skills programmers need to hone and different models will be applicable at different times. For example, Repenning describes the inappropriate use a “naive” object model in his paper Collaborative Diffusion: Programming Antiobjects (pdf, 2.7MB). Introducing the term Antiobjects, he talks about how responsibilities can be distributed differently to initially obvious approaches, gaining a much more efficient running system as well as a simpler implementation.

The modelling aspects of languages provide ways to group or separate different concerns, but underlying it all is the need to do something. There’s a reason that BASIC stands for Beginners All-purpose Symbolic Instruction Code. I recently came across some code that had forgotten this. The author had applied the Observer pattern perfectly, objects that did the work subscribed to another object that monitored the file system for changes. The main() method simply constructed the objects, wired them together and said “Go!”. The net effect of this, however, was that anyone coming to the code had to form a complete mental model of how all the objects were going to interact before they could predict the sequence of things that would get done. The pattern, while elegant and properly implemented, made the code harder to understand. What I needed, as a reader, was a simple sequence of instructions – the flow of the application.

So, as my eight year old starts to ask if he can learn how to make the computer do stuff I’m wondering if he should start with modelling code first or instructional code first – or if the distinction is even valid. Maybe I should start him off with Flash…

Cryptography Challenge…

Cory Doctorow asked Bruce Schneier to give him a hand designing wedding rings. Not an obvious combination until you realise these are crypto rings…

There are two great discussions going on over at both blogs. Cory has asked his crowd to help design a cipher for his crypto wedding rings. While Bruce simply said Contest: Cory Doctorow’s Cipher Wheel Rings.

The discussion on both posts is worth reading. A mixture of things popping up about the similarity between the three rings and the Enigma machine as well as comments about Jefferson’s Wheel Cipher.

Like most things Cory does (or says) there’s an element of the slightly bizarre. The prize, a not to be sniffed-at signed copy of Little Brother.

The full set of photos are on Cory’s Flickr account, tagged weddingring.

Comparisons with the Enigma machine, I suspect, are bogus. While there is a visual similarity with the Enigma’s wheels the Enigma’s cipher was implemented in the electronics within the machine. The letters on the rotors simply enabling the correct starting positions to be selected. The Enigma machines perform a substitution cipher, but with the additional complexity that the substitution pattern changes for each letter through the message. I don’t see a way to do that with these rings. There may be rotor ciphers that could be implemented – I don’t know.

Jefferson’s cipher is a much closer match, a fully manual system consisting of 26 wheels with the alphabet scrambled differently on each one. Similar to the Enigma machine, sender and receiver had to have the order of the wheels synchronised and each letter would use a different substitution scheme, though Jefferson’s not as thorough as the Enigma.

As the rings cannot be altered and the alphabet is in order on all three wheels, any attempt that results in one character of cipher text for each character of plain text will be a simple substitution cipher. While it may take several complex steps to arrive at the cipher character it will only take an attacker one step to go back.

So, if you’re thinking about this problem seriously there are some things you have to decide on first…

  1. Is the ring considered secret or not?

    This is isn’t an unreasonable assumption (putting aside that the details have been published online). It’s not that long ago that messages were transferred in plain text relying only on the emperor’s seal – made in wax with a ring only he carried.

  2. Can you include another secret?

    There are suggestions on the blogs of using most recent blog posts, first pages of known books and other items as keys to drive the cipher. This then involves taking the character from the key and the character from the plaintext and some form of mathematical computation (shifting rings up or down, finding the next dot above or below, that kind of thing) to arrive at the cipher text character.

  3. Is the algorithm secret?

    Knowing Bruce’s views on secrecy and security, even suggesting it is pure heresy. Considering the ring to be secret may be part of this, or may not. Some of the ideas I’ve had fall outside being encryption and really fall into the realm of a ‘secret encoding’. But hey, something has to be secret and if it can’t be the ring, or the key, the maybe it has to be the algorithm.

Then, of course, you have to decide what to do with the rings. Any Cryptographic algorithm fulfils one of four basic purposes:

  1. Symmetric Encryption

    These algorithms use the same key to encrypt and decrypt the text. They may use a single algorithm, like ROT13, or they may use a matched pair of algorithms, like many other substitution ciphers.

  2. Asymmetric Encryption

    These algorithms use one key to encrypt and another to decrypt. The keys in this case are paired and are usually termed public and private keys. Typically you would use the recipients public key to encrypt and they would use their own private key to decrypt.

  3. Non-Decryptable Hashes

    Used mostly for storing passwords (I can’t think of another use), these algorithms enable you to reliably convert plain text into a hash with little possibility of reversing the process. For passwords this means you store the hash of the password, then compare the hashed version of any sign-in attempt with the stored hash.

  4. Signing

    Signing means adding some kind of addendum to the message that confirms you wrote it. Again this is done using public/private key pairs. You use your private key to create a hashed version of the message which others can then verify using your public key.

As well as thinking about all of that good stuff it might be worth looking for clues in the design of the rings. Bruce must have had something in mind when designing the rings.

Here are the obvious things to notice:

  1. All three rings feature the alphabet in order.
  2. The dot patterns are not random.
  3. The dot pattern follow a 1, 2, 3 pattern.
  4. The dot pattern is not unique (it repeats) when looking across the three rings.

Less obvious:

  1. The S across three rings, looking at the dots above, makes dot, dot, dot while the O across the dots on top is three blanks (dash, dash, dash?) this made me go look at Morse Code again.

Yep, that’s all I spotted :-(

I’ll be chatting with a coupe of colleagues to see if we can put our heads together and also watching to see what the winner comes up with.

If I owed you a thousand dollars…

would you still be my friend?

The opening line of Someone Like That by Deblois (pronounced by saying all the letters, it rhymes with "choice" or "rolls royce"). A rhythmic, mellow natural sound that just seems to melt through the air.

Deblois says of herself:

She plays acoustic roots music; soulful original tunes with an eye for the universal and is backed by the wonderfully funky Big House Band.

This description really doesn’t do her justice. Last time I felt this good about discovering new music was a few years ago when I had the good fortune to stumble across Amos Lee playing live. I bought the album he was selling at the show and have listened to it over and over.

I get the same feeling I got from that album listening to Leviathan.

So, where did I find Deblois? Last.fm? A close friend? Nope. Luis Villa. Deblois is Luis Villa’s sister.

 

You may think this is boring but…

About a year ago I stole a book. Stole is definitely the right word as I took without asking. In fact the owner probably still doesn’t know I’ve got it. I’ll return it today – with an apology.

I shan’t name the owner as the book in question, and others like it, carry a certain stigma. A stigma that the reader might be someone not terribly interesting; the kind of person who might, in a different context, show his collection of stamps or dead butterflies.

The book in question is Simon Stokes’ Digital Copyright Law and Practice.

Not everyone wants or needs to know why A & M Records Inc v Napster Inc was an important case, or how Feist Publications v Rural Telephone Service Co has a day-to-day impact on what they can do on the web. For those publishing on the web and/or consuming things others have published some knowledge is a useful thing.

This book, however, is not a useful summary for those wanting a quick insight into the rights and wrongs of what they, or their users, may be doing. This book is a quite in-depth discussion of the state of protection offered to data and creative works in different jurisdictions along with explanations of the relevant precedent setting cases.

While Lessig et al talk about how Copyright reform is so desperately needed Stokes simply summarises the protections currently available in an objective and non-judgemental fashion, covering Copyright, Compilation Rights and Database Rights. Yes, he does talk about why Copyright is broken in an internet era, but he does not present his own preferred alternative, choosing instead to describe what he sees as the crossroads Copyright currently stands at.

The licenses that we use everyday – GPL, LGPL, Apache Licenses, Creative Commons and Open Data Commons – all rely on the law to underpin them. A sound understanding of those laws can really help in understanding which license really meets your needs. That sound understanding can be acquired by reading Stokes’ book. If you can keep your eyes open long enough…