Other Technical
Throttling Flickr Uploads
I’m part way through uploading a substantial backlog of photos (11,000+) to Flickr.
I had started out trying to go through them and decide which were worth uploading and which weren’t, but that approach was taking far too much time. I needed some help. So I decided that I’d upload all of them and open them up to my family to help with the sorting - asking them to tag the photos with the people in them and also use tags like "TODELETE" to mark those that are just noise and should be thrown out.
I could have opened the net wider, but there are photos of my kids and other people’s kids in there so just family it is.
I’m using phpFlickr to batch upload photos to Flickr and my script uploads as fast as the bandwidth will allow, though admittedly single-threaded. This meant that I couldn’t use it at work, at least not with a clear conscience, so have been uploading from home only. Those 8+ hours a day in the office have been bugging me though, so I was looking for a way to have photos uploading without having a detrimental impact on our connection.
A little googling found an article on bandwidth throttling in OSX that showed the basics of using ipfw to limit transfer rates. A bit of tweaking and I ended up with
sudo ipfw pipe 1 config bw 128KByte/s
sudo ipfw add 1 pipe 1 dst-ip 68.142.214.24
This limits the upload traffic to api.flickr.com to 128KB/s and means I’m not going to cause anyone a problem.
Sweet.
Moving Stuff Finished :-)
Yesterday evening I followed the excellent instructions from the WordPress Codex on Migrating from Movable Type to WordPress. This proved to be extremely easy with all the posts, categories and comments coming through perfectly.
I had some files (photos, ppt etc) manually uploaded, so logged in to move those over. As well as migrating I moved the blog from /blog/ up to the root of the site. That all went swimmingly and I picked a nice shiny new theme called demar.
The only thing left was to sort out redirects for all the old links - so I don’t lose all the link love I’ve managed to build up over the years.
After reading through the various options I had a few issues. I’d tinkered a bit with the MT URLs in the past and had a lot of legacy stuff hanging around. I decided I’d just do it with apache’s .htaccess. Not a choice for everyone, but my regex skills aren’t too shabby, so I figured I’d start there.
The URLs fell into a few different patterns:
/blog/atom.xml, index.xml, etc - the feeds. For now these can all redirect to /feed/ so we start with
RewriteEngine On
RewriteRule ^blog/atom.xml$ /feed/ [L,R=301]
RewriteRule ^blog/index.rdf$ /feed/ [L,R=301]
RewriteRule ^blog/index.xml$ /feed/ [L,R=301]
Feeds dealt with I moved on to the root of the blog, adding
RewriteRule ^blog/$ / [L,R=301]
RewriteRule ^blog$ / [L,R=301]
and then onto the archives, where we start to get trickier. First we have categories which take the form /blog/archives/cat_somewhere_I_put_stuff.html. WordPress creates a different pattern by default - /categories/somewhere-i-put-stuff. Not too hard, first we pull out the words, then glue them back together again.
RewriteRule ^blog/archives/cat_([^_]*)_([^_]*)_([^_]*)_(.*)\.html$ /category/$1-$2-$3-$4 [L,R=301]
RewriteRule ^blog/archives/cat_([^_]*)_([^_]*)_(.*)\.html$ /category/$1-$2-$3 [L,R=301]
RewriteRule ^blog/archives/cat_([^_]*)_(.*)\.html$ /category/$1-$2 [L,R=301]
RewriteRule ^blog/archives/cat_(.*)\.html$ /category/$1 [L,R=301]
Each of these regexs pulls out categories of 4 words long, 3 words, 2 words and 1 word respectively. If you have categories with more words in then you’ll need to add longer versions of these, ordering them longest first.
Next the monthly archives, in MT /blog/archives/2004_04.html and in WP /2004/04/
RewriteRule ^blog/archives/([0-9]{4})_([0-9]{2})\.html$ /$1/$2/ [L,R=301]
easy enough.
Then we have the individual posts. These fall into two groups, name based files and numbered files. The name based files are all /blog/archives/categoryname/postname.html. I had thought these were going to be a pig, but I discovered completely by accident that if you simply use the postname part of that then WordPress figures out which post you meant and redirects to its nice new WP URL. Sweet.
RewriteRule ^blog/archives/[^/]*/(.*)\.html$ /$1 [L,R=301]
The exception to this turns out to be posts that have a hyphen in the name. MT strips the hyphen, leaving WP with a name that doesn’t match. I put a rule in specifically for the one post I have that was affected:
RewriteRule ^blog/archives/personal/beijing_sightse.html$ /2008/05/04/beijing-sight-seeing/ [L,R=301]
Which just leaves the numbered posts: /blog/archives/000217.html. The numbered entries proved to be tricky. While you can just append the number /1234 like so and WordPress will fid a post for you, the posts weren’t matching up. As many of these had been indexed by Google and linked by others I wanted to hook them up to the right posts.
Fortunately I had Movable Type still rendering my site as static HTML files, so with a quick bit of bash magic I pulled out the numbered posts and made rules to map them to the Movable Type post name based permalinks (which we already did rewrite rules for above):
find . -type f \
| grep "^\./[0-9]*\.html$" \
| xargs grep permalink \
| awk ‘{print $1 " " $17}’ \
| sed -e ’s%^\./%RewriteRule ^blog/archives/%’ \
-e ’s%\.html:%.html$%’ \
-e ’s%href="http://www.dynamicorange.com%%’ \
-e ’s%">Permalink</a><br%%’ \
> rules.txt
Each line of rules.txt ends up looking like this
RewriteRule ^blog/archives/000285.html$ /blog/archives/semantic-web/vocabs.html [L,R=301]
which results in a second redirect to just /vocabs and then a third as WP works out where to take you finally. Not great to be bouncing around so much, but much better than losing the link.
Good luck if you decide to make the same move.
Flickr Bashing
I spent some time a little while ago writing a bash interface to the Flickr api, using curl to handle the http interactions.
I was doing it because I have a backlog of around 11,000 photos that need sorting and uploading. To get them backed up I decided to simply upload them all marked as private and tagged as ‘toProcess’ so that I could then start to go through them online.
I’d written most of the raw bash scripts to do the job, but hadn’t put enough error handling in, or enough testing, to trust them with my live flickr account.
That was back at Christmas, so coming back to it fresh a thought occurred to me. At work we’ve been working with php a lot, and php works just dandy as a command line scripting language as well as a web language. Not only that, but php also has a great Flickr library already, it’s tested and in production use - phpFlickr by Dan Coulter.
So, with all my photos already organised in folders…
#!/usr/bin/php
<?
$path = ini_get(’include_path’);
$myPath = dirname(__FILE__);
ini_set(’include_path’, $path . $PATH_SEPERATOR . $myPath);
ini_set(’include_path’, $path . $PATH_SEPERATOR . $myPath . DIRECTORY_SEPARATOR . ‘phpFlickr-2.2.0′);
require_once ‘phpFlickr.php’;
$f = new phpFlickr(’your api key here‘, ‘your secret here‘);
$f->setToken(’your token here‘);
$f->auth(’write’);
$searchPath = $myPath.DIRECTORY_SEPARATOR.’later/’;
$uploadDir = opendir($searchPath);
$is_public = 0;
$is_friend = 0;
$is_family = 0;
while (false !== ($listing = readdir($uploadDir)))
{
$isHidden = ((preg_match(’/^\./’, $listing)) > 0) ? true : false;
if (is_dir($searchPath.DIRECTORY_SEPARATOR.$listing) && !$isHidden)
{
echo $listing."\n";
$year = (preg_match(’/[0-9]{4}/’, $listing)) ? substr($listing, 0, 4) : null;
$setName = $listing;
$setDir = opendir($searchPath.DIRECTORY_SEPARATOR.$listing);
$tags = "$year \"$setName\" toprocess";
while (false !== ($setListing = readdir($setDir)))
{
$isJpeg = ((preg_match(’/\.jpg$/i’, $setListing)) > 0) ? true : false;
if ($isJpeg)
{
$photoFilename = $searchPath.DIRECTORY_SEPARATOR.$listing.DIRECTORY_SEPARATOR.$setListing;
$response = $f->async_upload($photoFilename, $setListing, ”, $tags, $is_public, $is_friend, $is_family);
echo $response."\t".$photoFilename."\n";
rename($photoFilename, $photoFilename.’.done’);
//sleep(1);
stream_set_blocking(STDIN, FALSE);
$stdin = fread(STDIN, 1);
if ($stdin != ”)
{
echo "Exiting from stdin keypress\n";
exit(0);
}
}
}
}
}
closedir($uploadDir);
?>
This script wanders through the folders uploading any jpegs it finds, tagging them with the name of the folder they came from as well as a year taken from the first four digits of the folder name. All so simple thanks to Dan Coulter’s work.
moving stuff around
Just spent the evening moving this blog over to Wordpress (from MovableType).
All seems to have gone well, though old permalinks will be broken right now, ’til I get to fix it.
Fixing a plasma TV
My father-in-law very, very kindly donated me a plasma TV recently, a 32" Phillips from a few years ago. It was refusing to switch on, the power LED on the front of the screen indicating what the manual calls "protect" mode. This means that the TV has a fault and the LED shows it by blinking red.
Finding information about stuff like this online is always annoying difficult due the variance in search terms. Is that LED blinking, flashing, cycling, going on and off or any of a number of other descriptions. I found several references to this problem on a mix of forums, but eventually found a thread describing how to fix a phillips plasma tv with a flashing red led on avforums.
From there I managed to find that the type of TV I have is covered by the FM23 AC Service Manual, available here in the annoying form of a 16 part rar file which unrars to a single PDF.
Most of the successes in the thread seemed to have come from following Barbusa’s instructions in the thread linked to above, detailing three capacitors that wear out on the main power board. Diagnosing this was based rather loosely on the fact that if I kept switching it off and back on every time it went into protect it would eventually power up and run just fine. I’m guessing that this comes from the caps managing to build up enough charge over several power cycles.
Capacitors like these are really cheap to replace, I bought some from a local Maplin for the grand total of £1.14. I also bought a new soldering iron, a fine tipped butane one as Barbusa recommended for the job, bringing the bill to a lofty twenty quid - far less than I’d have to pay just to get someone to look at it for repair.
Armed with these new caps and the stupidity necessary to play at soldering inside a high voltage appliance I started stripping it down. Lying the screen flat on its front (on something soft) to remove the stand and screws from the back panel which gives us a great view of the insides - click for larger images.
In the middle here I’ve outlined the main power board, it’s the one with big capacitors, transformers and the two really big metal heat sinks (one black, one silver) running up the middle of the board.
To remove it, we first have to disconnect all the connectors, I took a few photos so I could put them back, but it seems the cabling routes and different sizes of connectors means that they will only fit one way. There are several screws all around the edge and one in the middle of the board, there a small torq fitting, the same as the case screws not sure what size these are, but they’re the smallest torq I’ve got in my toolbox.
Having removed the board we need to find capacitors 2662, 2663 and 2664. In my 32" screen 2662 is a 1000μF 25v 85°C with 2663 and 2664 both 25v 100μF 25v 85°C. I took Barbusa’s advice and bought 105°C rated caps to deal better with the heat. For 2663 and 2664 I couldn’t get the 25v caps, so bought the higher rated 50v ones that are fitted in the 42" plasmas. I’m no expert, but thanks to some friendly advice in #electronics on freenode.net I was confident they would be safe.
Finding them is easy enough, the board is numbered, so with fingers on the capacitor and my new soldering iron on the joints I slowly pulled the caps out and replaced each one in turn.
The numbering is on both sides of the board, here I have replaced 2662 and I’m just about to replace the other two.
Carefully putting everything back together - deep breath - it all works, powering up first time and running fine. Many thanks to the help of strangers :->
get back your mac
Based on a script from here: http://blogs.ittoolbox.com/security/investigator/archives/stolen-machines-phone-home-10506
this now lives in /usr/bin/ipkeyb file
#!/usr/bin/perl
# Report to a webserver (for tracking in the log as a 404) where our Macintosh is.
# Keep trying forever
while (1) {
# Wait 2 minutes for networks etc to attach
sleep 120;
# Do we have a network?
$network = `ifconfig -a inet 2>/dev/null | sed -n -e ‘/127.0.0.1/d’ -e ‘/0.0.0.0/d’ -e ‘/tunnel/d’ -e ‘/inet/p’ | wc -l`;
#print(”network: $network\n”);
# Carve out serial number information from system profiler
$serial_number1 = `system_profiler 2> /dev/null | grep \”Serial Number\”`;
# We want the second instance of serial number in our URL string
@serial_number2 = split (’ ‘, $serial_number1);
#print(”serial: $serial_number2[2]\n”);
$url = “http://www.CHANGE_ME.com/ipkeyb/$serial_number2[2].html”;
#print(”url: $url\n”);
# Let’s identify
$useragent = “Where Am I (Mac OS X)”;
# Okay, if we have network - make the request to the webserver
if ( $network > 0 ) {
#print(”sending: $url”);
$status = `curl -A \”$useragent\” $url`;
}
# Wait 3 hours before we try again
sleep 10800;
}
and has an entry in /etc/rc.local
Technorati Tags: OSX, “stolen laptop recovery”
Back on OS X
I’ve been head down for a while on work things, doing a whole load of data munging as well as the usual dev work. But my Mac went pop a couple of weeks ago and Apple decided the best thing was to replace it rather than fix it; fine by me. It seemed like a good opportunity to look at what I have installed and list what’s on my machine and why:
Makes work life so much easier than with Office. Keynote and Pages are a joy to work with on the odd occasion where I have to write something other than code.
I know lots of mac users insist on using Safari and I agree with them that Safari’s a great browser, but the extensions for Firefox are too useful, and we have one or two internally that help a lot. Firefox has to be the default. Extensions that go on straight away are: Web Developer; Firebug; Duplicate Tab; Download Statusbar; Greasemonkey; del.icio.us Bookmarks; and Resizeable Textarea.
Very simple torrent client that seems to behave itself nicely.
Got to have the real deal installed and running. The standard one shipping with OS X seems fine too.
Much of what I do is a mix of Java and PHP right now. A departure from a few years ago. Eclipse PDT works really nicely. I’d rather be using Coda for the markup, but can’t justify it right now.
The slickest source repository software I’ve ever worked with. Simple, fast and elegant.
We use IRC a lot to keep in-touch and ask quick questions, this is a great client, with customizable alerts and the ability to put in a sequence of auto-commands for when you connect to a server.
The best multi-network IM client I’ve ever used.
Of course. Phone home.
I said a while ago I wasn’t going to twitter any more. I was too hasty. When I moved over to the mac someone mailed me twitterific and it makes Twitter useful.
Skitch is great - grab bits of screenshots, annotate and drop into emails, doc or post to their online service. Simple idea executed really, really well.
I’ve been using Password Safe for years, but moving to Linux and Mac I needed something else. Pasword Gorilla is compatible with Pasword Safe, so I can just move my password files from machine to machine easily and securely.
Rips DVD images onto your disc, allowing them to be played by DVD Player while the disc stays at home. The other advantage is that the hard-drive uses loads less power, so you can watch at least a whole movie while on a flight - on one battery.
This great little tool takes a whole load of WMA files and converts them to MP3 and registers them with iTunes. A painless way to migrate from WMP.
I run XP very occasionally and Ubuntu quite often for testing under different OSs. Very handy. I sometimes develop under Ubuntu too, as Fusion can take snapshots I can play easily without wrecking my machine.
Connect to work via a Cisco VPN, nice and easy, fast and reliable from pretty much anywhere. Shimo sits in the menu bar allowing quick connections without having to open the cisco client up.
Open-Source project to make linux open-source projects available to OS X. Equivalent to apt-get or yum package managers. The folks behind this do a great job of keeping the builds up-to-date and providing repositories. There’s Fink as well, and I’ve tried both. I found MacPorts better, but if I’m wrong please tell me!
When I moved over to Mac I very nearly bought NetNewsWire for blog reading. Then I found Vienna; an open-source blog reader that is really good. On of the key things is the way it opens articles into tabs, keeping the feed handy when you’ve finished.
Not free, but worth the 11GBP it cost me. This is a great little offline blog editor. Hopefully might help me get a little more written here.
Some people may have policy issues with this tool - it’s a wireless network discovery tool that also allows you to crack WEP and WPA keys. I’ve used it to secure my own network, but I also use it to find open hotspots when I’m out and about. It’s been moving about a bit, so if the link’s broken then let me know. It was hosted from a site run by it’s creator Michael Rossberg, but since a change to German law outlaws this tool he has handed it on.
I don’t understand why OS X doesn’t have multiple desktops built-in, but as 10.4.10 it doesn’t. This is the nicest of the desktop managers I found. Also works with Smackbook if you’re so inclined.
This used to be distributed as part of OS X apparently. But Smith Micro insist on you getting it from them now. Part of the process is giving them your email address; which they then spam until you tell them to stop. Useful piece of software, as stuff still comes in .sit form, but annoying model employed by Smith. They should read cluetrain.
Creatures and Creatures 2
Finally - a little fun. The Creatures icons from Fast Icon are lovely and adorn my most useful folders.
Technorati Tags: OS X, recommended, tools
great short piece
Guest Article: Our Dirty Little Secret - Worse Than Failure
the dirty little secret of our profession: we all write bad code.
ISBN 10/13 Converter in Excel
I had a batch of ISBNs to convert from ISBN13 back to ISBN10. I’ve got some C# to do it, and some Java, but figured this Excel spreadsheet converter would be useful for folks.
Free to do with what you will - provided without warranty.
ah, now I can read
I’ve been using FeedReader for a while (I prefer email style new to river of news) and it works really well. The interface is calm and it fits the way I work. But a few months back I upgraded my laptop to a higher res one, 1920×1200 in just 15.4″. This looks great, but I do end up making text larger in a few of my apps. I wanted to change that text size in FeedReader and I couldn’t find it anywhere. Then I remembered that the preview pane in FeedReader is just an embedded browser - I wonder… Sure enough, C:\Program Files\FeedReader30\stylesheet contains atom.xsl, emailstyle.xsl and custom(delicious.xml); a quick tweak from
body {
font-family: verdana, tahoma;
font-size: 0.7em;
line-height: 1.3em;
padding: 0;
margin: 0;
}
to
body {
font-family: verdana, tahoma;
font-size: 1.0em;
line-height: 1.3em;
padding: 0;
margin: 0;
}
in both atom and emailstyle and all looks lovely :->
Search
Right Now (ish)
- /me has gone home, feeling all coldy. must be man-flu 2 days ago
- #mashlib08 paul bevan from nlw telling us about cool stuff they're trying to do 6 days ago
- @andypowe11 I can haz duster slippers? http://tinyurl.com/5v6ds8 for teh kittens, k thx bye in reply to andypowe11 6 days ago
- More updates...
Categories
- .Net Technical
- Blog on Blog
- commands I have issued
- Enterprise Architecture
- event
- Fiction Book Review
- Food
- Interaction Design
- Internet Social Impact
- Internet Technical
- IP Law
- Library Tech
- Music
- New Toy
- Non-Fiction Book Review
- Other Technical
- Personal
- Random Thought
- Resourcing
- Security And Privacy
- Semantic Web
- Software Business
- Software Engineering
- Talis Technical
- Uncategorized
- Working at Talis
- [grid::blogpaper]
- [grid::fatherhood]
Archive
- November 2008
- October 2008
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- January 2008
- December 2007
- November 2007
- October 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- September 2006
- August 2006
- June 2006
- February 2006
- January 2006
- December 2005
- November 2005
- September 2005
- August 2005
- July 2005
- June 2005
- May 2005
- February 2005
- January 2005
- December 2004
- November 2004
- October 2004
- September 2004
- August 2004
- July 2004
- June 2004
- May 2004
- April 2004
- March 2004
- February 2004
- December 2003
- November 2003
- August 2003
- July 2003
- June 2003
- May 2003
- March 2003
- January 2003
- May 2002
- March 2002
- August 2001
- May 2001
- April 2001
- January 2001
- December 2000
- November 2000
- December 1999
- November 1999
- July 1999




