Moving Stuff Finished :-)
Yesterday evening I followed the excellent instructions from the WordPress Codex on Migrating from Movable Type to WordPress. This proved to be extremely easy with all the posts, categories and comments coming through perfectly.
I had some files (photos, ppt etc) manually uploaded, so logged in to move those over. As well as migrating I moved the blog from /blog/ up to the root of the site. That all went swimmingly and I picked a nice shiny new theme called demar.
The only thing left was to sort out redirects for all the old links – so I don’t lose all the link love I’ve managed to build up over the years.
After reading through the various options I had a few issues. I’d tinkered a bit with the MT URLs in the past and had a lot of legacy stuff hanging around. I decided I’d just do it with apache’s .htaccess. Not a choice for everyone, but my regex skills aren’t too shabby, so I figured I’d start there.
The URLs fell into a few different patterns:
/blog/atom.xml, index.xml, etc – the feeds. For now these can all redirect to /feed/ so we start with
RewriteEngine On
RewriteRule ^blog/atom.xml$ /feed/ [L,R=301]
RewriteRule ^blog/index.rdf$ /feed/ [L,R=301]
RewriteRule ^blog/index.xml$ /feed/ [L,R=301]
Feeds dealt with I moved on to the root of the blog, adding
RewriteRule ^blog/$ / [L,R=301]
RewriteRule ^blog$ / [L,R=301]
and then onto the archives, where we start to get trickier. First we have categories which take the form /blog/archives/cat_somewhere_I_put_stuff.html. WordPress creates a different pattern by default – /categories/somewhere-i-put-stuff. Not too hard, first we pull out the words, then glue them back together again.
RewriteRule ^blog/archives/cat_([^_]*)_([^_]*)_([^_]*)_(.*)\.html$ /category/$1-$2-$3-$4 [L,R=301]
RewriteRule ^blog/archives/cat_([^_]*)_([^_]*)_(.*)\.html$ /category/$1-$2-$3 [L,R=301]
RewriteRule ^blog/archives/cat_([^_]*)_(.*)\.html$ /category/$1-$2 [L,R=301]
RewriteRule ^blog/archives/cat_(.*)\.html$ /category/$1 [L,R=301]
Each of these regexs pulls out categories of 4 words long, 3 words, 2 words and 1 word respectively. If you have categories with more words in then you’ll need to add longer versions of these, ordering them longest first.
Next the monthly archives, in MT /blog/archives/2004_04.html and in WP /2004/04/
RewriteRule ^blog/archives/([0-9]{4})_([0-9]{2})\.html$ /$1/$2/ [L,R=301]
easy enough.
Then we have the individual posts. These fall into two groups, name based files and numbered files. The name based files are all /blog/archives/categoryname/postname.html. I had thought these were going to be a pig, but I discovered completely by accident that if you simply use the postname part of that then WordPress figures out which post you meant and redirects to its nice new WP URL. Sweet.
RewriteRule ^blog/archives/[^/]*/(.*)\.html$ /$1 [L,R=301]
The exception to this turns out to be posts that have a hyphen in the name. MT strips the hyphen, leaving WP with a name that doesn’t match. I put a rule in specifically for the one post I have that was affected:
RewriteRule ^blog/archives/personal/beijing_sightse.html$ /2008/05/04/beijing-sight-seeing/ [L,R=301]
Which just leaves the numbered posts: /blog/archives/000217.html. The numbered entries proved to be tricky. While you can just append the number /1234 like so and WordPress will fid a post for you, the posts weren’t matching up. As many of these had been indexed by Google and linked by others I wanted to hook them up to the right posts.
Fortunately I had Movable Type still rendering my site as static HTML files, so with a quick bit of bash magic I pulled out the numbered posts and made rules to map them to the Movable Type post name based permalinks (which we already did rewrite rules for above):
find . -type f \
| grep "^\./[0-9]*\.html$" \
| xargs grep permalink \
| awk ‘{print $1 " " $17}’ \
| sed -e ’s%^\./%RewriteRule ^blog/archives/%’ \
-e ’s%\.html:%.html$%’ \
-e ’s%href="http://www.dynamicorange.com%%’ \
-e ’s%">Permalink</a><br%%’ \
> rules.txt
Each line of rules.txt ends up looking like this
RewriteRule ^blog/archives/000285.html$ /blog/archives/semantic-web/vocabs.html [L,R=301]
which results in a second redirect to just /vocabs and then a third as WP works out where to take you finally. Not great to be bouncing around so much, but much better than losing the link.
Good luck if you decide to make the same move.
1 Comment to Moving Stuff Finished :-)
[...] If you are thinking of doing a similar move (and are of the tech inclination) I’d recommend Rob Styles post on moving from Typepad to Wordpress for information on dealing with redirecting URLs etc – something I struggled with (and still [...]
Leave a comment
Additional comments powered by BackType
Search
What I'm Doing...
- @moustaki, would you recommend an equivalent to music ontology for visual recordings? 1 day ago
- @chriskeene Does the uni have it's own local weather system? (http://twitter.com/chriskeene/status/10314171215 and go left) in reply to chriskeene 2 days ago
- @_philjohn should I expect a late arrival then? in reply to _philjohn 2 days ago
- More updates...
Recent Comments
- Patents are Property – Like it or Not « Chasing the Power Curve on When Patents Go Wrong…
- Arizona Joe on Fixing a plasma TV
- alex_turner11 on Ground roundup of new eReaders at CES on CNN
- negative_charge on Hacking Into Your Account is as Easy as 123456
- infopeep on Hacking Into Your Account is as Easy as 123456
- BenenhaleyBrian on The 18 Mistakes That Kill Startups
- Brian Benenhaley on The 18 Mistakes That Kill Startups
- infopeep on The 18 Mistakes That Kill Startups
- Rob Styles on Ruby Mock Web Server
- Jim on Fixing a plasma TV
Categories
- .Net Technical (8)
- Blog on Blog (6)
- commands I have issued (9)
- Enterprise Architecture (19)
- event (4)
- Fiction Book Review (2)
- Food (2)
- Intellectual Property (9)
- Interaction Design (27)
- Internet Social Impact (43)
- Internet Technical (16)
- IP Law (10)
- Library Tech (19)
- Music (2)
- New Toy (4)
- Non-Fiction Book Review (7)
- Ontologies (6)
- Open Data (7)
- Other Technical (20)
- Personal (36)
- Random Thought (16)
- Resourcing (4)
- Review (1)
- Security And Privacy (11)
- Semantic Web (30)
- Software Business (10)
- Software Engineering (37)
- Talis Technical (9)
- Uncategorized (44)
- Working at Talis (26)
- [grid::blogpaper] (8)
- [grid::fatherhood] (4)
Archives
- February 2010 (1)
- January 2010 (4)
- November 2009 (10)
- October 2009 (4)
- September 2009 (2)
- August 2009 (9)
- July 2009 (12)
- June 2009 (5)
- May 2009 (6)
- April 2009 (7)
- March 2009 (3)
- February 2009 (6)
- January 2009 (10)
- December 2008 (4)
- November 2008 (4)
- October 2008 (9)
- September 2008 (23)
- August 2008 (8)
- July 2008 (1)
- June 2008 (1)
- May 2008 (6)
- April 2008 (14)
- March 2008 (3)
- January 2008 (5)
- December 2007 (6)
- November 2007 (13)
- October 2007 (9)
- July 2007 (2)
- June 2007 (1)
- May 2007 (10)
- April 2007 (5)
- March 2007 (11)
- February 2007 (10)
- January 2007 (13)
- December 2006 (8)
- November 2006 (8)
- September 2006 (2)
- August 2006 (1)
- June 2006 (2)
- February 2006 (2)
- January 2006 (3)
- December 2005 (3)
- November 2005 (2)
- September 2005 (2)
- August 2005 (5)
- July 2005 (8)
- June 2005 (3)
- May 2005 (2)
- February 2005 (1)
- January 2005 (4)
- December 2004 (3)
- November 2004 (6)
- October 2004 (2)
- September 2004 (2)
- August 2004 (5)
- July 2004 (1)
- June 2004 (4)
- May 2004 (4)
- April 2004 (3)
- March 2004 (13)
- February 2004 (6)
- December 2003 (3)
- November 2003 (1)
- August 2003 (2)
- July 2003 (1)
- June 2003 (2)
- May 2003 (1)
- March 2003 (1)
- January 2003 (1)
- October 2002 (1)
- May 2002 (1)
- March 2002 (1)
- August 2001 (1)
- May 2001 (1)
- April 2001 (1)
- January 2001 (1)
- December 2000 (1)
- November 2000 (1)
- December 1999 (1)
- November 1999 (1)
- July 1999 (1)
June 3, 2009