Migrating WordPress to Static

Having merrily sunk about a year now into my experiment with prolonging adolescence making a game, I’ve wanted to get back into blogging some of what I’ve learned. But I’ve thoroughly fallen out of love with wordpress - updates, permissions, themes, etc. So I decided to nuke it and switch to an SSG (static site generator), which are apparently all the rage right now. Here, for your reference and mine, is what I did and how it went.

Step 1: Pick an SSG

Difficulty rating: painless - thanks to StaticGen, a site that lists loads of SSGs and their details. In my case I filtered by javascript and tried out the top 5-6 options, and eventually settled on hexo.

Step 2: Get all your data out of WordPress

Difficulty rating: painless.

  • Posts, pages, comments: Dashboard > Tools > Export
  • Attachments: grab the contents of wp-content/uploads

The built-in wordpress export gets you a big XML blob of blog data, and several of the blog-centric SSGs know how to import it directly. If yours doesn’t, install Hexo just long enough to use its migrate feature. This will get you a folder full of converted markdown files for all your pages/posts, suitable for further conversion.

Step 3: Fix everything that’s broken

Difficulty rating: tedious and painful. The details will depend on your before/after setups, but inevitably there will be a bunch of stuff to fix - changing hard URLs, image tags, code tags, and so forth.

My solution here was to write a simple node-js script that goes through a file applying regexes to each line. Then I started fixing my posts one by one, adding a regex to the conversion script for each problem that occurred more than once. This took a lot of finagling initally, but after a few hours the conversions had become largely automatic. The script wound up looking like this:

function work(line) {
// > < &
line = line.replace( />/g, '>')
line = line.replace( /&lt;/g, '<')
line = line.replace( /&amp;/g, '&')
// <pre lang:foo> ..
line = line.replace( /<pre[^>]*lang:(\w+)[^>]*>/g, '\n```$1\n')
// <pre .. > ..
line = line.replace( /<pre[^>]*>/g, '\n```\n')
// </pre>
line = line.replace( /<\/pre>/g, '\n```\n')
// ... and so on
}

Step 4: Customize themes and whatnot

Difficulty rating: being a web developer. This part is just front end development. In my case it took quite a while, because it mostly meant (re)acquainting myself with stuff like EJS/jade and stylus, and putting hexo in a headlock until it agreed to a pipeline (minification, etc) that I could live with. But if your SSG has themes and you’re willing to use one as-is, you might be done in five minutes.

Step 5: Migrate comments to Disqus

Difficulty rating: painless. I was worried about this step but it was over before I started.

  • Sign up for Disqus, create a message board
  • Disqus > Import > Upload your wordpress XML
  • Add Disqus’ universal include code to your theme (many blog-like themes have this built in)

Boom, worked perfectly, no pain whatsoever. If you have a massive blog with massive numbers of comments like Ray Camden, you might need to do things in batches.

Step 6: Spin up a static server

Enter the devops. I initially wanted to serve from a container, and spent several days playing with running nginx via Docker. This works great locally, but when I tried to deploy it on google cloud, ultimately there were too many fiddly bits, so I rolled a tiny boring nginx server. Assuming one already has google cloud’s CLI set up, this amounts to:

# provision the server
gcloud compute instances create "nginx-server" --machine-type "f1-micro" --tags "http-server","https-server" --boot-disk-size "10" --no-boot-disk-auto-delete
# ssh in and set stuff up
gcloud compute ssh nginx-server
sudo apt-get update
sudo apt-get install -y nginx

Easy as that. An f1-micro server and 10MB of disk runs a bit less than $5/month. For a real site, apart from the CLI above, you’ll want to promote the instance’s IP from ephemeral to static, and later of course point your domain at it.

Step 7: DEPLOY

Google docs will tell you to copy files to your server with gcloud compute copy-files, but don’t do that - it’s dog-slow and it overwrites every file, every time. Instead, use rsync. If you’re using google compute, here’s my deploy script:

rsync -avz -e "ssh -i ~/.ssh/google_compute_engine -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o CheckHostIP=no -o StrictHostKeyChecking=no"   \
path_to/your_www_folder/ \
root@<your server's IP>:/var/www/html/

Many Bothans died to bring us that script, please use it wisely.

Depending on your initial wordpress setup, your site may have incoming links of the form example.com/?p=123 where 123 is the post ID. To avoid breaking such links, I configured nginx to catch them and serve up permanent redirects to fixed URLs.

nginx.conf
map $arg_p $wordpress_link_redirect {
    default "";
    4      /2012/12/hello-world;
    15     /about;
    # ... and a hundred more lines like that
}

server {
    # (..snip..)
    location = / {
        if ($wordpress_link_redirect != "") {
            rewrite ^ $wordpress_link_redirect? permanent;
        }
    }
}

The first bit maps any incoming p query parameter to a URL string. The second bit is invoked for root / requests, and serves up a 301 permanent redirect if the p parameter got mapped, allowing links like this to work normally.


And there you have it. It took me a few weeks, all told, but the large majority of that was regular front-end development (i.e. hammering on the templates and CSS in hexo’s default theme until I was happy with the output). The actual “conversion” could easily be knocked out in a weekend, or even a few hours if you’re already very familiar with your SSG of choice.