Altering Your Site’s Permalink Structure

I made a rookie move while on Habari: I switched to using only post titles i.e. janetalkstech.com/%postname%. After I migrated from Habari to WordPress, I decided to keep that post structure i.e. janetalkstech.com/%postname%. Unfortunately for me, WordPress developers strongly recommend against setting up sites this way. After several days of seeing intermittent 500 errors and general site sluggishness (which, in hindsight, are likely attributable to WordPress having to do more work to figure out what post type the URL is referring to), I got the message and changed my site’s URL structure to janetalkstech.com/%year%/%postname%. Straight from the WordPress Codex:

For performance reasons, it is not a good idea to start your permalink structure with the category, tag, author, or postname fields. The reason is that these are text fields, and using them at the beginning of your permalink structure it takes more time for WordPress to distinguish your Post URLs from Page URLs (which always use the text “page slug” as the URL), and to compensate, WordPress stores a lot of extra information in its database (so much that sites with lots of Pages have experienced difficulties).

Changing my site’s URL structure was as simple as updating my permalinks options page, but with over 120 indexed posts, I needed to make sure that I:

  1. Didn’t annoy my visitors: For sites with less than 100 visitors daily like mine, I was not going to lose any money or get irate messages. However, I didn’t want to chase away the few visitors I have by throwing up a bunch of 404 – URL not found” messages. Instead, I needed to make sure I had a way to let them know that there would be a temporary break in “transmission” (so to speak). To do this, I needed to perform the change at a time when I had the lowest traffic volume. Again, I don’t make much money from this site, but if I did, it would not be smart to change my site’s links during the times of heaviest traffic!
  2. Had a plan for a seamless redirection: This was the biggest issue I needed to take care of before pulling the switch on my site’s structural change. I needed to make sure that the old posts were properly redirected i.e. a 301 redirect which tells Google/Search Engines that the old stuff is now at a different site. This is what Google recommends when you change a site or page’s url.

This time, I did the grunt work myself instead of going the Amazon Mechanical Turk route I used during my switch from Habari to WordPress. To make sure I didn’t leave my visitors in a lurch, I did the following:

  1. Exported my site’s contents in the WordPress WXR format and backed up my site’s database for good measure. It’s just good practice to have backing up data as a first step.
  2. Typed out an entire listing of my site’s posts by looking at my site’s Archives, grouped them by year and with a simple “search and replace” command in gedit, added the year (in which the post was written) to the URL so that janetalkstech.com/the-motorola-atrix became janetalkstech.com/2011/the-motorola-atrix.
  3. Manually created the redirections and a set of redirect rules with the Redirection Plugin. Instead of entering each redirect rule under the Redirection plugin’s “Redirects” tab, you can import a CSV, XML or RSS file into the plugin with the redirects you want and the plugin will do the rest of the work. Of course, the tedious part was entering the data into the CSV spreadsheet. The format for creating the CSV sheet is simple:
    • For each cell, enter the source/bad url, add a comma and then enter the target/good url.
    • Move on to the next cell below; Rinse and repeat.
    • You can verify the correct syntax for creating the redirect entries manually by exporting your current redirects from the Redirection plugin‘s “Modules” tab.
    • Double-check your work to make sure you have the correct redirections in your CSV sheet before importing!
    • I ended up with 122 cells containing redirects to the new URLs e.g.

  4. Installed the WP Maintenance mode plugin for WordPress and turned it on. Make sure you change the “Settings” for the Maintenance Mode plugin to “True”.
  5. Then, I changed my site’s permalinks by going to “Settings” and “Permalinks” while in the WordPress Administrative backend and entering “/%year%/%postname%“. I didn’t want trailing slashes at the end of my posts so I left out the trailing slash at the end. In my naiveté, it seems the current recommendation *is* to have a trailing slash at the end of URLs although that article is over a year old although Matt Cutts says the trailing slash doesn’t matter as much as picking the desired url style and sticking with it.
  6. Checked my CSV file for typos before importing into the Redirection Plugins’s “import” section. Creating a bulk CSV file to import your redirects is as simply as firing up Google Docs and creating cells in a Spreadsheet with the following information:
    1. source url which is the old/bad link e.g. http://janetalkstech.com/the-motorola-atrix
    2. literal comma
    3. target url which is the new links e.g. http://janetalkstech.com/2011/the-motorola-atrix
    4. Each cell of your spreadsheet should contain 1 redirect and be in this format:
      janetalkstech.com/the-motorola-atrix,http://janetalkstech.com/2011/the-motorola-atrix

  7. I verified that all my links were being properly redirected by clicking on old links and casually inspecting my site’s HTTP headers (using Wireshark). So, I was reasonably satisfied that my links and corresponding link juice were being passed on to the new URLs.

    GET /using-habari-from-a-users-perspective HTTP/1.1
    Host: janetalkstech.com
    Connection: keep-alive
    User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.100 Safari/534.30
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Encoding: gzip,deflate,sdch
    Accept-Language: en-US,en;q=0.8
    Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3

    HTTP/1.1 301 Moved Permanently
    Date: Wed, 15 Jun 2011 23:16:07 GMT
    Server: Apache
    X-Pingback: http://janetalkstech.com/xmlrpc.php
    Expires: Wed, 11 Jan 1984 05:00:00 GMT
    Cache-Control: no-cache, must-revalidate, max-age=0
    Pragma: no-cache
    Set-Cookie: PHPSESSID=x; path=/
    Vary: Accept-Encoding,User-Agent
    Last-Modified: Wed, 15 Jun 2011 23:16:13 GMT
    Location: http://janetalkstech.com/2009/using-habari-from-a-users-perspective
    Content-Encoding: gzip
    Content-Length: 20
    Keep-Alive: timeout=2, max=100
    Connection: Keep-Alive
    Content-Type: text/html; charset=UTF-8

    ………………..GET /2009/using-habari-from-a-users-perspective HTTP/1.1
    Host: janetalkstech.com
    Connection: keep-alive
    User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.100 Safari/534.30
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Encoding: gzip,deflate,sdch
    Accept-Language: en-US,en;q=0.8
    Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
    Cookie: PHPSESSID=x

  8. Turned off the WP Maintenance mode plugin and closely monitored my logs for any unusual drops in traffic.

Overall, thanks to the excellent Redirection Plugin by Urban Giraffe, I’ve been able to reliably handle the worst 404 errors that Google Webmaster Tools alerted me to. If you use this plugin, don’t forget to donate! I’ve got a long way to go (see image below) but baby steps. No more switching CMSes for me, that’s for damn sure. 🙂

Moving Back to WordPress from Habari :(

Over the course of 7 days, I completed the migration of this site from Habari 0.7 to WordPress 3.1.2. I ultimately came to realize that I needed to spend more time writing my blog posts and less time fiddling with my site or theme. In the end, I needed a fully baked CMS as Habari was still a little too ‘rare’ but the Habari CMS will always be on my radar. I knew this sad day would come and I already talked not-so-seriously about it with my post about navigating with Habari. Here, I will talk about the little issues that made my decision to switch somewhat easier, what it took to move my site from Habari back to WordPress, automating the process, more caveats as well as as some interesting things I miss about Habari!

Why?

My case with Habari is probably skewed by the fact that previously,Jane Talks Tech! ran on WordPress. Then, I switched to Habari. At the time, I had several comments on the site, I wasn’t using the Disqus commenting system then and I don’t even think they had the feature to migrate your existing comments to the Disqus servers. I didn’t have the foresight to do as much testing as I did in this reverse switching scenario so I switched to Habari, I experienced several issues that were later corrected/forgotten about or unfixed. In any case, here are some of the issues with Habari I’ve experienced or was experiencing:

  1. Trackbacks/Pingbacks: With Habari, trackbacks/pingbacks behaved very unexpectedly. I would write a post and link to a different post I’d written; More often than not, the trackback/pingback would be truncated i.e. would be missing the closing [/a] tag for links which would then cause the comments box & any associated text to be hyperlinked! I eventually went through *all* my trackbacks and manually edited them to either add the missing [/a] tag or remove the hyperlink all together. This would come back to bite me as I discovered during my migration back to WordPress but I had no choice! I know there’s a difference between a trackback & pingback but for my purposes, all I cared about was that:
    • Jane publishes a post and links to 1 or 2 more posts written by her.
    • Jane may edit published post and through changing post title, may inadvertently change the slug for the post.
    • As Jane discovers during her migration back to WordPress, there are several broken links in pingbacks/trackbacks that could only arise as a result of post title/slug change.
  2. Post title/slug problems: Every webmaster or blog owner should have a Google Webmaster Tools account. You will discover important things about your site in Google’s eyes such as broken links, search queries, etc. I didn’t login to my Webmaster tools account as often as I should, but for my migration back to WordPress, I finally took a look at what was going on with my site. To my horror, I discovered dozens of broken links to several internally linked posts. Majority of these broken links came from the realization that at some point, I had changed the title of certain posts which (and I’m not clear about this point) presumably changed the slug which would alter the URLs for those posts. In fairness, Habari doesn’t claim to be all that and kitchen sink. Habari is blogging software (first & foremost) and any work to extend the functionality of Habari is via plugins. However, I can see now that I need assistance in form of plugins and that since I cannot code a plugin, I need to go somewhere where plugins are already available. I cannot explain why I had so many bad/truncated URLs but I do know that the trackback/pingback issue with truncated URLs contributed. Whose fault (mine or Habari’s) that is doesn’t matter anymore. 🙂 I encourage anyone who experienced this same problem to leave a comment. The moral lesson of this bullet point is:

    Be careful in changing the titles and inadvertently changing the slug for already published posts because your pingbacks/trackbacks may not be updated.

  3. Threaded comments: When I switched from WordPress to Habari, I lost the threaded comments I had but I wasn’t too broken up about it. I kept the Habari commenting system and after a couple of months, I enabled Disqus commenting on my site. Unlike the Disqus plugin for WordPress, there was no feature in the Habari Disqus plugin to migrate existing comments to the Disqus system so comment management for my site was a little split between checking Disqus.com for moderated comments and checking on my Habari comment management page (because on posts containing existing WordPress/Habari comments, the Disqus commenting box did not appear which was intended behavior). I got over this but I sorely missed having the fully fledged Disqus plugin for WordPress. This would NOT be an issue for someone starting a brand new blog without comments. For someone with an established blog & comments, I would definitely recommend LOTS of testing before migration.
  4. Theming/Themes: With Habari 0.6, there were 2 gorgeous themes I’d found and absolutely loved. They were Dark Autumn by Ali Dmondark and Georgia by Thomas Silkjær. More importantly, they displayed fine without me needing to dive too deeply into CSS. With Habari 0.7’s release, it was trial & error to find out which themes worked OOB (out-of-box). There is a distribution hub for Habari themes but it is a work in progress. Habari is a labor of love and I can empathize with plugin or theme developers who do not have time to work on updating their themes/plugins with either bug or feature requests. Even Dark Autumn and Georgia weren’t without their quirks, but I was able to bend those themes to my will. 😛 With 0.7, those themes didn’t work for me (Dark Autumn was updated but I still had niggling issues I couldn’t fix and life got in the way) and with everything else going on, I figuratively threw up my hands and left the default K2 theme. I suspect the lack of themes is probably hurting the adoption of Habari. It probably doesn’t help that the only *definitive* post on creating themes with Habari is over 2 years old (it actually refers to porting WordPress themes) and that needs to change. There was actually a thread on Habari’s user mailing list where a Habari user was quite confused by reading the Habari Wiki page for creating custom Habari themes! After twittering about my switch back to WordPress, a Habari dev reached out to me and I told him one of the major reasons for leaving was the themes. [blackbirdpie url=”https://twitter.com/#!/janetalkstech/status/71723482823659520″]
  5. Habari Plugins: I like to compare my switch back to WordPress from Habari to my switch to Android from Symbian. Is Habari powerful with cleaner code, etc? Yes. However, the theme & plugin ecosystem for WordPress is huge which makes it the default release for several newfangled web services on the market and reminiscent of how apps are typically released on the iPhone first before Android. Nevertheless, there *are* plugins for the major sites for Habari and as a testament to the dedication of the Habari developers, a good number of these working plugins were written by them. For instance,
    1. Photosharing: There are plugins are available for viewing/inserting photos from: Flickr (bundled by default with Habari), Smugmug written by Colin Seymour, Picasa silo written by web development firm, Second Variety, Photozou, etc
    2. Social media: There are plugins available for viewing/inserting tweets, sharing blog posts, etc.
    3. Code sharing: There are plugins for embedding source code in blog posts, and many other plugins on the Habari distribution hub.

    However, WordPress wins out when it comes to the amount of customization that is possible with the help of plugins and I have truly come to appreciate the more mature plugins for services like Google Analytics, SEO, etc which keep me in my dashboard and help me continue to write.
    [blackbirdpie url=”https://twitter.com/#!/janetalkstech/status/72295045386076160″]

How?

Now, let me dive into the really cool details of how I managed my switch from Habari to WordPress without losing my sanity or (hopefully) not ruining my site’s ranking in Google’s eyes! Over a year ago, Chris Meller (a Habari developer) whipped up a small script to aid anyone who wanted to move back from whence they came i.e. WordPress. He used a beta version of WordPress 2.9 but WordPress is now at version 3.12! Currently, Chris Meller’s WordPress migration script will move the following:

  1. Posts – published or draft
  2. Pages – published or draft
  3. Comments – approved only
  4. Tags (Note: The script didn’t import my tags for some reason. This is an important warning that, depending on your setup, could be time-consuming work. I talk about how I got around this so keep reading. :D)

So, here are the first steps I followed to get my posts, pages, & comments imported into the test WordPress 2.9 installation and getting to a base installation:

  1. Create a new directory on janetalkstech.com.
  2. Downloaded a copy of WordPress 2.9 from the WordPress release archives as I saw no reason to tempt the gods by attempting the migration with a version of WordPress *he* didn’t use.
  3. Installed WordPress; This version will use the default “admin” username so I made sure I created the same username I had with my Habari installation and made that username an “Administrator”.
  4. Updated the WP Migration script with the database details for the new WordPress installation per Chris’s instructions:

    Well, you dump it in your Habari root directory, edit the array at the top so it can connect to the MySQL database your WordPress instance is installed in, and you load it up in your browser. You should see a bunch of junk about things it finds – and hopefully no MySQL errors along the way.

  5. Uploaded the edited WP migration script to my Habari installation at janetalkstech.com and *hopefully* no major errors in your case. In my case, a ‘major’ problem was the fact that my tags weren’t imported but I was just euphoric that I didn’t run into any “showstopping” bugs.
  6. Upgraded the test site to WordPress 3.1.2, blocked search engines from accessing the test site to prevent duplicate content problems, turned off commenting/pingbacks/trackbacks because this was a test site and installed tracking code (Google Analytics & StatCounter) to keep tabs on the visitors to the special test install site.
  7. Installed the WordPress DB Backup Plugin and User Role Editor plugin. These two plugins will save you heartache. Just do it now and thank me later.
  8. When I started the move, I had exactly 1, 111 tags in my Habari database and over 150 posts. The WordPress migration script did not import my tags and accordingly could not associate the posts with the tags. This was depressing because I was looking at *hours* of copying & pasting and doing tedious data entry. So, what did I do?

OutSource The Tedium

  1. Elance: Before all of this began, I initially created a job on Elance asking for a developer to take care of the mess for me. I got bids ranging from $200 to $300 from non-US based workers. As I was able to get my posts/comments/pages imported, I then created a new job asking for data entry work to add tags & associate the posts with corresponding tags. I got bids for this job ranging from $50 to $250. I wasn’t happy with the workers that were bidding on my jobs so I eventually canceled my jobs and decided to give Amazon’s Mechanical Turk program a try.
  2. Amazon MTurk: The website to sign in as a “Requester” is https://requester.mturk.com. The website is rather confusing so I recommend you skip straight to “create a hit individually“.
    1. You will be asked to pre-pay for your HITs so make sure you have your credit card info on hand or on your Amazon account.
    2. Err on the side of too much information. I created 2 jobs: one was asking for the creation of my 1, 111 tags based on the live site at Jane Talks Tech! for $10. I created screenshots showing how to create a tag, properly create the slug and submitting the new tag. The other *very* important thing I did was install the User Role editor plugin for WordPress. I didn’t have any extremely sensitive information in my drafts or blog posts so I wasn’t too worried about granting someone access to my test site. However, the User Role Editor allowed me to make sure the new user was was only able to create tags! Once the tags were created properly, the worker was paid. This worker was able to complete the job in less than 3 hours so I wasn’t left in limbo.
    3. The other job was for updating the posts with the created tags based on the archives at Jane Talks Tech! After Worker #1 was done, the next Worker to accept my HIT was to work on adding the tags to each post on the test site. Again, I created screenshots showing exactly how to do this and the worker was able to complete this task in less than 24 hours!In summary, for less than $25, I was able to get the 1, 111 tags created and added to the correct posts via Amazon’s Mechanical Turk program.
    4. I also ended up removing ~ 150 tags that were incorrectly added but thanks to the excellent “search tags” option in WordPress, I was able to find a common phrase within the bad tags and bulk-delete them. 😀
  3. One important thing I did (& kept doing) after major steps like:
    1. running the WP migration script successfully
    2. upgrading to WordPress 3.1.2
    3. creating tags thanks to an Amazon Turk worker
    4. updating posts with correct tags thanks to an Amazon Turk worker
    5. removing invalid tags from posts (done myself) was:

    WordPress Database Backup Optionsbacking up my WordPress database files via PHPMyAdmin where I downloaded the entire database and via the WordPress backup plugin three different ways: email, server and download. It is also a good idea to export your content using the built-in WordPress exporter which outputs a WordPress eXtended RSS file. The idea being: if something goes wrong, you can always fire up another WordPress instance and get going without having to start from a blank/fresh WordPress install.

Cleaning Up & Watching Out!

This part cannot be outsourced, in my opinion. So plan on having 2 – 4 hours to spend on fixing and getting things just right. My major concerns once I had my posts, pages, tags and categories set up properly were making sure: embedded/linked image paths were updated, internal/external urls were updated & functional, updating my sitemap and migrating my comments to the Disqus commenting system.

  1. Updating Image Paths: WordPress’s standard URL for uploaded files is: http://sitename.com/wp-content/uploads/whatever while Habari’s standard URL for uploaded files is: http://sitename.com/user/whatever/. So, I:
    1. Copied over my pictures & screenshots from /user/files to one folder in /wp-content/uploads/ and installed the Search Regex plugin for WordPress. I then performed a simple search for the “/user/files” string and replaced it with my WordPress folder path. Again, please backup your database files several ways before running this plugin. It is very powerful and done wrongly, can cause serious problems. So before running this, backup; after running this, backup. 🙂
    2. Installed the Broken Link Checker plugin for WordPress which alerted me to more broken image paths and I was able to use the Search Regex plugin to replace the links if there was a pattern. As I’ve become accustomed to, I backed up my database files before and after running this plugin.
  2. Update Links: The Broken Link Checker plugin was invaluable during this process particularly because it let me update the link without manually editing each post! This link checker works on all internal and external links so make sure you don’t constantly ask the program to crawl your links. There are more ways to check your site for dead/bad links but for immediate results or a snapshot of how bad things are, you cannot beat this simple plugin on small sites like mine. Another useful feature of the Broken Link Checker plugin is that it also detects redirected links. That was how I discovered over 200 internal links which had a slash at the end. This is probably a WordPress-related matter, but with the Broken Link Checker plugin, I was able to update the links with the canonical urls and avoid the extra slash.

    Thanks to this plugin, I was able to weed out several dead external link and discover several bad pingback/trackbacks that were hold-overs from my Habari days of broken trackbacks/pingbacks.

  3. Updating Your Sitemap: With the help of Yoast‘s WordPress SEO plugin, I was able to make sure that my site was verified with Google Webmaster tools, Bing Webmaster tools and Yahoo Site Explorer. I was also able to regenerate a sitemap of my site and resubmit that to the search engines. The WordPress SEO is a very polished plugin that does more than generating a sitemap; It helps you optimize your site and each post or pages for select phrases/keywords!
  4. Switching Comment Systems: This was probably the easiest part but nonetheless still nerve-wracking for me. As I alluded to, the official Disqus plugin for WordPress automagically imports your comments into your Disqus profile. I absolutely love Disqus because it allows you to get a fuller picture of your commenters and they’ve proven themselves honest by allowing you to export your Disqus comments back to your WordPress database or exporting all your comments to an XML file (which I am reasonably certain will be supported by whatever other CMS usurps WordPress’s throne).
    WordPress comments queued for Import into DisqusDisqus Successfully Imports WordPress comments

Missing Habari Already

  1. Media Management: Habari has a unique method of managing uploaded media (like photos, documents, etc). In the developers’ words:

    To assist in the management of media, such as images, video and audio, Habari provides a mechanism for defining virtual filesystems for media, called Media Silos. A silo provides a consistent interface for dealing with media.

    This made browsing through my Flickr/Smugmug albums an absolute pleasure and I still have not found a comparable Flickr/Media managing plugin for WordPress. 🙁

  2. Speed/Errors/Timeouts: The Admin backend for Habari loads drastically faster than WordPress’s. Oh, and don’t get me started on the errors I get even when trying to view my Admin page in WordPress!

    I’ve already had more 404 errors in the course of saving or doing anything in the WordPress backend this week than I have in the year of using Habari.

    Crazy, but again, I am aware that I’ve made some stability trade-offs in moving to WordPress because I want in to the WordPress plugin/theme ecosystem. 🙂 In fairness, WordPress is now a fullblown CMS whereas Habari can be called a minimalist blogging system.

The TL;DR version of this post is:

  1. Jane Talks Tech! now runs on WordPress. 😛

Overall, I’m very pleased that my migration from Habari to WordPress went smoother than I imagined. I didn’t have to visit IRC once or send flares into the digital universe. 🙂 If you have any questions/comments, drop  a note & I’ll do my best to answer.

It’s a happy day today. Oh la la! :)

So, after days of hinting and speculation, I’ve pulled the trigger and with the help of the fantastic developers at Habari (the #habari channel on freenode’s IRC server), I’ve successfully made the switch from the WordPress blogging platform to Habari, the content management system. Stay tuned for more. I leave you with this piece of eye candy from my Flickr gallery of flowers from the UGA Trial Gardens (behind Snelling Dining Hall):
Flower