redirection-csvcreation

Altering Your Site’s Permalink Structure

I made a rookie move while on Habari: I switched to using only post titles i.e. janetalkstech.com/%postname%. After I migrated from Habari to WordPress, I decided to keep that post structure i.e. janetalkstech.com/%postname%. Unfortunately for me, WordPress developers strongly recommend against setting up sites this way. After several days of seeing intermittent 500 errors and general site sluggishness (which, in hindsight, are likely attributable to WordPress having to do more work to figure out what post type the URL is referring to), I got the message and changed my site’s URL structure to janetalkstech.com/%year%/%postname%. Straight from the WordPress Codex:

For performance reasons, it is not a good idea to start your permalink structure with the category, tag, author, or postname fields. The reason is that these are text fields, and using them at the beginning of your permalink structure it takes more time for WordPress to distinguish your Post URLs from Page URLs (which always use the text “page slug” as the URL), and to compensate, WordPress stores a lot of extra information in its database (so much that sites with lots of Pages have experienced difficulties).

Changing my site’s URL structure was as simple as updating my permalinks options page, but with over 120 indexed posts, I needed to make sure that I:

  1. Didn’t annoy my visitors: For sites with less than 100 visitors daily like mine, I was not going to lose any money or get irate messages. However, I didn’t want to chase away the few visitors I have by throwing up a bunch of 404 – URL not found” messages. Instead, I needed to make sure I had a way to let them know that there would be a temporary break in “transmission” (so to speak). To do this, I needed to perform the change at a time when I had the lowest traffic volume. Again, I don’t make much money from this site, but if I did, it would not be smart to change my site’s links during the times of heaviest traffic!
  2. Had a plan for a seamless redirection: This was the biggest issue I needed to take care of before pulling the switch on my site’s structural change. I needed to make sure that the old posts were properly redirected i.e. a 301 redirect which tells Google/Search Engines that the old stuff is now at a different site. This is what Google recommends when you change a site or page’s url.

This time, I did the grunt work myself instead of going the Amazon Mechanical Turk route I used during my switch from Habari to WordPress. To make sure I didn’t leave my visitors in a lurch, I did the following:

  1. Exported my site’s contents in the WordPress WXR format and backed up my site’s database for good measure. It’s just good practice to have backing up data as a first step.
  2. Typed out an entire listing of my site’s posts by looking at my site’s Archives, grouped them by year and with a simple “search and replace” command in gedit, added the year (in which the post was written) to the URL so that janetalkstech.com/the-motorola-atrix became janetalkstech.com/2011/the-motorola-atrix.
  3. Manually created the redirections and a set of redirect rules with the Redirection Plugin. Instead of entering each redirect rule under the Redirection plugin’s “Redirects” tab, you can import a CSV, XML or RSS file into the plugin with the redirects you want and the plugin will do the rest of the work. Of course, the tedious part was entering the data into the CSV spreadsheet. The format for creating the CSV sheet is simple:
    • For each cell, enter the source/bad url, add a comma and then enter the target/good url.
    • Move on to the next cell below; Rinse and repeat.
    • You can verify the correct syntax for creating the redirect entries manually by exporting your current redirects from the Redirection plugin‘s “Modules” tab.
    • Double-check your work to make sure you have the correct redirections in your CSV sheet before importing!
    • I ended up with 122 cells containing redirects to the new URLs e.g.

  4. Installed the WP Maintenance mode plugin for WordPress and turned it on. Make sure you change the “Settings” for the Maintenance Mode plugin to “True”.
  5. Then, I changed my site’s permalinks by going to “Settings” and “Permalinks” while in the WordPress Administrative backend and entering “/%year%/%postname%“. I didn’t want trailing slashes at the end of my posts so I left out the trailing slash at the end. In my naiveté, it seems the current recommendation *is* to have a trailing slash at the end of URLs although that article is over a year old although Matt Cutts says the trailing slash doesn’t matter as much as picking the desired url style and sticking with it.
  6. Checked my CSV file for typos before importing into the Redirection Plugins’s “import” section. Creating a bulk CSV file to import your redirects is as simply as firing up Google Docs and creating cells in a Spreadsheet with the following information:
    1. source url which is the old/bad link e.g. http://janetalkstech.com/the-motorola-atrix
    2. literal comma
    3. target url which is the new links e.g. http://janetalkstech.com/2011/the-motorola-atrix
    4. Each cell of your spreadsheet should contain 1 redirect and be in this format:
      janetalkstech.com/the-motorola-atrix,http://janetalkstech.com/2011/the-motorola-atrix

  7. I verified that all my links were being properly redirected by clicking on old links and casually inspecting my site’s HTTP headers (using Wireshark). So, I was reasonably satisfied that my links and corresponding link juice were being passed on to the new URLs.

    GET /using-habari-from-a-users-perspective HTTP/1.1
    Host: janetalkstech.com
    Connection: keep-alive
    User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.100 Safari/534.30
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Encoding: gzip,deflate,sdch
    Accept-Language: en-US,en;q=0.8
    Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3

    HTTP/1.1 301 Moved Permanently
    Date: Wed, 15 Jun 2011 23:16:07 GMT
    Server: Apache
    X-Pingback: http://janetalkstech.com/xmlrpc.php
    Expires: Wed, 11 Jan 1984 05:00:00 GMT
    Cache-Control: no-cache, must-revalidate, max-age=0
    Pragma: no-cache
    Set-Cookie: PHPSESSID=x; path=/
    Vary: Accept-Encoding,User-Agent
    Last-Modified: Wed, 15 Jun 2011 23:16:13 GMT
    Location: http://janetalkstech.com/2009/using-habari-from-a-users-perspective
    Content-Encoding: gzip
    Content-Length: 20
    Keep-Alive: timeout=2, max=100
    Connection: Keep-Alive
    Content-Type: text/html; charset=UTF-8

    ………………..GET /2009/using-habari-from-a-users-perspective HTTP/1.1
    Host: janetalkstech.com
    Connection: keep-alive
    User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.30 (KHTML, like Gecko) Chrome/12.0.742.100 Safari/534.30
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Encoding: gzip,deflate,sdch
    Accept-Language: en-US,en;q=0.8
    Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
    Cookie: PHPSESSID=x

  8. Turned off the WP Maintenance mode plugin and closely monitored my logs for any unusual drops in traffic.

Overall, thanks to the excellent Redirection Plugin by Urban Giraffe, I’ve been able to reliably handle the worst 404 errors that Google Webmaster Tools alerted me to. If you use this plugin, don’t forget to donate! I’ve got a long way to go (see image below) but baby steps. No more switching CMSes for me, that’s for damn sure. :)

Related Posts Plugin for WordPress, Blogger...
  • http://ottodestruct.com Otto

    Wow. You went to a lot more effort than necessary, but well done regardless. :)

    WordPress very likely would have detected the bad incoming links and redirected them to the proper place. It includes a couple of features to do this automatically.

    The first is called redirect_canonical and it redirects any and all “bad” links to the canonical versions with 301’s. This is to help maintain google juice.
    Part of redirect_canonical doesn’t have an actual name for it, but in the code it notices 404’s and tries to figure out what you meant based on partial matching and things like that. The upshot is that the old postname only links likely would have been recognized as such automatically and the users redirected to the new links. No fuss, no muss.

    Now, this isn’t perfect, it can’t guess everything, but I feel pretty sure that it would have picked up your use case. So realistically, all you had to do was to go change the permalink structure and be done with it.

    For the cases where the automatic functionality didn’t work, there’s a couple of “permalink migration” plugins for WordPress too. These simply let you put in the old permalink structure on a plugin page, and it auto-redirects all old links to the new ones. This one works pretty well, I think: http://wordpress.org/extend/plugins/permalinks-migration-plugin-for-wordpress/

    • http://janetalkstech.com Jane Ullah

      Thanks for visiting my lowly blog and leaving a comment, Otto! :) And for making my post old news already. :P

      Before uploading the manual redirects I created, I went to the old/bad URLs and I was getting immediate 404 errors. Unless there was a setting to toggle WP’s built-in 404-catch-and-redirect functionality, it wasn’t happening for me. :( Still, that’s amazing that WordPress apparently does this! In my searches, I simply didn’t even consider that WordPress would automatically handle this.

      With detritus from my previous Habari to WordPress migration, I didn’t want to leave anything to chance hence the overkill. Also, I was having problems with links like this: janetalkstech.com/?p=XXX going to the wrong posts presumably because of how WordPress’ redirecting rules determines the matches alphabetically (correct me if I’m making stuff up):P Thanks again and I’ll probably update this post to highlight the choice bits of your comment if that’s okay.

      Cheers,
      Jane

      • http://www.seobutler.co.uk Martin Oddy

        Hmm. Having used WordPress for a fair old while, I can’t say I’ve come across any issues with having a yourdomain.com/%postname% URL structure. It’s something I’ve employed myself on numerous websites, and I know others in the SEO community that do the same.It makes sense though, and seems to be a failing in the WordPress system if anything. Thanks for the heads-up :)

        • http://janetalkstech.com Jane Ullah

          Martin,

          My site is hosted on a Dreamhost shared hosting server so this site’s performance could do with some speed. I have seen sites using the yourdomain.com/%postname% structure but without knowing the details of their server setup, etc, I wouldn’t recommend that a newbie (like me) use that permalink structure. Thanks for dropping by and I’m digging your website. :)

          Jane