This post is part of ProBlogger.net’s 31 Days to Building a Better Blog series.
Let me start off by saying that if you haven’t heard of Google Sitemaps yet, then you probably haven’t been worrying a whole lot about SEO (search engine optimization) for your site. I say this because even a cursory glance through the numerous SEO-related blogs on the internet will reveal the huge buzz, and sometimes disappointment, that have surrounded the implementation of Google’s new Sitemap program. Wondering what a sitemap is? Here’s a little snippet from Google:
What is Google Sitemaps?
Google Sitemaps is an experiment in web crawling. Using Sitemaps to inform and direct our crawlers, we hope to expand our coverage of the web and and speed up the discovery and addition of pages to our index. By placing a Sitemap-formatted file on your web server, you enable our crawlers to find out what pages are present and which have recently changed, and to crawl your site accordingly.
Basically, the steps to participating in Google Sitemaps are:
1. Creating a Sitemap in a supported format.
2. Submitting that Sitemap to Google.
3. Updating your Sitemap when your site changes.
So why would you want to use a Sitemap? More than likely you would use it in the hopes of getting your pages quickly indexed by Google. While I have read differing opinions on what to expect after submitting a sitemap, my personal experience has been nothing but positive. For instance, after struggling for a couple of days to find a sitemap solution for my TypePad blog, I finally managed to create a sitemap and get it through the submission process without any errors…within about 36-hours I was starting to get fairly regular hits from Google — nothing spectacular — but far more, and to deeper pages, than I had ever gotten before.
Now, I’m an admitted newbie at blogging, and while there are numerous more professional and tech-heavy ways to create and submit sitemaps, this post is going to concentrate on helping my fellow newbies; particularly TypePad users. A sitemap in it’s most basic form is a simple text (.txt) file that contains the URLs for all of the pages of your blog, and can also list when each post was last modified, and what you consider the relative importance of each post. There are numerous free programs available on-line that can create your sitemap for you, and once it’s made you can download it to your computer, and then upload to the root level of your server. I first tried out one of these on-line solutions as just a quick way to get Google to spider some of my earliest posts (and it worked fine) but I personally prefer a more automated system, and was really hoping to find a way to have my sitemap updated automatically whenever I publish a new post. After a lot of searching and fumbling I finally found a way that works well, but isn’t exactly perfect.
Because Cancer NewsWatch is hosted on TypePad, I did a search for “TypePad Sitemap Template” on Google, and sure enough, I came across a template that a fellow blogger had put together for MovableType, but was supposed to work for TypePad too (Google SiteMaps for Movable Type.) You need to have a Pro account on TypePad in order to use this, as you will have to go into your blog’s design templates and create an extra template, name it what you like, tell it what to name the file it will output (make sure to put a .xml at the end of the name — something like “sitemap.xml”) and then paste in whatever sitemap code you want to use. Next time you publish a post, this new template will be used to create a sitemap, and it will be placed at the root of your blog. You can then go to Googles Sitemap Submission page, and submit your newly made sitemap file.
Well, while all of this sounds fine and dandy, I came across a single stumbling block — my sitemap wouldn’t validate at Google : ( After submitting my sitemap’s url and waiting at least a few hours for Google to download it, I got an invalid time error. I searched my sitemap file, but couldn’t find anything wrong with the way it adhered to Google’s rather strict timing convention. So, having read that the time wasn’t absolutely necessary for my sitemap, and really, really, really wanting the thing to go through, I edited my template to remove the time-last-modified code, and submitted it again…guess what? Now I got an invalid date error. Yet again I searched through my file, but couldn’t find any sign of it deviating from what Google was asking for. Absolutely flustered, I removed the entirety of the last-modified code and submitted it again. This is what it looks like:
<?xml version=”1.0″ encoding=”UTF-8″?> <urlset xmlns=”http://www.google.com/schemas/sitemap/0.84″>
<url>
<loc><$MTBlogURL encode_xml=”1″$></loc>
<priority>1.0</priority>
<changefreq>daily</changefreq>
</url><MTEntries lastn=”10″>
<url>
<loc><$MTEntryPermalink encode_xml=”1″$></loc>
<priority>0.9</priority>
<changefreq>daily</changefreq>
</url>
</MTEntries><MTEntries lastn=”10″ offset=”10″>
<url>
<loc><$MTEntryPermalink encode_xml=”1″$></loc>
<priority>0.8</priority>
<changefreq>daily</changefreq>
</url>
</MTEntries><MTEntries lastn=”1000″ offset=”20″>
<url>
<loc><$MTEntryPermalink encode_xml=”1″$></loc>
<priority>0.5</priority>
<changefreq>monthly</changefreq>
</url>
</MTEntries></urlset>
Thank God, it finally worked!
Now I realize this isn’t the optimal way to do things, and it’s possible I’ll take some kind of hit for not providing the last-modified info, but I did manage to create a fully automated system for creating and submitting my sitemaps, and it has obviously paid off already, because I’m getting hits on pages that haven’t seen the light of day since they were first posted. Want to let Google know when you’ve updated your sitemap? Just ping them! Because I use ecto for all of my editing, I just set ecto’s ping preferences to notify Google whenever I post, so whenever Goggle comes spidering my site they find a brand spankin’ new sitemap to for their web-crawling pleasure. So far, the benefits of providing Goggle with a blueprint to my site have far out-weighed the trouble and frustration of making it all work. Now I’m just gonna’ sit back and ride the wave to Google Stardom ; )
Yeah, right…
Technorati Tags: ecto, Google Sitemaps, ProBlogger, TypePad

