Jan 22, 2009

A rake task to generate a google sitemap

I wanted to generate a dynamic sitemap on the off-chance it would help with Google results. Static sitemaps are pretty straight forward but if you've got a lot of stuff and they change daily, then hand editing files just won't do it. In that case you need to use a rake task along with cron to regenerate a sitemap. Here's one way you can do that.


In this scenario, I'm assuming you have a bunch of stores and that you want the sitemap to list them all. (You should replace your find statement with whatever you feel is appropriate to your sitemap.)


desc "Generate a sitemap.xml file"
task :generate_sitemap => :environment do |t|
filename = "#{RAILS_ROOT}/public/sitemap.xml"
stores = Store.find(:all)

File.open(filename, "w") do |file|
xml = Builder::XmlMarkup.new(:target => file, :indent => 2)
xml.instruct!
xml.urlset "xmlns" => "http://www.sitemaps.org/schemas/sitemap/0.9" do
for store in stores
xml.url do
xml.loc "http://www.buyindie.net/stores/#{store.id}-#{store.name.gsub(/[^a-z0-9]+/i, '-')}"
xml.lastmod store.updated_at.xmlschema
xml.changefreq "weekly"
xml.priority 0.6
end
end
end
end
end


Running that file will generate a sitemap.xml file which you can submit to The Google for inclusion. There are other ways to do this though. One criticism about using scripts like this is that if there's an error in the sitemap, it may not be obvious to other programmers where the code that generates the sitemap is.

To address that, one could generate the sitemap in the same way one would generate an RSS feed. Sprinkle in some page caching and expiration and you're on your way to having a sitemap generator that is friendly enough to share with your fellow coders.

No comments:

Post a Comment