Google sitemaps revisited
So I put my money where my mouth is and decided to take the plunge with Google Sitemaps. When it comes to scripting, I’m neither geek nor luddite [a geedite?] and it took me some figuring out but with hindsight the whole thing looks far more daunting than it actually is, so let me summarise how it works.
Firstly you need to understand what you need to make it work:
1. Python 2.2. (or higher) has to be installed on your webserver. That’s usually the case, but do check with your webserver administrator.
2. You need Telnet or SSH access to your webserver. If not currently enabled, most webserver administrators will enable it for you, provided you tell them why you need it.
Download the script and a few associated files here, bookmark the page, as it contains the detailed instructions. Unzip the files.
Then, get to work!
Firstly decide how you want the script, sitemap_gen.py, to calculate your sitemap, which will be called sitemap.xml.gz:
- from a list of URLs, stored in a *.txt file, supplied by you
- by letting the script map your domain automatically
- from access logs
Your choice will determine how to edit the example_config.xml file, so it contains information about your domain which the script needs to create the correct sitemap.
Save the example_config.xml as a text file, e.g. config.txt, so you can edit it in Notepad.
In the areas where it says “modify or delete”, you need to do just that: don’t leave an area unmodified, if you don’t want the script to use it, because it will cause errors. Next save the config file as config.xml.
Now upload config.xml and sitemap_gen.py to your webspace. If you’re using the URL list option, you need to upload the *.txt file containing the URLs as well.
You’re almost done but I found the next bit the trickiest. You need to run the script from the command line of your server using Telnet/SSH.
Connect to your webserver using Telnet/SSH and at the command line prompt $ type in:
python /path/sitemap_gen.py --config=/path/config.xml --testing
Where /path/ is the exact location of the script and the config file (I’m assuming you’ve uploaded them to the same location). The most common error encountered here is where you get:
$ python: can’t open ‘/path/sitemap_gen.py’
What the snake is telling you is that it can’t find the file, probably because you’re specifying the wrong path. Make sure you have the right path, or try eliminating the starting ‘/’ (path/ instead of /path/).
If it works, run it again, this time without ‘--testing’ bit, now it will create the sitemap.xml.gz file and notify Google.
Next you need to submit your sitemap to Google. You need a (free) Google account with Froogle, Groups or any other user group. Submit your sitemap, and sit back to wait for approval. Currently it only takes a few hours.