Monday, June 06, 2005

Google sitemaps revisited

So I put my money where my mouth is and decided to take the plunge with Google Sitemaps. When it comes to scripting, I’m neither geek nor luddite [a geedite?] and it took me some figuring out but with hindsight the whole thing looks far more daunting than it actually is, so let me summarise how it works.

Firstly you need to understand what you need to make it work:

1. Python 2.2. (or higher) has to be installed on your webserver. That’s usually the case, but do check with your webserver administrator.

2. You need Telnet or SSH access to your webserver. If not currently enabled, most webserver administrators will enable it for you, provided you tell them why you need it.


Download the script and a few associated files here, bookmark the page, as it contains the detailed instructions. Unzip the files.

Then, get to work!

Firstly decide how you want the script, sitemap_gen.py, to calculate your sitemap, which will be called sitemap.xml.gz:


  • from a list of URLs, stored in a *.txt file, supplied by you
  • by letting the script map your domain automatically
  • from access logs

Your choice will determine how to edit the example_config.xml file, so it contains information about your domain which the script needs to create the correct sitemap.

Save the example_config.xml as a text file, e.g. config.txt, so you can edit it in Notepad.

In the areas where it says “modify or delete”, you need to do just that: don’t leave an area unmodified, if you don’t want the script to use it, because it will cause errors. Next save the config file as config.xml.

Now upload config.xml and sitemap_gen.py to your webspace. If you’re using the URL list option, you need to upload the *.txt file containing the URLs as well.

You’re almost done but I found the next bit the trickiest. You need to run the script from the command line of your server using Telnet/SSH.

Connect to your webserver using Telnet/SSH and at the command line prompt $ type in:

python /path/sitemap_gen.py --config=/path/config.xml --testing

Where /path/ is the exact location of the script and the config file (I’m assuming you’ve uploaded them to the same location). The most common error encountered here is where you get:

$ python: can’t open ‘/path/sitemap_gen.py’

What the snake is telling you is that it can’t find the file, probably because you’re specifying the wrong path. Make sure you have the right path, or try eliminating the starting ‘/’ (path/ instead of /path/).

If it works, run it again, this time without ‘--testing’ bit, now it will create the sitemap.xml.gz file and notify Google.

Next you need to
submit your sitemap to Google. You need a (free) Google account with Froogle, Groups or any other user group. Submit your sitemap, and sit back to wait for approval. Currently it only takes a few hours.

Good luck!

6 Comments:

At 7:25 AM, Anonymous Anonymous said...

hi,
generator SITEMAP in PHP http://www.orvinfait.fr/scripts_web_performant.html

In French, possible to put a translation of this script on your site
Serge Cheminade

 
At 12:48 PM, Blogger Gert said...

Serge,

I'm not entirely sure what exactly you want bur here's a link to the Sitemap guide in French.

Salutations,

Gert

 
At 10:45 AM, Anonymous Anonymous said...

Hi Gert!

There is a program for PHP called phpSitemapNG - you can find more information about it at http://enarion.net/google/

This is much easier to install than the python solution provided by google.

Give it a try and give some feedback about your experience.

Best regards,
Tobias

 
At 2:03 PM, Blogger Gert said...

Hi Tobias,

Thanks for your comment. I'll look into your solution.

Personally I don't find the Google OEM script hard to install, certainly not after you've passed the first hurdle. But the php generator may be more luddite friendly...

 
At 4:50 PM, Anonymous Anonymous said...

Hi,
You can use Perl to create the
XML Google sitemap, there is
http://cpan.uwinnipeg.ca/htdocs/WWW-Google-SiteMap/WWW/Google/SiteMap.html
module from CPAN.
Also if you have a sitemap web page,
then it is easy to write
a Perl script to extract the links
and create the
XML Google sitemap file.
If you are using a PHP-based tool
for dynamically generated pages in
your site, for example Mambo,
then you can use that to create the
sitemap web page.

 
At 9:13 PM, Anonymous Anonymous said...

Just wanted to say thanks for posting this. After 3 days and then stumbling upon your blog, I finally realized that the sitemap_gen.py needed a path as well. :)

 

Post a Comment

<< Home