odd query strings in Googlebot requests

Google's indexing bot (edit: yes, it's Google, IP resolves) seems to be adding arbitrary query strings to our home page.

xx.xxx.xx.xxx - - [30/Jun/2009:10:14:37 -0400] "GET /?key=61680 HTTP/1.1" 200 3334 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
xx.xxx.xx.xxx - - [30/Jun/2009:10:16:58 -0400] "GET /?term=byron HTTP/1.1" 200 3184 "-" "DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)"

Any idea what these are meant for?


Solution 1:

Looks like Googlebot may be lightly probing your site in search of possible content-duplication issues. Or to see if your site correctly handles non-existent files (by returning a 404 response header) and/or bogus query strings.

It may also be testing to see if you may be some kind of link farm if bogus query requests deliver some kind of differing result.

It's also possible that someone out there has linked to your site using those query string parameters and the Googlebot is just coming back to you to see what it's all about. If that's the case, try and find out who's linking to you in such a way and see if you can't get them to correct their links.

Solution 2:

Are they found along with other Googlebot entries? If not it could be Googlebot is checking links from another website to yours to verify the connection with their algorithms. This means another website has links to your website with those URLs. I don't know if spam or link domains can do something with those URLs or not.

As I don't necessarily understand everything Googlebot does, I could be wrong, of course.

Solution 3:

For the past few days Googlebot has been doing the same thing to one of our sites. It appears to be inserting a querystring value that matches a key we use, but expects an integer where Googlebot is supplying a string. (e.g. The parameter should be something like gb=22 but Googlebot is looking for gb=lkcvvzxxz)

What's worse, Googlebot is indexing these bad URLs into Google.

I would love to see this question answered. I know this should have been a comment, but don't have the points to do that on severfault yet...