Last spring, Matt Cutts pointed out that it's a violation of Google's quality guidelines to allow your internal search results to be indexed. ( See: Search results in search results. ) Murphy being an optimist, I wind up with my own search results in Google's index. A lot of them, according to the site: query option.
I've blocked these in robots.txt, and nofollowed some of the similar links. The search form uses a get, so there are no links ... meaning there have to be external links for Google to know about the pages. I don't know php well enough to hack my particular wordpress template to add a meta tag for robots to the search piece of the code ... although I might wind up learning.
It would be a lot easier to just use .htaccess; if the user agent is googlebot, and the request is to a search, I can redirect to the home page, or some arbitrary url that can be indexed. That sounds a lot like cloaking, although it's the least effort to implement. In theory you aren't supposed to do anything solely for a search engine's benefit, but I don't personally care whether my /index.php?s= pages are being indexed.
|