My website is
www.localautomation.com. Here are some interesting things that have happen during the past three months which may be of help to people (I'm also looking for help myself).
I launched my site on February 15, 2006, and immediately submitted a sitemap to google. Within a week, I already had over 100 pages indexed, which I thought was pretty good. All of my pages are written in PHP... all of them use $_SESSION variables... and all of them use the header command "cache-control: private".
Once I hit about 100 pages or so, I never got any more. During this time, I looked at a lot of my pages and thought, stupidly, "hmmm... maybe Google doesn't like indexing PHP pages". So... I set up my server to automatically parse .html pages as if they are .php pages and changed ALMOST all of my page extensions to .html.
So...
www.localautomation.com/motion-control.php became
www.localautomation.com/motion-control.html... etc.
Guess what happened... Google started removing my pages... slowly but surely... from the index. Within two weeks, I was down to 5 pages! Yahoo started doing the exact same thing. Within two weeks, I had dropped to 5 pages on Yahoo as well... and they were the EXACT same pages that were left on Google!
The only pages that were left were 5 pages that still had the .php extension and did not include "header(cache-control: private)". Hence, my next great idea... remove "header (cache-control: private)" from all of my pages.
I did this yesterday... and Yahoo is now starting to add pages back again (I'm still waiting on Google). However, Yahoo is only adding back the .php pages... not the .html pages. Interesting.
So... here are my recent conclusions from this:
1. Google and Yahoo don't like "header(cache-control: private)".
2. Google and Yahoo don't like .html pages parsed as .php (maybe this is due to my $_SESSION variables - I'm not sure).
If anyone has any suggestions here, I would greatly appreciate it. I would like to keep my .php pages with the .html extension (for purely cosmetic reasons)... but it looks like this is a major problem for the search engines.
Also... does anyone know anything about "header(cache-control: private)"?
Thanks.