Quote:
Originally Posted by alemcherry
But google obviously wont index any crap.
|
I think that's what would be called a limit.  The fact is, it costs Google money to keep data in their index. They want ROI on it. Even if the cost of a page is small, they cover billions of pages, and the cost adds up quickly. So, Google use an algorithm to decide what pages to index and not to index.
For example, duplicate content tends to fall out of the index. Or, the index is limited to unique content. We can't know the whole algorithm, but can assume it's along these lines, but taking a great deal more into account. We know that as ( real, not toolbar) PageRank increases, Google will crawl deeper into your site, but many people live with about a 3 click depth limit.
I can write software to generate random words and fill billions of pages on my WordPress blog, until I fill the GB they give me. Google won't index it all. I don't know what "my" limit is, but I know there's a number that's beyond it. I also know that as I write, and gain links to each of my posts, that number increases.
|