Eons ago, Matt Cutts wrote a post called tell me about your backlinks.
Quote:
So: I’m sitting in a room with a bunch of webmasters who want to throw out urls for me to analyze. I have full access to all my spam detective and debugging tools that I know and love. I’ve got my pimped out Firefox ready to go, and no one can see my screen. It’s like some wonderful, wonderful daydream has come true. 
The main point I want to get across is that in 1-2 minutes, it was easy to tell whether a site was (over)doing reciprocal links or trying to buy links. One site said: “we used to be doing okay last year, but for some reason we’re just not doing as well this year.” And I was able to tell them why: they had no spam penalties, but Google is getting better at handling paid links, and the paid links that might have helped them last year just weren’t doing them any good now.
|
Paid link detection probably depends on a database they keep of link sellers. They might have a bot that crawls link brokerages, looking for uris out to different hosts.
To find sites who are "(over)doing" reciprocal links, the only realistic way I can imagine is to get a uri to a page, crawl it, find all the internal links, go crawl them, and so on, until the entire site has been crawled ... then do the same for all external links on all pages in the site. Again, they probably have a database telling them links to, say, the BBC, don't need to be checked, so it's not really all external links. Anyway, because of link pages, and just the idea that most links aren't site-wide, it seems like you really would have to crawl all of the site in question, and all of the sites it points to. I guess you could stop the second part early if you find a reciprocal link...
One to two minutes doesn't sound reasonable to accomplish that. Maybe with enough existing data, especially if all of each site in question has already been indexed. Or ... might there be a more efficient way to determine this?
|