Tycoon Talk
Become a Big fish!
The number 1 forum for online business!
Post topics, ask questions, share your knowledge.
Tycoon Talk is part of Freelancer.com - find skilled workers online at a fraction of the cost.

Blogging Forum


You are currently viewing our Blogging Forum as a guest. Please register to participate.
Login



Reply
Daily Archiving+Scrapers
Old 12-09-2007, 05:25 PM Daily Archiving+Scrapers
MrBrownThumb's Avatar
Extreme Talker

Posts: 194
Location: Chicago, IL
Trades: 0
On Blogger we have the option of archiving our blog's daily, weekly and monthly. Recently my little corner of the blogosphere has been dealing with scrapers and scraping causing many of us to just shorten our feeds entirely.

I've noticed sometimes that when you've been scraped the scraper will come up first in search results instead of the original post. I'm wondering if daily, weekly, monthly, archiving has benefits or drawbacks.

I'm working under the assumption that if you archive weekly/monthly by the time the spiders come to you the scraper site may have already been archived and will you'll be seen as the duplicate. Would daily archiving improve the chances that you're seen as the original source?

If I'm wrong feel free to set me straight.
__________________

Please login or register to view this content. Registration is FREE
.
Please login or register to view this content. Registration is FREE
.
Please login or register to view this content. Registration is FREE
.
MrBrownThumb is offline
Reply With Quote
View Public Profile Visit MrBrownThumb's homepage!
 
 
Register now for full access!
Old 12-09-2007, 05:59 PM Re: Daily Archiving+Scrapers
ForrestCroce's Avatar
Half Man, Half Amazing

Posts: 3,023
Name: Forrest Croce
Location: Seattle, WA
Trades: 0
I was getting a lot of scrapers for, well, the longest **** time. I'm not sure whether they ultimately decided to leave me alone, or just stopped coming up in my wp dashboard...?

Do you have an example of one? If you put spaces between the protocol - http:// - the domain and the tld, the forum software won't turn the urls into links, helping your enemy even just a little. I'm curious to get a look at them. Most of the scrapers that crawled me would post about 3,500 times a month, and the ones they grabbed from me were never indexed. Half of them seemed to go down within a week ... probably somebody complained to the ISP. A blog I occasionally write guest posts for has a much more persistent copier.

The first thing to do is put some links back to other articles on your site, with fully qualified urls. That would be http://www.amaryllisbulbs.org/2007/1...from-seed.html vs /2007/12/amaryllis-bulb-from-seed.html. Some sploggers will disable the links, others won't. If this one is coming up ahead of you with your own content, some links back aren't exactly a poison pill, but you'll get a little bit of credit, and, possibly, the 'bots will understand when one article links back to another identical article...?

You could also try banning the ip addresses outright. I tend to get nervous abuot that, but...
__________________

Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
ForrestCroce is offline
Reply With Quote
View Public Profile Visit ForrestCroce's homepage!
 
Old 12-09-2007, 07:57 PM Re: Daily Archiving+Scrapers
MrBrownThumb's Avatar
Extreme Talker

Posts: 194
Location: Chicago, IL
Trades: 0
Quote:
Originally Posted by ForrestCroce View Post

Do you have an example of one? If you put spaces between the protocol - http:// - the domain and the tld, the forum software won't turn the urls into links, helping your enemy even just a little. I'm curious to get a look at them.
I don't have one now because I've complained to the host company and they removed them or the scraper site removed it after I wrote about them on my blog and contacted their parent company.

One day I noticed my traffic drop by 50% and I got curious so I checked my url on copyscape and saw that a site called ppnow.net was republishing my feed. I did some searches and their scrapes were coming up ahead of my entries. It took about a week of *****ing to them and e-mailing the PR department of the company they're owned by and then writing about the site before they finally removed my posts. Since then my traffic has been rising back to what it was before they encountered my feed but still lower than it was before them. This is for the "urban garden" blog in my sig.

The funny thing about ppnow.net is that it looks like a social bookmarking site but all the avatars look like they were taken with the same camera and none of the "members" there have info in their profiles.



Quote:
Originally Posted by ForrestCroce View Post
The first thing to do is put some links back to other articles on your site, with fully qualified urls. That would be http://www.amaryllisbulbs.org/2007/1...from-seed.html vs /2007/12/amaryllis-bulb-from-seed.html. Some sploggers will disable the links, others won't. If this one is coming up ahead of you with your own content, some links back aren't exactly a poison pill, but you'll get a little bit of credit, and, possibly, the 'bots will understand when one article links back to another identical article...?
That blog is new and I'm not having any problems with it, yet. It's just blogger with a custom domain and pretty new. I'm rewriting some popular content from my main blog and giving it a home of its own there. But I'm not sure what you mean by "fully qualified urls" do you mean to post them on the entry without using anchor text? Because I usually link to related posts but I'll use anchor text instead of with the http----------------------html link.


Quote:
Originally Posted by ForrestCroce View Post
You could also try banning the ip addresses outright. I tend to get nervous abuot that, but...
These are all on blogger so I can't ban any IP. I wish they'd get that feature up on feedburner though because I've noticed a couple of the scrapers are on my "uncommon uses" in FB once again. To avoid the hassle of tracking them down I've just stopped publishing the full feed.
__________________

Please login or register to view this content. Registration is FREE
.
Please login or register to view this content. Registration is FREE
.
Please login or register to view this content. Registration is FREE
.
MrBrownThumb is offline
Reply With Quote
View Public Profile Visit MrBrownThumb's homepage!
 
Old 12-10-2007, 12:09 AM Re: Daily Archiving+Scrapers
ForrestCroce's Avatar
Half Man, Half Amazing

Posts: 3,023
Name: Forrest Croce
Location: Seattle, WA
Trades: 0
I don't publish full feeds, either. A few people have 'complained' to me, saying they'd happily subscribe to my feed, but for the partial publishes. I'm really not sure what the answer is there, being pretty new to blogging, and not using a feed reader myself.

By fully qualified urls, I mean in the href attribute, going from http to .html, instead of using a relative path, like ../archives/whatever. I used to only use relative paths for simplicity and portability, but no more. I'm not sure if that's an issue with blogs, though, but if your links all have your domain in them, the worst case is you'll get a link out of it.

Complaining to the company, their isp, and anyone else who could by involved by any stretch of the imagination is really the best thing you can do. If more people did that, sploggers would start talking about getting threatened with lawsuits, and a few of them might be deterred. So good on you for that.

It might pay to do some do-it-yourself seo for your blog.

Finally, if you're hosting or at least using your own domain name, would it make sense to use wordpress? There's a very nice import feature from what I've heard, and you can set it up to keep the same url structure so as not to hurt yourself in the serps...?
__________________

Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
ForrestCroce is offline
Reply With Quote
View Public Profile Visit ForrestCroce's homepage!
 
Old 12-10-2007, 01:32 AM Re: Daily Archiving+Scrapers
MrBrownThumb's Avatar
Extreme Talker

Posts: 194
Location: Chicago, IL
Trades: 0
I got my first complaint from a subscriber today but I don't think I'm going back to full a full or even partial feed anytime soon. At least not until FeedBurner/Google wake up and give us the option of blocking certain readers.


"Finally, if you're hosting or at least using your own domain name, would it make sense to use wordpress? There's a very nice import feature from what I've heard, and you can set it up to keep the same url structure so as not to hurt yourself in the serps...?"

I thought about that and I signed up and got an account with WordPress and comparing it to blogger it just doesn't make sense to do so right now. Here's the reason I decided to blog that blog with a custom domain with Blogger.

1. The domain cost me 10 bucks through Google Apps.
2. Blogger gives me 1024 MB of storage.
3. No webhosting fees or worrying about uptime.
4. I can edit the CSS and monetize it for free.


On WordPress
1. I'd have to pay for the ability to edit the CSS
2. You get 50 MB of storage more storage costs money
3. Can't monetize

Self hosted
1. Web hosting costs
2. Worrying about uptime
3. All the things you webmasters have to deal with


I also registered my urban gardening blog as a dot com but with 346 posts I'm barely using 2% of my storage on blogger. I figure by the time I get to the end of my storage ability I can then move over to the self hosted option and deal with all those headaches.
__________________

Please login or register to view this content. Registration is FREE
.
Please login or register to view this content. Registration is FREE
.
Please login or register to view this content. Registration is FREE
.
MrBrownThumb is offline
Reply With Quote
View Public Profile Visit MrBrownThumb's homepage!
 
Reply     « Reply to Daily Archiving+Scrapers
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off





   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML



Page generated in 0.43781 seconds with 12 queries