Tycoon Talk
Become a Big fish!
The number 1 forum for online business!
Post topics, ask questions, share your knowledge.
Tycoon Talk is part of Freelancer.com - find skilled workers online at a fraction of the cost.

PHP Forum


You are currently viewing our PHP Forum as a guest. Please register to participate.
Login



Freelance Jobs

Reply
Old 07-31-2008, 02:23 AM Website Screenshots
Galaxian's Avatar
Rich Powell

Posts: 842
Name: Rich Powell
Location: United Kingdom
Trades: 0
Considering a feature to screenshot a website as is at the address given. What sort of PHP code would this require? Would it require something installed on the server or can PHP do it?

Perhaps there's a site that serves website screenshots that can be manipulated by my project?
__________________

Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE

Please help get the new
Please login or register to view this content. Registration is FREE
forum started for Webmasters like you!


Last edited by Galaxian; 07-31-2008 at 02:27 AM..
Galaxian is offline
Reply With Quote
View Public Profile Visit Galaxian's homepage!
 
 
Register now for full access!
Old 07-31-2008, 02:51 AM Re: Website Screenshots
tripy's Avatar
Do not try this at home!

Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
Trades: 0
No, PHP cannot do it by itself.

But (shameless plug...) I have developped http://www.web-screenshots.com myself, and surely it can be of use to you.
The free version of the service can be accessed via http://www.web-screenshots.com/free/any_url, like http://web-screenshots.com/free/www.webmaster-talk.com

You can use a simple url to have a screenshot generated and sent back to you.
The paying users have access to more functions, like cropping, output in jpg or png, and a choice for the resulting screenshot between 40x30 px to 1280x1024px.

You are welcome to give it a try.
__________________
Only a biker knows why a dog sticks his head out the window.
tripy is offline
Reply With Quote
View Public Profile Visit tripy's homepage!
 
Old 07-31-2008, 06:14 AM Re: Website Screenshots
Galaxian's Avatar
Rich Powell

Posts: 842
Name: Rich Powell
Location: United Kingdom
Trades: 0
Thanks. That looks quite handy. The only feasible option for me is your free service as the paid one limits it to 500.

Now - when does the server know to update a screenshot? Does it do that?
__________________

Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE

Please help get the new
Please login or register to view this content. Registration is FREE
forum started for Webmasters like you!

Galaxian is offline
Reply With Quote
View Public Profile Visit Galaxian's homepage!
 
Old 07-31-2008, 09:27 AM Re: Website Screenshots
tripy's Avatar
Do not try this at home!

Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
Trades: 0
The paid one has many advantages, and 500 screenshots are still a lot, if you implement it at the server level.
I did that to avoid lazy developers that simply put an <img src="xxx"> refering to my service. I had to give an incentive to optimise the processus.
Beside, the more you use cached images, the more screenshots you are allowed to take, even beyond the 500 limit (up to 750): http://www.web-screenshots.com/prices/#bonus

Free service has an 2 month time to live in the cache (except if I kill it. Only me has this ability for the moment) or if I have to reboot the server (it was 8 month ago, last time, when it was installed).

Payed screenshots can be tweaked to specify how long the cache will stay valid, and have the ability to bypass the cache on a per request basis.
__________________
Only a biker knows why a dog sticks his head out the window.
tripy is offline
Reply With Quote
View Public Profile Visit tripy's homepage!
 
Old 07-31-2008, 05:45 PM Re: Website Screenshots
Super Talker

Posts: 116
Trades: 0
How exactly does the script work though? Is the screenshot manually taken? I'd be interested in the backend just for the sake of knowledge - I've never actually met anyone who'd developed something like this and I always had that question lol.
__________________


Please login or register to view this content. Registration is FREE

nasaboy007 is offline
Reply With Quote
View Public Profile Visit nasaboy007's homepage!
 
Old 07-31-2008, 06:36 PM Re: Website Screenshots
VirtuosiMedia's Avatar
Web Design Made Simple

Posts: 1,228
Trades: 0
Quote:
Originally Posted by nasaboy007 View Post
How exactly does the script work though? Is the screenshot manually taken? I'd be interested in the backend just for the sake of knowledge - I've never actually met anyone who'd developed something like this and I always had that question lol.
I sincerely doubt that it's manually taken, but I wouldn't mind hearing how tripy made it either, if he's willing to share, though I understand if he's not. At a very high level, my guess is that it is something like the following process:

A URL is provided.
The script accesses the URL through one of the browser engines like webkit, trident, gecko, etc.
A screenshot is taken, resized to a thumbnail, and added to the database in association with the URL.
__________________
Want new web resources every day? - Follow me on
Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE


Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
VirtuosiMedia is offline
Reply With Quote
View Public Profile Visit VirtuosiMedia's homepage!
 
Old 07-31-2008, 07:23 PM Re: Website Screenshots
tripy's Avatar
Do not try this at home!

Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
Trades: 0
Quote:
A URL is provided.
The script accesses the URL through one of the browser engines like webkit, trident, gecko, etc.
A screenshot is taken, resized to a thumbnail, and added to the database in association with the URL.
I can give rough directions, as the implementation itself is way more complicated than just assembling ready made programs.

1) The backend. As Virtuosimedia guessed, I use a python binding of the firefox rendering library: gecko.
This allows me to interpret a web page from a python script.
2) The rendering. This is done through a virtual display, which is randomly selected at call time in a spool of 30 of them.
3) The screenshoting. This is handled in 2 steps. First, a screenshot is taken using the glibc library (it's a linux system library that handles graphical operations).
Once the screenshot is taken, I use the PIL python imaging module to enhance it's quality, sharpness, do a crop (if it's asked) and resize it to the final size.
This screenshot is then sent to a in-memory cache, shared by the python process and the web site.

A typical request would be:
° As a request hit the server, it's parsed, and the paying options are activated as asked depending of the subscription level of the user.
°Then, a reverse DNS call is done to verify that the domain is valid.
° A cache check is done on 2 levels. First on the live cache, and if it's not found there, on the DB cache.
The reason for the 2 levels of cache is that the live cache have a decaying system in place. After a certain time with no access, the object is removed from the cache, to let other pictures fill it.
If we found a picture in the DB cache that is not present into the live cache, it's re-inserted into the live cache.
To determine if a cached version of a screenshot exists, I do a checksum on several properties of that screenshot, and use it as a key to identify them.
It's needed to allow different payig users to have different caching delays.
° If a picture is found from the cache, it's served right away (with the recording of a cached hit from the user). If not, the python script is called, and when it's semaphore is cleared, the PHP script handles back the screenshot from the cache.

There is a lot more going on, in the details. Because some flash or java can crash the rendering engine, and in that case, the python process might get stuck running without end.
That's why there is a scheduler there, that will check that no screenshot process takes more than 1 minute to run. If it does, it's killed, period.
I don't want to block system resources on 1 request that is bound to fail.

I first started this as a "I've heard you cannot do this in PHP, but I'd like to anyway. Can you do it?" contest.
At first, it had around 80% successful hits, which was already a lot compared to what my customer had at this time (around 50% with a windows service that gave crappy blocky pictures).

Today, I've incremented the success rate to approx. 98%...
http://www.webalis.com/2008/04/what-...y-application/
Quote:
As the screenshots are fetched live, I was thinking that I could end up with a lot of "bad" screenshots, as the request would pile up, but it seems that it stay relatively low.
My last test ran for 2502 requests (1002 not cached, and 1500 who where cached).
I had a 0.07% of error on the cached myspace page, and a 1.8% on the live Google web page, which makes a global error rate of 0.76% for 2500 requests.

Considering that most of the request should be done for cached screenshots, I'm pretty satisfied of those results. The errors where all that the screenshots had a 0 byte size, and was discarded. As those are automatically rejected from the cache system, it does looks fine to me.

I was not expecting so much out of the box !!!
In the end, it's around 2 years of development, but not contiguous.
I spend some time there, then nothing for 2 months. Then again some weeks....
I'm pretty happy of its results now, but I still haven't really spread the word about it.
Developing this is fun. Promoting it, on the other hand, is not much appealing to me...
__________________
Only a biker knows why a dog sticks his head out the window.
tripy is offline
Reply With Quote
View Public Profile Visit tripy's homepage!
 
Reply     « Reply to Website Screenshots
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off





   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML



Page generated in 0.37826 seconds with 12 queries