Tycoon Talk
Become a Big fish!
The number 1 forum for online business!
Post topics, ask questions, share your knowledge.
Tycoon Talk is part of Freelancer.com - find skilled workers online at a fraction of the cost.

PHP Forum


You are currently viewing our PHP Forum as a guest. Please register to participate.
Login



Freelance Jobs

Reply
How do I program a bot?
Old 04-28-2008, 06:17 PM How do I program a bot?
Webmaster Talker

Posts: 626
Trades: 0
I need to program a bot. I need it to login to an account I have that is located on a password protected site. From there, I need to scrape the HTML that is generated on the screen (it is from an ASP page). I will then parse the HTML that has been scraped and pull out the information that I need which I will then save in my database for later use.

Can someone please help me with this?

For the record, this is all legal as this is an account that I'm allowed to be in but I just want to automate the process at it will be very time consuming to do it manually.

Any help is appreciated.
jim.thornton is offline
Reply With Quote
View Public Profile
 
 
Register now for full access!
Old 04-28-2008, 10:43 PM Re: How do I program a bot?
addonchat's Avatar
Super Talker

Posts: 115
Name: Chris Duerr
Trades: 0
file_get_contents() will allow you to easily fetch a web page.
http://us2.php.net/manual/en/functio...t-contents.php

As for logging in, if it is a form, the easiest way would be to see if it processes CGI GET, in that case you'd just call file_get_contents() and append the url-encoded username/password parameters. E.g., "/login.php?un=bob&password=12345"

I'm not sure if file_get_contents() supports the standard "http://USERNAME:PASSWORD...." protocol, but it's worth a try.

Enjoy your parsing
__________________
Chris Duerr
AddonChat Java Chat Software

Please login or register to view this content. Registration is FREE
-
Please login or register to view this content. Registration is FREE

Please login or register to view this content. Registration is FREE
addonchat is offline
Reply With Quote
View Public Profile
 
Old 04-29-2008, 04:16 AM Re: How do I program a bot?
solomongaby's Avatar
Webmaster Talker

Latest Blog Post:
How Do You Find Music Online ?
Posts: 522
Name: Gabe Solomon
Location: Romania
Trades: 1
you can try to use the class snoppy ... you can find it on sourceforge ... it has the ability to do a login even with POST variables
__________________
If you like my posts ... TK is appreciated:)

Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
solomongaby is offline
Reply With Quote
View Public Profile Visit solomongaby's homepage!
 
Old 04-29-2008, 10:19 AM Re: How do I program a bot?
addonchat's Avatar
Super Talker

Posts: 115
Name: Chris Duerr
Trades: 0
For POST events, if you prefer to do it yourself, check out:

http://us2.php.net/manual/en/functio...ost-fields.php

and look over:

http://us2.php.net/http
__________________
Chris Duerr
AddonChat Java Chat Software

Please login or register to view this content. Registration is FREE
-
Please login or register to view this content. Registration is FREE

Please login or register to view this content. Registration is FREE
addonchat is offline
Reply With Quote
View Public Profile
 
Old 04-30-2008, 10:55 AM Re: How do I program a bot?
Webmaster Talker

Posts: 626
Trades: 0
How do hackers and stuff make bots that will login to sites then. Is there no way to automate the process, or would I have to program something in C that is basically a browser, parses the page and then inputs the info?

There has got to be a way to do this.

From what you guys are saying, it sounds like I can get into the site but then how do I set it up that it will click on the links that I need to get to the page to parse?
jim.thornton is offline
Reply With Quote
View Public Profile
 
Old 04-30-2008, 11:09 AM Re: How do I program a bot?
tripy's Avatar
Do not try this at home!

Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
Trades: 0
You need to parse every answers you get after every requests.
Or you know the link to be followed, and you blindly go there, discarding what the server sends you back.

As for the scrapping, there are different approaches, some better than others, but I won't explain them.
This look a bit too border line for my taste.

What !?
You thought there was a "hacking construction kit" and you would have just to make 3 clicks ?
Sorry, but no. It takes real programming to do so.
__________________
Only a biker knows why a dog sticks his head out the window.
tripy is offline
Reply With Quote
View Public Profile Visit tripy's homepage!
 
Old 04-30-2008, 11:29 AM Re: How do I program a bot?
addonchat's Avatar
Super Talker

Posts: 115
Name: Chris Duerr
Trades: 0
zincoxide -- Programs don't click, people click. Short of parsing the HTML, I've already posted all of the information anyone would need. I suggest you start with the HTTP specification - http://www.w3.org/Protocols/rfc2616/rfc2616.html

Enjoy!
__________________
Chris Duerr
AddonChat Java Chat Software

Please login or register to view this content. Registration is FREE
-
Please login or register to view this content. Registration is FREE

Please login or register to view this content. Registration is FREE
addonchat is offline
Reply With Quote
View Public Profile
 
Old 04-30-2008, 12:48 PM Re: How do I program a bot?
Ultra Talker

Posts: 407
Trades: 1
Do you need this done once, regularly, or continuously? This seems pretty simple to do.
__________________
[
Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
|
Please login or register to view this content. Registration is FREE
]
Lucas3677 is offline
Reply With Quote
View Public Profile Visit Lucas3677's homepage!
 
Old 05-01-2008, 12:09 PM Re: How do I program a bot?
Webmaster Talker

Posts: 626
Trades: 0
I understand that people do the 'clicking'. I guess I'm using the wrong words.

Let me try explaining what I'm doing:

I am trying to login to a site (I would like it done daily) that lists the companies that I deal with and the rates that they offer.

The problem is that the site uses sessions (I think) to maintain a logged in status, so if I copy the url for the rates page I get a logged out message and it asks me to login. So..

I need to find a way to log in using a bot. Once you log in it automatically goes to a main menu. From there, you can find a link in the nav menu that I click on to get the rates. Then a page comes up with the rates in a table. This is the page that I want to parse.

So... I need a method of the site saving my session information so I can subsequently get the appropriate page. But I need it to save the session info and I don't know how to do that. Maybe it isn't a php thing, it might be a different language I need.
jim.thornton is offline
Reply With Quote
View Public Profile
 
Old 05-01-2008, 02:51 PM Re: How do I program a bot?
VirtuosiMedia's Avatar
Web Design Made Simple

Posts: 1,228
Trades: 0
This is probably a no, but does the site you're trying to get info from have an RSS feed that you could subscribe to instead of you having to login through a bot?
VirtuosiMedia is offline
Reply With Quote
View Public Profile Visit VirtuosiMedia's homepage!
 
Old 05-01-2008, 08:02 PM Re: How do I program a bot?
addonchat's Avatar
Super Talker

Posts: 115
Name: Chris Duerr
Trades: 0
If the site is programmed correctly, there is no need for you to worry about cookies. The login sounds like it is a form, view the source for that form and find out what variable names are used for login/password, the form submit type (post or get) then use the references above to automate the login. After that, your program should be able to directly retrieve the link you're interested in, then you can begin parsing it.

The only item you'll likely have to contend with is reading redirect output of the
login page to capture the session ID, which should be passed along without cookies being used.
__________________
Chris Duerr
AddonChat Java Chat Software

Please login or register to view this content. Registration is FREE
-
Please login or register to view this content. Registration is FREE

Please login or register to view this content. Registration is FREE
addonchat is offline
Reply With Quote
View Public Profile
 
Old 05-02-2008, 10:55 AM Re: How do I program a bot?
Webmaster Talker

Posts: 626
Trades: 0
Okay... I get it now. Thank you.

Unfortunately, they don't have an RSS feed. I've been trying for two years for them to get one but everytime I ask they say "R-what". It's very frustrating. This company claims to be ahead of the competition when it comes to technology but they don't know what the heck they are doing.

Yesterday I actually found a page (well file), which I can wget without needing to login. It has all the information I need in it.

Now I just have to figure out how to parse HTML accurately.

Thanks for all your help!
jim.thornton is offline
Reply With Quote
View Public Profile
 
Old 05-02-2008, 11:10 AM Re: How do I program a bot?
Plugin-Developer's Avatar
Weightlifting CS Student

Posts: 504
Name: Nick Ohrn
Trades: 0
For parsing HTML, you have a couple of options, zincoxide. The easiest one to use is just XML document parsing. You can use PHP5's DomDocument class to parse a web page as long as you know the page is going to be valid XHTML. If you have questions about the validity of the code you're going to be retrieving, you'll have to go another route.

For screen scraping, regular expressions are a really popular technique. This article is a great starting point for that. Good luck with your venture.
__________________

Please login or register to view this content. Registration is FREE
- Custom plugin development to fit your needs. Plugins available for WordPress and Drupal, among others.
Plugin-Developer is offline
Reply With Quote
View Public Profile Visit Plugin-Developer's homepage!
 
Reply     « Reply to How do I program a bot?
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off





   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML



Page generated in 0.43011 seconds with 12 queries