help to cut up robots file and get contents..
11-03-2007, 08:15 PM
|
help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
okay i basically made a robots.txt maker to go in the admin of the cms i doing, and basically at the moment it has to textarea one to enter disallowed dirs and files. new line for each
and one for banned bots
i basically done it and it would produce something like this:
Code:
#DISALLOWED DIR
User-agent: *
Disallow: /admin/
Disallow: /includes/
Disallow: /modules/
Disallow: /docs/
Disallow: /dev/
Disallow: /zips/
Disallow: /themes/
#BANNED BOTS
user-agent: randombot
Disallow: /
user-agent: randombot2
Disallow: /
user-agent: randombot3
Disallow: /
i need however to be able to un do it all so i can display them in the textareas.
like so i need to be able to make that become
Bannedbots textarea:
randombot
randombot2
randombot3
Disallowed dir/files:
/admin/
/includes/
/modules/
/docs/
/dev/
/zips/
/themes/
how can i do this?
Thanks,
Dan
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
11-04-2007, 01:11 AM
|
Re: help to cut up robots file and get contents..
|
Posts: 238
Location: United States
|
I'm really bad at explaining things, so I just wrote it.
PHP Code:
$directories = array(); // Array to contain disallowed directories $bots = array(); // Array to contain disallowed bots
$robots = file_get_contents('robots.txt'); if ($robots === false) die('Error: file cannot be read.'); $lines = split("\n", $robots);
$dirFlag = false; // flag to look for directories foreach ($lines as $line){ // Loop through each line if (stripos($line, 'User-agent: *') === 0){ $dirFlag = true; // If we see User-agent: *, then we know we are looking for disallowed directories }elseif (stripos($line, 'User-agent:') === 0){ $dirFlag = false; // If we see User-agent without the *, we want bots, not disallowed directories $bots[] = trim(substr($line, 11)); }elseif (stripos($line, 'Disallow:') === 0 && $dirFlag){ $directories[] = trim(substr($line, 9)); } // Ignore all other lines }
__________________
The interlocking pieces of web development: usability, performance, accessibility, and standards.
|
|
|
|
11-04-2007, 09:15 AM
|
Re: help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
THANKS...
But whats the two variables i need to use then?! lol i cant see which one!
im probably being really stupid.
could you just point em out and ill top up your tp 
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
11-04-2007, 10:08 AM
|
Re: help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
okay ignore that i was being stupid.
But im getting undefined function fro stripos because my stupid host STILL hasnt bloody got PHP5 so whats the PHP 4.X alternative?
Thanks,
Dan
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
11-04-2007, 10:20 AM
|
Re: help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
o wait no i just did a google and found someone wrote a little fix for it
PHP Code:
if (!function_exists("stripos")) { function stripos($haystack, $needle, $offset=0) { return strpos(strtolower($haystack), strtolower($needle), $offset); } }
</SPAN>
from what i can tell it basically makes the function
okay now im not getting the error
But instead of it getting whats in the file it is just showing Array in both textareas AND it appears to be deleting the file contents!
it does seem to be saving it but deletes onload
okay ill post what i have below, please not the above function to fix the stripos() problem is in my functions file and becasuse the rebots editor is included into another file it gets included... if that makes sense
PHP Code:
<?php //Open up the file $fh = fopen(DOC_ROOT."/robots.txt", "w+"); if($_POST['submit']) { $disallowed_dir = explode("\n", $_POST['disallowed_dir']); $disallowed = "User-agent: * \n"; foreach($disallowed_dir as $line) { $disallowed.= "Disallowed: $line \n"; } $disallowed.= "\n\n"; ###### $bannedbots = explode("\n", $_POST['bannedbots']);
foreach($bannedbots as $bot) { $bannedbots = "User-agent: $bot \n"; $bannedbots.= "Disallowed: \ \n\n"; } $robotstxt = " #Disallowed Dirs and Files \n\n $disallowed #Banned Bots \n $bannedbots"; //Write to the file fwrite($fh, "$robotstxt"); } //End if else { $directories = array(); // Array to contain disallowed directories $bots = array(); // Array to contain disallowed bots $robots = file_get_contents(DOC_ROOT.'/robots.txt'); if ($robots === false) die('Error: file cannot be read.'); $lines = split("\n", $robots); $dirFlag = false; // flag to look for directories foreach ($lines as $line){ // Loop through each line if (stripos($line, 'User-agent: *') === 0){ $dirFlag = true; // If we see User-agent: *, then we know we are looking for disallowed directories }elseif (stripos($line, 'User-agent:') === 0){ $dirFlag = false; // If we see User-agent without the *, we want bots, not disallowed directories $bots[] = trim(substr($line, 11)); }elseif (stripos($line, 'Disallow:') === 0 && $dirFlag){ $directories[] = trim(substr($line, 9)); } // Ignore all other lines }
echo '<form action="" method="post"> Disallowed folders and files. <textarea name="disallowed_dir" cols="50" rows="10"> '.$directories.' </textarea> <br /><br /> Banned Bots <textarea name="bannedbots" cols="50" rows="10"> '.$bots.' </textarea> <br /><br /> <input type="submit" name="submit" value="Save" /> </form>'; } //Close the file up fclose($fh); ?>
So whats WRONG?! ARGH
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
Last edited by dansgalaxy; 11-04-2007 at 10:26 AM..
|
|
|
|
11-04-2007, 10:47 AM
|
Re: help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
Another error: when it adds the banned bots, it just adds the last one on the list it seems and ignores the rest no clue why ANYONE
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
11-04-2007, 12:00 PM
|
Re: help to cut up robots file and get contents..
|
Posts: 2,389
Name: <member type="brilliant" alt="foolish">James Lewitzke</member>
Location: / public_html / Universe / Virgo_Supercluster / Local_Group / Milky_Way / Orion_Arm / Solar_System / Earth / North_America / USA / Wisconsin
|
It’s tough to tell without looking at your server directory hierarchy. Are all the folders you listed at the root of the directory?
I’m not exactly sure what changes the coding made (I’m a coding noob), but remember that you can also allow robots into certain areas, for example:
Code:
user-agent: randombot1
disallow: /
allow: /textarea1
This would block the bot from all areas of the directory except textarea1
|
|
|
|
11-04-2007, 12:49 PM
|
Re: help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
i know but theres no need, the bot blocker is basically so if i get any bad bots they can be blocked from the server
and the disallow is for folders which contain admin stuff etc
okay heres the latest version of the script, and i think i know the part which is causeing problems.
PHP Code:
<?php
if($_POST['submit']) { $disallowed_dir = explode("\n", $_POST['disallowed_dir']); $disallowed = "User-agent: * \n"; foreach($disallowed_dir as $line) { $disallowed.= "Disallowed: $line"; } $disallowed.= "\n\n";
###### $bannedbots = explode("\n", $_POST['bannedbots']); $banned_bots = ''; foreach($bannedbots as $bot) { $banned_bots.= "User-agent: $bot \n"; $banned_bots.= "Disallowed: \ \n"; } $robotstxt = " #Disallowed Dirs and Files \n\n $disallowed #Banned Bots \n $banned_bots"; //Open up the file $fh = fopen(DOC_ROOT."/robots.txt", "w"); //Write to the file fwrite($fh, "$robotstxt"); //Close the file up fclose($fh); echo 'Saved'; } //End if else { $directories = array(); // Array to contain disallowed directories $bots = array(); // Array to contain disallowed bots $robots = file_get_contents(DOC_ROOT.'/robots.txt'); if ($robots === false) die('Error: file cannot be read.'); $lines = explode("\n", $robots); $dirFlag = false; // flag to look for directories foreach ($lines as $line){ // Loop through each line if (stripos($line, 'User-agent: *') === 0){ $dirFlag = true; // If we see User-agent: *, then we know we are looking for disallowed directories }elseif (stripos($line, 'User-agent:') === 0){ $dirFlag = false; // If we see User-agent without the *, we want bots, not disallowed directories $bots[] = trim(substr($line, 11)); }elseif (stripos($line, 'Disallow:') === 0 && $dirFlag){ $directories[] = trim(substr($line, 9)); } // Ignore all other lines } $bots = implode("\n",$bots); $directories = implode("\n",$directories); echo '<form action="" method="post"> Disallowed folders and files. <textarea name="disallowed_dir" cols="50" rows="10"> '.$directories.' </textarea> <br /><br /> Banned Bots <textarea name="bannedbots" cols="50" rows="10"> '.$bots.' </textarea> <br /><br /> <input type="submit" name="submit" value="Save" /> </form>'; }//End else
?>
and i think this part is causing my problems, it is now saving without any problems but it isnt displaying the disallowed dirs
and i think this is the problem:
PHP Code:
$dirFlag = false; // flag to look for directories foreach ($lines as $line){ // Loop through each line if (stripos($line, 'User-agent: *') === 0){ $dirFlag = true; // If we see User-agent: *, then we know we are looking for disallowed directories }elseif (stripos($line, 'User-agent:') === 0){ $dirFlag = false; // If we see User-agent without the *, we want bots, not disallowed directories $bots[] = trim(substr($line, 11)); }elseif (stripos($line, 'Disallow:') === 0 && $dirFlag){ $directories[] = trim(substr($line, 9)); }
because as far as i can tell it IS still picking up the user-agent: * for both because it doesnt get it if you know what i mean so i think i need to have it like
user-agent: [a-z][A-Z][0-9] so on the it knows its about a banned bot and not * because the bot name would start with something other than * so it needs to like show that?
but im clueless as to how.
Thanks,
Dan
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
11-05-2007, 10:25 PM
|
Re: help to cut up robots file and get contents..
|
Posts: 238
Location: United States
|
Strpos doesn't use regular expressions, so * should only match the character *. Is your list showing a bunch of disallowed bots as "*" or a bunch of disallowed directories as "/"? I've tested the section of code that reads the file both on PHP 5 and PHP 4 (after I replaced the stripos's) and it seems to work properly for me. Maybe it's something to do with the format of robots.txt that I missed. I have no idea though.
Actually, if it's not picking up the directories and it's only picking up the bots, then it is either NOT picking up User-agent: * not picking up Disallow:.
And if you want to do the [a-z][A-Z] etc. thing, then you will have use preg_match(). preg_match() can be used in this case- I just chose not to.
__________________
The interlocking pieces of web development: usability, performance, accessibility, and standards.
|
|
|
|
11-06-2007, 12:00 PM
|
Re: help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
11-06-2007, 02:44 PM
|
Re: help to cut up robots file and get contents..
|
Posts: 238
Location: United States
|
Ah, I see the problem now. The correct way to disallow a directory is to use "Disallow: /blah". Your script is writing them as "Disallowed: /blah".
__________________
The interlocking pieces of web development: usability, performance, accessibility, and standards.
|
|
|
|
11-06-2007, 03:23 PM
|
Re: help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
YAY.. but.
okay after tweaking, i have got it almost right, thanks frost for pointing that out it gave me the corners and sides of the puzzle!
Now heres the but.
its working and its reading it fine.
But it seems to add a \n when it saves to both fields so im getting a extra blank
user-agent:
Disallow:
and Disallow: on the bots and Disallowed dirs bits
;/ so how do i make it delete unneeded \n from the end of the bits before saving??
could trim do this im not sure about \n and the trim function.
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
11-06-2007, 03:46 PM
|
Re: help to cut up robots file and get contents..
|
Posts: 238
Location: United States
|
Yes, trim() removes all kinds of whitespace by default, but that might not be your problem. In PHP, the following strings are equivalent (well, nearly equivalent, depending on which OS you are using, but that is irrelevant).
PHP Code:
$stringOne = "1st line 2nd line 3rd line";
$stringTwo = "1st line\n2nd line\n3rd line";
If you write either of those strings to a file, you will get three lines. However, if you do this:
PHP Code:
$stringThree = "1st line\n 2nd line\n 3rd line";
then you will get a double-spaced result.
It's possible that you may be doing something like that judging by the code you posted a couple days ago. If not, then it could be just an unnecessary double \n\n or something.
__________________
The interlocking pieces of web development: usability, performance, accessibility, and standards.
|
|
|
|
11-06-2007, 05:14 PM
|
Re: help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
yea it is because in the robots it got double newline 
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
11-06-2007, 05:18 PM
|
Re: help to cut up robots file and get contents..
|
Posts: 6,521
Name: Dan
Location: Swindon
|
okay its stopped most of the double new lines,
okay here it is, i type the stuff in save it.
its fine.
if i then open and resave (without even touching them)
it automatically ads a new line so on the second save it adds another newline which then means another thing for both.
get me?
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
|
« Reply to help to cut up robots file and get contents..
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|