while statement stops before running code
03-10-2005, 11:07 PM
|
Parse a dat file and insert values into a MySQL db
|
Posts: 16
|
I need to figure out how in the heck to to take a dat file and:
1) check to see if this file has already been parsed and added to the db if so then quit
2) if new file (header contains unique identifier like HEAD03022005) then parse all values which appear in the file like this:
48020105224
48020514002
48022403889
48026115817
48026115817
48029903855
48032507284
48040311473
etc.
and the file ends with TRAIL000000014
I need to put all of the values in between the head and trail into a MySQL a db, these values then need to have an incremental value associated with them that represents how many times they have been sent to the db recently. This value will go down to 0 if the client validates their code and it matches one of the 11 digit values in the db. (If it validates then they will be given <i>x</i> number of new codes from another db, <i>x</i> being the incremental value.
Seems simple enough yet I can't even begin to figure out how to do this.
|
|
|
|
03-11-2005, 04:19 AM
|
|
Posts: 1,832
Location: Somewhere else entirely
|
Ok, I don't quite understand the meaning of the codes, but what you probably want to do is something like the following:
Have one database table, let's call it files, into which you put the unique id number from the head. Then you can check to see if its there already in which case we're done. If not, you add it in and continue parsing the file.
Then have another table, call it codes with each code and a count. For each line you check to see if it's already present and if so, make the counter one bigger. If it's not there we add a new entry. The same table can be used to knock the value to 0 when someone validates a file.
Do you already have a users database? Cos you'll need this to keep track of who has done what and who has recieved which codes etc.
Which part of this do you need help with - the database creation, or picking out the values from the file?
__________________
UPDATE 0beron SET talkupation = talkupation + lots WHERE post = 'helpful';
Please login or register to view this content. Registration is FREE (aka MSN handwriting for forums)
|
|
|
|
03-11-2005, 08:15 AM
|
|
Posts: 16
|
Picking out the codes from the file and putting them into the database w/ the checks (already been parsed and code already exist)
|
|
|
|
03-11-2005, 04:21 PM
|
|
Posts: 16
|
Ok I have it parsing the file and connecting to my db fine, I think, I just don't see the values showing up in my db after the job is run. Why?
$fh = fopen($datFile, "r") or die( "unable to open $datFile");
while(!feof($fh))
{
$line = fgets($fh,1024);
list($fcode) = explode("/n", $line);
echo "$fcode<br>";
$sql = "INSERT INTO codes (cust_code) VALUES ('$fcode')";
mysql_query($sql);
}
?>
|
|
|
|
03-11-2005, 04:47 PM
|
|
Posts: 1,832
Location: Somewhere else entirely
|
Picking things out first involves opening the file in php, then it's probably best to do things on a line by line basis. The following script will read a file (called test.txt) and echo each line in turn:
PHP Code:
<?php
$fp = fopen("test.txt","r");
while(!feof($fp)) {
$line=fgets($fp,1024);
echo $line ."<br />";
}
fclose($fp);
?>
The function fopen opens the file, the "r" tells it we want to read the file, not write to it, and the function fgets() reads one line from the file.
For your purpose, you will need to replace the echo with the database queries and checks.
Each time you read a line you then have to check what's in it. This depends a lot on how the files are structured. If the HEAD line always starts HEAD, you can check for this using the substr function. Once you have found the HEAD line, lets say you read it into a variable $line, and then check that $line starts with 'HEAD'. Then you can issue a databse query along the lines of:
PHP Code:
$result = mysql_query("SELECT * FROM files WHERE head = '$line'");
$num = mysql_num_rows($result);
mysql_query issues the query, in this case to select everything with the same HEADxxxxxxxxxx as your current file. The next function mysql_num_rows will tell you the number of rows returned. If this is 1 you can quit cos this files already been done. If it's zero, you will need to add the entry to the database:
PHP Code:
mysql_query("INSERT INTO files (head) VALUES ('$line')");
Then to do the codes, it is a very similar operation. You can either use substring again to check if it starts with 480, or if that won't always be the case, a function like is_numeric() might be enough. Then you will need to write a SQL the same as the first one above to check how many rows have that code already. If none do , we want to put the code in with count = 1, if not then we want to make the count one bigger:
PHP Code:
$result = mysql_query("SELECT * FROM codes WHERE code = $line");
$num = mysql_num_rows($result);
if($num==0) {
mysql_query("INSERT INTO codes (code, count) VALUES ( '$line', 1)");
}
else {
mysql_query("UPDATE codes SET count = count+1 WHERE code = '$line' ");
}
That help?
__________________
UPDATE 0beron SET talkupation = talkupation + lots WHERE post = 'helpful';
Please login or register to view this content. Registration is FREE (aka MSN handwriting for forums)
|
|
|
|
03-11-2005, 05:40 PM
|
|
Posts: 16
|
now all the records in the table (valid_cust_codes) are coming up the same
2147483647
the header and trail are entered as 0
I'm starting to get the hang of this. The first thing I'm going to want to check is to see if the number of records between the head and trail equals the number designated after trail. The test file I have has 14 records in between HEAD and TRAIL00000014. If it matches then we check to see if HEAD03022005 <--Date already exists in our table if it does then die and delete file. If it doesn't then we proceed with parsing the records and sending them to the codes table and check to see if they exist already etc etc.
Last edited by Hallmarc; 03-11-2005 at 05:55 PM..
|
|
|
|
03-11-2005, 06:42 PM
|
|
Posts: 1,832
Location: Somewhere else entirely
|
Ah - the codes are too big to be treated as numbers. 2147483647 is the maximum size integer value you can store (in signed format) on a 32bit machine. You will need to treat the numbers as strings instead of as numbers. This should not be hard - just make sure your SQL queries put single quotes round the value of '$line', and that your column datatype is varchar and not int.
__________________
UPDATE 0beron SET talkupation = talkupation + lots WHERE post = 'helpful';
Please login or register to view this content. Registration is FREE (aka MSN handwriting for forums)
|
|
|
|
03-11-2005, 06:49 PM
|
|
Posts: 16
|
Ah that did the trick for the codes. The dat files will always start with HEADDateSent and end with TRAILNumberofRecords. So now how do I read the last line in the dat file (the TRAIL one) so I can check to see if it matches the number of records?
|
|
|
|
03-11-2005, 07:53 PM
|
|
Posts: 16
|
OK I just decided to run another While loop and wait until it hits it the last record now I need to remove all leading zeros (because I never know if the number of records will be in the double, triple, quadruple rtc. digits) then I can compare the TRAIL count with the real record count.
|
|
|
|
03-11-2005, 08:43 PM
|
|
Posts: 3,189
|
Hey Oberon...Just for future reference this whole block of code...
PHP Code:
$fp = fopen("test.txt","r");
while(!feof($fp)) {
$line=fgets($fp,1024);
echo $line ."<br />";
}
fclose($fp);
Can be replaced with this....
PHP Code:
$file = file('test.txt');
foreach($file as $r) {
echo $r . '<br />';
}
|
|
|
|
03-11-2005, 11:15 PM
|
|
Posts: 1,832
Location: Somewhere else entirely
|
Nice - didn't know that. That example was a copy/paste from a script I had lying around that I think is quite old.
__________________
UPDATE 0beron SET talkupation = talkupation + lots WHERE post = 'helpful';
Please login or register to view this content. Registration is FREE (aka MSN handwriting for forums)
|
|
|
|
03-12-2005, 08:10 AM
|
|
Posts: 36
|
Just because something can be done another way, doesn't mean it's the best way. How's working with large files - 1-2 gig - gonna effect the latter example ?
|
|
|
|
03-12-2005, 09:36 AM
|
|
Posts: 3,189
|
It really depends on whether you are just echoing that array or modifying it. When modifying the array I think a for loop is actually faster, and it is much faster when you determine the size of the array before placing it in the for loop.
As far as fopen....I have read around that fopen is very server intensive and although I have not seen any benchmarks I have found that file is much quicker when working with bigger files.
|
|
|
|
03-12-2005, 10:45 AM
|
|
Posts: 36
|
Quote:
|
It really depends on whether you are just echoing that array or modifying it.
|
That is irrelevant. The point was with the latter way the whole file gets copied into memory. A 1-2 gig file and you only have 512Mb memory - you can do the math. And then you have PHP's memory limit to worry about. The microsecond speed deferences between loops won't save you here.
Quote:
|
As far as fopen....I have read around that fopen is very server intensive and although I have not seen any benchmarks I have found that file is much quicker when working with bigger files.
|
This is just a mute point, what do you think file() does ?
|
|
|
|
03-12-2005, 11:02 AM
|
While loop only runs once
|
Posts: 16
|
Ok what have I done wrong here??
$fp=fopen($datFile,"r") or die("Unable to open $datFile. Please check and make sure the file exists and is not corrupt.");
$lines = 0;
if ($fh = fopen('../sr_codes/DAILY.dat','r')) {
while (! feof($fh)) {
$line=fgets($fh,1024);
list($fcode) = explode("/n", $line);
if (substr($fcode, 0, 5)==="TRAIL") {
$stripZeros=substr($fcode, 5);
$checkForZero=substr($stripZeros, 0, 1);
while ($checkForZero="0"); {
$stripZeros=substr($stripZeros, 1);
$checkForZero=substr($stripZeros, 0, 1);
}
}
$lines++;
}
}
|
|
|
|
03-12-2005, 11:08 AM
|
|
Posts: 36
|
Two lines that immediately spring out.
Code:
list($fcode) = explode("/n", $line);
Code:
while ($checkForZero="0");
|
|
|
|
03-12-2005, 11:26 AM
|
|
Posts: 3,189
|
Quote:
|
Originally Posted by tress
That is irrelevant. The point was with the latter way the whole file gets copied into memory. A 1-2 gig file and you only have 512Mb memory - you can do the math. And then you have PHP's memory limit to worry about. The microsecond speed deferences between loops won't save you here.
This is just a mute point, what do you think file() does ?
|
As I understand it, file copies the file once into an array. Looping over fopen will open the file each itteration of the loop and read the whole thing looking for that one line then close it, then open it again, etc.
So either way if you have a 2gb file it's going to be using all the memory and start paging to the hard drive.
|
|
|
|
03-12-2005, 06:57 PM
|
|
Posts: 1,832
Location: Somewhere else entirely
|
I though fopen gave you a pointer into the file, you call fopen once outside the loop, and the fgets() will get you one line at a time and advance the pointer. It's probably slower, but it means you don't read the whole file at once. fopen shouldn't be placed within the loop itself.
__________________
UPDATE 0beron SET talkupation = talkupation + lots WHERE post = 'helpful';
Please login or register to view this content. Registration is FREE (aka MSN handwriting for forums)
|
|
|
|
03-12-2005, 07:18 PM
|
|
Posts: 3,189
|
Perhaps your right. Although I'm not sure how it gets the pointer, does it need to open and parse the file first?
|
|
|
|
03-12-2005, 07:30 PM
|
|
Posts: 3,110
Location: Toronto, Ontario
|
No, the pointer is just a resource that allows us to work with a file. The contents of the file aren't read until you choose to read them with one of the reading functions. With larger files, it is indeed more efficient to read chunks of the file instead of the entire file.
Kinda just like opening a mySQL connection returns a resource and not the entire database.
|
|
|
|
|
« Reply to while statement stops before running code
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|