Tycoon Talk
Become a Big fish!
The number 1 forum for online business!
Post topics, ask questions, share your knowledge.
Tycoon Talk is part of Freelancer.com - find skilled workers online at a fraction of the cost.

PHP Forum


You are currently viewing our PHP Forum as a guest. Please register to participate.
Login



Freelance Jobs

Reply
help with uploading file information into databases
Old 11-02-2005, 01:47 PM help with uploading file information into databases
qnc
Skilled Talker

Posts: 74
Trades: 0
I would like your help in combining the information from three uploaded files and then using the files information to query the database and then add that information plus the information from the files into different tables in database!!

The case is as follows.

I have two files (the third is discussed at the end)

One contains people and representations of their genetic code at a particular points on their genome. Alleles are the code and the points are called markers. *.ped is the name of the file but it is a simple text file for pedigree. The first part of each line looks like this:

ADRP134 0247 0227 0228 2 2

this is the family ID, the persons ID, the father's ID, the mother ID, whether the person is male or female (1 or 2) and whether they have the disease or not (1 or 2). This is followed on the same line by their genetic code at a particular point which in this case is:

1 2 2 3 1 10 4 5

There are two numbers for each point. So the first point is 1 2 the second 2 3 and so on

The other file (*.dat for data file but again it is a simple text file) contains the name of the points (markers):

The whole file looks similar to this:

5 0 0 5 << NO. OF LOCI, RISK LOCUS, SEXLINKED (IF 1) PROGRAM
0 0.000000 0.000000 0 << MUT LOCUS, MUT RATE, HAPLOTYPE FREQUENCIES (IF 1)
1 2 3 4 5
1 2
0.999990 0.000010 << GENE FREQUENCIES
1 << NO. OF LIABILITY CLASSES
0.000000 0.900000 0.900000
3 2 # D20S906
0.500000 0.500000 << GENE FREQUENCIES
3 7 # D22S280
0.142857 0.142857 0.142857 0.142857 0.142857 0.142857 0.142857 << GENE FREQUENCIES
3 10 # D22S423
0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 << GENE FREQUENCIES
3 8 # D22S274
0.125000 0.125000 0.125000 0.125000 0.125000 0.125000 0.125000 0.125000 << GENE FREQUENCIES
0 0 << SEX DIFFERENCE, INTERFERENCE (IF 1 OR 2)
0.166670 0.166670 0.166670 0.166670 0.000000 << RECOMB VALUES
1 0.1 0.45 << REC VARIED, INCREMENT, FINISHING VALUE

However the points' names are only contained in lines beginning with 3 so here it is extracted out:

3 2 # D20S906
0.500000 0.500000 << GENE FREQUENCIES
3 7 # D22S280
0.142857 0.142857 0.142857 0.142857 0.142857 0.142857 0.142857 << GENE FREQUENCIES
3 10 # D22S423
0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 0.100000 << GENE FREQUENCIES
3 8 # D22S274
0.125000 0.125000 0.125000 0.125000 0.125000 0.125000 0.125000 0.125000 << GENE FREQUENCIES



In these lines the names are D20S906, D22S280, D22S423, D22S274 respectively and correspond to 1 2 2 3 1 10 4 5 mentioned above. interestingly the number after 3 represents how many alleles have been found in that family for that marker. the fractions below are the frequency of each allele however we can calculate this more accurately once this system is instituted.



Now to the task I was hoping to import both files

from the *.dat file I want to extract what markers are being used.

into a table called raw allele I want to store the persons unique id which corresponds to an entry in another table called family the example above is entry entry number 36. (The *.ped file carries the fam_id and the person ID which corresponds to a unique entry in the family data table this is where a query has to be done)

therefore in the *.ped has to be compared to the corresponding record in the family table and then the alleles recorded as well

so the table raw allele data table will start to look like this:

Ui, fam_ui, d_no, allele1, allele2, ori files
1 , 36 , D20S906, 1, 2
2 , 36 , D22S280, 2, 3

now ori files is the third file which is an *.SMP (again a text file) file that links the original genetic reader instrument file to the ped file. i.e. family = adrp134 person = 0247 the file containing the original data = qnc200505-a1-adrp134-0247-09.fas (not a text file but its name will suffice for future searches back to the raw data)

All I would like is for the qnc200505-a1-adrp134-0247-09.fas to be added to the appropriate line in the raw allele table file.

so after the three files have been uploaded and manipulated

the final entry would look something like

Ui, fam_ui, d_no, allele1, allele2, ori files
1 , 36 , D20S906, 1, 2, qnc200505-a1-adrp134-0247-09.fas
2 , 36 , D22S280, 2, 3, qnc200505-a1-adrp134-0247-09.fas


To summaries it is uploading three files simultaneously, if possible, using the information in them to preform a basic query and then using some of the information in the files and the info from the query to populate another table.

Can anyone help with part or all of it or point me in the right direction to figure it out for myself?
qnc is offline
Reply With Quote
View Public Profile
 
 
Register now for full access!
Old 11-02-2005, 02:17 PM
0beron's Avatar
Defies a Status

Posts: 1,832
Location: Somewhere else entirely
Trades: 0
Quote:
Originally Posted by me
Can't help you with the file uploads, since I've not done this myself, but once they are on the server it will be very easy to get things into the database as you want it.
EDIT: forget that, I tried a test script and it worked OK, I posted in your other thread and merged it into this thread at the bottom.


Here are some PHP functions that you will find useful, post back with more questions of you want a hand using them:

file ()
The filefunction loads an entire file into an array - so for your .ped file you can say
PHP Code:
$pedlines=file("yourfile.ped"); 
at which point $pedfiles[0] = the first line of the file.

You can loop over all the lines in the file with a foreach loop.

explode()
The explode function splits up a string into pieces based on a separator - in this case you want to separate on spaces:
PHP Code:
$items explode(' '$pedlines[5]);
echo 
$items[0];
echo 
$items[1]; 
Here $items holds the familyID in $items[0], the personID in $items[1], the fatherID in $items[2] etc...

You can then use $items[0] and $items[1] to do your query into the family table.

preg_match()
Regular expression matching - useful in this case for picking out the names of the markers starting with #
PHP Code:
$datlines file("yourfile.dat");
foreach(
$datlines as $lnum => $datline) {
  
//CAUTION!! Arrays start at 0, line numbers at 1!!
  
if (preg_match("_(#\S*)_"$datline ,$matches)) {
    echo 
$matches[0];
  }

^ This should echo out all the #D numbers, including the #. If you want it without the # use "_#(\S*)_" instead and $matches[0] will just be the number.


This is all a bit short on details but it looks like a mammoth task you are trying. Let us know if this was helpful and if you'd like further help with a part of it.
__________________
UPDATE 0beron SET talkupation = talkupation + lots WHERE post = 'helpful';

Please login or register to view this content. Registration is FREE
(aka MSN handwriting for forums)

Last edited by 0beron; 11-03-2005 at 08:21 PM..
0beron is offline
Reply With Quote
View Public Profile Visit 0beron's homepage!
 
Old 11-03-2005, 07:02 AM
qnc
Skilled Talker

Posts: 74
Trades: 0
This is great just what I needed will start and post my poor syntax efforts up here for debugging.

check for my post in a few days
qnc is offline
Reply With Quote
View Public Profile
 
Old 11-03-2005, 01:57 PM upload info into array
qnc
Skilled Talker

Posts: 74
Trades: 0
Would the following work?

PHP Code:
<!-- 

UPLOAD FORM
Aim of form: to browse for files on computer viewing page.
upload the file information to $pedlines, $datlines and $smplines

-->


<?php
// test to see if files have been uploaded before
// FIRST if no then display upload FORM

if (!isset == $pedlines+$datlines+$smplines)
{

echo
'
<form action="upload_check_insert_page.php" method="POST">
ATTACH<br />
PED file: <input type="file" name="your_ped_file.ped" /><br />
DAT file: <input type="file" name="your_dat_file.ped" /><br />
SMP file: <input type="file" name="your_smp_file.ped" /><br />
<input type="submit" value="UPLOAD" /><br />
</form>
'
;
}
else
{

// load files' information into an array line by line 
// $pedfiles[0] = the first line of the *.ped file

$pedlines=file("your_ped_file.ped");  
$datlines=file("$your_dat_file.dat"); 
$smplines=file("$your_smp_file.smp"); 
}
?>
qnc is offline
Reply With Quote
View Public Profile
 
Old 11-03-2005, 08:13 PM
0beron's Avatar
Defies a Status

Posts: 1,832
Location: Somewhere else entirely
Trades: 0
There is more involved in uploading a file - you need the code that processes the form to look in the $_FILES[] array that comes with the form.
This is some code I put together and tested. It borrows heavily form this php manual page:
http://uk.php.net/features.file-upload

PHP Code:
<?php
// In PHP versions earlier than 4.1.0, $HTTP_POST_FILES should be used instead
// of $_FILES.

print_r($_POST);

if( !isset(
$_POST['submit'])) { ?>

<!-- The data encoding type, enctype, MUST be specified as below -->
<form enctype="multipart/form-data" action="<?php echo $_SERVER['PHP_SELF']; ?>" method="POST">
    <!-- MAX_FILE_SIZE must precede the file input field -->
    <input type="hidden" name="MAX_FILE_SIZE" value="30000" />
    <!-- Name of input element determines name in $_FILES array -->
    Ped file: <input name="pedfile" type="file" />
    Dat file: <input name="datfile" type="file" />
    Smp file: <input name="smpfile" type="file" />
    <input type="submit" name="submit" value="Upload Files" />
</form>


   <?}

else {

  
$uploaddir 'destinationfolder/';
  
$pedupload $uploaddir basename($_FILES['pedfile']['name']);
  
$datupload $uploaddir basename($_FILES['datfile']['name']);
  
$smpupload $uploaddir basename($_FILES['smpfile']['name']);
  
  echo 
'<pre>';
  if (
move_uploaded_file($_FILES['pedfile']['tmp_name'], $pedupload) &&
      
substr($_FILES['pedfile']['name'],-4) == ".ped") {
    echo 
"Ped file is valid, and was successfully uploaded.\n";
  } else {
    echo 
"Possible file upload attack!\n";
  }
  if (
move_uploaded_file($_FILES['datfile']['tmp_name'], $datupload)&&
      
substr($_FILES['datfile']['name'],-4) == ".dat") {
    echo 
"Dat file is valid, and was successfully uploaded.\n";
  } else {
    echo 
"Possible file upload attack!\n";
  }
  if (
move_uploaded_file($_FILES['smpfile']['tmp_name'], $smpupload)&&
      
substr($_FILES['smpfile']['name'],-4) == ".smp") {
    echo 
"Smp file is valid, and was successfully uploaded.\n";
  } else {
    echo 
"Possible file upload attack!\n";
  }
  
  
$pedlines file($pedupload);
  
$datlines file($datupload);
  
$smplines file($smpupload);


}

?>
Make sure that destinationfolder has write permissions set.
By the end of the script you should have the file contents available in arrays, and should be able to use the other string processing functions above the sort everything out.
Be aware that uploading another ped/dat/smp file with the same name as one already uploaded will overwrite it.
__________________
UPDATE 0beron SET talkupation = talkupation + lots WHERE post = 'helpful';

Please login or register to view this content. Registration is FREE
(aka MSN handwriting for forums)

Last edited by 0beron; 11-03-2005 at 08:22 PM..
0beron is offline
Reply With Quote
View Public Profile Visit 0beron's homepage!
 
Old 11-04-2005, 01:50 PM getting there!!
qnc
Skilled Talker

Posts: 74
Trades: 0
I got this far any obvious corrections/additions at this stage?

PHP Code:
<!-- 

after info is
in

 $pedlines = file($pedupload);
  $datlines = file($datupload);
  $smplines = file($smpupload);


NOW WE START THE MAIN DATA GATHERING AND EDIT FORM PREPERATION:
Aim: is to 
gather marker name information
gather family and individual details
query fam_data table to get fam_ui for individual
query raw_allele data with fam_ui and marker data to see if record has been entered
if yes then as if replacement or insert
if insert then add "duplicate and $entry_date" to form
all other cases leave comments blank
insert form into database
-->

<!-- get marker details -->
<?php
// preg_match out markers info
foreach ($datlines as $lnum => $datline
{
//CAUTION Arrays start at 0, line numbers at 1
  
if (preg_match("_(#\S*)_"$datline ,$matches)) 
    {
// pass markers into an array $matches[0];
    
}
$no_markers = (count($matches);
$no_alleles $no_markers*2;
 }
?>

<!-- set up while loop for array to EXPLODE info from ped files -->
<?php
while ($row = array($pedlines))
{
// take a *.ped string and split based on separtors here spaces ' ', using EXPLODE
// could use $pedlines[5] to limit to fam data but probably better to get whole string
$fam_items explode(' '$pedlines);

// set up for each marker a person has
// pass alleles into an array 
// $alleles = $array($fam_items[5]-[n])
// for each $allele

$fam_i $fam_items[0];
$in_i $fam_items[1];  

// query databse with $items [0] and [1]
$sql="SELECT fam_ui FROM fam_data WHERE fam_i = $fam_i and in_i = $in_i";
$fam_ui_result mysql_query($sql) or die ('Failed to execute ' $sql ' due to ' mysql_error());


//query database to see if person and marker have been entered before
$sql2="SELECT ui FROM raw_allele WHERE fam_ui++d_no = $fam_ui_result ++ $matches";
$duplicate_entry_result mysql_query($sql2) or die ('Failed to execute ' $sql2 ' due to ' mysql_error());
 
 if (
$duplicate_entry_result 0)

{

// EXPLODE out alleles
$marker_items explode(' '$pedlines[5+$no_alleles]);
// if ($marker_allele_result = $marker_item[5],)


// assemble fam_ui maker name and corresponding alleles
// $raw_allele_line[i]=$fam_ui_result.\t.$matches[0].\t.fam_items[ai].\t.fam-item[ai+1].\n;

// TILL $pedlines[5+((count($matches))*2)]

// then next pedline string

<!-- display results and allow editing-->
// make form with prefilled in $raw_allele_line
echo <form>$raw_allele_line</form>

}
else
{
// DO YOU WANT TO REPLACE RECORDS OR INSERT AS DUPLICATES
// if REPLACE THEN SHOW FORM AS ABOVE
// IF DUPLICATES THEN ADD 'duplicate and entry date into comments'
};
?>

<!-- inserts final file into database -->
<?php
// insert into database as REPLACE or INSERT.
?>

<!-- echo out success -->

<!-- finish upload_page.php page -->
</body>
</html>
qnc is offline
Reply With Quote
View Public Profile
 
Old 11-04-2005, 02:51 PM
0beron's Avatar
Defies a Status

Posts: 1,832
Location: Somewhere else entirely
Trades: 0
The brackets don't match on $no_markers = (count($matches); , you don't need the first (.
If you intend to count the number of markers, this won't work as it is - preg_match will only match one marker, which is fine cos we are only looking at them a line at a time, so you need to keep count inside the loop:

PHP Code:
$markers = Array(); //Empty array to be filled with markers.
foreach ($datlines as $lnum => $datline
{
//CAUTION Arrays start at 0, line numbers at 1
  
if (preg_match("_(#\S*)_"$datline ,$matches)) 
    {
     
$markers[] = $matches[0];
    }
 } 
//Count them when we are finished looping:
$no_markers count($markers);
$no_alleles $no_markers*2
The syntax you use for your while loop is usually used for going thorough database results. $pedlines is already an array, plus assigining this array to $row won't help. You can just use another foreach for this:
PHP Code:
foreach($pedlines as $plnum => $pedline) {
  
$fam_items explode(' '$pedlines); 
  
//etc...


$fam_ui_result is a result set, not a number. To get the number you'll need to say
PHP Code:
$fam_ui_number mysql_result($fam_ui_result,0,'fam_ui'); 
$fam_ui_result is a piece of database with one row (row 0) and one column (called fam_ui). Mysql result gets the contents of the result for you.

The query to check for duplicates has odd syntax - what is this query meant to do?

You have the same problem with $duplicate_entry_result as with $fam_ui_result. Use mysql_result, or even better mysql_num_rows if you just want to know if there were duplicates or not.

PHP Code:
($duplicate_entry_result 0
You need to use == here instead of =. == compares things, = assigns things. Your code assigns 0 to $duplicate_entry_result, and then does an if test on the result of the assignment, which is always zero. Use == to compare $duplicate_entry_result with 0.


Those are the main things I can see - I'm not exactly sure what the logic of the program should be in places since I don't know what the data means. It seems a fairly complex task - keep up the good work and don't give up!
Don't be afraid to try stuff out in a test environment - make some small test files and test each piece as you go along - don't try to write it all in one go.

Post again if you have any more questions.
__________________
UPDATE 0beron SET talkupation = talkupation + lots WHERE post = 'helpful';

Please login or register to view this content. Registration is FREE
(aka MSN handwriting for forums)
0beron is offline
Reply With Quote
View Public Profile Visit 0beron's homepage!
 
Old 11-07-2005, 11:24 AM
qnc
Skilled Talker

Posts: 74
Trades: 0
Thanks for all your help!

Quote:
The query to check for duplicates has odd syntax - what is this query meant to do?
The query is meant to check if the info in an uploaded file is already in the database.

If it is we may want to overwrite that information or we may want to add it as another seperate record.

how do I do this?
Quote:
Use mysql_result, or even better mysql_num_rows if you just want to know if there were duplicates or not.
Also you use
PHP Code:
$pedlines as $plnum => $pedline 
is this "=>" symbol for php 5?

what does $plnum and $lnum mean?

will have some more code tomorrow
qnc is offline
Reply With Quote
View Public Profile
 
Old 11-07-2005, 02:58 PM
0beron's Avatar
Defies a Status

Posts: 1,832
Location: Somewhere else entirely
Trades: 0
I explain mysql_result above in the case of $fam_ui_result. Basically it picks out a single value from a mysql query. Imagine a query such as $result = mysql_query("SELECT * FROM tablename"), that would return the entire table. It would look like this:

Code:
     'field' 'anotherfield' 'somethingelse'
row0  3         5            14
   1  6         2            8
   2  7         9            12
Here you can see the difficulty - $result is not a number or a string, but an entire table. To get one cell from the table you can use mysql_result in the form $cellcontents = mysql_result($result, <row number>, fieldname); So in this case mysql_result($result,0,'anotherfield') would give back 5.
In the case where your query only gives one value back, it is still a whole table, it's just a table with one cell in it. You still have to use mysql_result to get at it, you can't use $result directly.

My comment about mysql_num_rows is for the duplicates query - if you make yourself a query that finds any duplicates, you can say mysql_num_rows($result) to find out if there were any rows that mathed the query. You don't need to know what they are, just that there were some duplicates.

About the query that does the duplicate check - how is it supposed to handle the different markers? Do you want to check for any occurence of $fam_ui_number that appears with one of the markers, or with ANY of the markers?

The syntax with "=>" is another form of foreach - it works under PHP 4 and 5 just fine. It allows you access to the array keys as well as the values:

PHP Code:
foreach($yourarray as $key => $value) {
  echo 
$key." - ";
  echo 
$value."<br />";

The above code will print out each key in the array and its corresponding value.
I used the names $plnum and $lnum since the array contains your files, so the keys will be the line numbers (although they will be one different cos arrays start from 0). You don't have to make use of $lnum and $plnum inside the loop, but it could come in useful.
__________________
UPDATE 0beron SET talkupation = talkupation + lots WHERE post = 'helpful';

Please login or register to view this content. Registration is FREE
(aka MSN handwriting for forums)
0beron is offline
Reply With Quote
View Public Profile Visit 0beron's homepage!
 
Old 11-08-2005, 08:21 AM
qnc
Skilled Talker

Posts: 74
Trades: 0
Excellent cheers

with redard to
Quote:
About the query that does the duplicate check - how is it supposed to handle the different markers? Do you want to check for any occurence of $fam_ui_number that appears with one of the markers, or with ANY of the markers?
I guess it should check if a person ($fam_iu_number) and a maker IN the file that is being uploaded have been enetered before.

Also I am having difficulty with the marker loop as each person will have several markers and each marker two alleles.

person marker Allele 1 allele 2
36 D22S420 3 5
36 D12S112 4 10

etc...

the dat file tells me the marker name
the ped file allows me to retrive the person ui number from the database and also has the allele info.

I just remebered if it is a new family it needs to be entered into the fam_data table i.e. if upon searching it does not find the family number and the person number then it needs to enter that first, then cycle through the ped file getting the info!!

any ideas on this?

The other major question that is troubleing me is do I do this line by line i.e. each line ped strip out markers from dat and query database or do i generate a batch result and query all at once?

Last edited by qnc; 11-08-2005 at 08:39 AM..
qnc is offline
Reply With Quote
View Public Profile
 
Old 11-08-2005, 11:42 AM
qnc
Skilled Talker

Posts: 74
Trades: 0
Ok starting to do live testing

getting the following error messages.

Warning: Invalid argument supplied for foreach() in /home/qncqnc/public_html/testbed/firstphp/science/upload_edit_insert_page.php on line 157

Warning: Invalid argument supplied for foreach() in /home/qncqnc/public_html/testbed/firstphp/science/upload_edit_insert_page.php on line 176

Failed to execute SELECT fam_ui FROM fam_data WHERE fam_i = and in_i = due to You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the right syntax to use near 'and in_i =' at line 1
qnc is offline
Reply With Quote
View Public Profile
 
Reply     « Reply to help with uploading file information into databases
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off





   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML



Page generated in 0.98414 seconds with 12 queries