Tycoon Talk
Become a Big fish!
The number 1 forum for online business!
Post topics, ask questions, share your knowledge.
Tycoon Talk is part of Freelancer.com - find skilled workers online at a fraction of the cost.

PHP Forum


You are currently viewing our PHP Forum as a guest. Please register to participate.
Login



Freelance Jobs

Reply
Converting HTML to EXCEL (XLS)
Old 06-23-2006, 11:29 PM Converting HTML to EXCEL (XLS)
Webmaster Talker

Posts: 626
Trades: 0
Can anyone help me convert my HTML page into xls format (excel)?

Excel opens my HTML page great, but I need to actually convert the file to XLS format.

Any help is appreciated... If no one knows how to do this, can you point me to scripts and/or tutorials?
jim.thornton is offline
Reply With Quote
View Public Profile
 
 
Register now for full access!
Old 06-24-2006, 05:19 AM Re: Converting HTML to EXCEL (XLS)
chrishirst's Avatar
Missing! presumed drunk.

Posts: 42,385
Name: Chris Hirst
Location: Blackpool. UK
Trades: 0
File -> Save As -> select XLS as File Type ??


apart from that, Why??
__________________
Chris. ->>
Please login or register to view this content. Registration is FREE
<<-

A foolish consistency is the hobgoblin of little minds
Thought for today:- Is SEO the only industry where all the cowboys are Indians?
chrishirst is offline
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Old 06-24-2006, 11:07 PM Re: Converting HTML to EXCEL (XLS)
Webmaster Talker

Posts: 626
Trades: 0
I have tried that. However, the file is actually an HTML file and EXCEL is loading the file file. BUT, I need to extract the information from the file so that I can save it in my database.

The problem is that when I go to use my script to actually extract the info, it can't read the file because it isn't a true XLS file. So I have to save the file, open it in excel, click save as XLS, then run my script to extract the information.

I need to automate this process... Here is what I want to do:

1. Download the .asp file and save on server (which contains the table for the XLS file) - I can do this no problem
2. run the script to convert to proper XLS format.
3. run the script to extrace raw data from XLS file
4. save data to database

In order to do this, I need to be able to convert the HTML data from the .asp file to XLS format.

Do you know if it is possible?
jim.thornton is offline
Reply With Quote
View Public Profile
 
Old 06-25-2006, 04:46 AM Re: Converting HTML to EXCEL (XLS)
chrishirst's Avatar
Missing! presumed drunk.

Posts: 42,385
Name: Chris Hirst
Location: Blackpool. UK
Trades: 0
because this is the PHP forum and the script is in ASP, should we assume that the asp page is remote ?

if so, this
Quote:
I need to be able to convert the HTML data from the .asp file to XLS format
isn't likely to be possible server-side.

I can't see an easy way to accomplish this. Although I'm working blind with this idea because I've no idea what any of this data/scripts look like or do.
But at a guess
You will need to scrape the .asp page to get the raw source, strip out the HTML code and any data you do not need, format the bits you want as a CSV file/string then open or import that into Excel.

looking at your list of steps though, you need to rethink what exactly you are doing

Why do you need steps 2 & 3 ? Is there a specific reason you use Excel ?

A order of
1. Download the .asp file and save on server (which contains the table for the XLS file) - I can do this no problem
2. run a PHP script to extract raw data from file
3. save data to database

would be easier
__________________
Chris. ->>
Please login or register to view this content. Registration is FREE
<<-

A foolish consistency is the hobgoblin of little minds
Thought for today:- Is SEO the only industry where all the cowboys are Indians?
chrishirst is offline
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Old 06-25-2006, 02:08 PM Re: Converting HTML to EXCEL (XLS)
Webmaster Talker

Posts: 626
Trades: 0
I would absolutely agree that it would be easier, however, I asked in another thread how to parse the HTML information without any luck. Basically, I don't know how to extract the HTML coding, and get the information out. So I figured there might be a script somewhere (or tutorial) which would be able to convert an html table into an excel file.

I'm sorry to make you work blind but as the table isn't mine, it is from my head office, I am allowed to view it and see it but I don't know if I am able to release it to the public.

It is a big table/spreadsheet of data, however I only need data from about a 10 cell block (5x2). It contains interest rates, and basically I just want to *scrape* out the interest rates and save them in my database. If I can get the file into CSV format even, it would be much easier.

Oh... Yes it is remote, therefore the .asp page is really and html page.
jim.thornton is offline
Reply With Quote
View Public Profile
 
Old 06-25-2006, 03:46 PM Re: Converting HTML to EXCEL (XLS)
chrishirst's Avatar
Missing! presumed drunk.

Posts: 42,385
Name: Chris Hirst
Location: Blackpool. UK
Trades: 0
without seeing the structure of the page output it simply isn't possible to say how it could be done exactly.

some pointers to conversion methods. You simply have to consider the HTML as a delimited file and write an algorithm to extract the information from the page.

so;
strip out all html tags except table row and cell related ones
strip out all \n characters
replace all </tr> with \n
replace all <tr ...> with nothing
replace all </td> with "," (comma)
replace all <tr ...> with nothing

get the idea? you simply reduce the HTML source code to a series of lines each with the columns seperated by commas.
You can then parse this into an array and extract the elements that you need.
__________________
Chris. ->>
Please login or register to view this content. Registration is FREE
<<-

A foolish consistency is the hobgoblin of little minds
Thought for today:- Is SEO the only industry where all the cowboys are Indians?
chrishirst is offline
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Old 06-25-2006, 10:51 PM Re: Converting HTML to EXCEL (XLS)
ChancesAre's Avatar
Skilled Talker

Posts: 84
Trades: 0
If you mean parsing the content of the html document, then you would need to create a parser script for this html files, you can save the extracted data in tab delimited texts... this you can open easily in Excel.
ChancesAre is offline
Reply With Quote
View Public Profile
 
Old 06-26-2006, 12:34 AM Re: Converting HTML to EXCEL (XLS)
Webmaster Talker

Posts: 626
Trades: 0
Thanks guys... Is there any way you can give me a sample html extraction algorithm? Just imaging 2 columns and 6 rows with numbers in every row. I don't really know how to search out all <table> and <tr> and <td> tags.
jim.thornton is offline
Reply With Quote
View Public Profile
 
Old 06-26-2006, 12:46 AM Re: Converting HTML to EXCEL (XLS)
ChancesAre's Avatar
Skilled Talker

Posts: 84
Trades: 0
What you should do is look for unique patterns, coz its hard to extract if you will just look for <tables, <td inside these tags there will be unique patterns... you check everyline for this patterns if you fins it, parse the line and extract the data..., if a record is completed insert to a tab delimeted file, then continue extracting for records...
ChancesAre is offline
Reply With Quote
View Public Profile
 
Old 06-26-2006, 03:17 AM Re: Converting HTML to EXCEL (XLS)
chrishirst's Avatar
Missing! presumed drunk.

Posts: 42,385
Name: Chris Hirst
Location: Blackpool. UK
Trades: 0
Can't be more specific without seeing the data.

but I would assume that the source is consistent each time the page is pulled and only the data changes. So you could use simple str_replace lines to remove much of the code (meta tags, open and close <body>,<html> etc).
If you want to replace everything in the <head> at once, the regular expression pattern would be
PHP Code:
<head>.*?</head
(think that's right)

If there are script blocks
PHP Code:
<script.*?>.*?</script> 
Then you should be down to the steps I outlined earlier in the thread. I was guessing there that each item of data would be in it's own cell and each grouping on the "Y" axis in rows.
__________________
Chris. ->>
Please login or register to view this content. Registration is FREE
<<-

A foolish consistency is the hobgoblin of little minds
Thought for today:- Is SEO the only industry where all the cowboys are Indians?
chrishirst is offline
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Old 06-26-2006, 11:30 PM Re: Converting HTML to EXCEL (XLS)
Webmaster Talker

Posts: 626
Trades: 0
Ok...

Here is a bit of the html page I need to scrape:
Code:
<html>
<head>
	<title>Virtual Office</title>
		
		<META HTTP-EQUIV="Refresh" CONTENT="3600; URL=LogoutExpired.php"> 
			
	<link rel="STYLESHEET" type="text/css" href="includes/style/style.css">
<script src="includes/javascript.js" type="text/javascript"></script>
<script language=JavaScript>
function printWindow() {
alert("Before printing, ensure that page settings are for legal size and landscape orientation.");

   bV = parseInt(navigator.appVersion);
   if (bV >= 4) {
   		window.print();
   } else {
   		window.print();
   }
}

</script>
</head>

<body>
Here is the code including the regular expression I am testing it with:
Code:
<?php
	$line = "";
	$fr1 = fopen("c:\file.htm", "r") or die("Couldn't open file");

	while(!feof($fr1)) {
		$line .= fgets($fr1, 2048);
	}

	fclose($fr1);
	
	$line = eregi_replace('<script.*>.*<\/script>?', '<s></s>', $line);
	
	echo "<textarea>$line</textarea>";
?>
For some reason it isn't working... Basically, I am attempting to re-write <script></script> tags (and all contents) to <s></s>. There really isn't a reason for this other than I am trying to understand regular expressions. I tried putting the ? after .* but I was getting an error message.
jim.thornton is offline
Reply With Quote
View Public Profile
 
Old 06-27-2006, 04:40 AM Re: Converting HTML to EXCEL (XLS)
chrishirst's Avatar
Missing! presumed drunk.

Posts: 42,385
Name: Chris Hirst
Location: Blackpool. UK
Trades: 0
the code above works ok for me.

are you testing this on a windows machine ?

if so your path to the file needs to be either using a forward slash ( c:/file.htm ) or you should "escape" the backslash ( c:\\file.htm )

otherwise what errors are you getting?
does PHP have access rights to the root of C: drive ?
__________________
Chris. ->>
Please login or register to view this content. Registration is FREE
<<-

A foolish consistency is the hobgoblin of little minds
Thought for today:- Is SEO the only industry where all the cowboys are Indians?
chrishirst is offline
Reply With Quote
View Public Profile Visit chrishirst's homepage!
 
Old 06-27-2006, 01:02 PM Re: Converting HTML to EXCEL (XLS)
Webmaster Talker

Posts: 626
Trades: 0
Sorry, I guess I should clarify what is happening... First, yes, I am currently testing it on a windows machine running apache 2.0.54.

I should say that the script is running without any errors and it is stripping out the html tags and all the content within those tags. However, the regular expression isn't working as I am trying.

You will notice that within <head></head> there are 2 <script></script> tags with content. I am expecting the regular expression to EXTRACT the <script> tags and their content and replace them with <s></s> tags.

The problem is that when I run the script it strips out BOTH sets of <script> tags and their content but only replaces it with ONE set of <s></s> tags when it should replace it with 2 sets of <s></s> tags.

Therefore, I think there is a problem with my reg exp.

Any suggestions?
jim.thornton is offline
Reply With Quote
View Public Profile
 
Old 06-27-2006, 01:47 PM Re: Converting HTML to EXCEL (XLS)
mgraphic's Avatar
Truth Seeker

Latest Blog Post:
JAMISONTUNES
Posts: 2,918
Name: Keith Marshall
Location: Connecticut
Trades: 0
One thing about regular expressions - They become greedy if you let them.

http://www.regular-expressions.info/
__________________

<mgraphic /> - I don't have a solution but I admire the problem.
mgraphic is offline
Reply With Quote
View Public Profile
 
Old 06-30-2006, 03:04 PM Re: Converting HTML to EXCEL (XLS)
Average Talker

Posts: 22
Trades: 0
You can try this.

Use the mouse to select the content from the page you want to copy. I assume that it is in some sort of table. Copy using Ctrl+c.

Go to Excel and select Ctrl+v to paste what you've got on the HTML page into an excel spreadsheet. It will look like crap.

Don't click anything or go anywhere - just select Ctrl+c right after you did the Ctrl+v above, then move to a different tab in the Excel workbook.

Select Edit | Paste Special, and choose to paste values.

You might have to adjust the first row, but then should be good with just the content of what was on the HTML page.

Last edited by steve49589; 06-30-2006 at 03:09 PM..
steve49589 is offline
Reply With Quote
View Public Profile
 
Old 06-30-2006, 04:46 PM Re: Converting HTML to EXCEL (XLS)
Junior Talker

Posts: 1
Name: Franklin
Location: NJ
Trades: 0
Sorry guy not my area, but if I Talk to my buddy he might know so hold on.







Quote:
Originally Posted by zincoxide
Can anyone help me convert my HTML page into xls format (excel)?

Excel opens my HTML page great, but I need to actually convert the file to XLS format.

Any help is appreciated... If no one knows how to do this, can you point me to scripts and/or tutorials?
flybravo34 is offline
Reply With Quote
View Public Profile
 
Old 06-30-2006, 05:14 PM Re: Converting HTML to EXCEL (XLS)
Webmaster Talker

Posts: 626
Trades: 0
Thanks for all your help guys... I took the suggestion of using a regex and stripped out all the HTML and just made it into a CSV file.

I have now created the script which automatically updates my rates in my database.

Thanks for all the help... It was great!
jim.thornton is offline
Reply With Quote
View Public Profile
 
Old 12-09-2010, 07:09 AM Re: Converting HTML to EXCEL (XLS)
Junior Talker

Posts: 1
Name: Md. Tariqul Islam Drubo
Trades: 0
Just add the following magical lines at the very beginning of your php file that you want to export to excel


header("Content-type: application/octet-stream");
header("Content-Disposition: attachment; filename=" . $_GET['f'] . ".xls");
header("Pragma: no-cache");
header("Expires: 0");
tareqdhk is offline
Reply With Quote
View Public Profile
 
Old 12-09-2010, 07:23 AM Re: Converting HTML to EXCEL (XLS)
Extreme Talker

Posts: 156
Trades: 0
Look for sites like mediaconverter they will be able to do just what you need
dagaul101 is offline
Reply With Quote
View Public Profile
 
Reply     « Reply to Converting HTML to EXCEL (XLS)
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are Off
Pingbacks are Off
Refbacks are Off





   
RSS Feed  Feeds: RSS   JS   XML
RSS Feed  Feeds for this forum: RSS   JS   XML



Page generated in 0.86345 seconds with 12 queries