How to generate HTML from an RSS feed
02-06-2008, 07:01 AM
|
How to generate HTML from an RSS feed
|
Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
|
Howdy peoples,
DansGalaxy asked me recently how to transform a RSS feed into simple HTML.
I've composed a how-to, explaining it, but it's a bit too complicated an long to go here.
So, if anyone needs something similar in PHP, take a look at my blog (a link should be below my avatar).
I'll expand it later to do the same in .Net (C#), but I need to do some researches first.
Tripy.
__________________
Only a biker knows why a dog sticks his head out the window.
|
|
|
|
02-06-2008, 07:12 AM
|
Re: How to generate HTML from an RSS feed
|
Posts: 1,226
Name: Mike
Location: Mataro, Spain
|
What complicity are you talking about? Fetch the file, convert into object as data structure and perform cycled echo.
|
|
|
|
02-06-2008, 07:56 AM
|
Re: How to generate HTML from an RSS feed
|
Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
|
No, create a XSL style sheet, apply it to the RSS feed, and use PHP's xslt engine to get back the html, in a php variable, to integrate it into a web page.
__________________
Only a biker knows why a dog sticks his head out the window.
|
|
|
|
02-06-2008, 02:18 PM
|
Re: How to generate HTML from an RSS feed
|
Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
|
Quote:
|
One thing, this seems to convert the whole thread to HTML, how can i use it to keep it as XML but parse some HTML within it?
|
The question is pertinent...
The problem is that HTML conflict with the rss's xml structure.
You have 2 roads you can follow.
The first, is to put your rss itme description into a CDATA section.
CDATA sections are literraly skiped by the XSL engine, meaning that anything that's inside it, will be outputted as is from the XSL parser.
The syntax would be:
Code:
<description><![CDATA[
It's that time again, CALM's Annual Golf day is soon approaching.
<ul>
<li>lll</li>
<li>aaa</li>
</ul>
More details of how to enter your teams coming soon!
]]></description>
I personally don't recommend to go the CDATA road, as once you start adding some, you will tend to mess your code by adding more.
The other solution, which I'll recommend to you, is to escape correctly your XML, and as you use PHP to generate the XML feed, by using the html_entities() function [ http://www.php.net/manual/en/function.htmlentities.php ]
It will output your HTML with > rather than < and < rather than > and this will not collide with the XML syntax anymore.
Code:
<description>
It's that time again, CALM's Annual Golf day is soon approaching.
<ul>
<li>lll</li>
<li>aaa</li>
</ul>
More details of how to enter your teams coming soon!
</description>
After that, you only need 2 more lines in the function transforming the XML in HTML (Sorry for the regexp enthusiast, but I'm not a member of that clan, I prefer to use str_replace) to replace < and > by the "final" representation:
PHP Code:
<?php /** * Transform a specified rss feed into html. * The XSL style sheet is specifed as the second parameter. If none is specified, the function will try to find a "rss.xsl" file in the current directory * @param string $rssFeed The url of the rss feed to parse * @param string $xslStyleSheet The place of the XSL style sheet either on the file system, or a url * @return string Return the HTML generated by the transformation of the XML */ function htmlize($rssFeed=null, $xslStyleSheet="rss.xsl"){ $html=""; if($rssFeed===null){ return null; } if(!file_exists($xslStyleSheet)){ $msg="The XSL style sheet $xslStyleSheet was not found in ".basedir(__FILE__); error_log($msg); //We don't show the error message by default, for security die(); } //We get the stream content into a variable. $rssContent=file_get_contents($rssFeed); //We initialize the XSLT engine $xsl = new XSLTProcessor(); //And create a DOM document element. $doc = new DOMDocument(); //load the XSL style sheet into the DOM document $doc->load($xslStyleSheet); //And indicate the XSLT engine to use this DOM representation of my file. $xsl->importStyleSheet($doc); //We import the XML into the DOM parser $doc->loadXML($rssContent); //And we transform it via the XSL engine $html=$xsl->transformToXML($doc); $html=str_replace(">",'>',$html); $html=str_replace("<",'<',$html); return $html; } echo htmlize("http://193.58.255.251/trf/rss.xml"); ?>
And so, it will replace the HTML entities with their normal representation.
Example there: http://193.58.255.251/trf/demo.php
__________________
Only a biker knows why a dog sticks his head out the window.
Last edited by tripy; 02-06-2008 at 02:20 PM..
|
|
|
|
02-06-2008, 03:09 PM
|
Re: How to generate HTML from an RSS feed
|
Posts: 6,521
Name: Dan
Location: Swindon
|
Isnt that demo converting the whole feed to html still? im confused :s
i dont really get how the above code does.
all i can understand is using htmlentiis and then changeing them back... whats the point sorry, bit confuddled here.
Also whats the problem with the CDATA option couldnt i just have like
PHP Code:
<description><![CDATA[ <? echo $row['fullpage'] ?> ]]></description>
and that will be it? or am i missing something? :s
sorry still quite dodgy with XML syntax etc.
Thanks again tho, Definatly TPing you!
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
02-06-2008, 03:56 PM
|
Re: How to generate HTML from an RSS feed
|
Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
|
Quote:
|
all i can understand is using htmlentiis and then changeing them back... whats the point sorry, bit confuddled here.
|
The point is that you must have a valid XML file, and if you have a "<br>" inside it description field, then it won't be valid no more.
Thus the entitysation (do you get me?) of the <description> element to avoid that.
After that, the de-entitisation is there only to send correct HTML code, rather than escaped html tags to the browser.
Quote:
|
Also whats the problem with the CDATA option ?
|
It is to be tested...
As the CDATA section is completely ignored by the XSL parser, it may output ok, but maybe not.
Maybe you will get the "<![CDATA[" into the HTML too.
As I stated, I consider this processor instruction as nasty as using a $_GET value in a SQL query without check proofing it.
As much as I can, I recommend you to stay away from those, and trust me, if you think twice about what you are doing, then you don't need them.
__________________
Only a biker knows why a dog sticks his head out the window.
|
|
|
|
02-06-2008, 05:00 PM
|
Re: How to generate HTML from an RSS feed
|
Posts: 6,521
Name: Dan
Location: Swindon
|
Okay, but all my code is valid (to the best of my knowledge) to a XHTML 1.0 min standard, so i dont have un closed tags...
But even when its valid XHTML and dont have unclosed tags it still throughs up the malformed XML, but now ii tried the CDATA it seems to be working, its all displayign as expected (except for a youtube video but that dont matter.. )
Im still bit confussed about the whole DOM/Javascript/XML ish stuff..
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
02-06-2008, 05:08 PM
|
Re: How to generate HTML from an RSS feed
|
Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
|
Quote:
|
but all my code is valid (to the best of my knowledge) to a XHTML 1.0 min standard
|
It's not what I was talking about.
Keep in mind that the RSS feed is not supposed to display HTML, but raw text.
Having a litteral (X)HTML element into it would theoretically break it.
Code:
<rss>
<channel>
...
<item>
<description>I <strong>love</strong> to sleep</description>
</item>
</channel>
</rss>
Because it means that your <description> element have an <strong> xml element as a child, which is not defined in the specifications.
That's the reason you need to either use a CDATA section which tells the XSL parser that everything in it should be completely ignored, or you use the HTML entities to avoid conflict with the XML elements, like this:
Code:
<rss>
<channel>
...
<item>
<description>I >strong< love >/strong< to sleep</description>
</item>
</channel>
</rss>
__________________
Only a biker knows why a dog sticks his head out the window.
Last edited by tripy; 02-06-2008 at 05:09 PM..
|
|
|
|
02-06-2008, 06:35 PM
|
Re: How to generate HTML from an RSS feed
|
Posts: 6,521
Name: Dan
Location: Swindon
|
But what i dont get is, if your then changing them back, whats the point, why would the reader Parse them as Html but know not to parse them as XML without using CDATA?
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
02-06-2008, 06:55 PM
|
Re: How to generate HTML from an RSS feed
|
Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
|
Because the XSL parser is way more stricter than the HTML engine of the browser.
If there is 1 little inconsistency in either the XML or the XSL, it will die, asking you to fix it, and not just throw a warning.
The validity and conformity of the datas to a schema is everything, in XML.
If you cannot ensure this, then you cannot know what to do with them.
It's the moto of the XML way.
So, I'm making them valid at 1 moment T, and after parsing it through the XSL engine, I convert them back to HTML at the moment T'
It's all a matter of context. What is valid at one moment is not valid at the previous moment.
__________________
Only a biker knows why a dog sticks his head out the window.
|
|
|
|
02-07-2008, 09:48 AM
|
Re: How to generate HTML from an RSS feed
|
Posts: 6,521
Name: Dan
Location: Swindon
|
o so you kind of tell it to parse the XML with the htmlentities and then once its parsed the XML you convert it back and tell it to parse the xhtml?
__________________
Discounted Web Hosting With XDnet! >> Get 25% of hosting~ Promo: Webmaster-talk <<
|
|
|
|
02-07-2008, 10:23 AM
|
Re: How to generate HTML from an RSS feed
|
Posts: 3,621
Name: Thierry
Location: I'm the uber Spaminator !
|
It's exactly that .
__________________
Only a biker knows why a dog sticks his head out the window.
|
|
|
|
|
« Reply to How to generate HTML from an RSS feed
|
|
|
| Thread Tools |
Search this Thread |
|
|
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|