Forum

Thread tagged as: Question, Problem, Suggestions

Parsing XML from RSS Page

Hi all,

I've just setup an RSS page on a website I'm working on and I'd like to know how to successfully parse the XML on this feed, so I can spit out the same content on a different page.

I'll run through what I'm trying to do. I've set up an RSS page which basically feeds its content from a programme.php which is basically a page with a big list of events, times and dates etc.

The RSS page seems to be successfully grabbing out the content from programme.php in a sort of XML format. I say sort of because it isn't exactly a pure XML document that I'm accustomed to working with i.e.

<?xml version="1.0" encoding="UTF-8"?>
<item>
  <example>Test</example>
</item>

Instead I'm getting page which includes it's own html and body tag, see picture below :

RSS Page Structure

I'm not really sure how to workaround this, but basically my next plan of action was to use simpleXML to retrieve the data here and spit out the parts of the XML that I need i.e. I specifically need the contents sitting inside the <item> nodes....however it seems because of the pages structure or it not being a true XML document, simpleXML is just not working on this page.

Not sure how well I've explained myself, but would appreciate any assistance here.

Thanks! :)

Natalia Robba

Natalia Robba 0 points

  • 3 years ago
Drew McLellan

Drew McLellan 2638 points
Perch Support

Did you create the XML document?

Drew McLellan said:

Did you create the XML document?

Hi Drew, thanks for the reply. No I didn't create the XML document myself as such, I'm basically trying to retrieve the contents of an already pre-existing php page (content added via perch content) and use the content within it to populate an rss feed, so I can then parse this with PHP. The reason being I'd like to create a animated slider on the homepage which just spits out specific content from said original php page above.

I followed the steps here - https://docs.grabaperch.com/perch/content/functions/how-do-i-add-an-rss-feed/

The resulting rss feed has been created, and it does contain all of the content I need, however it also contains body tags, html tags (show in image in my post above). I thought it would create a XML page, but I can't parse it in this way =( Even though the xml tag is in there (this taken from the tutorial page link above) :

<?php echo '<'.'?xml version="1.0"?'.'>'; ?>
    <rss version="2.0">

...it doesn't seem to be creating an XML document but rather a web page with tags inside.

Any suggestions? Have I done something wrong?

Thanks for your time

Drew McLellan

Drew McLellan 2638 points
Perch Support

Ok, so where are the body tags coming from?

Drew McLellan said:

Ok, so where are the body tags coming from?

That's what I don't understand :/

Here is the rss page in question :

<?php include('perch/runtime.php'); ?>
    <?php echo '<'.'?xml version="1.0"?'.'>'; ?>
    <channel>
        <?php
            $opts = array(
                    'page'=>'/programme.php',
                    'template'=>'_programme_rss.html'
                );
            perch_content_custom('DailyProgrammes', $opts);
        ?>
    </channel>

The programme.php is a normal php page which has perch content areas. Here is the excerpt for the template used above _programme_rss.html with the info I want to grab:


<item> <programmeDay><perch:content id="programmeday" type="text" label="Programme Day" required="true" title="true" /></programmeDay> <programmeTheme><perch:content id="programmetheme" type="text" label="Programme Theme" required="true" /></programmeTheme> <dayEvents> <perch:repeater id="newEntry" label="New Programme Entry" max="100"> <programmeTime><perch:content id="programmetime" type="text" label="Programme Time" required="true" title="true" /></programmeTime> <programmeTopic><perch:content id="topic" type="text" label="Programme Topic" required="true" title="true" /></programmeTopic> <programmeRoom><perch:content id="room" type="text" label="Programme Room" required="true" title="true" /></programmeRoom> </perch:repeater> </dayEvents> <description> </description> </item>

And the resulting rss page includes a body tag :/

Drew McLellan

Drew McLellan 2638 points
Perch Support

Have you tried declaring the page as XML?

header('Content-Type: application/rss+xml; charset=utf-8');

So your page might look like

<?php 
include('perch/runtime.php');
header('Content-Type: application/rss+xml; charset=utf-8');
echo '<'.'?xml version="1.0"?'.'>'; 

echo '<channel>';

$opts = array( 
    'page'=>'/programme.php', 
    'template'=>'_programme_rss.html' 
); 

perch_content_custom('DailyProgrammes', $opts); 

echo '</channel>';

Drew McLellan said:

Have you tried declaring the page as XML?

header('Content-Type: application/rss+xml; charset=utf-8');

So your page might look like

<?php 
include('perch/runtime.php');
header('Content-Type: application/rss+xml; charset=utf-8');
echo '<'.'?xml version="1.0"?'.'>'; 

echo '<channel>';

$opts = array( 
  'page'=>'/programme.php', 
  'template'=>'_programme_rss.html' 
); 

perch_content_custom('DailyProgrammes', $opts); 

echo '</channel>';

I'll give this a go and get back to you thanks!

It's looking better! But still doesn't seem to be an xml doc, now atleast its showing some structure, but the pesky tags are still there (heres an inspect element screeny) :

XML Screenshot

Got it!!

Thanks for your suggestion Drew, I tried instead


<?php include('perch/runtime.php'); header("content-type: text/xml"); echo '<'.'?xml version="1.0"?'.'>'; echo '<channel>'; $opts = array( 'page'=>'/programme.php', 'template'=>'_programme_rss.html' ); perch_content_custom('DailyProgrammes', $opts); echo '</channel>';

and that sorted it out, now displaying properly as XML.

Thanks for the tip!