首页 > 开发 > PHP > 正文

读取站点更新纪录(RSS2.0) - php篇 :新浪,雅虎新闻

2024-05-04 22:58:58
字体:
来源:转载
供稿:网友

[前言]
在个人建站的过程中,经常要从其他网站获取大量动态信息。
本文所描述的就是使用php程序读取rss标准的xml格式文件,动态显示他人站点的信息列表。

[演示]

 


yahoo news : perl php  perl/php xml::rss读yahoo新闻(英文)的例子 
my csdn blog : perl php  perl/php xml::rss读取个人csdn博客的例子 
jlinux : perl php  perl/php xml::rss读取jlinux的例子 
新浪新闻  综合 perl php 体育 perl php 娱乐 perl php 


 

[前提]
对于php编程爱好者来说,前期的准备相对简单,只要有php4以上的环境就可以建立此功能。


[对应的xml/rss文件的格式]
基本上很多网站提供的用来做rss浏览的文件都是以下的格式,这是符合xml的w3c通用标准的。
简单的分析一下,
基本的树结构是,
一个rss根下,有一个channel节点,
  该channel节点下的title,link,description属性是常用的,
     然后就是item节点,众多item节点是最近跟新的若干篇文章,
    该item节点下的title,link,pubdate,description属性是常用的。
   简单格式如下:


<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/commentapi/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
<channel>
<title>本站点频道的标题</title>
<link>链接地址</link>
<description>站点频道描述信息</description>
<item>
<title>文章1</title>
<link>文章1链接地址</link>
<description>文章1内容简介</description>
</item>
<item>
<title>文章2</title>
<link>文章2链接地址</link>
<description>文章2内容简介</description>
</item>
</channel>
</rss>

 

举例:

- <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/commentapi/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
- <channel>
  <title>邢晓宁专栏</title>
  <link>http://blog.csdn.net/thefirstwind/</link>
  <description>代码一生</description>
  <dc:language>af</dc:language>
  <generator>.text version 1.0.1.1</generator>
  <image>http://counter.csdn.net/pv.aspx?id=72</image>
- <item>
  <dc:creator>♂猜猜♂(邢晓宁)</dc:creator>
  <title>在 ms windows 下建立 docbook 的解譯環境</title>
  <link>http://blog.csdn.net/thefirstwind/archive/2006/12/21/1451714.aspx</link>
  <pubdate>thu, 21 dec 2006 13:50:00 gmt</pubdate>
  <guid>http://blog.csdn.net/thefirstwind/archive/2006/12/21/1451714.aspx</guid>
  <wfw:comment>http://blog.csdn.net/thefirstwind/comments/1451714.aspx</wfw:comment>
  <comments>http://blog.csdn.net/thefirstwind/archive/2006/12/21/1451714.aspx#feedback</comments>
  <slash:comments>0</slash:comments>
  <wfw:commentrss>http://blog.csdn.net/thefirstwind/comments/commentrss/1451714.aspx</wfw:commentrss>
  <trackback:ping>http://tb.blog.csdn.net/trackback.aspx?postid=1451714</trackback:ping>
  <description>在 ms windows 下建立 docbook 的解譯環境<img src ="http://blog.csdn.net/thefirstwind/aggbug/1451714.aspx" width = "1" height = "1" /></description>
  </item>
- <item>
  <dc:creator>邢晓宁</dc:creator>
  <title>程序员学习的革命-如何使用大脑</title>
  <link>http://blog.csdn.net/thefirstwind/archive/2006/12/13/1440965.aspx</link>
  <pubdate>wed, 13 dec 2006 09:41:00 gmt</pubdate>
  <guid>http://blog.csdn.net/thefirstwind/archive/2006/12/13/1440965.aspx</guid>
  <wfw:comment>http://blog.csdn.net/thefirstwind/comments/1440965.aspx</wfw:comment>
  <comments>http://blog.csdn.net/thefirstwind/archive/2006/12/13/1440965.aspx#feedback</comments>
  <slash:comments>27</slash:comments>
  <wfw:commentrss>http://blog.csdn.net/thefirstwind/comments/commentrss/1440965.aspx</wfw:commentrss>
  <trackback:ping>http://tb.blog.csdn.net/trackback.aspx?postid=1440965</trackback:ping>
  <description>很多人搞技术,还有很多转行搞技术,搞了一段时间终于发现,自己不适合作技术,要我说其实就是用脑方式的问题。真的学会适当的用脑方式,编程编起来得心应手。<img src ="http://blog.csdn.net/thefirstwind/aggbug/1440965.aspx" width = "1" height = "1" /></description>
  </item>
  </channel>
  </rss>
 

[核心程序]


<?php

$rssurl = "http://blog.csdn.net/thefirstwind/rss.aspx";
$buff = "";
$fp = fopen($rssurl,"r");
while ( !feof($fp) ) {
    $buff .= fgets($fp,4096);
}
fclose($fp);

$parser = xml_parser_create();
xml_parser_set_option($parser,xml_option_skip_white,1);
xml_parse_into_struct($parser,$buff,$values,$idx);
xml_parser_free($parser);

$in_item = 0;
foreach ($values as $value) {
    $tag  = $value["tag"];
    $type = $value["type"];
    $value = $value["value"];

    $tag = strtolower($tag);
    if ($tag == "item" && $type == "open") {
        $in_item = 1;
    } else if ($tag == "item" && $type == "close") {
        echo <<<eom
$title
$link
$description
eom;
        $in_item = 0;
    }
    if ($in_item) {
        switch ($tag) {
            case "title":
                $title = $value;
                break;
            case "link":
                $link = $value;
                break;
            case "description":
                $description = $value;
                break;
        }
    }
}

?>


 

[配合上以上说明,完整的源代码如下]
以下附加了css样式。


<?php

#$rssurl = "http://www3.asahi.com/rss/index.rdf";
#$rssurl = "http://rss.news.yahoo.com/rss/topstories";
$rssurl = "http://blog.csdn.net/thefirstwind/rss.aspx";
#$rssurl = "http://jlinux.ddo.jp/bbs/rss.php?auth=0";
#$rssurl = "http://rss.sina.com.cn/news/marquee/ddt.xml";
#$rssurl = "http://rss.sina.com.cn/news/allnews/sports.xml";
#$rssurl = "http://rss.sina.com.cn/news/allnews/ent.xml";

$buff = "";
$fp = fopen($rssurl,"r");
while ( !feof($fp) ) {
    $buff .= fgets($fp,4096);
}
fclose($fp);

$parser = xml_parser_create();
xml_parser_set_option($parser,xml_option_skip_white,1);
xml_parse_into_struct($parser,$buff,$values,$idx);
xml_parser_free($parser);
$channel_title = $values[2]["value"];
echo <<<__html__
<html>
<head>
<meta http-equiv='content-type' content='text/html; charset=utf-8'>
<title>$channel_title</title>
<link rel='stylesheet' type='text/css' id='css' href='/bbs/forumdata/cache/style_1.css'>
<script type='text/javascript' src='/bbs/include/common.js'></script>
<script type='text/javascript' src='/bbs/include/menu.js'></script>
</head>
<body>

<table border='1'>
<tr><td>
<img src='http://www.pushad.com/xrssfile/2007-1/30/2007130142039121.gif'>&nbsp;&nbsp;
<!--
<img src='http://www.pushad.com/xrssfile/2007-1/30/2007130142039669.gif'>&nbsp;&nbsp;
<img src='http://jlinux.ddo.jp/bbs/images/default/logo.gif'>&nbsp;&nbsp;
<img src='http://www.pushad.com/xrssfile/2007-1/30/2007130142039970.gif'>&nbsp;&nbsp;
//-->
</td>
<td>
$channel_title
$channel_lastbuilddate<br>
</td>
</td>
__html__;

$in_item = 0;
foreach ($values as $value) {
    $tag  = $value["tag"];
    $type = $value["type"];
    $value = $value["value"];

    $tag = strtolower($tag);
    if ($tag == "item" && $type == "open") {
        $in_item = 1;
    } else if ($tag == "item" && $type == "close") {
        echo <<<eom
<tr>
  <td colspan='2' class='header'width='400'>
    <a href="$link">$title</a>
  </td>
</tr>
<tr>
  <td colspan='2' width='400'align='right'>
    $pubdate
  </td>
</tr>
<tr>
  <td colspan='2' width='400'>
    $description
  </td>
</tr>
<tr>
  <td>
    &nbsp;
  </td>
</tr>
eom;
        $in_item = 0;
    }
    if ($in_item) {
        switch ($tag) {
            case "title":
                $title = $value;
                break;
            case "link":
                $link = $value;
                break;
            case "pubdate":
                $pubdate = $value;
                break;
            case "description":
                $description = $value;
                break;
        }
    }
}

echo <<< __htmlend__
</table>
</body>
</html>
__htmlend__;

?>
                       

发表评论 共有条评论
用户名: 密码:
验证码: 匿名发表