[前言]
在个人建站的过程中,经常要从其他网站获取大量动态信息。
本文所描述的就是使用php程序读取rss标准的xml格式文件,动态显示他人站点的信息列表。
[演示]
yahoo news : perl php perl/php xml::rss读yahoo新闻(英文)的例子
my csdn blog : perl php perl/php xml::rss读取个人csdn博客的例子
jlinux : perl php perl/php xml::rss读取jlinux的例子
新浪新闻 综合 perl php 体育 perl php 娱乐 perl php
[前提]
对于php编程爱好者来说,前期的准备相对简单,只要有php4以上的环境就可以建立此功能。
[对应的xml/rss文件的格式]
基本上很多网站提供的用来做rss浏览的文件都是以下的格式,这是符合xml的w3c通用标准的。
简单的分析一下,
基本的树结构是,
一个rss根下,有一个channel节点,
该channel节点下的title,link,description属性是常用的,
然后就是item节点,众多item节点是最近跟新的若干篇文章,
该item节点下的title,link,pubdate,description属性是常用的。
简单格式如下:
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/commentapi/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
<channel>
<title>本站点频道的标题</title>
<link>链接地址</link>
<description>站点频道描述信息</description>
<item>
<title>文章1</title>
<link>文章1链接地址</link>
<description>文章1内容简介</description>
</item>
<item>
<title>文章2</title>
<link>文章2链接地址</link>
<description>文章2内容简介</description>
</item>
</channel>
</rss>
举例:
- <rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:wfw="http://wellformedweb.org/commentapi/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/">
- <channel>
<title>邢晓宁专栏</title>
<link>http://blog.csdn.net/thefirstwind/</link>
<description>代码一生</description>
<dc:language>af</dc:language>
<generator>.text version 1.0.1.1</generator>
<image>http://counter.csdn.net/pv.aspx?id=72</image>
- <item>
<dc:creator>♂猜猜♂(邢晓宁)</dc:creator>
<title>在 ms windows 下建立 docbook 的解譯環境</title>
<link>http://blog.csdn.net/thefirstwind/archive/2006/12/21/1451714.aspx</link>
<pubdate>thu, 21 dec 2006 13:50:00 gmt</pubdate>
<guid>http://blog.csdn.net/thefirstwind/archive/2006/12/21/1451714.aspx</guid>
<wfw:comment>http://blog.csdn.net/thefirstwind/comments/1451714.aspx</wfw:comment>
<comments>http://blog.csdn.net/thefirstwind/archive/2006/12/21/1451714.aspx#feedback</comments>
<slash:comments>0</slash:comments>
<wfw:commentrss>http://blog.csdn.net/thefirstwind/comments/commentrss/1451714.aspx</wfw:commentrss>
<trackback:ping>http://tb.blog.csdn.net/trackback.aspx?postid=1451714</trackback:ping>
<description>在 ms windows 下建立 docbook 的解譯環境<img src ="http://blog.csdn.net/thefirstwind/aggbug/1451714.aspx" width = "1" height = "1" /></description>
</item>
- <item>
<dc:creator>邢晓宁</dc:creator>
<title>程序员学习的革命-如何使用大脑</title>
<link>http://blog.csdn.net/thefirstwind/archive/2006/12/13/1440965.aspx</link>
<pubdate>wed, 13 dec 2006 09:41:00 gmt</pubdate>
<guid>http://blog.csdn.net/thefirstwind/archive/2006/12/13/1440965.aspx</guid>
<wfw:comment>http://blog.csdn.net/thefirstwind/comments/1440965.aspx</wfw:comment>
<comments>http://blog.csdn.net/thefirstwind/archive/2006/12/13/1440965.aspx#feedback</comments>
<slash:comments>27</slash:comments>
<wfw:commentrss>http://blog.csdn.net/thefirstwind/comments/commentrss/1440965.aspx</wfw:commentrss>
<trackback:ping>http://tb.blog.csdn.net/trackback.aspx?postid=1440965</trackback:ping>
<description>很多人搞技术,还有很多转行搞技术,搞了一段时间终于发现,自己不适合作技术,要我说其实就是用脑方式的问题。真的学会适当的用脑方式,编程编起来得心应手。<img src ="http://blog.csdn.net/thefirstwind/aggbug/1440965.aspx" width = "1" height = "1" /></description>
</item>
</channel>
</rss>
[核心程序]
<?php
$rssurl = "http://blog.csdn.net/thefirstwind/rss.aspx";
$buff = "";
$fp = fopen($rssurl,"r");
while ( !feof($fp) ) {
$buff .= fgets($fp,4096);
}
fclose($fp);
$parser = xml_parser_create();
xml_parser_set_option($parser,xml_option_skip_white,1);
xml_parse_into_struct($parser,$buff,$values,$idx);
xml_parser_free($parser);
$in_item = 0;
foreach ($values as $value) {
$tag = $value["tag"];
$type = $value["type"];
$value = $value["value"];
$tag = strtolower($tag);
if ($tag == "item" && $type == "open") {
$in_item = 1;
} else if ($tag == "item" && $type == "close") {
echo <<<eom
$title
$link
$description
eom;
$in_item = 0;
}
if ($in_item) {
switch ($tag) {
case "title":
$title = $value;
break;
case "link":
$link = $value;
break;
case "description":
$description = $value;
break;
}
}
}
?>
[配合上以上说明,完整的源代码如下]
以下附加了css样式。
<?php
#$rssurl = "http://www3.asahi.com/rss/index.rdf";
#$rssurl = "http://rss.news.yahoo.com/rss/topstories";
$rssurl = "http://blog.csdn.net/thefirstwind/rss.aspx";
#$rssurl = "http://jlinux.ddo.jp/bbs/rss.php?auth=0";
#$rssurl = "http://rss.sina.com.cn/news/marquee/ddt.xml";
#$rssurl = "http://rss.sina.com.cn/news/allnews/sports.xml";
#$rssurl = "http://rss.sina.com.cn/news/allnews/ent.xml";
$buff = "";
$fp = fopen($rssurl,"r");
while ( !feof($fp) ) {
$buff .= fgets($fp,4096);
}
fclose($fp);
$parser = xml_parser_create();
xml_parser_set_option($parser,xml_option_skip_white,1);
xml_parse_into_struct($parser,$buff,$values,$idx);
xml_parser_free($parser);
$channel_title = $values[2]["value"];
echo <<<__html__
<html>
<head>
<meta http-equiv='content-type' content='text/html; charset=utf-8'>
<title>$channel_title</title>
<link rel='stylesheet' type='text/css' id='css' href='/bbs/forumdata/cache/style_1.css'>
<script type='text/javascript' src='/bbs/include/common.js'></script>
<script type='text/javascript' src='/bbs/include/menu.js'></script>
</head>
<body>
<table border='1'>
<tr><td>
<img src='http://www.pushad.com/xrssfile/2007-1/30/2007130142039121.gif'>
<!--
<img src='http://www.pushad.com/xrssfile/2007-1/30/2007130142039669.gif'>
<img src='http://jlinux.ddo.jp/bbs/images/default/logo.gif'>
<img src='http://www.pushad.com/xrssfile/2007-1/30/2007130142039970.gif'>
//-->
</td>
<td>
$channel_title
$channel_lastbuilddate<br>
</td>
</td>
__html__;
$in_item = 0;
foreach ($values as $value) {
$tag = $value["tag"];
$type = $value["type"];
$value = $value["value"];
$tag = strtolower($tag);
if ($tag == "item" && $type == "open") {
$in_item = 1;
} else if ($tag == "item" && $type == "close") {
echo <<<eom
<tr>
<td colspan='2' class='header'width='400'>
<a href="$link">$title</a>
</td>
</tr>
<tr>
<td colspan='2' width='400'align='right'>
$pubdate
</td>
</tr>
<tr>
<td colspan='2' width='400'>
$description
</td>
</tr>
<tr>
<td>
</td>
</tr>
eom;
$in_item = 0;
}
if ($in_item) {
switch ($tag) {
case "title":
$title = $value;
break;
case "link":
$link = $value;
break;
case "pubdate":
$pubdate = $value;
break;
case "description":
$description = $value;
break;
}
}
}
echo <<< __htmlend__
</table>
</body>
</html>
__htmlend__;
?>
新闻热点
疑难解答