【发布时间】:2014-08-12 02:23:39
【问题描述】:
我正在为这个函数使用 Simple html dom (http://simplehtmldom.sourceforge.net/) 库。
我想解析一个网站的 pre 标记的内容,我正在使用这个代码:
<?php include '/libraries/simple_html_dom.php' ?>
<?php
// Create DOM from URL or file
$html = file_get_html('testing.html');
// Find the Text
foreach($html->find('pre') as $element)
echo '<p>' . $element . '<p>';
?>
这是文件“testing.html”的内容:
<html>
<head>
</head>
<body bgcolor="#FFFFFF">
<pre>
am.o V 1 1 PRES ACTIVE IND 1 S
amo, amare, amavi, amatus V [XXXAO]
love, like; fall in love with; be fond of; have a tendency to;
am.as N 1 1 ACC P F
ama, amae N F [XXXDO] lesser
bucket; water bucket; (esp. fireman's bucket);
am.as V 1 1 PRES ACTIVE IND 2 S
amo, amare, amavi, amatus V [XXXAO]
love, like; fall in love with; be fond of; have a tendency to;
</pre>
</body>
</html>
如您所见,前置文本具有回车符,我想将其保留在输出中。目前这是解析器的输出:
am.o V 1 1 PRES ACTIVE IND 1 S amo, amare, amavi, amatus V [XXXAO] love, like; fall in love with; be fond of; have a tendency to; am.as N 1 1 ACC P F ama, amae N F [XXXDO] lesser bucket; water bucket; (esp. fireman's bucket); am.as V 1 1 PRES ACTIVE IND 2 S amo, amare, amavi, amatus V [XXXAO] love, like; fall in love with; be fond of; have a tendency to;
我该怎么做?
【问题讨论】:
-
试试
nl2br() -
你不能指望用简单的 html dom 保留空白。如果需要,请使用 preg 函数。
标签: php html-parsing simple-html-dom