【发布时间】:2012-09-28 01:51:34
【问题描述】:
鉴于一大块 HTML 可以很好地在 <div> 和 <table> 中显示数据,如何删除所有 HTML/CSS 标记,同时保留最初在单个单元格和 div 中找到的文本,现在只用换行符分隔?
此处显示的当前尝试将输出一个长的连续段落,而不是在 div 或表格形式时保持分隔。
原始 HTML: http://pastebin.com/63N3Kg16
输出:
John Smith | SomeName Realty | (xxx) 939-4835 Allston St, Cambridge, MA Very spacious under renovation with SST/Granite, porch, minutes to MIT, redline, Nov/1 4BR/1BA Apartment $3,400/month Bedrooms 4 Bathrooms 1 full, 0 partial Sq Footage Unspecified Parking None Pet Policy No pets Deposit $0 DESCRIPTION Triple decker building secondfloor apt aprox 2000 sqf with large bedrooms, kitchen, pantry, porch, d/w, all woodfloor and ZTilded in the kitchen, new bath. utilities extra,Nov/1 see additional photos below Contact info: Payman Ahmadifar Bayside Realty (xxx) 939-4835 Posted: Sep 24, 2012, 6:55am PDT
PHP
nl2br(trim(strip_tags($html)));
预期输出
带有<br> 或换行符的纯文本,没有<div> 或<table> HTML 标记。基本上是为了使文本更具可读性,保持原始的间距/分隔结构,但除了 <br> 之外没有 CSS 样式或 HTML 标记。
John Smith | SomeName Realty | (xxx) 939-4835
Allston St, Cambridge, MA
Very spacious under renovation with SST/Granite, porch, minutes to MIT, redline, Nov/1
4BR/1BA Apartment $3,400/month
Bedrooms 4
Bathrooms 1 full, 0 partial
Sq Footage Unspecified
Parking None
Pet Policy No pets
Deposit $0
DESCRIPTION
Triple decker building secondfloor apt aprox 2000 sqf with large bedrooms, kitchen, pantry, porch, d/w, all woodfloor and ZTilded in the kitchen, new bath. utilities extra,Nov/1 see additional photos below
Contact info: Payman Ahmadifar Bayside Realty (xxx) 939-4835
Posted: Sep 24, 2012, 6:55am PDT
【问题讨论】:
-
你试过strip_tags($html, '
')你能添加预期的输出吗另外 nl2br 也不一定能达到预期的效果,因为 html 可能不包含任何 nl感谢您的反馈,我不希望在最终输出中出现像<table> <p> <div>这样的样式。如果可能,只使用新行,<br>和<strong>。John Smith 来自哪里?以及在哪里 |来自?你想在浏览器中查看吗?还是保存到某个文件中?
标签: php simple-html-dom