这个 HTML 优化代码的开头是做什么的？答案

【问题标题】：What does the beginning of this HTML optimization code do?这个 HTML 优化代码的开头是做什么的？
【发布时间】：2010-12-18 05:02:04
【问题描述】：

困难的部分是试图弄清楚 stripwhitespace() 函数的作用。 stripbuffer() 相当简单，但我一直盯着这段小代码有一段时间了，现在试图破译它，但无济于事。神秘的变量名称和缺少 cmets 也无济于事。由于本网站的垃圾邮件预防措施，我还不得不从学分中删除一些超链接

<?php 
/* ---------------------------------
26 January, 2008 - 2:55pm:

The example below is adapted from a post by londrum 8:29 pm on June 7, 2007: 
"crunch up your HTML into a single line
a handy little script..."

This PHP code goes at the very TOP of the PHP-enabled HTML webpage
above EVERYTHING else. Recommendation: use a PHP include file for this
to have only one file to maintain. 
--------------------------------- */
function stripwhitespace($bff){
    $pzcr=0;
    $pzed=strlen($bff)-1;
    $rst="";
    while($pzcr<$pzed){
        $t_poz_start=stripos($bff,"<textarea",$pzcr);
        if($t_poz_start===false){
            $bffstp=substr($bff,$pzcr);
            $temp=stripBuffer($bffstp);
            $rst.=$temp;
            $pzcr=$pzed;
        }
        else{
            $bffstp=substr($bff,$pzcr,$t_poz_start-$pzcr);
            $temp=stripBuffer($bffstp);
            $rst.=$temp;
            $t_poz_end=stripos($bff,"</textarea>",$t_poz_start);
            $temp=substr($bff,$t_poz_start,$t_poz_end-$t_poz_start);
            $rst.=$temp;
            $pzcr=$t_poz_end;
        }
    }
    return $rst;
}

function stripBuffer($bff){
    /* carriage returns, new lines */
    $bff=str_replace(array("\r\r\r","\r\r","\r\n","\n\r","\n\n\n","\n\n"),"\n",$bff);
    /* tabs */
    $bff=str_replace(array("\t\t\t","\t\t","\t\n","\n\t"),"\t",$bff);
    /* opening HTML tags */
    $bff=str_replace(array(">\r<a",">\r <a",">\r\r <a","> \r<a",">\n<a","> \n<a","> \n<a",">\n\n <a"),"><a",$bff);
    $bff=str_replace(array(">\r<b",">\n<b"),"><b",$bff);
    $bff=str_replace(array(">\r<d",">\n<d","> \n<d",">\n <d",">\r <d",">\n\n<d"),"><d",$bff);
    $bff=str_replace(array(">\r<f",">\n<f",">\n <f"),"><f",$bff);
    $bff=str_replace(array(">\r<h",">\n<h",">\t<h","> \n\n<h"),"><h",$bff);
    $bff=str_replace(array(">\r<i",">\n<i",">\n <i"),"><i",$bff);
    $bff=str_replace(array(">\r<i",">\n<i"),"><i",$bff);
    $bff=str_replace(array(">\r<l","> \r<l",">\n<l","> \n<l",">  \n<l","/>\n<l","/>\r<l"),"><l",$bff);
    $bff=str_replace(array(">\t<l",">\t\t<l"),"><l",$bff);
    $bff=str_replace(array(">\r<m",">\n<m"),"><m",$bff);
    $bff=str_replace(array(">\r<n",">\n<n"),"><n",$bff);
    $bff=str_replace(array(">\r<p",">\n<p",">\n\n<p","> \n<p","> \n <p"),"><p",$bff);
    $bff=str_replace(array(">\r<s",">\n<s"),"><s",$bff);
    $bff=str_replace(array(">\r<t",">\n<t"),"><t",$bff);
    /* closing HTML tags */
    $bff=str_replace(array(">\r</a",">\n</a"),"></a",$bff);
    $bff=str_replace(array(">\r</b",">\n</b"),"></b",$bff);
    $bff=str_replace(array(">\r</u",">\n</u"),"></u",$bff);
    $bff=str_replace(array(">\r</d",">\n</d",">\n </d"),"></d",$bff);
    $bff=str_replace(array(">\r</f",">\n</f"),"></f",$bff);
    $bff=str_replace(array(">\r</l",">\n</l"),"></l",$bff);
    $bff=str_replace(array(">\r</n",">\n</n"),"></n",$bff);
    $bff=str_replace(array(">\r</p",">\n</p"),"></p",$bff);
    $bff=str_replace(array(">\r</s",">\n</s"),"></s",$bff);
    /* other */
    $bff=str_replace(array(">\r<!",">\n<!"),"><!",$bff);
    $bff=str_replace(array("\n<div")," <div",$bff);
    $bff=str_replace(array(">\r\r \r<"),"><",$bff);
    $bff=str_replace(array("> \n \n <"),"><",$bff);
    $bff=str_replace(array(">\r</h",">\n</h"),"></h",$bff);
    $bff=str_replace(array("\r<u","\n<u"),"<u",$bff);
    $bff=str_replace(array("/>\r","/>\n","/>\t"),"/>",$bff);
    $bff=ereg_replace(" {2,}",' ',$bff);
    $bff=ereg_replace("  {3,}",'  ',$bff);
    $bff=str_replace("> <","><",$bff);
    $bff=str_replace("  <","<",$bff);
    /* non-breaking spaces */
    $bff=str_replace(" &nbsp;","&nbsp;",$bff);
    $bff=str_replace("&nbsp; ","&nbsp;",$bff);
    /* Example of EXCEPTIONS where I want the space to remain
    between two form buttons at */ 
    /* <!-- http://websitetips.com/articles/copy/loremgenerator/ --> */
    /* name="select" /> <input */
    $bff=str_replace(array("name=\"select\" /><input"),"name=\"select\" /> <input",$bff);

    return $bff;
}
ob_start("stripwhitespace");
?>

【问题讨论】：

至少格式化你的代码。人们可能不太愿意帮助您对代码进行去混淆处理。
@Robert，我已经为他格式化了，代码没有被混淆。只是写得不好（变量名不好，缺少 cmets。）
@simshaun，谢谢。我尝试自己格式化它，我想我做到了。它变得一团糟。

标签： php html optimization buffer

【解决方案1】：

在我看来，它好像在处理 textarea 之前和 textarea 之后的所有内容，但它只留下 textarea 的内容。

虽然这段代码可能有些有趣，但 PHP 在快速字符串操作方面是出了名的糟糕，所有那些 str_replace 调用都是一个坏主意。

我预计通过在 Web 服务器上使用 gzip/deflate 在发送前压缩脚本输出，您会获得更好的性能。

【讨论】：

哇哦。谢谢提供信息。我认为 PHP 可以节省足够的空间，减少的下载时间可以证明增加的处理时间是合理的，但我不认为 PHP 会这么慢。
同意。使用 gzip 压缩输出，而不是使用该代码 sn-p。
根据 php.net 的说法，gzip 压缩会导致相当多的错误。你说的是 ob_gzhandler 吗？
@Ajson: 你的网络服务器没有压缩输出的方法吗？不，不是在谈论 ob_gzhandler。
这是一个自制的网络服务器。我是一个爱好者，而不是专业人士。

【解决方案2】：

这绝对是一团糟，但似乎它从字符串中去除了不必要的空格，除了文本区域内的空格。

【讨论】：

【解决方案3】：

stripBuffer 的作用很明显：它试图从输入中去除所有空格。

stripwhitespace 的工作原理如下：

function stripwhitespace($input){
    $currentPosition=0; // start from the first char
    $endPosition=strlen($input)-1; // where to stop
    $returnValue="";

    // while there is more input to process
    while($currentPosition<$endPosition){
        // find start of next <textarea> tag
        $startOfNextTextarea=stripos($input,"<textarea",$currentPosition);
        if($startOfNextTextarea===false){
            // no textarea tag remaining:
            // strip ws from remaining input, append to $returnValue and done!
            $bufferToStrip=substr($input,$currentPosition);
            $temp=stripBuffer($bufferToStrip);
            $returnValue.=$temp;
            $currentPosition=$endPosition; // to cause the function to return
        }
        else{
            // <textarea> found
            // strip ws from input in the range [current_position, start_of_textarea)
            $bufferToStrip=substr($input,$currentPosition,$startOfNextTextarea-$currentPosition);
            // append to return value
            $temp=stripBuffer($bufferToStrip);
            $returnValue.=$temp;
            $endOfNextTextarea=stripos($input,"</textarea>",$startOfNextTextarea);
            // get contents of <textarea>, append to return value without stripping ws
            $temp=substr($input,$startOfNextTextarea,$endOfNextTextarea-$startOfNextTextarea);
            $returnValue.=$temp;
            // continue looking for textareas after the end of this one
            $currentPosition=$endOfNextTextarea;
        }
    }
    return $returnValue;
}

我承认，如果你不能“直观地”判断它正在尝试做什么，这将是相当困难的，因为<textarea> 标记的内容在 HTML 中得到了特殊处理。

【讨论】：

【解决方案4】：

在伪代码中（ish）

bff is the initial buffer
pzcr is the current start
pzed is the current end
rst will have the filtered text appended to it.
while the current start is before the end
  t_pos_start is first position of the textarea (after current start)
  if there is no text area found
    bffstp becomes the substring of the buffer starting at pzcr
    temp is buffer stripped.
    append temp to rst
    set the current start to the current end.
  else
    set bffstp to the substr between the start and the start of the textarea tag
    temp is buffer stripped.
    append temp to rst
    skip the textarea
    temp will be the substr from the start of the text area to the closing text area tag.
    append temp (unfiltered) to rst.
    set the next start to the end of the textarea (at the start of its closing tag).
  end the if
end the while
return the appended buffer (rst)

嗯 - 作为一个 html 压缩器，这段代码本身实际上是臃肿的并且难以阅读。使用得当的正则表达式应该可以做得更好。

【讨论】：