如何使用 PHP 读取网页 HTML 的第 n 行？答案

【问题标题】：How can I read the nth line of a web page's HTML, using PHP?如何使用 PHP 读取网页 HTML 的第 n 行？
【发布时间】：2015-02-15 15:46:34
【问题描述】：

我已经看到这个关于文件的问题，但由于某种原因，它们永远无法用于网页。

我正在尝试使用file_get_contents 来获取网页的内容（不太关心速度，因此我没有使用 cURL），然后我想打印特定的行。

您能否给我最简单的方法，因为我正在创建一个从多个网页获取特定行的 API。

或者，有没有一种方法可以搜索并打印包含某个字符串的行？例如，以“Foo”开头的一行（如果只有一行包含它）。

【问题讨论】：

标签： php line file-get-contents

【解决方案1】：

function readStrLine($str, $n) {
    $lines = explode(PHP_EOL, $str);
    return $lines[$n-1];
}

$file = file_get_contents('http://google.pl');

echo readStrLine($file, 10);

你可以按新行分解字符串，然后你得到以索引 0 开头的行数组（它是第一行）

编辑整洁的 html 的替代方式

function readHtmlLine($html, $n) {
    $dom = new DOMDocument();
    $dom->preserveWhiteSpace = false;
    $dom->loadHTML($html);
    $dom->formatOutput = true;
    $lines = explode(PHP_EOL, $dom->saveHTML());
    return $lines[$n-1];
}

$file = file_get_contents('http://google.pl');

echo readHtmlLine($file, 10);

【讨论】：

我这样做了，但除非 10 是 1，否则它不起作用，然后它只打印整个页面，而不是特定的行。我将如何打印，比如第 60 行或第 70 行等？
它可能不起作用，如果页面没有任何换行符作为响应，但在这种情况下，除非您格式化 html，否则您无法知道行，例如铬做到了
如果你想格式化 html，下面是例子：stackoverflow.com/a/6516335/2962442
我不知道你到底想做什么，但是有很多方法可以从 html 内容中获取数据，例如xpath，选择数据的好方法

【解决方案2】：

如何阅读网页的特定行？ [PHP]

您能否给我最简单的方法，因为我正在创建一个从多个网页获取特定行的 api。

或者，有没有一种方法可以搜索并打印包含某个字符串的行？

示例 html 文件：

file.html

<html>
<head><title>File</title></head>
<body>
    <p>Nancy is my name</p>
    <p>James is my name</p>
    <p>Foo is my name</p>
    <p>Bob is my name</p>
</body>
</html>

简单的php函数：

function checkFile( $file, $keyword ) {

    // open file for reading
    $handle = @fopen( $file, 'r' );

    // check to make sure handle is valid
    if( $handle ) {

        // traverse file line by line
        while( ($line = fgets($handle)) !== false ) {

            // search for specific keyword no matter what case is used i.e. foo or Foo
            if( stripos($line, $keyword) === false ) {
                // string not found, continue with next iteration
                continue;
            } else {

                // keyword was found

                // close file
                fclose($handle);

                // return line
                return $line;
            }
        }
    }
}

$result = checkFile( 'file.html', 'foo' );

echo $result;

输出：<p>Foo is my name</p>

【讨论】：

【解决方案3】：

$url = 'https://www..';

$content = file_get_contents($url);

if($content){

    $start = strpos($content, '<span>');// $start is the first word where you want to begin read code

    $end = strpos($content,'</span>',$start );// $end is the last word where you want to stop read code

    $result = substr($content,$start,$end - $start);// $result is the code you have read and you want to display
   
} else {
    $error = 'no data found';
}

【讨论】：

您应该将所有代码放在“```”（3 个反引号）中，这样您的代码就会全部格式化。另外，第n行是指在html的源代码中还是在视觉上？
我的意思是在网页的源代码中（所以 html），抱歉我的英语令人困惑，感谢您的提示！ :D