立即重定向到从谷歌搜索结果中抓取的 URL答案

【问题标题】：Instantly redirect to URL that has been scraped from google search results立即重定向到从谷歌搜索结果中抓取的 URL
【发布时间】：2019-12-09 05:55:53
【问题描述】：

最近，我想知道用 PHP 编写的网络爬虫是否可以立即重定向到在 google 搜索中获取的第一个 url。

<?php
include('simple_html_dom.php');

$html = file_get_html('https://www.google.com/search?q=raspberry&oq=raspberry&aqs&num=1');

$linkObjs = $html->find('div[class=jfp3ef] a');
foreach ($linkObjs as $linkObj) {
        $title = trim($linkObj->plaintext);
        $link = trim($linkObj->href);

        //if it is not a direct link but url reference found inside it, then extract
        if (!preg_match('/^https?/', $link) && preg_match('/q=(.+)&amp;sa=/U', $link, $matches) && preg_match('/^https?/', $matches[1])) {
            $link = $matches[1];
        } else if (!preg_match('/^https?/', $link)) { // skip if it is not a valid link
            continue;
        }

        echo $link . '</p>';
}
?>

该代码从谷歌搜索“raspberry”中获取第一个顶级结果并打印该网站的网址。我希望它把它重定向到那个 url 而不是打印出来。

【问题讨论】：

标签： php html dom web-scraping

【解决方案1】：

使用php内置的header()函数。你会这样使用它：

header("Location: $link");

请注意，如果 $link 变量中的链接没有 http 或 https 前缀，则它可能无法正确重定向，因此您可能需要先检查它是否存在。如果它不只是预先准备并进行重定向。

另外，在调用header() 函数之前，不要使用echo 或任何其他向屏幕输出内容的语句，因为那样也不起作用。

【讨论】：