除非您正在构建 100% 代理,否则您会将 cURL 拉入的内容转储到浏览器中。结果现在来自 cURL 结果转储到的页面,而不是来自原始 cURL 请求。
基本上,如果您访问 http://localhost 并且上述代码位于 index.php 中,则该页面正在请求 :8081/comingEpisodes 内容并将其转储到原始 @987654322 的 context 中@。浏览器现在基于来自 http://localhost 的所有内容,而不是来自 curl 请求。
您可以在文档输出到某个“proxy.php?retrieve=old_url”之前替换文档中的所有内容链接,然后让所有这些现在通过相同的 cURL 上下文进行调用,但这就是网络代理的基础。
End-User Intermediary End-Website
(http://localhost) (localhost/index.php) (http://192.168.0.14:8081/comingEpisodes/)
------------------ --------------------- ------------------------------------------
Initial visit--------->
cURL Request------------->
Page Content (html basically)
Echoed back to user<------
Content<---------------
Finds <img> etc.------>
/comingEpisodes/img1.jpg // 404 error, it's actually on :8081
// that localhost has no idea about
// because it's being hidden using cURL
非常简单的演示
<?php
//
// Very Dummied-down proxy
//
// Either get the url of the content they need, or use the default "page root"
// when none is supplied. This is not robust at all, as this really only handles
// relative urls (e.g. src="images/foo.jpg", something like src="http://foo.com/"
// would become src="index.php?proxy=http://foo.com/" which makes the below turn
// into "http://www.google.com/http://foo.com/")
$_target = 'http://www.google.com/' . (isset($_GET['proxy']) ? $_GET['proxy'] : '');
// Build the cURL request to get the page contents
$cURL = curl_init($_target);
try
{
// setup cURL to your liking
curl_setopt($cURL, CURLOPT_RETURNTRANSFER, 1);
// execute the request
$page = curl_exec($cURL);
// Forward along the content type (so images, files, etc all are understood correctly)
$contentType = curl_getinfo($cURL, CURLINFO_CONTENT_TYPE);
header('Content-Type: ' . $contentType);
// close curl, we're done.
curl_close($cURL);
// test against the content type. If it HTML then we need to re-parse
// the page to add our proxy intercept in the URL so the visitor keeps using
// our cURL request above for EVEYRTHING it needs from this site.
if (strstr($contentType,'text/html') !== false)
{
//
// It's html, replace all the references to content using URLs
//
// First, load our DOM parser
$html = new DOMDocument();
$html->formatOutput = true;
@$html->loadHTML($page); // was getting parse errors, added @ for demo purposes.
// simple demo, look for image references and change them
foreach ($html->getElementsByTagName('img') as $img)
{
// take a typical image:
// <img src="logo.jpg" />
// and make it go through the proxy (so it uses cURL again:
// <img src="index.php?proxy=logo.jpg" />
$img->setAttribute('src', sprintf('%s?proxy=%s', $_SERVER['PHP_SELF'], urlencode($img->getAttribute('src'))));
}
// finally dump it to client with the urls changed
echo $html->saveHTML();
}
else
{
// Not HTML, just dump it.
echo $page;
}
}
// just in case, probably want to do something with this.
catch (Exception $ex)
{
}