【发布时间】:2016-04-20 10:16:20
【问题描述】:
我正在尝试获取多个产品的一些详细信息,以下代码适用于单个 URL,并且运行良好:-
<?php
$url = "http://www.flipkart.com/healthgenie-hd-221-digital-black-dotted-weighing-scale/p/itmeatqzeehkdsmg";
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
$output = curl_exec($curl);
curl_close($curl);
$dom = new DOMDocument;
libxml_use_internal_errors(true);
$dom->loadHTML($output);
libxml_clear_errors();
$xpath = new DOMXpath($dom);
// Product Name
$find_product_name = $xpath->query('//h1[@class="title"]');
if($find_product_name->length > 0)
{
$product_name= $find_product_name->item(0)->nodeValue;
}
else
{
$product_name="Product Name Not Found";
}
// Sold by
$find_seller = $xpath->query('//a[@class="seller-name"]');
if($find_seller->length > 0)
{
$seller= $find_seller->item(0)->nodeValue;
}
else
{
$seller="Seller Not Found";
}
// List Price
$find_list_price = $xpath->query('//span[@class="price"]');
if($find_list_price->length > 0)
{
$list_price= $find_list_price->item(0)->nodeValue;
}
else
{
$list_price="Not Available";
}
// Sale Price
$find_sale_price = $xpath->query('//span[@class="selling-price omniture-field"]');
if($find_sale_price->length > 0)
{
$sale_price= $find_sale_price->item(0)->nodeValue;
}
else
{
$sale_price="Not Available";
}
// Stock Status
$find_stock = $xpath->query('//div[@class="out-of-stock-status"]');
if($find_stock->length > 0)
{
$stock= $find_stock->item(0)->nodeValue;
}
else
{
$stock = "In Stock!";
}
//
?>
<table width="100%" align="center" border="1">
<tr>
<th>Product Name</th>
<th>Sold By</th>
<th>List Price</th>
<th>Sale Price</th>
<th>Stock Status</th>
</tr>
<tr>
<td><?php echo '<a href="'.$url.'" target="_blank">'.$product_name.'</a>'; ?></td>
<td><?php echo $seller; ?></td>
<td><?php echo $list_price; ?></td>
<td><?php echo $sale_price; ?></td>
<td><?php echo $stock; ?></td>
</tr>
</table>
现在我需要一种方法来一次从多个 URL 获取数据。我希望多个 URL 采用相同的过程 ny 采用数组中的 URL。例如:-
<?php
$url = array(
"http://www.flipkart.com/healthgenie-hd-221-digital-black-dotted-weighing-scale/p/itmeatqzeehkdsmg",
"http://www.flipkart.com/healthgenie-hd-221-digital-black-dotted-weighing-scale/p/itmeatqzeehkdsmg",
"http://www.flipkart.com/healthgenie-hd-221-digital-black-dotted-weighing-scale/p/itmeatqzeehkdsmg"
);
?>
我请求所有开发人员查看并帮助我解决此问题。提前谢谢了。
【问题讨论】:
-
阅读部分
multi_curl- php.net/manual/en/function.curl-multi-init.php -
嗨,Mulder,感谢您的评论。我已经检查过了,但我从未使用过 curl。如果有人帮助我了解代码结构,我将不胜感激。例如我需要创建多少个函数以及如何创建
标签: php curl web-scraping