【发布时间】:2018-04-13 11:40:01
【问题描述】:
所以这里是 python 的工作代码,用于从 instagram 中抓取图像
from __future__ import print_function
import json, re, requests
user = 'thefatfoxcamden'
profile = 'https://www.instagram.com/' + user
with requests.session() as s:
s.headers['user-agent'] = 'Mozilla/5.0'
end_cursor = ''
for count in range(1, 3):
print('PAGE: ', count)
r = s.get(profile, params={'max_id': end_cursor})
data = re.search(r'window._sharedData = (\{.+?});</script>', r.text).group(1)
j = json.loads(data)
for node in j['entry_data']['ProfilePage'][0]['user']['media']['nodes']:
if node['is_video']:
page = 'https://www.instagram.com/p/' + node['code']
r = s.get(page)
url = re.search(r'"video_url": "([^"]+)"', r.text).group(1)
print('VIDEO:', url)
else:
print('IMAGE:', node['display_src'])
end_cursor = re.search(r'"end_cursor": "([^"]+)"', r.text).group(1)
代码运行得很好,现在我想用 PHP 转换这段代码
我现在只能得到第一个循环
下面是哪个代码
<?php
function scrape_insta($username) {
$insta_source = file_get_contents('http://instagram.com/'.$username);
$shards = explode('window._sharedData = ', $insta_source);
$insta_json = explode(';</script>', $shards[1]);
$insta_array = json_decode($insta_json[0], TRUE);
return $insta_array;
}
//Supply a username
$my_account = 'thefatfoxcamden';
$results_array = scrape_insta($my_account);
$image_array= array();
$dataList = ($results_array['entry_data']['ProfilePage'][0]['graphql']['user']['edge_owner_to_timeline_media']['edges']);
for($i=0; $i<count($dataList); $i++){
if($dataList[$i]['node'][__typename]=='GraphImage'){
//echo '<img src="'.$dataList[$i]['node']['display_url'].'"></br>';
echo '<img src="'.$dataList[$i]['node']['thumbnail_src'].'"></br>';
}
}
上面的代码只调用一次
我想像 python 代码一样在循环中调用 3 次。
提前致谢
【问题讨论】:
标签: php python regex instagram