【发布时间】:2017-08-27 23:55:26
【问题描述】:
我正在尝试调试我网站的 .htaccess + robots.txt,我想使用 cURL 或 wget 尝试访问我使用 robots.txt 阻止的文件或应通过 .htaccess 重定向到另一个位置的页面
我的 robots.txt 中有以下内容
User-agent: *
Disallow: /wp/wp-admin/
但是,我仍然可以抓取它
wget
$ wget http://xxxx.com/wp/wp-admin/
SYSTEM_WGETRC = c:/progra~1/wget/etc/wgetrc
syswgetrc = C:\Program Files (x86)\GnuWin32/etc/wgetrc
--2017-08-28 07:37:05-- http://xxxx.com/wp/wp-admin/
Resolving xxxx.com... 118.127.47.249
Connecting to xxxx.com|118.127.47.249|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://xxxx.com/wp/wp-login.php?redirect_to=http%3A%2F%2Fxxxx.com%2Fwp%2Fwp-
admin%2F&reauth=1 [following]
--2017-08-28 07:37:12-- http://xxxx.com/wp/wp-login.php?redirect_to=http%3A%2F%2Fxxxx.com%2Fwp%2Fwp-admin%2F&reauth=1
Connecting to xxxx.com|118.127.47.249|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2891 (2.8K) [text/html]
Saving to: `wp-login.php@redirect_to=http%3A%2F%2Fxxxx.com%2Fwp%2Fwp-admin%2F&reauth=1'
100%[==============================================================================>] 2,891 --.-K/s in 0.1s
2017-08-28 07:37:17 (22.2 KB/s) - `wp-login.php@redirect_to=http%3A%2F%2Fxxxx.com%2Fwp%2Fwp-admin%2F&re
auth=1' saved [2891/2891]
卷曲
$ curl -L xxx.com/wp/wp-admin -o wp-admin.html
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 1147 100 1147 0 0 107 0 0:00:10 0:00:10 --:--:-- 280
0 0 0 0 0 0 0 0 --:--:-- 0:01:37 --:--:-- 0
100 2891 100 2891 0 0 17 0 0:02:50 0:02:42 0:00:08 234
wget 和 curl 都不尊重 robots.txt 有没有办法检查我的 .htaccess+robots.txt 如何?谢谢!
【问题讨论】:
标签: .htaccess curl wget robots.txt