【问题标题】:Perl mechanize print HTML form namesPerl 机械化打印 HTML 表单名称
【发布时间】:2013-05-18 01:36:12
【问题描述】:

我正在尝试自动登录 hotmail。我怎样才能找到合适的字段是什么?当我打印表单时,我只会得到一堆十六进制信息。

什么是正确的方法,它是如何使用的?

use WWW::Mechanize;
use LWP::UserAgent;


my $mech = WWW::Mechanize->new();
my $url = "http://hotmail.com";
$mech->get($url);



print "Forms: $mech->forms";


if ($mech->success()){
    print "Successful Connection\n";
} else {
    print "Not a successful connection\n"; }

【问题讨论】:

  • 那个十六进制的东西是什么样子的?
  • 如果我在关闭 JavaScript 的 Firefox 中打开 hotmail.com,我会被重定向到 login.live.com/jsDisabled.srf?mkt=EN-US&lc=1033,上面写着 Microsoft account requires JavaScript to sign in. This web browser either does not support JavaScript, or scripts are being blocked.。尝试使用像 Firebug 这样的工具来查看实际发布的内容。我不确定你是否可以轻松地模仿它。
  • @simbabque ARRAY(0x306b018)
  • $mech->forms 返回一个数组引用。您对 Perl 不是很熟悉,是吗?试试这个:use Data::Dumper; print Dumper $mech->forms;。它将以更易读的格式向您显示数组 ref 的内容。有关参考的更多信息,请参阅 perldoc.perl.org/perlref.htmlperldoc.perl.org/perlreftut.html

标签: html forms perl mechanize


【解决方案1】:

这可能对你有帮助

use WWW::Mechanize;
use Data::Dumper;

my $mech = WWW::Mechanize->new();

my $url = "http://yoururl.com";

$mech->get($url);

my @forms = $mech->forms;

foreach my $form (@forms) {

        my @inputfields = $form->param;

        print Dumper \@inputfields;
}  

【讨论】:

    【解决方案2】:

    有时在编写阅读器或接口之前先查看网站要求的内容很有用。 我写了这个书签,你保存在浏览器书签中,当你在访问任何 html 网页时单击它时,将在弹出窗口中显示所有表单操作和字段,甚至隐藏值。只需复制下面的文本并粘贴到新的书签位置字段中,命名并保存。

    javascript:t=%22<TABLE%20BORDER='1'%20BGCOLOR='#B5D1E8'>%22;for(i=0;i<document.forms.length;i++){t+=%22<TR><TH%20colspan='4'%20align='left'%20BGCOLOR='#336699'>%22;t+=%22<FONT%20color='#FFFFFF'>%20Form%20Name:%20%22;t+=document.forms[i].name;t+=%22</FONT></TH></TR>%22;t+=%22<TR><TH%20colspan='4'%20align='left'%20BGCOLOR='#99BADD'>%22;t+=%22<FONT%20color='#FFFFFF'>%20Form%20Action:%20%22;t+=document.forms[i].action;t+=%22</FONT></TH></TR>%22;t+=%22<TR><TH%20colspan='4'%20align='left'%20BGCOLOR='#99BADD'>%22;t+=%22<FONT%20color='#FFFFFF'>%20Form%20onSubmit:%20%22;t+=document.forms[i].onSubmit;t+=%22</FONT></TH></TR>%22;t+=%22<TR><TH>ID:</TH><TH>Element%20Name:</TH><TH>Type:</TH><TH>Value:</TH></TR>%22;for(j=0;j<document.forms[i].elements.length;j++){t+=%22<TR%20BGCOLOR='#FFFFFF'><TD%20align='right'>%22;t+=document.forms[i].elements[j].id;t+=%22</TD><TD%20align='right'>%22;t+=document.forms[i].elements[j].name;t+=%22</TD><TD%20align='left'>%20%22;t+=document.forms[i].elements[j].type;t+=%22</TD><TD%20align='left'>%20%22;if((document.forms[i].elements[j].type==%22select-one%22)%20||%20(document.forms[i].elements[j].type==%22select-multiple%22)){t_b=%22%22;for(k=0;k<document.forms[i].elements[j].options.length;k++){if(document.forms[i].elements[j].options[k].selected){t_b+=document.forms[i].elements[j].options[k].value;t_b%20+=%20%22%20/%20%22;t_b+=document.forms[i].elements[j].options[k].text;t_b+=%22%20%22;}}t+=t_b;}else%20if%20(document.forms[i].elements[j].type==%22checkbox%22){if(document.forms[i].elements[j].checked==true){t+=%22True%22;}else{t+=%22False%22;}}else%20if(document.forms[i].elements[j].type%20==%20%22radio%22){if(document.forms[i].elements[j].checked%20==%20true){t+=document.forms[i].elements[j].value%20+%20%22%20-%20CHECKED%22;}else{t+=document.forms[i].elements[j].value;}}else{t+=document.forms[i].elements[j].value;}t+=%22</TD></TR>%22;}}t+=%22</TABLE>%22;mA='menubar=yes,scrollbars=yes,resizable=yes,height=800,width=600,alwaysRaised=yes';nW=window.open(%22/empty.html%22,%22Display_Vars%22,%20mA);nW.document.write(t);

    【讨论】:

      【解决方案3】:

      我试图模拟发送您的登录信息的发布请求,但该网站似乎正在动态添加一堆 id ---长生成的字符串等到 url,我无法弄清楚如何模仿它们.所以我写了下面的hacky解决方法。

      #!/usr/bin/perl
      
      use strict;
      use warnings;
      use WWW::Curl::Easy;
      use Data::Dumper;
      my $curl = WWW::Curl::Easy->new;
      
      #this is the name and complete path to the new html file we will create
      my $new_html_file = 'XXXXXXXXX';
      my $password = 'XXXXXXXX';
      my $login = 'XXXXXXXXX';
      
      #escape the .
      $login =~ s/\./\\./g;
      
      my $html_to_insert = qq(<script src="//ajax.googleapis.com/ajax/libs/jquery/2.0.0/jquery.min.js"></script><script type="text/javascript">setTimeout('testme()', 3400);function testme(){document.getElementById('res_box').innerHTML = '<h3 class="auto_click_login_np">Logging in...</h3>';document.f1.passwd.value = '$password';document.f1.login.value = '$login';\$("#idSIButton9").trigger("click");}var counter = 5;setInterval('countdown()', 1000);function countdown(){document.getElementById('res_box').innerHTML = '<h3 class="auto_click_login_np">You should be logged in within ' + counter + ' seconds</h3>';counter--;}</script><h2 style="background-color:#004c00; color: #fff; padding: 4px;" id="res_box" onclick="testme()" class="auto_click_login">If you are not logged in after a few seconds, click here.</h2>);
      
      $curl->setopt(CURLOPT_HEADER,1);
      my $url = 'https://login.live.com';
      $curl->setopt(CURLOPT_URL, $url);
      
      # A filehandle, reference to a scalar or reference to a typeglob can be used here.
      my $response_body;
      
      $curl->setopt(CURLOPT_WRITEDATA, \$response_body);
      
      open( my $fresh_html_handle, '+>', 'fresh_html_from_login_page.html');
      
      
      # Starts the actual request
      my $curl_return_code = $curl->perform;
      
      # Looking at the results...
      if ($curl_return_code == 0) {
              print("Transfer went ok\n");
              my $response_code = $curl->getinfo(CURLINFO_HTTP_CODE);
              # judge result and next action based on $response_code
      
              print $fresh_html_handle $response_body;
      
         } else {
              # Error code, type of error, error message
              print("An error happened: $curl_return_code ".$curl->strerror($curl_return_code)." ".$curl->errbuf."\n");
      }
        close($fresh_html_handle);   
      
      
       #erase whatever a pre-existing edited file if there is one
       open  my $erase_html_handle, ">", $new_html_file or die "Hork! $!\n";
       print $erase_html_handle;
       close $erase_html_handle;
      
      
      
      
      #open the file with the login page html
      open( FH, '<', 'fresh_html_from_login_page.html');  
      
      open( my $new_html_handle, '>>', $new_html_file);
      
      my $tracker=0;
      
        while( <FH> ){
      
      
      
            if( $_ =~ /DOCTYPE/){
               $tracker=1;
               print $new_html_handle $_;
      
           } elsif($_ =~ /<\/body><\/html>/){
               #now add the javascript and html to automatically log the user in
               print $new_html_handle "$html_to_insert\n$_";
            }elsif( $tracker == 1){
               print $new_html_handle $_;
            }
      
      
       } 
        close(FH);           
        close($new_html_handle);
      
         my $sys_call_res = system("firefox file:///usr/bin/outlook_auto_login.html");
         print "\n\nresult: $sys_call_res\n\n";   
      

      【讨论】:

        猜你喜欢
        • 2013-10-31
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2013-10-16
        • 1970-01-01
        相关资源
        最近更新 更多