【问题标题】:PHP Not parsing rss using cURL properlyPHP没有正确使用cURL解析rss
【发布时间】:2012-02-25 08:30:51
【问题描述】:

我只想获取“频道”标签的名称,即频道……当我使用它来解析来自 Google 的 rss 时,该脚本工作正常…………但是当我将它用于其他一些提供程序,它提供输出“#text”而不是提供预期输出的“通道”……以下是我的脚本,请帮助我。

$url = 'http://ibnlive.in.com/ibnrss/rss/sports/cricket.xml';
    $get =  perform_curl($url);
    $xml = new DOMDocument();
    $xml -> loadXML($get['remote_content']);  
  $fetch = $xml -> documentElement;
  $gettitle = $fetch -> firstChild -> nodeName; 
  echo $gettitle; 
  function perform_curl($rss_feed_provider_url){

       $url = $rss_feed_provider_url;
       $curl_handle = curl_init();

       // Do we have a cURL session?
       if ($curl_handle) {
          // Set the required CURL options that we need.
          // Set the URL option.
          curl_setopt($curl_handle, CURLOPT_URL, $url);
          // Set the HEADER option. We don't want the HTTP headers in the output.
          curl_setopt($curl_handle, CURLOPT_HEADER, false);
          // Set the FOLLOWLOCATION option. We will follow if location header is present.
          curl_setopt($curl_handle, CURLOPT_FOLLOWLOCATION, true);
          // Instead of using WRITEFUNCTION callbacks, we are going to receive the remote contents as a return value for the curl_exec function.
          curl_setopt($curl_handle, CURLOPT_RETURNTRANSFER, true);

          // Try to fetch the remote URL contents.
          // This function will block until the contents are received.
          $remote_contents = curl_exec($curl_handle);

          // Do the cleanup of CURL.
          curl_close($curl_handle);

          $remote_contents = utf8_encode($remote_contents);

          $handle = @simplexml_load_string($remote_contents);
          $return_result = array();
          if(is_object($handle)){
              $return_result['handle'] = true;
              $return_result['remote_content'] = $remote_contents;
              return $return_result;
          }
          else{
              $return_result['handle'] = false;
              $return_result['content_error'] = 'INVALID RSS SOURCE, PLEASE CHECK IF THE SOURCE IS A VALID XML DOCUMENT.';
              return $return_result;
          }

        } // End of if ($curl_handle)
      else{
        $return_result['curl_error'] = 'CURL INITIALIZATION FAILED.';
        return false;   
      }
   } 

【问题讨论】:

    标签: php php xml rss


    【解决方案1】:

    it gives an output '#text' instead of giving 'channel' which is the intended output 发生这种情况是因为 $fetch -> firstChild -> nodeType 是 3,这是一个 TEXT_NODE 或只是一些文本。您可以通过

    选择频道
    echo $fetch->getElementsByTagName('channel')->item(0)->nodeName;
    

    $gettitle = $fetch -> firstChild -> nodeValue;
    var_dump($gettitle); 
    

    给你

    string(5) "
        "
    

    或空格和一个新行符号,由于格式化而恰好出现在 xml 标记之间。

    ps:您链接的 RSS 提要在 http://validator.w3.org/feed/ 验证失败

    【讨论】:

      【解决方案2】:

      看一下 XML - 它被很好地打印了空白,因此它被正确解析。根节点的第一个子节点是文本节点。如果你想要更轻松的时间,我建议使用SimpleXML,或者在你的 DomDocument 上使用XPath 查询来获取感兴趣的标签。

      这就是你如何使用 SimpleXML

      $xml = new SimpleXMLElement($get['remote_content']);
      print $xml->channel[0]->title;
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-08-13
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多