如何获取特定部分 id 下的所有 <span> 项目？ (JSoup)答案

【问题标题】：How do I get all the <span> items under a specific section id? (JSoup)如何获取特定部分 id 下的所有 <span> 项目？ (JSoup)
【发布时间】：2020-08-15 03:16:40
【问题描述】：

我正在尝试获取“收藏夹”部分 ID 下方的“框架标题”类的所有项目。网络爬虫非常新，所以我相信它很简单。

HTML：

<section id="favourites" class="section">
 <h2 class="section-heading">Favorite films</h2>
    <div>
     <a href="/film/wings-of-desire/" class="frame has-menu" data-original-title="Wings of Desire (1987)"><span class="frame-title">Wings of Desire (1987)</span>

我想检索“框架标题”部分。这就是电影的名字。

这是我尝试过的：

for (int x = 0; x<users.size();x++)
        {
           String url2 = "https://letterboxd.com" + users.get(x);
           Document doc = Jsoup.connect(url2).get();
           Elements films = doc.select("a:has(frame-title)");;
           
              for (Element film:films){
           String temp = films.attr("href").toString();
           films1.add(temp);
           System.out.println(temp);    
         }

        }

【问题讨论】：

标签： java web-crawler jsoup

【解决方案1】：

试试这个。

String url = "https://letterboxd.com/saka1029/";
Document doc = Jsoup.connect(url).get();
Elements films = doc.select("section#favourites span.frame-title");
for (Element film : films) {
    System.out.println(film);
}

输出：

<span class="frame-title">Ghost in the Shell</span>

这是我的最爱。 :)

【讨论】：

哦，这行得通！谢谢！只有一件事，我怎么能只隔离标题？周围没有任何 html。我想将每个电影标题添加到数组列表中。
可以通过film.text()获取标题。