在 Selenium 中查找 LinkedIn 工作页面的元素答案

【问题标题】：Find element in selenium for LinkedIn Job Page在 Selenium 中查找 LinkedIn 工作页面的元素
【发布时间】：2021-09-08 18:43:00
【问题描述】：

我目前正在尝试在 Linkedin 上找工作我已经写了这段代码

job = browser.find_elements_by_tag_name("a")
c = []

for i in job:
    c.append(i.text)
print(c)
print((len(c)))

这不会返回正确的输出，我想检索每个帖子的职位，这给我的输出是

['LinkedIn', 'new feed updates notifications\nHome', 'My Network', 'Jobs', 'Messaging', '1\n1 new notification\nNotifications', '', '', 'Global Analytic Insights Consultant', 'Concentrix', '6 alumni work here', '', 'Data Intelligence Engineer', 'Arrow Electronics', '', 'Junior Data Modeler', 'Teradata', '1 alum works here', '', 'Research Analyst - BASES', 'NielsenIQ', '1 alum works here', '', 'Group Visualisation and Reporting Specialist', 'Coca-Cola Beverages South Africa (CCBSA)', '', 'Group External Commercial Data Specialist', 'Coca-Cola Beverages South Africa (CCBSA)', '', 'Junior/ Business Analyst', 'Arrow Electronics', '', 'Technical Consultant (O365)', 'Microsoft', '2 connections work here', '', 'Business Analyst', '', 'Support Engineer for Power BI', 'Microsoft', '2 connections work here', '', 'Insights and Analytics Specialist', 'Souq.com', '', 'Call Center Representative', 'Raya CX', '', 'Work from Home Opportunities | Flexible Hours', 'Appen', '', 'Early Careers Program', 'DXC Technology', '1 alum works here', '', 'Business Intelligence Executive', 'noon', '', 'Performance Management Analyst', 'talabat', '7 alumni work here', '', 'IT Business Analyst', 'ALEXBANK', '30 alumni work here', '', 'Study Abroad', 'Educatly', '', 'Work from Home Opportunities | Flexible Hours', 'Appen', '', 'Associate Managing Consultant, Advisors', 'Mastercard', '', 'Assistant', 'NielsenIQ', '', 'Program Support Associate', 'Souq.com', '', 'Junior/ Business Analyst', 'Arrow Electronics', '', 'Internet Analyst', 'Appen', '', 'Oracle Cloud SCM / ERP Junior Coordinator (12 months contract)', 'Oracle', '1 connection works here', 'Try Premium for free', 'About', 'Accessibility', 'Help Center', 'Ad Choices', 'Advertising', '', '', '', '', '']
101

这是该页面的链接：

linkedin.com/jobs/search/?geoId=106155005&location=Egypt

【问题讨论】：

请寻找您想要的报废元素所属的唯一属性/类。如果多个段属于同一个类，那么您可以使用 list[i] 在您的代码之后对其进行跟踪。

标签： selenium

【解决方案1】：

这是因为您使用了错误的定位器。
该页面上有很多标签名称为a 的元素，其中大部分不是您要查找的内容。
这就是您获得此结果的原因。
UPD
这段代码应该可以工作：

job_titles = browser.find_elements_by_css_selector("a.job-card-list__title")
c = []

for title in job_titles:
    c.append(title.text)
print(c)
print((len(c)))

【讨论】：

linkedin.com/jobs/search/?geoId=106155005&location=Egypt 这是页面的链接，请告诉我该怎么做
请查看更新后的答案。如果还有问题/疑问，请告诉我
这是它给出的输出 ['', '', '', '', '', '', '', '', '', '', '', '' , '', '', '', '', '', '', '', '', '', '', '', '', ''] 25
@Prophet 你需要更改for循环for title in job_titles: c.append(title.text)
哦，它做到了！太感谢了！你能解释一下你做了什么吗，因为我还需要在页面右侧刮掉其他的东西，比如标题和每个工作的详细信息

【解决方案2】：

这些是 CSS - 从中获取文本： .disabled.ember-view.job-card-container__link.job-card-list__title 看起来最好向下滚动到页面底部以加载所有工作，因为页面上有 25 个，当你打开页面时你只看到 7 个，所以最好向下滚动而不是搜索标题

如果你想要 Xpath： //a[@class="disabled ember-view job-card-container__link job-card-list__title"]

【讨论】：

能否请您写下这段代码，以便我能做得更好？
@SeifMahdi 正如 Prophet 所写，没关系，只需要更新 for 循环
你能帮我解决这个问题吗？ stackoverflow.com/questions/69109163/…