【问题标题】:How to scrape data from linkedin my network connection?如何从我的网络连接中的链接中抓取数据?
【发布时间】:2019-03-27 06:21:40
【问题描述】:

如何从我的网络连接中的linkedin 抓取数据? 我已经成功地从单个 url https://www.linkedin.com/in/$yourfriendname/ 中抓取了 pereson 的名称/

有什么办法可以从 https://www.linkedin.com/mynetwork/invite-connect/connections/

from selenium import webdriver
from bs4 import BeautifulSoup
import getpass
import requests
from selenium.webdriver.common.keys import Keys
import pprint

chrome_path = '/usr/bin/chromedriver'
driver = webdriver.Chrome(chrome_path)
driver.get("https://www.linkedin.com")
userid='xxxx@gmail.com'
password = ('xxxxxxxx')
driver.implicitly_wait(6)
driver.find_element_by_xpath("""//*[@id="login- 
email"]""").send_keys(userid)
driver.find_element_by_xpath("""//*[@id="login- 
password"]""").send_keys(password)
driver.find_element_by_xpath("""//*[@id="login-submit"]""").click()
url='www.linkedin.com/in/$yourfriendname/'  
driver.get("https://"+url.rstrip())
connectionName = driver.find_element_by_class_name('pv-top-card- 
section__name').get_attribute('innerHTML')
print(connectionName)

>>YOUR FRIEND NAME

url1='https://www.linkedin.com/mynetwork/invite-connect/connections/'
driver.get(url1.rstrip())

如何从 url1 上面抓取?

【问题讨论】:

    标签: python selenium-webdriver beautifulsoup


    【解决方案1】:

    下面是提取href标签的代码

    url1='https://www.linkedin.com/mynetwork/invite-connect/connections/'
    driver.get(url1.rstrip())
    elems = driver.find_elements_by_xpath("//div[@class='mn-connection-card__details']//a[@data-control-name='connection_profile'][@href]")
    
    for elem in elems:
        driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
        print (elem.get_attribute("href"))
    

    【讨论】:

      猜你喜欢
      • 2019-04-24
      • 1970-01-01
      • 2021-01-08
      • 2023-03-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-06-13
      • 2016-07-25
      相关资源
      最近更新 更多