【问题标题】:Email Python Script Output电子邮件 Python 脚本输出
【发布时间】:2019-02-26 06:05:36
【问题描述】:

我刚刚完成了我的第一个新闻网页抓取脚本,我对它非常满意,即使代码看起来一点也不好看。我想知道当我运行它时,我应该如何通过电子邮件(gmail 地址)将脚本的输出发送给自己。我试图运行 smtplib,但它对我不起作用。

这是我当前的代码:

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
from datetime import date
from dateutil import parser
import smtplib
from email.mime.text import MIMEText

my_url1 = "https://www.coindesk.com/category/business-news/legal"
my_url2 = "https://cointelegraph.com/tags/bitcoin-regulation"
my_url3 = "https://techcrunch.com/tag/blockchain/"

# Opening up the website, grabbing the page
uFeedOne = uReq(my_url1, timeout=5)
page_one = uFeedOne.read()
uFeedOne.close()

# html parser
page_soup1 = soup(page_one, "html.parser")

# grabs each publication block
containers_one = page_soup1.findAll("a", {"class": "stream-article"} )

for container_one in containers_one:
  ## get todays date.
  ## I have taken an offset as the site has older articles than today.
  today = date.today().strftime("%Y, %m, %d")

  ## The actual datetime string is in the datetime attribute of the time tag.
  date_time1 = container_one.time['datetime']

  ## The dateutil package parses the ISO-formatted date and returns a condensed version.
  date1 = parser.parse(date_time1)
  dt1 = date1.strftime("%Y, %m, %d")

  ## Simple comparison
  if dt1 == today:
      link1 = container_one.attrs['href']
      publication_date1 = "published on " + container_one.time.text
      title1 = container_one.h3.text
      description1 = "(CoinDesk)-- " +  container_one.p.text

      print("link: " + link1)
      print("publication_date: " + publication_date1)
      print("title: ", title1)
      print("description: " + description1 + " \n")


uFeedTwo = uReq(my_url2, timeout=5)
page_two = uFeedTwo.read()
uFeedTwo.close()

page_soup2 = soup(page_two, "html.parser")

containers_two = page_soup2.findAll("div",{"class": "post-preview-item-inline__content"})

for container_two in containers_two:
  today = date.today().strftime("%Y, %m, %d")

  date_time2 = container_two.time['datetime']
  date2 = parser.parse(date_time2)
  dt2 = date2.strftime("%Y, %m, %d")

  title_container2 = container_two.find("span",{"class": "post-preview-item-inline__title"})
  description_container2 = container_two.find("p",{"class": "post-preview-item-inline__text"}).text

  if dt2 == today:
    link2 = container_two.div.a.attrs['href']
    publication_date2 = "published on " + date2.strftime("%b %d, %Y")
    title2 = title_container2.text
    description2 = "(CoinTelegraph)-- " + description_container2

    print("link: " + link2)
    print("publication_date: " + publication_date2)
    print("title: ", title1)
    print("description: " + description2 + " \n")


uFeedThree = uReq(my_url3, timeout=5)
page_three = uFeedThree.read()
uFeedThree.close()

# html parser
page_soup3 = soup(page_three, "html.parser")

# grabs each publication block
containers_three = page_soup3.findAll("div",{"class": "post-block post-block--image post-block--unread"})

for container_three in containers_three:
  today = date.today().strftime("%Y, %m, %d")
  date_time3 = container_three.time['datetime']

  date3 = parser.parse(date_time3)
  dt3 = date3.strftime("%Y, %m, %d")

  keyword1 = "law"
  keyword2 = "legal"
  description_container3 = container_three.find("div", {"class": "post-block__content"}).text.strip()

  if dt3 == today and (keyword2 in description_container3) or (keyword1 in description_container3):
      link3 = container_three.header.h2.a.attrs['href']
      publication_date3 = "published on " +  date3.strftime("%b %d, %Y")
      title3 = container_three.header.h2.a.text.strip()
      description3 = "(TechCrunch)-- " + description_container3

      print("link: " + link3)
      print("publication_date: " + publication_date3)
      print("title: ", title3)
      print("description: " + description3 + " \n")

我知道我想对此做一个变体:

# Open a plain text file for reading.  For this example, assume that
# the text file contains only ASCII characters.
with open(textfile) as fp:
    # Create a text/plain message
    msg = MIMEText(fp.read())

# me == the sender's email address
# you == the recipient's email address
msg['Subject'] = 'The contents of %s' % textfile
msg['From'] = me
msg['To'] = you

# Send the message via our own SMTP server.
s = smtplib.SMTP('localhost')
s.send_message(msg)
s.quit()

【问题讨论】:

  • 你遇到什么问题
  • @Jeril 他在邮寄脚本输出时遇到问题......很可能是 SMTPlib 配置......
  • 如果您在发送电子邮件时遇到问题,您的代码中的网络抓取部分完全无关紧要,就像标签一样。另外,请明确说明您的问题!
  • 我想把这个寄给自己,但我不知道怎么做。

标签: python email web-scraping


【解决方案1】:

这是使用 SMTP 向任何人发送邮件的代码 sn-p。

下面的代码是为 gmail SMT P 配置的。如果你有其他的可以 进行配置。

import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEText import MIMEText

msg = MIMEMultipart()
msg['From'] = 'me@gmail.com'
msg['To'] = 'you@gmail.com'
msg['Subject'] = 'Enter subjecy of msg here'
message = 'here is the email'
msg.attach(MIMEText(message))

# GMAIL_SMTP_HOST = 'smtp.gmail.com'
# GMAIL_SMTP_PORT = '587'
mailserver = smtplib.SMTP('smtp.gmail.com',587)
# secure our email with tls encryption
mailserver.starttls()
mailserver.sendmail('me@gmail.com','you@gmail.com',msg.as_string())
mailserver.quit()

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2017-03-20
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-11-03
    • 2011-02-20
    • 2013-01-11
    相关资源
    最近更新 更多