【问题标题】:UnicodeEncodeError: 'ascii' codec can't encode character '\u20b9' in position 248: ordinal not in range(128)UnicodeEncodeError:“ascii”编解码器无法在位置 248 编码字符“\u20b9”:序数不在范围内(128)
【发布时间】:2019-12-29 19:49:52
【问题描述】:

我尝试制作一个网络抓取工具来跟踪亚马逊价格,并在价格发生变化或波动时向我发送电子邮件提醒,但这是我遇到的错误,我对此很陌生。

详细错误:

    Traceback (most recent call last):
  File "/Users/vaibhav/Desktop/labai/scraper.py", line 53, in <module>
    check_price()
  File "/Users/vaibhav/Desktop/labai/scraper.py", line 20, in check_price
    send_mail()
  File "/Users/vaibhav/Desktop/labai/scraper.py", line 45, in send_mail
    msg
  File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/smtplib.py", line 855, in sendmail
    msg = _fix_eols(msg).encode('ascii')
UnicodeEncodeError: 'ascii' codec can't encode character '\u20b9' in position 248: ordinal not in range(128)

我写的 Python 代码

    import requests
    from bs4 import BeautifulSoup
    import smtplib


    URL = 'https://www.amazon.in/Nokia-Designer-Protective-Printed-Doesnt/dp/B078MFZS9V/ref=bbp_bb_a77114_st_KIqx_w_1?psc=1&smid=A2V1Y4Y0T37MVF'
    headers = {example user agent}


    def check_price():
        page = requests.get(URL,headers = headers)

    soup = BeautifulSoup(page.content,'html.parser')

    title = soup.find(id="productTitle").get_text()
    price = soup.find(id="priceblock_ourprice").get_text()
    converted_price = float(price[2:5])

    if(converted_price<400):
        send_mail()

    print(title.strip())
    print(converted_price)


    if(converted_price>300):
        send_mail()

def send_mail():
    server = smtplib.SMTP('smtp.gmail.com', 587)
    server.ehlo()
    server.starttls()
    server.ehlo()

    server.login(''example@exampleemail'','examplepass')

    subject = 'Price fell down'
    body =  'Check the amazon link  https://www.amazon.in/dp/B07XVKG5XV?aaxitk=Afmq.hE.Dq.i9ttZqy2U9g&pd_rd_i=B07XVKG5XV&pf_rd_p=2e3653de-1bdf-402d-9355-0b76590c54fe&hsa_cr_id=4398426540602&sb-ci-n=price&sb-ci-v=64%2C899.00&sb-ci-m=₹'

    msg = f"Subject = {subject}\n\n{body}"

    server.sendmail(
        'example@exampleemail',
        'example@exampleemail',
        msg
    )

    print('HEY MAIL HAS BEEN SENT')

    server.quit()


check_price()

【问题讨论】:

  • 为避免这被作为垃圾邮件关闭,我建议编辑您的帖子以删除特定链接。
  • this question 有帮助吗?问题是您正在尝试发送包含 ₹ 的消息,该消息是使用 ASCII 编码的非 ASCII 字符。
  • 我认为没有理由删除网络抓取问题的特定链接。他们很高兴拥有。垃圾邮件链接(以及包含个人详细信息的链接)是另一回事,但这似乎不是一回事。

标签: python python-3.x beautifulsoup python-requests


【解决方案1】:

这是由于卢比的货币符号 ₹ 无法以 ASCII 编码的结果。您可能希望为 smtplib 启用 UTF-8(或其他一些 unicode 编码)。最简单的方法是使用email (link is to examples) 模块。

import smtplib
from email.mime.text import MIMEText

text_type = 'plain' # or 'html'
text = 'Your message body'
msg = MIMEText(text, text_type, 'utf-8')
msg['Subject'] = 'Test Subject'
msg['From'] = gmail_user
msg['To'] = 'user1@x.com,user2@y.com'
server = smtplib.SMTP_SSL('smtp.gmail.com', 465)
server.login(gmail_user, gmail_password)
server.send_message(msg)
# or server.sendmail(msg['From'], msg['To'], msg.as_string())
server.quit()

this 答案复制的代码。

请注意,在MIMEText 中,我们使用'utf-8'。这使我们能够对 INR 货币符号进行编码。

【讨论】:

    猜你喜欢
    • 2019-11-10
    • 2021-05-14
    • 2023-03-03
    • 2014-01-22
    • 2016-08-26
    • 2017-03-29
    • 2011-07-05
    • 2018-07-10
    • 2012-04-14
    相关资源
    最近更新 更多