【问题标题】:What form is " where it represents the double quote character?"代表双引号字符的形式是什么?
【发布时间】:2021-04-26 05:49:41
【问题描述】:

我有一个包含如下 JSON 记录的数据集:

{"reviewerID": "A3SSQRWUP2A04Q", "asin": "B0000224UE", "reviewerName": "Wilson", "helpful": [0, 0], "reviewText": "I love this tool. Yes it is heavy. Yes you can wear it on your belt and no one will notice. Really. And yes, you get a great amount of tools. I already always carry a Swiss Army Knife on me, so I went with this model to get the serrated blade, and not the scissors.It is perfectly aligned, buttery smooth, nicer than my Leatherman Rebar, and yes, bigger and heavier. The tools come out on the outside, and lock with a satisfying "snick." The release is easier to use than the Leatherman Rebar. Which itself is a nice tool, I carry that in my daily work messenger bag.The pliers are strong and super easy. One nice touch -- the ruler, inches/centimeters, is far easier to read than the Leatherman. The result I think of the brightly polished steel.", "overall": 5.0, "summary": "Top of the line heavy multi tool", "unixReviewTime": 1396224000, "reviewTime": "03 31, 2014"}

当我在 google 中搜索符号 " 时,它会自动将其转换为双引号 (") 符号。这是什么形式,我如何将它们变成像nltk 这样的可读格式?

【问题讨论】:

    标签: python json nltk symbols


    【解决方案1】:

    啊,这些是它们的 HTML 编码!请参阅此处的表格:https://www.toptal.com/designers/htmlarrows/symbols/

    在 Python 中,您可以使用内置的 html 模块将这些字符转换为普通字符。

    import html
    normal_string = html.unescape(string_with_html_entities)
    

    【讨论】:

    • 谢谢。令人惊讶的是,当谷歌决定不合作时,我在寻找解决方案时是多么无助。
    猜你喜欢
    • 1970-01-01
    • 2013-12-21
    • 1970-01-01
    • 1970-01-01
    • 2021-01-06
    • 2011-08-21
    • 2020-11-18
    • 2011-11-28
    相关资源
    最近更新 更多