【问题标题】:Finding and replacing parameters in an HTML txt folder在 HTML txt 文件夹中查找和替换参数
【发布时间】:2019-01-22 16:18:03
【问题描述】:

我有一个数据库,我不知道你是否称它为数据库,但它已经到了那里,我们有多个来自旧公司文件的 URL,这些文件很混乱。我想用公司的新横幅/HTML 信息查找并替换多个 URL 的 .txt 文件。该脚本将与其他用 Python 编写的数据框程序(CVS 解析器)结合使用。这是我的代码。为什么我的 .TXT 文件没有被替换?

我尝试查看我知道是字符串的读取对象的输出。以及研究函数 replace() 的功能。

    import json
    import csv


    class HTML_Parser:
        def _init_(self, data):
            data.self = data

    F = open(r"C:\Users\Ultrarev\Desktop\Emeran-Parser\HTMLtoBeReplaced.txt", 
    "r")

    str1 = F.read()







 str1.replace("http://www.ultrarev.com/processedimages/ebay_layout/banner750x150     .jpg","https://xcart.amcoautoparts.com/ebay_layout/ebay_tmp_top.jpg")

 str1.replace("http://www.ultrarev.com/processedimages/manufacturers/Ultrarev- 
   Footer.jpg","https://xcart.amcoautoparts.com/manufacturers/Ultrarev-F 
   ooter.jpg")
    str1.replace("ULTRAREV INC.","AMCO Auto Parts, LLC")
    str1.replace("ULTRAREV INC",'AMCO Auto Parts, LLC')
    str1.replace("Should you have any question  please call 1(877) 858- 
   7272.","Should you have any question, please message us!")
    str1.replace("120 Central Ave. Farmingdale  NJ 07727"," ")
    str1.replace("CALL FOR CUSTOMER SUPPORT"," ")
    str1.replace("Please Call us toll free 1-877-858-7272!","Should you have any 
    question, please message us!")
    str1.replace('<a style="color: #000000; font-weight:bold; text- 
    decoration:none" href="tel:732-938-3999">'," ")
    str1.replace('<a style="color: #000000; font-weight: bold; text-decoration: 
    none" href="tel:1-877-858-7272">1-877-858-7272</a>',' ')
    str1.replace('<a style="color: #000000; font-weight: bold; text-decoration: 
    none" href="tel:732-938-3999">732-938-3999</a>',' ')
    str1.replace(' OR ',' ')
    str1.replace('OEM (Match Case) - Find them in both Title and Description', 
    'OE')
    str1.replace("http://www.ultrarev.com", "https://xcart.amcoautoparts.com")
    str1.replace('http://www.ultrarev.com/processedimages/manufacturers/ralco- 
   rz- 
     logo_texture.png', 'http://amcoautoparts.com/images/P/RalcoRZLogo.png')



print(str1)

我希望它返回一个带有替换值的字符串,而不是返回先前字符串的值。

【问题讨论】:

    标签: python html python-3.x replace export-to-csv


    【解决方案1】:

    str.replace() 不会修改原始字符串 - 而是返回已完成替换的新字符串。

    你必须链接你的电话。比如:

    new_str = original_str.replace(...).replace(...).replace(...)
    

    另外我建议你使用元组来存储这样的替换对:

    replaces = (('from1', 'to1'), ('from2', 'to2'), ('from3', 'to3'))
    for src, dest in replaces:
      str1 = str1.replace(src, dest)
    print (str1)
    

    【讨论】:

    • 哦,是的,我忘记了——Python 很多时候只是返回对象的新实例,而不是修改原始实例。我的 Pythonic 思维方式仍然需要微调。谢谢你。我的意思是除了大量的替代品之外,您还有什么建议?也许是一个 for 循环?另外,这将帮助我解决其他问题。
    【解决方案2】:

    我只是附加了所有的替换。

    import json
    import csv
    
    
    class HTML_Parser:
        def _init_(self, data):
            data.self = data
    
    F = open(r"C:\Users\Ultrarev\Desktop\Emeran-Parser\HTMLtoBeReplaced.txt")
    
    str1 = F.read()
    
    str2 = str1.replace("http://www.ultrarev.com/processedimages/ebay_layout/banner750x150.jpg","https://xcart.amcoautoparts.com/ebay_layout/ebay_tmp_top.jpg").replace("http://www.ultrarev.com/processedimages/manufacturers/Ultrarev-Footer.jpg","https://xcart.amcoautoparts.com/manufacturers/Ultrarev-Footer.jpg").replace("ULTRAREV INC.","AMCO Auto Parts, LLC").replace("ULTRAREV INC",'AMCO Auto Parts, LLC').replace("Should you have any question  please call 1(877) 858-7272.","Should you have any question, please message us!").replace("120 Central Ave. Farmingdale  NJ 07727"," ").replace("CALL FOR CUSTOMER SUPPORT"," ").replace("Please Call us toll free 1-877-858-7272!","Should you have any question, please message us!").replace('<a style="color: #000000; font-weight:bold; text-decoration:none" href="tel:732-938-3999">'," ").replace('<a style="color: #000000; font-weight: bold; text-decoration: none" href="tel:1-877-858-7272">1-877-858-7272</a>',' ').replace('<a style="color: #000000; font-weight: bold; text-decoration: none" href="tel:732-938-3999">732-938-3999</a>',' ').replace(' OR ',' ').replace('OEM (Match Case) - Find them in both Title and Description', 'OE').replace("http://www.ultrarev.com", "https://xcart.amcoautoparts.com").replace('http://www.ultrarev.com/processedimages/manufacturers/ralco-rz-logo_texture.png', 'http://amcoautoparts.com/images/P/RalcoRZLogo.png')
    
    
    
    print(str2)
    

    这似乎是目前最好的解决方案。

    【讨论】:

      猜你喜欢
      • 2020-07-04
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-04-05
      • 2012-12-03
      • 2019-07-18
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多