【问题标题】:Removing Unwanted Characters From List从列表中删除不需要的字符
【发布时间】:2018-08-08 21:50:42
【问题描述】:

我有一个与此结构类似的项目列表:

[{'Condition': '2013 Yamaha FJR 1300',
 'Date': '2018-02-28 11:30',
 'Description': ['\n        ',
  '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.',
  '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM',
  '\n',
  '\n    '],
 'Images': [],
 'Latitude': '35.599694',
 'Location': ' (Asheville)',
 'Longitude': '-82.628866',
 'Price': '$7500',
 'Title': '2013 Yamaha FJR 1300',
 'Url': 'https://asheville.craigslist.org/mcd/d/2013-yamaha-fjr-1300/6513320993.html',
 '_id': {'$oid': '5a96dbee6f9ca5410cc9ed98'}},

{'Condition': '2014 Honda Accord Sedan',
 'Date': '2018-02-28 11:24',
 'Description': ['\n        ',
  '\n2014 Honda Accord  Automatic, White , On Tan, It has Only 41,980 Miles It Has Spoiler, Power Windows, and Mirrors, Tan Cloth Seats, Power Seats, 4 Cylinder, 4 Door, Radio, 6 CD Changer, FM,AM,CD, XM Radio, Bluetooth, Back up Camera, Side and Curtain Air Bag, 16 Inch Factory Wheels with Firestone  Great Tires, Tinted Glass, And Much More, Clean On inside, Runs and Drives Like New, Call Me for more info, 864-266-6936 Willing to Negotiate if offer is fair.....',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\nhonda, bmw, crv, mercedes, ford, mazda, lx, rx, ls, is, gs, 470 honda, lexus, toyota, ford, accord, civic, coupe, Mercedes,Honda Pilot, Lexus gx470 & 460, Chevrolet Tahoe, suburban, Tahoe, land rover, Nissan armada, GMC Yukon, Terrian, CX7, BMW x5, GMC Terrian, B 2011, 2010, 2009, 2008, 2007, 2012, 2013, 2014, 2016, 2006, 2005, 2017, 2018, ',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n    '],
 'Images': ['https://images.craigslist.org/00b0b_gNOi9VtqAy3_600x450.jpg',
  'https://images.craigslist.org/00a0a_gs2eKxUlQho_600x450.jpg',
  'https://images.craigslist.org/00l0l_lPmE8ML0zcb_600x450.jpg',
  'https://images.craigslist.org/00x0x_bS9gCuxM7ID_600x450.jpg',
  'https://images.craigslist.org/01010_dTS4DnHjVWW_600x450.jpg',
  'https://images.craigslist.org/00w0w_70D0xeDKa7d_600x450.jpg',
  'https://images.craigslist.org/00606_4SUFT4ZCbmO_600x450.jpg',
  'https://images.craigslist.org/00k0k_1AQ7kVbviPN_600x450.jpg',
  'https://images.craigslist.org/00d0d_3STBecGHaXD_600x450.jpg',
  'https://images.craigslist.org/01717_guG6n90XfQt_600x450.jpg',
  'https://images.craigslist.org/00h0h_8be8866trLr_600x450.jpg',
  'https://images.craigslist.org/00B0B_gaQQvQHlARl_600x450.jpg',
  'https://images.craigslist.org/00b0b_ih84Nskx5xj_600x450.jpg',
  'https://images.craigslist.org/01616_aveWbY1HQvr_600x450.jpg',
  'https://images.craigslist.org/00x0x_Fflsg0wwsK_600x450.jpg',
  'https://images.craigslist.org/00b0b_6FBg7KV8HYv_600x450.jpg',
  'https://images.craigslist.org/00J0J_3vd5Ip3mQ5S_600x450.jpg',
  'https://images.craigslist.org/00L0L_loNV2CrnnLn_600x450.jpg',
  'https://images.craigslist.org/00K0K_fh8oSEa9fKn_600x450.jpg',
  'https://images.craigslist.org/00r0r_8P0SjsOgNd5_600x450.jpg',
  'https://images.craigslist.org/00k0k_ZY0ywNmKkr_600x450.jpg',
  'https://images.craigslist.org/00y0y_7Gie7XD8uuH_600x450.jpg',
  'https://images.craigslist.org/00c0c_2nVDzLJhnYi_600x450.jpg',
  'https://images.craigslist.org/00202_7k10eK3bxMn_600x450.jpg'],
 'Latitude': '35.039000',
 'Location': ' (Cowpens)',
 'Longitude': '-81.822000',
 'Price': '$10995',
 'Title': '2014 Honda Accord  White  41k',
 'Url': 'https://asheville.craigslist.org/ctd/d/2014-honda-accord-white-41k/6513312696.html',
 '_id': {'$oid': '5a96dbf16f9ca5410cc9ed99'}}]

当我运行以下代码时:

wanted_keys = ['Title', 'Location', 'Price', 'Description', 'Url', 'Latitude', 'Longitude'] 
for item in cl_used_items_raw[:2]:
    for k in wanted_keys:
        lines = str(item[k]).split()
        split_lines = [line.replace('\n', '').strip() for line in lines]
        print("{}".format(' '.join(split_lines) + '\t'))
    print('\n')

我得到一个输出:

2013 Yamaha FJR 1300    
(Asheville) 
$7500   
['\n ', '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '\n', '\n ']    
https://asheville.craigslist.org/mcd/d/2013-yamaha-fjr-1300/6513320993.html 
35.599694   
-82.628866  


2014 Honda Accord White 41k 
(Cowpens)   
$10995  
['\n ', '\n2014 Honda Accord Automatic, White , On Tan, It has Only 41,980 Miles It Has Spoiler, Power Windows, and Mirrors, Tan Cloth Seats, Power Seats, 4 Cylinder, 4 Door, Radio, 6 CD Changer, FM,AM,CD, XM Radio, Bluetooth, Back up Camera, Side and Curtain Air Bag, 16 Inch Factory Wheels with Firestone Great Tires, Tinted Glass, And Much More, Clean On inside, Runs and Drives Like New, Call Me for more info, 864-266-6936 Willing to Negotiate if offer is fair.....', '\n', '\n', '\n', '\n', '\n', '\n', '\nhonda, bmw, crv, mercedes, ford, mazda, lx, rx, ls, is, gs, 470 honda, lexus, toyota, ford, accord, civic, coupe, Mercedes,Honda Pilot, Lexus gx470 & 460, Chevrolet Tahoe, suburban, Tahoe, land rover, Nissan armada, GMC Yukon, Terrian, CX7, BMW x5, GMC Terrian, B 2011, 2010, 2009, 2008, 2007, 2012, 2013, 2014, 2016, 2006, 2005, 2017, 2018, ', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n '] 
https://asheville.craigslist.org/ctd/d/2014-honda-accord-white-41k/6513312696.html  
35.039000   
-81.822000

我知道我已经很接近了,但我正在努力确定如何编写我的 for 循环以删除描述值中的额外空白字符,同时仍保持我已有的输出结构?

【问题讨论】:

    标签: python python-3.x list for-loop whitespace


    【解决方案1】:

    这给了我正确的输出:

    for item in cl_used_items_raw[:2]:
        for k in wanted_keys:
            if k == 'Description':
                lines = str(''.join(item[k])).split()
                split_lines = [line.replace('\n', '').strip() for line in lines]
                split_lines = ' '.join(split_lines)
                print(split_lines)
            else:
                lines = str(item[k]).split()
                split_lines = [line.replace('\n', '').strip() for line in lines]
                print("{}".format(' '.join(split_lines) + '\t'))       
        print('\n')
    

    【讨论】:

      【解决方案2】:
      >>> desc = ['\n        ',
      ...   '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.',
      ...   '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM',
      ...   '\n',
      ...   '\n    ']
      

      之前:

      >>> desc
      ['\n        ', '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '\n', '\n    ']
      

      应用 replace() 和 strip()

      [x.replace('\n', '').strip() for x in desc ]
      

      之后:

      ['', '2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '', '']
      

      如果我理解正确,您可以用空字符串替换换行符,然后删除周围的空格

       [x.replace('\n', '').strip() for x in desc ]
      

      【讨论】:

      • 谢谢。当我将您的代码用于该特定示例时,我得到了我需要的东西,但是当我将该行重新插入我的 for 循环 split_lines = [line.replace('\n', '').strip() for line in lines] 时,它会输出与以前相同的输出?
      【解决方案3】:

      line.strip() 不会就地修改line - 它会返回修改后的值,因此您使用它的方式不会以任何方式影响line

      你的意思可能是:

      split_lines = [line.strip() for line in lines]
      

      【讨论】:

      • 谢谢。你知道为什么我的代码仍然不能正常工作吗? for item in cl_used_items_raw[:2]: for k in wanted_keys: lines = str(item[k]).split() split_lines = [line.replace('\n', '').strip() for line in lines] print("{}".format(' '.join(split_lines) + '\t')) print('\n')
      • 请对原始问题进行编辑 - cmets 中的代码不可读。
      猜你喜欢
      • 1970-01-01
      • 2011-12-24
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多