从列表中删除不需要的字符

时间:2018-02-28 22:02:19

标签: python python-3.x list for-loop whitespace

我有一个类似于此结构的项目列表:

[{'Condition': '2013 Yamaha FJR 1300',
 'Date': '2018-02-28 11:30',
 'Description': ['\n        ',
  '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.',
  '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM',
  '\n',
  '\n    '],
 'Images': [],
 'Latitude': '35.599694',
 'Location': ' (Asheville)',
 'Longitude': '-82.628866',
 'Price': '$7500',
 'Title': '2013 Yamaha FJR 1300',
 'Url': 'https://asheville.craigslist.org/mcd/d/2013-yamaha-fjr-1300/6513320993.html',
 '_id': {'$oid': '5a96dbee6f9ca5410cc9ed98'}},

{'Condition': '2014 Honda Accord Sedan',
 'Date': '2018-02-28 11:24',
 'Description': ['\n        ',
  '\n2014 Honda Accord  Automatic, White , On Tan, It has Only 41,980 Miles It Has Spoiler, Power Windows, and Mirrors, Tan Cloth Seats, Power Seats, 4 Cylinder, 4 Door, Radio, 6 CD Changer, FM,AM,CD, XM Radio, Bluetooth, Back up Camera, Side and Curtain Air Bag, 16 Inch Factory Wheels with Firestone  Great Tires, Tinted Glass, And Much More, Clean On inside, Runs and Drives Like New, Call Me for more info, 864-266-6936 Willing to Negotiate if offer is fair.....',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\nhonda, bmw, crv, mercedes, ford, mazda, lx, rx, ls, is, gs, 470 honda, lexus, toyota, ford, accord, civic, coupe, Mercedes,Honda Pilot, Lexus gx470 & 460, Chevrolet Tahoe, suburban, Tahoe, land rover, Nissan armada, GMC Yukon, Terrian, CX7, BMW x5, GMC Terrian, B 2011, 2010, 2009, 2008, 2007, 2012, 2013, 2014, 2016, 2006, 2005, 2017, 2018, ',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n',
  '\n    '],
 'Images': ['https://images.craigslist.org/00b0b_gNOi9VtqAy3_600x450.jpg',
  'https://images.craigslist.org/00a0a_gs2eKxUlQho_600x450.jpg',
  'https://images.craigslist.org/00l0l_lPmE8ML0zcb_600x450.jpg',
  'https://images.craigslist.org/00x0x_bS9gCuxM7ID_600x450.jpg',
  'https://images.craigslist.org/01010_dTS4DnHjVWW_600x450.jpg',
  'https://images.craigslist.org/00w0w_70D0xeDKa7d_600x450.jpg',
  'https://images.craigslist.org/00606_4SUFT4ZCbmO_600x450.jpg',
  'https://images.craigslist.org/00k0k_1AQ7kVbviPN_600x450.jpg',
  'https://images.craigslist.org/00d0d_3STBecGHaXD_600x450.jpg',
  'https://images.craigslist.org/01717_guG6n90XfQt_600x450.jpg',
  'https://images.craigslist.org/00h0h_8be8866trLr_600x450.jpg',
  'https://images.craigslist.org/00B0B_gaQQvQHlARl_600x450.jpg',
  'https://images.craigslist.org/00b0b_ih84Nskx5xj_600x450.jpg',
  'https://images.craigslist.org/01616_aveWbY1HQvr_600x450.jpg',
  'https://images.craigslist.org/00x0x_Fflsg0wwsK_600x450.jpg',
  'https://images.craigslist.org/00b0b_6FBg7KV8HYv_600x450.jpg',
  'https://images.craigslist.org/00J0J_3vd5Ip3mQ5S_600x450.jpg',
  'https://images.craigslist.org/00L0L_loNV2CrnnLn_600x450.jpg',
  'https://images.craigslist.org/00K0K_fh8oSEa9fKn_600x450.jpg',
  'https://images.craigslist.org/00r0r_8P0SjsOgNd5_600x450.jpg',
  'https://images.craigslist.org/00k0k_ZY0ywNmKkr_600x450.jpg',
  'https://images.craigslist.org/00y0y_7Gie7XD8uuH_600x450.jpg',
  'https://images.craigslist.org/00c0c_2nVDzLJhnYi_600x450.jpg',
  'https://images.craigslist.org/00202_7k10eK3bxMn_600x450.jpg'],
 'Latitude': '35.039000',
 'Location': ' (Cowpens)',
 'Longitude': '-81.822000',
 'Price': '$10995',
 'Title': '2014 Honda Accord  White  41k',
 'Url': 'https://asheville.craigslist.org/ctd/d/2014-honda-accord-white-41k/6513312696.html',
 '_id': {'$oid': '5a96dbf16f9ca5410cc9ed99'}}]

当我运行以下代码时:

wanted_keys = ['Title', 'Location', 'Price', 'Description', 'Url', 'Latitude', 'Longitude'] 
for item in cl_used_items_raw[:2]:
    for k in wanted_keys:
        lines = str(item[k]).split()
        split_lines = [line.replace('\n', '').strip() for line in lines]
        print("{}".format(' '.join(split_lines) + '\t'))
    print('\n')

我得到了一个输出:

2013 Yamaha FJR 1300    
(Asheville) 
$7500   
['\n ', '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '\n', '\n ']    
https://asheville.craigslist.org/mcd/d/2013-yamaha-fjr-1300/6513320993.html 
35.599694   
-82.628866  


2014 Honda Accord White 41k 
(Cowpens)   
$10995  
['\n ', '\n2014 Honda Accord Automatic, White , On Tan, It has Only 41,980 Miles It Has Spoiler, Power Windows, and Mirrors, Tan Cloth Seats, Power Seats, 4 Cylinder, 4 Door, Radio, 6 CD Changer, FM,AM,CD, XM Radio, Bluetooth, Back up Camera, Side and Curtain Air Bag, 16 Inch Factory Wheels with Firestone Great Tires, Tinted Glass, And Much More, Clean On inside, Runs and Drives Like New, Call Me for more info, 864-266-6936 Willing to Negotiate if offer is fair.....', '\n', '\n', '\n', '\n', '\n', '\n', '\nhonda, bmw, crv, mercedes, ford, mazda, lx, rx, ls, is, gs, 470 honda, lexus, toyota, ford, accord, civic, coupe, Mercedes,Honda Pilot, Lexus gx470 & 460, Chevrolet Tahoe, suburban, Tahoe, land rover, Nissan armada, GMC Yukon, Terrian, CX7, BMW x5, GMC Terrian, B 2011, 2010, 2009, 2008, 2007, 2012, 2013, 2014, 2016, 2006, 2005, 2017, 2018, ', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n', '\n '] 
https://asheville.craigslist.org/ctd/d/2014-honda-accord-white-41k/6513312696.html  
35.039000   
-81.822000

我知道我很接近,但我正在努力确定如何编写for-loop以删除Description值中的其他空白字符,同时仍保持我已有的输出结构?

3 个答案:

答案 0 :(得分:1)

line.strip()无法就地修改line - 它会返回修改后的值,因此您使用它的方式不会以任何方式影响line

你可能意味着:

split_lines = [line.strip() for line in lines]

答案 1 :(得分:1)

>>> desc = ['\n        ',
...   '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.',
...   '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM',
...   '\n',
...   '\n    ']

<强>之前:

>>> desc
['\n        ', '\n2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '\n$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '\n', '\n    ']

应用replace()和strip()

[x.replace('\n', '').strip() for x in desc ]

<强>后:

['', '2013 Yamaha FJR 1300 Sport Touring, 4 cylinder, 12.120 miles, silver, cruise control, traction control, ABS brakes, heated hand grips, Two Brothers exhaust, handle bar risers, 6.5 gal. gas tank, adjustable windshield, saddlebags, excellent condition, very clean.', '$ 7.500 (828) 250-0373 WWW.GREENVALLEYCARS.COM', '', '']

如果我理解正确,你可以用空字符串替换换行符,然后删除空格

 [x.replace('\n', '').strip() for x in desc ]

答案 2 :(得分:0)

这给了我正确的输出:

for item in cl_used_items_raw[:2]:
    for k in wanted_keys:
        if k == 'Description':
            lines = str(''.join(item[k])).split()
            split_lines = [line.replace('\n', '').strip() for line in lines]
            split_lines = ' '.join(split_lines)
            print(split_lines)
        else:
            lines = str(item[k]).split()
            split_lines = [line.replace('\n', '').strip() for line in lines]
            print("{}".format(' '.join(split_lines) + '\t'))       
    print('\n')