【问题标题】:String matching and storing within a dictionary字符串匹配和存储在字典中
【发布时间】:2021-11-27 14:55:59
【问题描述】:

我正在使用模式匹配来收集属于街道地址的邮政编码并将这些地址作为值存储在字典中,这是我尝试过的:

test = pd.DataFrame(['SR2', 'SA1', 'M16', 'KY6', 'SR6'], columns=(['postcode']))

street = pd.DataFrame(['UnnamedRoad,LlandeiloSA196UA,UK', '8NewRd,LlandeiloSA196DB,UK','1RomanRd,Banwen,NeathSA109LH,UK', 'UnnamedRoad,LlangadogSA199UN,UK', '48ColeAve,ChadwellStMary,GraysRM164JQ,UK', '37WellingtonRd,NorthWealdBassett,EppingCM166JY,UK'], columns=(['address']))

dictframe = {}
for i in test['postcode']:
    dictframe[i] = list()
    for k in range(0, len(test), 1):    
        dictframe[i].append(list(filter(lambda x: test['postcode'][k] in x, street['address'])))

但是,这会打印每个键中的所有输出,但我只希望值出现在键中,否则如果没有匹配项,则将列表保持为空。这是我得到的输出:

{'SR2': [[],
  ['UnnamedRoad,LlandeiloSA196UA,UK',
   '8NewRd,LlandeiloSA196DB,UK',
   '1RomanRd,Banwen,NeathSA109LH,UK',
   'UnnamedRoad,LlangadogSA199UN,UK'],
  ['48ColeAve,ChadwellStMary,GraysRM164JQ,UK',
   '37WellingtonRd,NorthWealdBassett,EppingCM166JY,UK'],
  [],
  []],
..
..
..

预期输出:

{'SR2': [],
'SA1': ['UnnamedRoad,LlandeiloSA196UA,UK',
   '8NewRd,LlandeiloSA196DB,UK',
   '1RomanRd,Banwen,NeathSA109LH,UK',
   'UnnamedRoad,LlangadogSA199UN,UK']
...
...
}

【问题讨论】:

    标签: python pandas string dataframe dictionary


    【解决方案1】:

    更正代码 - 不需要内部 for 循环 & 在匹配 test['postcode'] 索引的字符串中需要使用,参考Python enumerate

    import pandas as pd
    test = pd.DataFrame(['SR2', 'SA1', 'M16', 'KY6', 'SR6'], columns=(['postcode']))
    
    street = pd.DataFrame(['UnnamedRoad,LlandeiloSA196UA,UK', '8NewRd,LlandeiloSA196DB,UK','1RomanRd,Banwen,NeathSA109LH,UK', 'UnnamedRoad,LlangadogSA199UN,UK', '48ColeAve,ChadwellStMary,GraysRM164JQ,UK', '37WellingtonRd,NorthWealdBassett,EppingCM166JY,UK'], columns=(['address']))
    
    dictframe = {}
    for index, i in enumerate(test['postcode']):
        dictframe[i] = list()
        #for k in range(0, len(street), 1):    
        dictframe[i].append(list(filter(lambda x: test['postcode'][index] in x, street['address'])))
    

    输出-

    {'KY6': [[]],
     'M16': [['48ColeAve,ChadwellStMary,GraysRM164JQ,UK',
       '37WellingtonRd,NorthWealdBassett,EppingCM166JY,UK']],
     'SA1': [['UnnamedRoad,LlandeiloSA196UA,UK',
       '8NewRd,LlandeiloSA196DB,UK',
       '1RomanRd,Banwen,NeathSA109LH,UK',
       'UnnamedRoad,LlangadogSA199UN,UK']],
     'SR2': [[]],
     'SR6': [[]]}
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-05-01
      • 2019-10-06
      • 2020-08-27
      • 2014-09-02
      • 1970-01-01
      • 2021-11-30
      • 1970-01-01
      • 2017-05-28
      相关资源
      最近更新 更多