【问题标题】:Removing substrings in list of list of strings, maintain order - Python删除字符串列表中的子字符串,维护顺序 - Python
【发布时间】:2019-12-11 13:37:57
【问题描述】:

我有一个带有字符串的列表。

words = [['gamma_ray_bursts','merger','death','throes','magnetic_flares','neutrino_antineutrino','objections','bursts','double_neutron_star','parker_instability','positrons'],
 ['dot','gravitational_lensing','splittings','limits','amplifications','time_delays','extracting_information','fix','distant_quasars'],
 ['recoil','gamma_ray_bursts','neutron_stars','jennings','possible_origins','birthplaces','disjoint','arrival_directions'],
 ['sn_sn','type_ii_supernovae','distances','dilution','extinction','extragalactic_distance_scale','expanding_photosphere','distance','photospheres','supernovae_sn','span_wide_range'],
 ['photon_pair','high_energy','gamma_ray_burst','optical_depth','absorbing_medium','implications','problem','annihilation_radiation','emergent_spectrum','limit','radiation_transfer','collimation','regions']]
  1. 如果列表中的任何元素是另一个元素的子字符串,我想删除它。
  2. 我希望保留订单

' 我试过这个循环:

for string_list in words:
    for item in string_list: 
        for item1 in string_list:
            if item in item1 and item!= item1:
                string_list.remove(item)

它似乎适用于较小的列表列表,但是当我增加列表的 len 时会输出错误。

ValueError                                Traceback (most recent call last)
<ipython-input-91-7546f608171f> in <module>
      4         for item1 in string_list:
      5             if item in item1 and item!= item1:
----> 6                 string_list.remove(item)

ValueError: list.remove(x): x not in list

预期输出:

words = [['gamma_ray_bursts','merger','death','throes','magnetic_flares','neutrino_antineutrino','objections','double_neutron_star','parker_instability','positrons'], ['dot','gravitational_lensing','splittings','limits','amplifications','time_delays','extracting_information','fix','distant_quasars'],['recoil','gamma_ray_bursts','neutron_stars','jennings','possible_origins','birthplaces','disjoint','arrival_directions'], ['sn_sn','type_ii_supernovae','distances','dilution','extinction','extragalactic_distance_scale','expanding_photosphere','photospheres','supernovae_sn','span_wide_range'],['photon_pair','high_energy','gamma_ray_burst','optical_depth','absorbing_medium','implications','problem','annihilation_radiation','emergent_spectrum','limit','radiation_transfer','collimation','regions']]

我搜索了论坛,有一个非常相似的问题,解决方案有时有效,但有时会输出错误,发生此错误的位置不一致。列表的长度是可变的。 Python - Remove any element from a list of strings that is a substring of another element

【问题讨论】:

  • 请添加所需的输出。您的问题可以以多种方式解释。 “另一个元素的子字符串”是什么意思?您的字符串列表列表。您是否正在使用字符串中的子字符串?我很困惑。还有什么是元素?您指的是其他列表吗?
  • 更改您尝试迭代的列表的内容从来都不是好习惯
  • 对不起,我添加了预期的输出。在每个列表中,我想删除作为另一个元素/字符串的子字符串的任何元素/字符串。前任。 list_1 = ['gamma_ray_bursts' ,... 'bursts'] remove 'bursts' output = ['gamma_ray_bursts',...] 每个列表列表都应独立检查子串。不,我指的是每个列表中的元素,而不是其他列表。
  • @ChrisDoyle 我想知道。如果是这种情况,创建一个没有子字符串的新列表是更好/可接受的做法?
  • 所以只是为了澄清您的要求,如果列表中的任何元素是同一列表中另一个项目的子列表,您希望删除它。

标签: python-3.x


【解决方案1】:

与其从列表中删除元素,不如创建一个符合您要求的新元素(因为这样更安全)?

# method to filter out substrings
def substr_in_list(elem, lst):
  for s in lst:
    if elem != s and elem in s:
      return True
  return False

words = [[j for j in i if not substr_in_list(j, i)] for i in words]

输出:

[['gamma_ray_bursts', 'merger', 'death', 'throes', 'magnetic_flares', 'neutrino_antineutrino', 'objections', 'double_neutron_star', 'parker_instability', 'positrons'], ['dot', 'gravitational_lensing', 'splittings', 'limits', 'amplifications', 'time_delays', 'extracting_information', 'fix', 'distant_quasars'], ['recoil', 'gamma_ray_bursts', 'neutron_stars', 'jennings', 'possible_origins', 'birthplaces', 'disjoint', 'arrival_directions'], ['sn_sn', 'type_ii_supernovae', 'distances', 'dilution', 'extinction', 'extragalactic_distance_scale', 'expanding_photosphere', 'photospheres', 'supernovae_sn', 'span_wide_range'], ['photon_pair', 'high_energy', 'gamma_ray_burst', 'optical_depth', 'absorbing_medium', 'implications', 'problem', 'annihilation_radiation', 'emergent_spectrum', 'limit', 'radiation_transfer', 'collimation', 'regions']]

【讨论】:

    猜你喜欢
    • 2020-12-03
    • 2021-06-24
    • 1970-01-01
    • 2020-04-05
    • 1970-01-01
    • 2023-04-09
    • 1970-01-01
    相关资源
    最近更新 更多