删除列表中存在于其他元素中的单词答案

【问题标题】：Remove words in list that are present in other elements删除列表中存在于其他元素中的单词
【发布时间】：2023-03-21 21:18:01
【问题描述】：

我正在尝试删除列表中存在于其他元素中的元素

例如：

l = ['word1', 'word1 word2', 'word3 word2', 'word2', 'word4']

函数的输出应该是：

['word3 word2', 'word1 word2', 'word4']

因为 'word1' 和 'word2' 分别出现在原始列表的索引 1 和 2 中

这是我想出的算法。它在 ~O(n^2)

中解决了这个问题

l = ['word1', 'word1 word2', 'word3 word2', 'word2', 'word4']

s = set(l)

for root_word in l:
    for item in (s-set([root_word])):
        if root_word in item:
            if root_word in s:
                s.remove(root_word)

print(s)

有人知道更优化的解决方案吗？也许使用python内置函数来加快速度

【问题讨论】：

列表中是否有“word1 word2 word3”，输出会是什么？

标签： python string algorithm

【解决方案1】：

您当前的解决方案不是 O(n^2)，而是 O(m.n^2)，其中 m 是列表元素中的最大单词数。您可以为列表中的每个条目使用frozenset 单词，然后使用issubset 方法测试每个单词。这将在您的元素包含大量单词但复杂度不会改变的情况下提高性能。

【讨论】：

【解决方案2】：

l = ['word1', 'word1 word2', 'word3 word2', 'word2', 'word4']
s = set(l)
x = [s.remove(root_word) for root_word in l for item in (s-set([root_word])) if root_word in item if root_word in s ]   
print(s)

我使用 List Comprehension 技术来执行 take ~20micr0sec

【讨论】：

1.可读性低，2.这个解决方案的复杂度是多少？
你好，时间复杂度 o(n*m)，我不太了解 list comp 中的时间复杂度，但与用户的疑问相比，它给出了一些最佳解决方案，感谢您的回复，如果有请纠正我我错了