如果搜索始终是字符串前缀,那么您可以使用 prefix tree 或 Trie,它是现有的 Python 模块。
Trie 允许在 O(M) 时间内找到匹配项,其中 M 是最大值
字符串长度
reference
(即取决于最大密钥长度而不是密钥数量)。
代码
from pytrie import StringTrie
def create_prefix(dict):
" Creates a prefix tree based upon a dictionary "
# create empty trie
trie = StringTrie()
for k in dict:
trie[k] = k
return trie
测试 1
# Preprocess to create prefix tree
mydict = {'A B C':0, 'A B E':1, 'E F':0}
prefix_tree = create_prefix(mydict)
# Now you can use search tree multile times to speed individual searches
for search_string in ['A B', 'A B C', 'E', 'B']:
results = prefix_tree.values(search_string) # # .values resturn list that has this as a prefix
if results:
print(f'Search String {search_string} found in keys {results}')
else:
print(f'Search String {search_string} not found')
输出
Search String A B found in keys ['A B C', 'A B E']
Search String A B C found in keys ['A B C']
Search String E found in keys ['E F']
Search String B not found
测试 2(添加以回答 OP 的问题)
mydict = {'A B C':0, 'A B C D':0, 'A B C D E':0}
prefix_tree = create_prefix(mydict)
# Now you can use search tree multile times to speed individual searches
for search_string in ['A B', 'A B C', 'A B C D', 'A B C D E', 'B C']:
results = prefix_tree.values(search_string) # # .values resturn list that has this as a prefix
if results:
print(f'Search String {search_string} found in keys {results}')
else:
print(f'Search String {search_string} not found')
输出
Search String A B found in keys ['A B C', 'A B C D', 'A B C D E']
Search String A B C found in keys ['A B C', 'A B C D', 'A B C D E']
Search String A B C D found in keys ['A B C D', 'A B C D E']
Search String A B C D E found in keys ['A B C D E']
Search String B C not found