不要将字典变成完整的图表,而是使用结构少一点的东西:
对于字典中的每个word,您可以通过删除len(word) 中每个i 的字符号i 得到一个shortened_word。将(shortened_word, i) 对映射到所有words 的列表。
这有助于查找所有带有一个替换字母的单词(因为对于某些 i,它们必须在同一个 (shortened_word, i) bin 中,而对于多一个字母的单词(因为它们必须在某些 (word, i) bin 中) i.
Python 代码:
from collections import defaultdict, deque
from itertools import chain
def shortened_words(word):
for i in range(len(word)):
yield word[:i] + word[i + 1:], i
def prepare_graph(d):
g = defaultdict(list)
for word in d:
for short in shortened_words(word):
g[short].append(word)
return g
def walk_graph(g, d, start, end):
todo = deque([start])
seen = {start: None}
while todo:
word = todo.popleft()
if word == end: # end is reachable
break
same_length = chain(*(g[short] for short in shortened_words(word)))
one_longer = chain(*(g[word, i] for i in range(len(word) + 1)))
one_shorter = (w for w, i in shortened_words(word) if w in d)
for next_word in chain(same_length, one_longer, one_shorter):
if next_word not in seen:
seen[next_word] = word
todo.append(next_word)
else: # no break, i.e. not reachable
return None # not reachable
path = [end]
while path[-1] != start:
path.append(seen[path[-1]])
return path[::-1]
及用法:
dictionary = ispell_dict # list of 47158 words
graph = prepare_graph(dictionary)
print(" -> ".join(walk_graph(graph, dictionary, "hands", "feet")))
print(" -> ".join(walk_graph(graph, dictionary, "brain", "game")))
输出:
hands -> bands -> bends -> bents -> beets -> beet -> feet
brain -> drain -> drawn -> dawn -> damn -> dame -> game
关于速度的一句话:构建“图形助手”很快(1 秒),但手 -> 脚需要 14 秒,大脑 --> 游戏需要 7 秒。
编辑:如果您需要更快的速度,可以尝试使用图表或网络库。或者您实际上构建了完整的图表(慢),然后更快地找到路径。这主要包括将边缘查找从步行功能移动到图形构建功能:
def prepare_graph(d):
g = defaultdict(list)
for word in d:
for short in shortened_words(word):
g[short].append(word)
next_words = {}
for word in d:
same_length = chain(*(g[short] for short in shortened_words(word)))
one_longer = chain(*(g[word, i] for i in range(len(word) + 1)))
one_shorter = (w for w, i in shortened_words(word) if w in d)
next_words[word] = set(chain(same_length, one_longer, one_shorter))
next_words[word].remove(word)
return next_words
def walk_graph(g, start, end):
todo = deque([start])
seen = {start: None}
while todo:
word = todo.popleft()
if word == end: # end is reachable
break
for next_word in g[word]:
if next_word not in seen:
seen[next_word] = word
todo.append(next_word)
else: # no break, i.e. not reachable
return None # not reachable
path = [end]
while path[-1] != start:
path.append(seen[path[-1]])
return path[::-1]
用法:首先构建图表(慢,某些 i5 笔记本电脑上的所有时序,YMMV)。
dictionary = ispell_dict # list of 47158 words
graph = prepare_graph(dictionary) # more than 6 minutes!
现在找到路径(比以前快得多,无需打印):
print(" -> ".join(walk_graph(graph, "hands", "feet"))) # 10 ms
print(" -> ".join(walk_graph(graph, "brain", "game"))) # 6 ms
print(" -> ".join(walk_graph(graph, "tampering", "crunchier"))) # 25 ms
输出:
hands -> lands -> lends -> lens -> lees -> fees -> feet
brain -> drain -> drawn -> dawn -> damn -> dame -> game
tampering -> tapering -> capering -> catering -> watering -> wavering -> havering -> hovering -> lovering -> levering -> leering -> peering -> peeping -> seeping -> seeing -> sewing -> swing -> swings -> sings -> sines -> pines -> panes -> paces -> peaces -> peaches -> beaches -> benches -> bunches -> brunches -> crunches -> cruncher -> crunchier