跳到主要内容

212. 单词搜索 II [hard]

212. 单词搜索 II [hard]

https://leetcode-cn.com/problems/word-search-ii/

给定一个二维网格 board 和一个字典中的单词列表 words,找出所有同时在二维网格和字典中出现的单词。

单词必须按照字母顺序,通过相邻的单元格内的字母构成,其中“相邻”单元格是那些水平相邻或垂直相邻的单元格。同一个单元格内的字母在一个单词中不允许被重复使用。

示例:

输入:

words = ["oath","pea","eat","rain"] and board =
[
['o','a','a','n'],
['e','t','a','e'],
['i','h','k','r'],
['i','f','l','v']
]

输出: ["eat","oath"]

说明:

  • 你可以假设所有输入都由小写字母 a-z 组成。

提示:

  • 你需要优化回溯算法以通过更大数据量的测试。你能否早点停止回溯?
  • 如果当前单词不存在于所有单词的前缀中,则可以立即停止回溯。什么样的数据结构可以有效地执行这样的操作?散列表是否可行?为什么? 前缀树如何?如果你想学习如何实现一个基本的前缀树,请先查看这个问题: 实现Trie(前缀树)。

通过次数17,237 | 提交次数41,468

Second Try

2020-07-22

参考题解写的简化版本,但是不习惯在操作过程中对trie树进行剪纸修改,就维持目前这个速度好了。优化过多,换个需求就悲剧了。

在判断是否mark_end存在的时候,又犯了个current level和next level的错误,这地方一迷糊真容易出bug,思路清晰就没问题。


class Solution:

def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
if not (len(board) and len(board[0]) and len(words)):
return []

trie = {}
mark_end = "#"
for word in words:
node = trie
for c in word:
node = node.setdefault(c, dict())
node[mark_end] = word # 保留完整的单词

# print("trie", trie)

m, n = len(board),len(board[0])
def backtrack(i, j, visited, seq, node):
if not (0<= i < m and 0 <= j <n):
return False
if (i, j) in visited: # 已经在此轮被访问过
return False
if board[i][j] not in node: # 不存在trie树中
return False
visited.add((i, j))
# 又搞出来一个bug,这时候应该用next level来判断是否mark end存在,而不是当前node
# if mark_end in node: # 发现了完整单词
# seq.append(node[mark_end])
# next_level = node[board[i][j]]
next_level = node[board[i][j]]
if mark_end in next_level: # 发现了完整单词
seq.append(next_level[mark_end])
for step_row, step_col in [(1, 0), (-1, 0), (0, 1), (0, -1)]:
backtrack(i + step_row, j + step_col, visited, seq, next_level)
visited.remove((i, j)) # backtrack,方便下一轮

seq = []
for i in range(m):
for j in range(n):
backtrack(i, j, set(), seq, trie)
return list(set(seq))
  • 执行用时:356 ms, 在所有 Python3 提交中击败了30.31%的用户
  • 内存消耗:27.8 MB, 在所有 Python3 提交中击败了50.00%的用户
  • python新用法: dict setdefault
In [45]: a = dict()

In [46]: a.setdefault?
Docstring: D.setdefault(k[,d]) -> D.get(k,d), also set D[k]=d if k not in D
Type: builtin_function_or_method

In [47]: a.setdefault("t", 2)
Out[47]: 2

In [48]: a
Out[48]: {'t': 2}

First Try

2020-07-22

其实应该说在做过208,211之后,这道题挺简单的,主要是trie树和backtrack递归,轻松就写出来,结果还是一堆bug调了好久,惨。

  • 问题主要出在使用last_node来判断是否为单词末尾,结果写出了拧巴的判断,把自己也绕晕了。其实设置key的时候,对应的value为node,node设置is_end。因此每次只要判断循环到最后的node是否is_end=True即可,思路也变得清晰了起来。
  • 其次在找到某个单词之后,还是不能终止,因为可能有更长的单词也匹配,因此需要继续一路走下去。只是找到某个单词之后,复制后放到结果列表里即可。
  • 沿着当前坐标走下去,需要添加到路径里,走完之后需要注意把路径返回,不能影响到其他坐标选择。
  • 某个单词在矩阵里可能有不同的实现方式,因此最后的结果需要注意去重。

vola, 其实熟悉了trie树和backrack回溯递归之后,也算是一道简单的题目,把bug都趟过一遍就行。

"""
对单词列表构建一个棵trie树,类似vocabulary
对矩阵里面每个元素进行4个方向遍历,并记录走过的位置

test case:

[["o","a","a","n"],["e","t","a","e"],["i","h","k","r"],["i","f","l","v"]]
["oath","pea","eat","rain"]

贡献错误提交的案例:
- 竟然会查找出多个重复单词,但是trie好像没法及时终止。。
[["a","a"]]
["a"]

- 更长的单词覆盖了短单词

[["a","b"],["c","d"]]
["ac","ca", "acb", "acd"]

- 错误提交

[["a","b"],["c","d"]]
["ab","cb","ad","bd","ac","ca","da","bc","db","adcb","dabc","abb","acb"]

- tmd又错误提交了。。。

[["a","b"],["a","a"]]
["aaab","aaa","aaaa","aaba"]
"""

class Node(dict):
def __init__(self):
super(Node).__init__()
self.is_end = False

class Trie(object):
def __init__(self):
self.node = Node()

def add(self, word):
node = self.node
for c in word:
if c not in node:
node[c] = Node()
node = node[c]
node.is_end = True

# 根本不用last node作为标记啊,只要存在value为node,判断其是否is end即可。
# def add(self, word):
# node = self.node
# for c in word:
# last_node = node
# if c not in node:
# node[c] = Node()
# node = node[c]
# last_node.is_end = True

def search(self, word):
node = self.node
for c in word:
last_node = node
if c not in node:
return False
node = node[c]
return last_node.is_end


class Solution:

def findWords(self, board: List[List[str]], words: List[str]) -> List[str]:
if not (len(board) and len(board[0]) and len(words)):
return []

trie = Trie()
for w in words:
trie.add(w)
print("trie", trie.node)

def search_point(i, j, visited, path, paths, node):
if not (0<= i < m and 0 <= j < n):
return False
if (i, j) in visited:
return False
if board[i][j] in node:
path.append([i, j])
visited.add((i, j))
next_level = node[board[i][j]]
if next_level.is_end:
paths.append(path[:])
# return True # 想不到又有一个坑,就算找到了一个有效单词,但也不能停止搜索,可能下一个还有更长的单词
for step_x, step_y in [(-1, 0), (1, 0), (0, 1), (0, -1)]:
# 不用判断成功或失败,只要能继续在trie树里查找下去,就一路找下去
# if search_point(i + step_y, j + step_x, visited, path, next_level):
# return True
search_point(i + step_y, j + step_x, visited, path, paths, next_level)
# 发现四个方向都不存在,这一步就素走成功了当前单词,但也不存在了,需要往回退,复原path和visited的现场
path.pop()
visited.remove((i, j))
return False

paths = []
m, n = len(board),len(board[0])
for i in range(m):
for j in range(n):
visited = set()
path = []
search_point(i, j, visited, path, paths, trie.node)
rv = []
for path in paths:
t = []
for i, j in path:
t.append(board[i][j])
rv.append(''.join(t))
# print('rv', rv)
return list(set(rv))

  • 执行用时:408 ms, 在所有 Python3 提交中击败了23.73%的用户
  • 内存消耗:35.5 MB, 在所有 Python3 提交中击败了50.00%的用户
  • failed try

起始的时候用keys而非node来迭代,而且判断是否为单词结尾的时候,使用了错误的判断逻辑,导致写起来拧巴,结果也错误了好几次,最后放弃。改成First try的新逻辑后,思路就清晰多了。


class Node:
def __init__(self):
self.keys = [None] * 26
self.mark = False

class Trie:

def __init__(self):
"""
Initialize your data structure here.
"""
self.keys = [None] * 26

def insert(self, word: str) -> None:
"""
Inserts a word into the trie.
"""
keys = self.keys
for i in range(len(word)):
idx = ord(word[i]) - ord("a")
if keys[idx] == None:
keys[idx] = Node()
if i == len(word) - 1:
keys[idx].mark = 2
keys = keys[idx].keys

def search(self, word: str) -> bool:
"""
Returns if the word is in the trie.
"""
keys = self.keys
for i in range(len(word)):
idx = ord(word[i]) - ord("a")
if keys[idx] == None:
return False
if i == len(word) - 1:
return keys[idx].mark == 2
keys = keys[idx].keys
return False

def startsWith(self, prefix: str) -> bool:
"""
Returns if there is any word in the trie that starts with the given prefix.
"""
def rec(node):
if node.mark == 2:
return True
for k in node.keys:
if k and rec(k):
return True
return False

keys = self.keys
for c in prefix:
idx = ord(c) - ord("a")
if keys[idx] is None:
return False
node = keys[idx]
keys = node.keys
return rec(node)