Word Break (LC 139)

Given a string and a list of words, determine if the string can be constructed from concatenating words from the list of words. A word can be used multiple times.

Input: s = "leetcode", wordDict = ["leet","code"]
Output: true
Explanation:** Return true because "leetcode" can be segmented as "leet code".

Solution

Backtracking Aggregation

We can try every word in the dictionary and see if the dictionary word is a prefix of the target word. For example, assuming the dictionary contains ["a", "b", "algo"], "a" and “algo" are both prefixes of "algomonster”, but "b" is not. If the dictionary word is a prefix then we can consider the prefix matched and continue to try to match the remaining part of the target word. We can draw the state-space tree by using every dictionary word as an edge at every level.

is_leaf: This occurs when start_index == len(words); then we have matched every letter and word break is a success
get_edges: We can use startIndex to record the next letter in the target we have matched so far. We try this for each word in the dictionary
- target[:startIndex] is matched
- target[startIndex:] is to be matched.
initial_value: We start with False because we haven’t broken the target word yet.
aggregate: We use logical OR to update ans to True if return value from the child recursive call is True.
additional_states: There are no additional states.

def word_break(target, words):
    def dfs(start_index):
        if start_index == len(target):
            return True
 
        ans = False
        for word in words:
            if target[start_index:].startswith(word):
                ans = ans or dfs(start_index + len(word))
 
        return ans
 
    return dfs(0)

The solution above works but has bad time complexity. Let n represent the length of the target string, and let m represent the number of words in the words list.

In the worst case scenario, you would require all m words to make the target string. In addition, in the worst possible case, you would need to use n words to make the target string, which happens when all the words we use have a length of 1. Since we have m choices for each word, we get an upper bound time complexity of O(m^n). In practice however, it will much faster than O(m^n), but still really inefficient.

Memoization

The above solution works but will time out if there are a lot of branches. We can use Memoization to cache the branches we have already seen.

def word_break(target, words) -> bool:
    memo = {}
    def dfs(start_index):
        if start_index == len(target):
            return True
 
        if start_index in memo:
            return memo[start_index]
 
        ans = False
        for word in words:
            if target[start_index:].startswith(word):
                if dfs(start_index + len(word)):
                    ans = True
                    break
 
        memo[start_index] = ans
        return ans
 
    return dfs(0)

Time complexity:

The size of the memo array is O(n) where n is the length of the target word.
For each state in the memo array, we have to try every dictionary word to see if it’s a prefix of the target word at the current position.
Assuming the size of the dictionary is m, string matching takes O(n), so the overall time complexity is O(n^2 * m);

/notes/

Recent

Characteristic Polynomial CL Stability

Choosing Poles for IOP and SPA

Control Design with IOP and SPA

Word Break (LC 139)

Solution

Backtracking Aggregation

Memoization

Graph View

Table of Contents

Backlinks