InsertingSpaces

The Problem

You are given a dictionary of words, followed by a list of sentences which have been stripped of all punctuation. You are to determine if spaces can be re-inserted into the sentences such that each word created appears in the given dictionary.

Things to consider

Words may overlap
- i.e. “cat” and “catch”
- Where you choose to insert the space is important in this case
- This prevents heuristic greedy algorithms
A word may appear more than once in any sentence
Test cases are large enough that computing all possible dictionary combinations will eat my CPU

Things to consider

The key to this problem is finding all valid word prefixes for a given sentence
Since words may overlap, we must explore each one

Example

Dictionary:
- cat
- catch
- cha
Sentence:
- catcha
You must consider the case of [catch][a] and [cat][cha]

Data Representation

Since we will be looking for valid word prefixes, this problem lends itself well to the Trie data structure (also known as a prefix tree)
Trie’s are ordered tree data structures which have the property that all descendants of a given node share a common prefix
Used commonly for auto-complete and text prediction

Implementation

Like graphs, most languages do not include a trie structure in the standard library
Again though, they are easily created using existing data structures
Each node in the trie contains a map of a Character => TrieNode denoting an out going edge labeled with the character incident on the given TrieNode
For the purpose of this problem we only need a minimal trie implementation
insert, and prefixMatches are all that are necessary
Note that tries can also be used to implement pattern matching i.e.
- find all strings that start with a, followed by any 3 characters, and end with b
- These features are used for things like text prediction and auto complete

Trie Node Implementation

Insert Function

Finding matching prefixes

Putting it all together

Now that we have a way of finding all valid prefixes, we just have to design an algorithm to solve the problem
Basically, for each valid prefix we find, we want to remove it from the sentence and then repeat the process with the remaining string
Once we have exhausted the entire input string, we know that it was a valid sentence
Seems like a recursive problem

InsertingSpaces

The Problem

Things to consider

Things to consider

Example

Data Representation

Implementation

Trie Node Implementation

Insert Function

Finding matching prefixes

Putting it all together

The algorithm

Full solution