Schutze’s Vector Space: Detail
Build a co-occurrence matrix
- Restrict Vocabulary to 4 letter sequences
- Exclude Very Frequent - Articles, Afflixes
- Entries in 5000-5000 Matrix
Word Context
- 4grams within 1001 Characters
- Sum & Normalize Vectors for each 4gram
- Distances between Vectors by dot product