HMM Approximations | Visual Basic Developers Guide to ASP and IIS: Build Powerful Server-Side Web Applications with Visual Basic. (Visual Basic Developers Guides)

Note that in the prior examples, the current state of the chain was a function solely of the prior state of the chain with a defined probability. This is known as a bigram (a sequence of two words). If the current state were not a function of the prior state, the selection of state would simply be a random process. If the state were dependent upon the prior two states, it would be called a trigram. While increasing the dependence on the prior states (or context) can increase the usability of the chain (as we'll see in the applications section), the memory requirements on such models can be inhibitive [Henke 1999]. For example, consider Table 10.1 that shows the number of elements necessary for a vocabulary of 100 words.

Table 10.1: Context Size and the Number of Unique Elements.
n	*Type*	*Number of Elements*

2	bigram	10,000
3	trigram	1,000,000
4	4-gram	100,000,000

A corpus of 100 unique words is quite small, therefore creating anything other than bigrams over a corpus can be quite expensive.