Title: Digital data structures and order statistics
Abstract: This paper studies in a probabilistic framework some topics concerning the way words (strings) can overlap, and relationship of it to the height of digital trees associated with this set of words. A word is defined as a random sequence of (possible infinite) symbols over a finite alphabet. A key notion of alignment matrix {C ij } n i,j=1 is introduced where C ij is the length of the longest string that is prefix of the i-th and the j-th word. It is proved that the height of an associated digital tree is simply related to the alignment matrix through some order statistics. In particular, using this observation and proving some inequalities for order statistics, we establish that the height of a digital trie under independent model (i.e., all words are statistically independent), is asymptotically equal to 2 logαn where n is the number of words stored in the trie and α is a parameter of the probabilistic model. Some extensions of our basic model to other digital trees such as b-tries, tries with random number of keys (Poisson model) and suffix trees (dependent keys !) are also shortly discussed.
Publication Year: 1989
Publication Date: 1989-01-01
Language: en
Type: book-chapter
Indexed In: ['crossref']
Access and Citation
Cited By Count: 10
AI Researcher Chatbot
Get quick answers to your questions about the article from our AI researcher chatbot