Title: STAMP: On Discovery of Statistically Important Pattern Repeats in Long Sequential Data
Abstract:Previous chapter Next chapter Full AccessProceedings Proceedings of the 2003 SIAM International Conference on Data Mining (SDM)STAMP: On Discovery of Statistically Important Pattern Repeats in Long Se...Previous chapter Next chapter Full AccessProceedings Proceedings of the 2003 SIAM International Conference on Data Mining (SDM)STAMP: On Discovery of Statistically Important Pattern Repeats in Long Sequential DataJiong Yang, Wei Wang, and Philip S. YuJiong Yang, Wei Wang, and Philip S. Yupp.224 - 235Chapter DOI:https://doi.org/10.1137/1.9781611972733.21PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract In this paper, we focus on mining periodic patterns allowing some degree of imperfection in the form of random replacement from a perfect periodic pattern. In InfoMiner+, we proposed a new metric, namely generalized information gain, to identify patterns with events of vastly different occurrence frequencies and to adjust for the deviation from a pattern. In particular, a penalty is allowed to be associated with gaps between pattern occurrences. This is particularly useful in locating repeats in DNA sequences. In this paper, we present an effective mining algorithm, STAMP, to simultaneously mine significant patterns and the associated subsequences under the model of generalized information gain. Previous chapter Next chapter RelatedDetails Published:2003ISBN:978-0-89871-545-3eISBN:978-1-61197-273-3 https://doi.org/10.1137/1.9781611972733Book Series Name:ProceedingsBook Code:PR112Book Pages:xiv + 347Read More