•   When: Monday, December 14, 2015 from 02:30 PM to 03:30 PM
  •   Speakers: Yuan Li
  •   Location: Nguyen Engineering, Room 4201
  •   Export to iCal

Abstract

Time series have attracted researchers’ attention for many decades. To study time series, people usually start from inter-data level or intra-data level. For inter-data level, researchers focus on the traditional tasks, such as clustering, classification and anomaly detection. All such tasks are based on similarity of data. Most existing work on searching time series similarity are shape-based, which typically fail to produce satisfactory results when the sequence is long. We introduce a histogram-based representation for time series data to overcome such a drawback, which is structure-based.

So far, there is relatively little work on studying time series data from the intra-data level. In order to grasp the hidden patterns and relationship in the data, we exploit grammar induction without requiring users knowing much detail of the data. By introducing grammar induction into pattern discovery for time series, we can effectively identify repeated patterns without much prior knowledge. A pattern visualization system, by which patterns and relationship of patterns in the data can be discovered and presented intuitively, is introduced and developed in this work.

Many grammar induction algorithms have been introduced due to their practical or theoretical impact on data compression, pattern discovery, and computation theory. Most existing works on learning grammar for a language are based on deterministic approach. We introduce our non-deterministic approach to address grammar induction. Our grammar induction algorithm can effectively identify smaller grammar than well-known grammar induction algorithms. The results illustrate that our algorithms are feasible to resolve difficult problems such as pattern discovery of symbolized sequence.

Posted 8 years, 3 months ago