•   When: Tuesday, May 11, 2021 from 01:00 PM to 03:00 PM
  •   Speakers: Yifeng Gao
  •   Location: Virtual
  •   Export to iCal

Abstract

With the widespread use of sensor networks, large-scale time series have become ubiquitous in both industrial processes and research applications. How to mine useful information and make the decision based on such time series data has become a popular topic in various research field including in medicine, meteorology, biology, astronomy, and etc. In recent decades, the task of detect repeated patterns, as known as motif discovery, in time series has received a great amount of attention in recent years. The discovered motifs play an essential role in many time series data mining tasks such as data visualization, classification, clustering, etc.

Despite the significant advances of motif discovery research in the recent decade, how to detect motifs in a large-scale time series is still a challenging problem. Besides, in some downstream tasks that using motifs, having motifs of different lengths is crucial as variable-length patterns can naturally co-exist in the time series and represent different unique aspects of the data.

 

Motivated by these challenges, in this dissertation, we introduce a series of time- and space-efficient approximate algorithms for detecting variable-length motifs. The proposed methods enable motif discovery in large-scale time series, which ultimately benefit a large range of downstream research tasks.

 

Specifically, we introduced three algorithms to tackle the following challenging tasks in variable-length motif discovery for large-scale time series: 1) mining motifs in over one hundred million scale time series; 2) mining motifs with significantly different length scales; 3) mining co-evolving subdimensional motifs

 

We demonstrate that all of the proposed algorithms can efficiently detect meaningful variable-length motifs in various large-scale, real-world time series. Ultimately, the proposed algorithms can benefit various downstream tasks such as data visualization, classification, clustering, anomaly detection, and rule discovery. 

Posted 3 years, 9 months ago