|
News
& Announcements |
For
the week of 1/31 only, my office hours will be changed to Wednesday
(2/2) 3-4pm, and Thursday (2/3) 4-5pm.
|
Instructor:
Dr. Jessica Lin
Office:
Engineering Building 4419
Phone:
703-993-4693
Email:
jessica [AT] cs [DOT] gmu [DOT] edu
Office
Hours: Wednesday 2-4pm
Classes
Thursdays
7:20-10:00pm
Innovation Hall 132
Prerequisite:
CS 750 or equivalent. Some
programming skills required for the final project.
Textbook (optional):
Data
Mining:
Concepts and Techniques, 2nd Edition, Morgan Kauffmann Publishers,
March 2006. ISBN 1-55860-901-6.
Course Description:
Time series, or measurements taken
over time in its traditional sense, is perhaps the most commonly
encountered data type, encompassing almost every human endeavor
including medicine, finance, aerospace, industry, science, etc. While
time series data present special challenges to researchers due to its
unique characteristics, the past decade has seen an explosion in time
series data mining. This seminar provides an overview on state of the
art research on mining time series data. Topics covered include data
representation, similarity search, indexing, clustering,
classification, anomaly
detection, rule discoery, motif discovery, and visualization.
Sequential pattern discovery on discrete, temporal data (web logs,
customer transactions, etc). and mining of streaming time series will
also be discussed.
Course Format:
The course will include lectures by
the instructor, presentations from students, and class discussion. You
will be asked to read research papers published in major conferences
and/or journals (paper list TBA).
Grading
Grading
will be
based on participation, assignments, presentation(s),
and a final project. You will be using Matlab in this class. Each week
you are required to read two papers. Each student will present 1-2
papers in the semester.
Participation/Attendance/Quizzes: 15%
Assignments: 20%
Presentation: 25%
Project Proposal: 15%
Project: 25%
Schedule
Assigned papers
should be read prior to the class meeting (e.g. read papers #1
for the 2/3 class).
Weeks |
Dates |
Topics |
Papers |
Presenter(s) |
1 |
1/27 |
No
Class |
|
|
2 |
2/3 |
Time
Series Similarity Search/Indexing I |
1 |
|
3 |
2/10 |
Time
Series Similarity
Search/Indexing II |
2,
3 |
|
4 |
2/17 |
Symbolic Representation |
4,
5 |
|
5 |
2/24 |
Classification |
6,
7 |
Sheri (6) |
6 |
3/3 |
Clustering |
8,
9 |
Paul (8), Muzammil (9) |
7 |
3/10 |
Subsequence Clustering / Rule Discovery |
10,
11 |
Philip (10), Rohan (11) |
8 |
3/17 |
Spring Break
|
|
|
9 |
3/24 |
Motif
Discovery (Project Proposal due) |
12,
13 |
Jin-Ming (12), Sheri (13) |
10 |
3/31 |
Anomaly
Detection
|
14,
15 |
Raghu (14), Paul (15)
|
11 |
4/7 |
Visualization |
16,
17 |
Stefan (17) |
12 |
4/14 |
Social Media Analysis |
18,
19, 20, 21*, 22* |
Carl (18), Muzammil (19),
Carl (20)
|
13 |
4/21 |
Trajectory
|
23, 24
|
Stefan (23), Philip (24) |
14 |
4/28 |
Spatiotemporal
|
25,
26 |
Michael (25), Philip (26) |
15 |
5/5 |
TBA |
|
|
16 | 5/10 | Project Presentations 1 |
|
| 16 | 5/12 | Project Presentations 2 |
|
|
Paper List (TBA)
1.
Chotirat Ann Ratanamahatana, Jessica Lin, Dimitrios Gunopulos, Eamonn
J. Keogh, Michail Vlachos, Gautam Das: Mining Time Series Data. Data
Mining and Knowledge Discovery Handbook 2010: 1049-1077. 2. (The first paper on time series mining) Rakesh Agrawal, Christos Faloutsos, and Arun N. Swami. 1993. Efficient Similarity Search In Sequence Databases.
In Proceedings of the 4th International Conference on Foundations of
Data Organization and Algorithms (FODO '93), David B. Lomet (Ed.).
Springer-Verlag, London, UK, 69-84 3. Hui Ding, Goce Trajcevski, Peter Scheuermann, Xiaoyue Wang, and Eamonn Keogh. 2008. Querying and mining of time series data: experimental comparison of representations and distance measures. Proc. VLDB Endow. 1, 2 (August 2008), 1542-1552. 4. Jessica Lin, Eamonn J. Keogh, Li Wei, Stefano Lonardi: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Discov. 15(2): 107-144 (2007)
5. Alessandro Camerra, Themis Palpanas, Jin Shieh, Eamonn Keogh, "iSAX 2.0: Indexing and Mining One Billion Time Series," icdm, pp.58-67, 2010 IEEE International Conference on Data Mining, 2010
6. Milos Radovanovic, Alexandros Nanopoulos, Mirjana Ivanovic: Time-Series Classification in Many Intrinsic Dimensions. SDM 2010: 677-688
7. Li Wei and Eamonn Keogh. 2006. Semi-supervised time series classification.
In Proceedings of the 12th ACM SIGKDD international conference on
Knowledge discovery and data mining (KDD '06). ACM, New York, NY, USA,
748-753.8. T. Warren Liao. 2005. Clustering of time series data-a survey. Pattern Recogn. 38, 11 (November 2005), 1857-1874. 9. P.P. Rodrigues, J. Gama, and J.P. Pedroso, “ODAC: Hierarchical Clustering of Time Series Data Streams,” Proc. Sixth SIAM Int'l Conf. Data Mining, pp. 499-503, Apr. 2006. 10. Eamonn J. Keogh, Jessica Lin: Clustering of time-series subsequences is meaningless: implications for previous and future research. Knowl. Inf. Syst. 8(2): 154-177 (2005) 11. Dina Goldin, Ricardo Mardales, and George Nagy. 2006. In search of meaning for time series subsequence clustering: matching algorithms based on a new distance measure.
In Proceedings of the 15th ACM international conference on Information
and knowledge management (CIKM '06). ACM, New York, NY, USA, 347-356. 12. Nuno Castro, Paulo J. Azevedo: Multiresolution Motif Discovery in Time Series. SDM 2010: 665-676 13. Abdullah Mueen and Eamonn Keogh. 2010. Online discovery and maintenance of time series motifs. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '10) 14. Dan Preston, Pavlos Protopapas, Carla Brodley. Event Discovery in Time Series. SDM 2009. 15. Dragomir Yankov, Eamonn Keogh, and Umaa Rebbapragada. 2008. Disk aware discord discovery: finding unusual time series in terabyte sized datasets. Knowl. Inf. Syst. 17, 2 (November 2008), 241-262. 16. Jessica Lin, Eamonn J. Keogh, Stefano Lonardi, Jeffrey P. Lankford, Donna M. Nystrom: Visually mining and monitoring massive time series. KDD 2004: 460-469 17. Kumar, N., Lolla N., Keogh, E., Lonardi, S. , Ratanamahatana, C. A. and Wei, L. (2005). Time-series Bitmaps: A Practical Visualization Tool for working with Large Time Series Databases. In proceedings of SIAM International Conference on Data Mining (SDM '05), Newport Beach, CA, April 21-23. 18. Daniel Gruhl, R. Guha, Ravi Kumar, Jasmine Novak, and Andrew Tomkins. 2005. The predictive power of online chatter. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining (KDD '05). 19. S. Asur and B. A. Huberman 2010 Predicting the Future with Social Media arXiv:1003.5699v1
20. Nilesh Bansal and Nick Koudas. 2007. BlogScope: a system for online analysis of high volume text streams. In Proceedings of the 33rd international conference on Very large data bases (VLDB '07)
21 (optional). Johan Bollen, Huina Mao, and Xiao-Jun Zeng. Twitter mood predicts the stock market. Journal of Computational Science, 2010 22 (optional). M. Platakis, D. Kotsakos, D. Gunopulos. 2008. Discovering Hot Topics in the Blogosphere.
In Proc. of the 2nd Panhellenic Scientific Student Conference on
Informatics, Related Technologies and Applications EUREKA 2008, pp.
122--132. 23. Jae-Gil Lee, Jiawei Han, and Xiaolei Li. 2008. Trajectory Outlier Detection: A Partition-and-Detect Framework.
In Proceedings of the 2008 IEEE 24th International Conference on Data
Engineering (ICDE '08). IEEE Computer Society, Washington, DC, USA,
140-149. 24. Anna Monreale, Fabio Pinelli, Roberto Trasarti, Fosca Giannotti: WhereNext: a location predictor on trajectory pattern mining. KDD 2009: 637-646 25. McGovern, Amy; Rosendahl, Derek H; Brown, Rodger A; and Droegemeier, Kelvin K. (2011) Identifying Predictive Multi-Dimensional Time Series Motifs: An application to severe weather prediction. Data Mining and Knowledge Discovery. Volume 22, Issue 1, pages 232-258 26. Shen-Shyang Ho, Wenqing Tang, W. Timothy Liu: Tropical cyclone event sequence similarity search via dimensionality reduction and metric learning. KDD 2010: 135-144
|
|