Outlier Detection and Description

The goal of the workshop on Outlier Detection and Description (ODD) is to address outlier mining as the twofold task of outlier detection, and outlier description. In other words, the quantitiave and qualitative analysis of anomalies in data. These topics are rarely considered in unison, and literature for these tasks is spread over different research communities. The main goal of ODD is to bridge this gap and provide a venue for knowledge exchange between these different research areas for a corroborative union of quantitative and qualitative analyses for the study of outlier mining.

Invited Speakers

We are proud to have Charu Aggarwal and Raymond Ng as keynote speakers.

Charu Aggarwal will give a presentation on 'Outlier Ensembles'.
Charu is a Research Scientist at IBM T.J. Watson, New York. His research interests include outlier analysis, graph mining, social networks, data stream mining, and mining high dimensional data. He has published over 200 papers in refereed conferences and journals, 8 books, and has applied for or been granted over 80 patents. His h-index is 56. In January 2013 he published a monograph on Outlier Analysis.
Raymond Ng will give a presentation on 'Outlier Detection in Personalized Medicine'.
Raymond is a Professor of Computer Science at the University of British Columbia, Canada. His research areas include data mining, health informatics and data bases. In recent years, he has been focusing on the analysis of genomics data and text data. Amongst many contributions, he is one of the co-authors of the famous LOF outlier detection algorithm and one of the first outlier description methods on Finding Intensional Knowledge.

Each keynote will be 30 minutes long, including questions.

Workshop Program

ODD is a half-day workshop on August 11th, organized in conjunction with ACM SIGKDD 2013.

The program for ODD is:


9:00Workshop Opening
Keynote Presentation (abstract) (slides)
'Outlier Detection in Personalized Medicine'
by Raymond Ng
9:30 'Enhancing One-class Support Vector Machines for Unsupervised Anomaly Detection' (slides)
by Mennatallah Amer, Markus Goldstein, Slim Abdennadher
9:45 'Systematic Construction of Anomaly Detection Benchmarks from Real Data' (slides)
by Andrew Emmott, Shubhomoy Das, Thomas Dietterich, Alan Fern, Weng-Keen Wong

10:00Coffee Break

10:30Keynote Presentation (abstract) (slides)
'Outlier Ensembles' [Related Paper]
by Charu Aggarwal
11:00 'Anomaly Detection on ITS Data via View Association' (slides)
by Junaidillah Fadlil, Hsing-Kuo Pao, Yuh-Jye Lee
11:15 'On-line relevant anomaly detection in the Twitter stream: An Efficient Bursty Keyword Detection Model' (slides)
by Jheser Guzman, Barbara Poblete
11:30 'Distinguishing the Unexplainable from the Merely Unusual: Adding Explanations to Outliers to Discover and Detect Significant Complex Rare Events'
by Ted Senator, Henry Goldberg, Alex Memory
11:45 'Latent Outlier Detection and the Low Precision Problem' (slides)
by Fei Wang, Sanjay Chawla, Didi Surian
12:00Discussion & Closing


Lunch (on your own)


Important Dates

Submission Deadline 4th of June 2013, 23:59 PST (extended)
Notification to Authors 22st of June 2013, 23:59 PST
Camera-ready Deadline 3rd of July 2013, 23:59 PST
Workshop day 11th of August 2013

What's ODD?

Traditionally, outlier mining and anomaly detection focused on the automatic detection of highly deviating objects. It has been studied for several decades in statistics, machine learning, data mining, and database systems, and led to a lot of insight as well as automated systems for the detection of outliers.

However, for today's applications to be successful, mere identification of anomalies alone is not enough. With more and more applications using outlier analysis for data exploration and knowledge discovery, the demand for manual verification and understanding of outliers is steadily increasing. Examples include applications such as health surveillance, customer segmentation, fraud analysis, or sensor monitoring, where one is particularly interested in why an object seems outlying.

Example: Consider outlier analysis in the domain of health surveillance. An outlier might be a patient that shows high deviation in specific vital signals like "heart beat rate" and "skin humidity". If this patient is only detected by a traditional algorithm, this is not sufficient in case of health surveillance: health professionals have to be able to verify the reasons for why this patient stands out in order to provide proper medical treatment accordingly. It is a major task for outlier analysis to assist in such a manual verification. Hence, outlier mining algorithms should provide additional descriptive information. These outlier descriptions should be easy to understand and should highlight the specific deviation of an outlier in contrast to regular patients.

Even though outlier detection has been studied for several decades, awareness for the need of outlier descriptions has only recently raised attention in the data mining community. Mining outlier descriptions is currently being studied in different forms in contrast mining, pattern mining, data compression, graph outlier mining, subspace outlier mining, in addition to other fields including data visualization, image saliency detection, and astronomy. We strongly believe there is a significant overlap in the techniques of these different fields and that developments in either setting can have a significant impact on the other. Therefore, the goal of this workshop is to bring together researchers with a shared interest in outlier detection and outlier description methods, whether for use in traditional databases, graph databases, data streams or in the processing of other large and complex data sources.

Our aim is hence to bring these and other communities together in one venue. With ODD, our objectives are to: 1) further increase the general interest on this important topic in the broader research community; 2) bring together experts from closely related areas (e.g., outlier detection and contrast mining) to shed light on how this emerging new research direction can benefit from other well-established areas; 3) provide a venue for active researchers to exchange ideas and explore important research issues in this area. Overall, the idea behind ODD is that outlier detection and description together will provide novel techniques that assist humans in manual outlier verification by easy-to-understand descriptions, and so will help to advance the state of the art and applicability of outlier mining.

Call for Papers

Topics of interests for the workshop include, but are not limited to:
  • Interleaved detection and description of outliers
    • Description models for given outliers
    • Pattern and local information based outlier description
    • Subspace outliers, feature selection, and space transformations
    • Ensemble methods for anomaly detection and description
    • Descriptive local outlier ranking
    • Identification of outlier rules
    • Finding intensional knowledge
    • Contextual and community outliers
    • Human-in-the-loop modeling and learning
    • Visualization techniques for interactive exploration of outliers
    • Comparative studies on outlier description
  • Related research fields
    • Contrast mining
    • Change and novelty detection
    • Causality analysis
    • Frequent itemset mining
    • Compression theory
    • Subgroup mining
    • Subspace learning
  • Formal outlier mining models
    • Supervised, semi-supervised, and unsupervised models
    • Statistical models
    • Distance-based models
    • Density-based models
    • Spectral models
    • Constraint-based models
    • Ensemble models
  • Outlier mining for complex databases
    • Graph data (e.g. community outliers)
    • Spatio-temporal data
    • Time series and sequential data
    • Online processing of stream data
    • Scalability to high dimensional data
  • Applications of outlier detection and description
    • Fraud in financial data
    • Intrusions in communication networks
    • Sensor network analysis
    • Social network analysis
    • Health surveillance
    • Customer profiling
    • ... and many more ...

Submission Guidelines

Submission is closed.

Program Committee

  • Fabrizio Angiulli, University of Calabria
  • Ira Assent, Aarhus University
  • James Bailey, University of Melbourne
  • Arindam Banerjee, University of Minnesota
  • Albert Bifet, Yahoo! Labs Barcelona
  • Christian Böhm, LMU Munich
  • Rajmonda Caceres, MIT
  • Varun Chandola, Oak Ridge Nat. Lab.
  • Polo Chau, Georgia Tech
  • Sanjay Chawla, University of Syndey
  • Tijl De Bie, University of Bristol
  • Christos Faloutsos, Carnegie Mellon University
  • Jing Gao, University of Buffalo
  • Manish Gupta, Microsoft, India
  • Jaakko Holmén, Aalto University
  • Eamonn Keogh, University of California – Riverside
  • Matthijs van Leeuwen, KU Leuven
  • Daniel B. Neill, Carnegie Mellon University
  • Naren Ramakrishnan, Virginia Tech
  • Spiros Papadimitriou, Rutgers University
  • Koen Smets, University of Antwerp
  • Hanghang Tong, CUNY
  • Ye Wang, The Ohio State University
  • Arthur Zimek, LMU Munich

Organizers

You can contact us at:
odd13kdd (at) gmail.com