Tech Talk: An Analysis of Analysis

title: An Analysis of Analysis

speaker: Charles Parker

time: Friday, 11 May 2012, 10:30am

A basic problem in computer science is binary classification, in which an algorithm applies a binary label to data based on the presence or absence of some phenomenon. Problems of this type abound in areas as diverse as computational biology, multimedia indexing, and anomaly detection. Evaluating the performance of a binary labeling algorithm is itself a complex task, often based on a domain-dependent notion of the relative cost of "false positives" versus "false negatives". As these costs are often not available to researchers or engineers, a number of methods are used to provide a cost-independent analysis of performance. In this talk, I will examine a number of these methods both theoretically and experimentally. The presented results suggest a set of best practices for evaluating binary classification algorithms, while questioning whether a cost-independent analysis is even possible.

bio: Charles Parker received his Ph.D. in Computer Science in 2007 under Professor Prasad Tadepalli at Oregon State University. His thesis, "Structured Gradient Boosting", presented a gradient-based approach to structured prediction useful in information retrieval and planning domains. From 2007 to 2011, he worked for the Eastman Kodak Company on various problems in data mining, scanned document analysis, and consumer video indexing. He currently works for BigML, Inc., helping to develop a web-scale infrastructure and interface for machine learning. His work has appeared in ICML, AAAI, ICDM, and other notable venues.

