DS4140
Download as PDF
Data Mining
Description
Data mining is the study of efficiently finding structures and patterns in large data sets. We will focus on: (1) converting from a messy and noisy raw data set to a structured and abstract one, (2) applying scalable and probabilistic algorithms to these well-structured abstract data sets, and (3) formally modeling and understanding the error and other consequences of parts (1) and (2), including choice of data representation and trade-offs between accuracy and scalability. These steps are essential for training as a data scientist. Topics will include: similarity search, clustering, regression/dimensionality reduction, graph analysis, PageRank, and small space summaries. We will also cover several recent developments and applications.
Minimum Credits
3
Maximum Credits
3
Repeat for Credit
No
Required Requisite(s):
011153
Semesters Typically Offered
Spring
Fee Amount
$15