DS4140
Download as PDF
Data Mining
Description
Data mining is the study of efficiently finding structures and patterns in large data sets. We will focus on: (1) converting from a messy and noisy raw data set to a structured and abstract one, (2) applying scalable and probabilistic algorithms to these well-structured abstract data sets, and (3) formally modeling and understanding the error and other consequences of parts (1) and (2), including choice of data representation and trade-offs between accuracy and scalability. These steps are essential for training as a data scientist. Topics will include: similarity search, clustering, regression/dimensionality reduction, graph analysis, PageRank, and small space summaries. We will also cover several recent developments and applications.
Minimum Credits
3
Maximum Credits
3
Repeat for Credit
No
Required Requisite(s):
Prerequisites: 'C-' or better in CS 3500 AND DS 3190 AND Foundational Courses ((‘C-‘ or better in (CS 1400 AND CS 1410) OR CS 1420) AND (‘B-‘ or better in CS 2420) AND (‘C’ or better in MATH 1210)) AND Major or Minor in Kahlert School of Computing
Semesters Typically Offered
Spring
Fee Amount
$15
Course Fee Usage
Consumable Instruction Materials