500

DATA-50000 Mathematics for Data Scientists

Study of mathematical concepts used in data science applications. Topics include differentiation and integration of functions, optimization techniques, matrix operations, eigenvalues and eigenvectors, curve fitting, and discrete mathematics.

3

DATA-50100 Concepts of Statistics 1

Distribution of random variables, conditional probability and independence, distributions of functions of random variables, and limiting distributions are discussed.

3

Prerequisites

DATA 50000 or prior coursework in Calculus

DATA-51000 Introduction to Data Mining and Analytics

Overview of the field of data mining and analytics; includes large-scale file systems and Map-Reduce, measures of similarity, link analysis, frequent item sets, clustering, e-advertising as an application, recommendation systems.

3

Prerequisites

CPSC 50200 or DATA 50000, and CPSC 50100, DATA 51100, or prior programming experience, or an undergraduate degree in Computer Science

DATA-51100 Statistical Programming

Programming structures and algorithms for large-scale statistical data processing and visualization. Students will use commonly available data analysis software packages to apply concepts and skills to large data sets and will also develop their own code using an object­oriented programming language.

3

Prerequisites

CPSC 50100 or prior programming experience

DATA-51200 Concepts of Statistics 2

Multivariate data analysis, preliminary data analysis, exploratory factor analysis, multiple regression analysis, multiple discriminant analysis, logistic regression, MANOVA and GLM, and cluster analysis are discussed.

3

Prerequisites

DATA 50100

DATA-53000 Data Visualization

The theory and practice of visualizing large, complicated data sets to clarify areas of emphasis. Human factors best practices will be presented. Programming with advanced visualization frameworks and practices will be demonstrated and used in group programming projects.

3

Prerequisites

CPSC 50100, DATA 51100, or prior programming experience

DATA-54000 Large-Scale Data Storage Systems

The design and operation of large-scale, cloud-based systems for storing data. Topics include operating system virtualization, distributed network storage; distributed computing, cloud models (IAAS, PAAS, and SAAS), and techniques for securing cloud and virtual systems.

3

Prerequisites

CPSC 50100, DATA 51100, or prior programming experience

DATA-55000 Machine Learning

Algorithms for enabling artificial systems to learn from experience; supervised and unsupervised learning; clustering, reinforcement learning; control. Students will write programs that demonstrate machine learning techniques.

3

Prerequisites

CPSC 50200 or DATA 50000, and CPSC 50100, DATA 51100, or prior programming experience

DATA-59000 Data Science Project for Computer Scientists

The capstone experience for students pursuing the Computer Science concentration in Data Science. Students will develop a solution for a real-world problem in data mining and analytics, document their work in a scholarly report, and present their methodology and results to faculty and peers.

3

Prerequisites

A minimum of 24 hours earned in the MS Data Science program.