# DATA - Data Science

## DATA-50000 Mathematics for Data Scientists

Study of mathematical concepts used in data science applications. Topics include differentiation and integration of functions, optimization techniques, matrix operations, eigenvalues and eigenvectors, curve fitting, and discrete mathematics.

## DATA-50100 Probability and Statistics for Data Scientists

This course covers aspects of probability theory and statistical analysis used in data science. Students will study elementary probability theory, basic combinatorics, conditional probability and

independence, Bayes’ rule, random variables, mathematical expectation, discrete and continuous distributions, estimation theory, and tests of hypotheses. This course requires the use of statistical

computing with the R programming language for solving sample problems.

### Prerequisites

DATA 50000 or prior coursework in Calculus## DATA-51000 Data Mining and Analytics

This course covers techniques for knowledge extraction in very large-scale data. Students will learn how to analyze real-world datasets using different data mining techniques like document similarity detection, association rule mining, clustering, link analysis, and predictive modeling. Topics also include applications for e-advertising and recommendation systems.

### Prerequisites

CPSC 50200 or DATA 50000, and CPSC 50100, DATA 51100, or prior programming experience, or an undergraduate degree in Computer Science## DATA-51100 Statistical Programming

Programming structures and algorithms for large-scale statistical data processing and visualization. Students will use commonly available data analysis software packages to apply concepts and skills to large data sets and will also develop their own code using an objectoriented programming language.

### Prerequisites

CPSC 50100 or prior programming experience## DATA-51200 Multivariate Data Analysis

### Prerequisites

DATA 50100## DATA-53000 Data Visualization

The theory and practice of visualizing large, complicated data sets to clarify areas of emphasis. Human factors best practices will be presented. Programming with advanced visualization frameworks and practices will be demonstrated and used in group programming projects.

### Prerequisites

CPSC 50100, DATA 51100, or prior programming experience## DATA-54000 Large-Scale Data Storage Systems

The design and operation of large-scale, cloud-based systems for storing data. Topics include operating system virtualization, distributed network storage; distributed computing, cloud models (IAAS, PAAS, and SAAS), and techniques for securing cloud and virtual systems.

### Prerequisites

CPSC 50100, DATA 51100, or prior programming experience## DATA-55000 Supervised Machine Learning

### Prerequisites

CPSC 50200 or DATA 50000, and CPSC 50100, DATA 51100, or prior programming experience## DATA-55100 Unsupervised Machine Learning

### Prerequisites

CPSC 50200 or DATA 50000, and CPSC 50100, DATA 51100, or prior programming experience## DATA-55200 Semantic Web

Expressing relationships among items in a way that enables automated, distributed analysis in an application-independent way; text mining to derive meaning from semantic networks; algorithms for processing semantic networks; developing a web of things.

### Prerequisites

CPSC 50100, DATA 51100, or prior programming experience## DATA-56600 Digital Image Processing

### Prerequisites

DATA 50000 and CPSC 50100## DATA-59000 Data Science Master's Project

The capstone experience for students pursuing the Computer Science concentration in Data Science. Students will develop a solution for a real-world problem in data mining and analytics, document their work in a scholarly report, and present their methodology and results to faculty and peers.

### Prerequisites

A minimum of 24 hours earned in the MS Data Science program.## DATA-59500 Data Science Master's Thesis Research

In this course, students will work with a faculty advisor on research in the field of Data Science or its applications. The student will research open problems in data science, select a topic for their thesis, and implement novel solutions, which will be documented in a formal thesis. The course will require students to form a thesis committee and defend their thesis before graduating from the program. This course is meant to be repeated three times to fulfill the concentration requirements.