500
Study of mathematical concepts used in data science applications. Topics include differentiation and integration of functions, optimization techniques, matrix operations, eigenvalues and eigenvectors, curve fitting, and discrete mathematics.
3
This course covers aspects of probability theory and statistical analysis used in data science. Students will study elementary probability theory, basic combinatorics, conditional probability and
independence, Bayes’ rule, random variables, mathematical expectation, discrete and continuous distributions, estimation theory, and tests of hypotheses. This course requires the use of statistical
computing with the R programming language for solving sample problems.
3
Prerequisites
DATA 50000 or prior coursework in Calculus
This course covers techniques for knowledge extraction in very large-scale data. Students will learn how to analyze real-world datasets using different data mining techniques like document similarity detection, association rule mining, clustering, link analysis, and predictive modeling. Topics also include applications for e-advertising and recommendation systems.
3
Prerequisites
CPSC 50200 or DATA 50000, and CPSC 50100, DATA 51100, or prior programming experience, or an undergraduate degree in Computer Science
Programming structures and algorithms for large-scale statistical data processing and visualization. Students will use commonly available data analysis software packages to apply concepts and skills to large data sets and will also develop their own code using an objectoriented programming language.
3
Prerequisites
CPSC 50100 or prior programming experience
This course explores statistical techniques for analysis of multivariate data. It covers exploratory factor analysis, multiple regression analysis, multiple discriminant analysis, logistic regression, multivariate analysis of variance and covariance, general linear models, and cluster analysis. Extensive use of statistical software is required.
3
Prerequisites
DATA 50100
The theory and practice of visualizing large, complicated data sets to clarify areas of emphasis. Human factors best practices will be presented. Programming with advanced visualization frameworks and practices will be demonstrated and used in group programming projects.
3
Prerequisites
CPSC 50100, DATA 51100, or prior programming experience
The design and operation of large-scale, cloud-based systems for storing data. Topics include operating system virtualization, distributed network storage; distributed computing, cloud models (IAAS, PAAS, and SAAS), and techniques for securing cloud and virtual systems.
3
Prerequisites
CPSC 50100, DATA 51100, or prior programming experience
This course covers methods and theory related to generating predictive models from labeled datasets. Students will get introduced to computational learning theory, study algorithms for generating predictive models, perform feature selection and hyperparameter tuning, and learn how to evaluate model performance. Examples of supervised machine learning techniques covered in the course include naïve Bayes learning, logistic regression, decision tree induction, support vector machines, and deep neural networks. Other, recent developments and state-of-the art methods related to supervised learning may also be covered. Students will be required to write programs that demonstrate machine learning techniques on real-world datasets.
3
Prerequisites
CPSC 50200 or DATA 50000, and CPSC 50100, DATA 51100, or prior programming experience
This course will survey leading algorithms for unsupervised learning and high dimensional data analysis. The first part of the course will cover clustering algorithms and generative models of high dimensional data, such as distance/similarity measures, k-means clustering, hierarchical clustering, Fuzzy C-Means (FCM), Possibilistic C-Means (PCM), Principal Components Analysis (PCA), and Linear Discriminant Analysis (LDA). The second part of the course will cover spectral methods for dimensionality reduction, including multidimensional scaling, spectral clustering, and manifold learning. The third part of the course will cover self-organizing maps (SOMs) as well as an introduction to semi-supervised learning. Other, recent developments and state-of-the art methods related to unsupervised learning may also be covered.
3
Prerequisites
CPSC 50200 or DATA 50000, and CPSC 50100, DATA 51100, or prior programming experience
Expressing relationships among items in a way that enables automated, distributed analysis in an application-independent way; text mining to derive meaning from semantic networks; algorithms for processing semantic networks; developing a web of things.
3
Prerequisites
CPSC 50100, DATA 51100, or prior programming experience
This course provides an introduction to basic concepts, methodologies, and algorithms of digital image processing focusing on the following two major problems concerned with digital images: image enhancement and restoration for easier interpretation of images, and image analysis and object recognition. Some advanced image processing and computer vision techniques (e.g., object detection and tracking or camera models and stereo vision) might also be studied in this course. The primary goal of this course is to lay a solid foundation for students to study advanced image analysis topics such as computer vision systems, biomedical image analysis, and multimedia processing and retrieval.
3
Prerequisites
DATA 50000 and CPSC 50100
The capstone experience for students pursuing the Computer Science concentration in Data Science. Students will develop a solution for a real-world problem in data mining and analytics, document their work in a scholarly report, and present their methodology and results to faculty and peers.
3
Prerequisites
A minimum of 24 hours earned in the MS Data Science program.
In this course, students will work with a faculty advisor on research in the field of Data Science or its applications. The student will research open problems in data science, select a topic for their thesis, and implement novel solutions, which will be documented in a formal thesis. The course will require students to form a thesis committee and defend their thesis before graduating from the program. This course is meant to be repeated three times to fulfill the concentration requirements.
3
Prerequisites
Permission from Data Science Program Director.