MIT Libraries home DSpace@MIT

  • DSpace@MIT Home
  • MIT Libraries

This collection of MIT Theses in DSpace contains selected theses and dissertations from all MIT departments. Please note that this is NOT a complete collection of MIT theses. To search all MIT theses, use MIT Libraries' catalog .

MIT's DSpace contains more than 58,000 theses completed at MIT dating as far back as the mid 1800's. Theses in this collection have been scanned by the MIT Libraries or submitted in electronic format by thesis authors. Since 2004 all new Masters and Ph.D. theses are scanned and added to this collection after degrees are awarded.

MIT Theses are openly available to all readers. Please share how this access affects or benefits you. Your story matters.

If you have questions about MIT theses in DSpace, [email protected] . See also Access & Availability Questions or About MIT Theses in DSpace .

If you are a recent MIT graduate, your thesis will be added to DSpace within 3-6 months after your graduation date. Please email [email protected] with any questions.

Permissions

MIT Theses may be protected by copyright. Please refer to the MIT Libraries Permissions Policy for permission information. Note that the copyright holder for most MIT theses is identified on the title page of the thesis.

Theses by Department

  • Comparative Media Studies
  • Computation for Design and Optimization
  • Computational and Systems Biology
  • Department of Aeronautics and Astronautics
  • Department of Architecture
  • Department of Biological Engineering
  • Department of Biology
  • Department of Brain and Cognitive Sciences
  • Department of Chemical Engineering
  • Department of Chemistry
  • Department of Civil and Environmental Engineering
  • Department of Earth, Atmospheric, and Planetary Sciences
  • Department of Economics
  • Department of Electrical Engineering and Computer Sciences
  • Department of Humanities
  • Department of Linguistics and Philosophy
  • Department of Materials Science and Engineering
  • Department of Mathematics
  • Department of Mechanical Engineering
  • Department of Nuclear Science and Engineering
  • Department of Ocean Engineering
  • Department of Physics
  • Department of Political Science
  • Department of Urban Studies and Planning
  • Engineering Systems Division
  • Harvard-MIT Program of Health Sciences and Technology
  • Institute for Data, Systems, and Society
  • Media Arts & Sciences
  • Operations Research Center
  • Program in Real Estate Development
  • Program in Writing and Humanistic Studies
  • Science, Technology & Society
  • Science Writing
  • Sloan School of Management
  • Supply Chain Management
  • System Design & Management
  • Technology and Policy Program

Collections in this community

Doctoral theses, graduate theses, undergraduate theses, recent submissions.

Thumbnail

The pulse amplifier in theory and experiment 

Thumbnail

Optical studies of the nature of metallic surfaces 

Thumbnail

A controlled community for Waterbury, Connecticut 

feed

undergraduate thesis datascience

BSc/MSc Thesis

Our research group offers various interesting topics for a BSc or MSc thesis, the latter both in Computer Science and Scientific Computing . These topics are typically closely related to ongoing research projects (see our Research Page and Publications ). Below, we outline the basic procedure you should follow when planning to do a thesis in our group. Please read the following carefully! You also might want to take a quick look at past topics students covered in their theses. Please also note that we currently cannot accommodate all requests for advising a thesis as in current semester  as well as in the upcoming summer semester 2024 we are already advising numerous MSc and BSc theses.

Requirements

A key requirement is that you have taken some advanced courses offered by our group. This includes Data Science for Text Analytics  or  Complex Network Analysis (ICNA) and the more recent master level class on Natural Language Processing with Transformers  (INLPT). Student should also have some background in machine learning, ideally in combination with NLP. We also strongly recommend that prior to starting a thesis (especially a BSc thesis) in our group, you do an advanced software practical to become familiar with the data and tools we use in many of our projects. Most students typically do this in the semester before they officially start their thesis. Further requirements include

  • very good programming experience with Python (strongly preferred, including framework like pandas and numpy)
  • solid background in statistics and linear algebra
  • (optionally) experience with the machine learning frameworks such as PyTorch
  • (optionally) experience with NLP frameworks such as spaCy, gensim, LangChain
  • (optionally) experience with Opensearch or Elasticsearch
  • knowledge using tools such as Github and Docker

It is also advantageous if you have taken some graduate courses in the areas of efficient algorithms (e.g., IEA1 ) and in particular machine learning (e.g., IML , IFML or IAI ). Being familiar with frameworks like scikit-learn , Keras or PyTorch is advantageous.

If you have only taken the undergraduate course introduction to databases (IDB) and none of the other above courses, it is unlikely that we can accommodate your request.

Make also sure that you are familiar with the examination regulations ("Prüfungsordnung") that apply to your program of study.

Getting in Contact

Prior to getting in contact with us you should, of course, read this page in its entirety. If you think your interests and expertise are a good fit for our group and research activities, send an email to Prof. Michael Gertz with the subject "Anfrage BSc Arbeit" or "Anfrage MSc Arbeit" and include the following information:

  • your current transcript (as PDF). You can download this from the LSF .
  • information about your field of application ("Anwendungsfach"), in particular the courses you have taken
  • your programming experience and projects you worked on
  • areas of interest based on the research conducted in our group
  • any other information you think might strengthen your request

We will then review this information and get back to you with the scheduling of an appointment in person to discuss further details.

Thesis Expose

Once we agree on a topic for your thesis, before you officially register for a thesis, we would like to get an idea of how you approach scientific research and whether you are able to do scientific writing. For this, we require that you write an expose of your planned thesis research (see, e.g., here or here ) . This document is about 4-6 pages and has to include a description of

  • the context of your project and research
  • problem statement(s)
  • objectives and planned approaches
  • related work
  • milestones towards a timely completion of the thesis

Especially for the related work, it is important that you get a good overview  early on in your thesis project; of course, your advisor will give you some starting points. Most of the time, such an expose becomes an integral part of the introductory chapter of your thesis, so there is no time and effort wasted. The expose needs to be submitted to your advisor on schedule (which you arrange with your advisor), who will then discuss the expose with you and coordinate the next steps. Occasionally we also have students give a 10-15 minute presentation of their research plan in front of the members of our group in order to get further ideas, comments, suggestions, and pointers on their thesis.

Official Registration

In agreement with your advisor, after you have submitted an expose of good quality, you plan for an official start date of the thesis. For this, please fill out the  form suitable for your program of study:

  • Für Anmeldung einer Bachelorarbeit, siehe hier . 
  • For officially registering your master's thesis, see here . 
  • Registration form for a MSc thesis in Scientific Computing (please see Mrs. Kiesel to obtain a form).

Hand in this form to Prof. Michael Gertz who will then turn in the signed form.

Thesis Research and Advising

  • Here are some hints on grammar and style we maintain locally.
  • Some easy, purely syntactic  hints  on writing good research papers (from Prof. Felix Naumann )
  • Dos and don'ts, Universität Heidelberg, Prof. Dr. Anette Frank
  • Leitfaden zur Abfassung wissenschaftlicher Arbeiten, Ruhr-Universität Bochum, Katarina Klein
  • Leitfaden zur Abfassung wissenschaftlicher Arbeiten, TU Dresden, Maria Lieber

In addition, you can find a detailed description how to write a seminar paper using our template for seminar papers. The hints in this template might also be crucial when you are writing a thesis: [ seminar template .zip ] [ report sample pdf ] [ slides english pdf ] [ slides german pdf ]

Feel also free to ask us for copies of BSc/MSc thesis students did in the past in our group.

Thesis Template

  • Thesis template [.zip] ; see a sample PDF here .

Thesis Presentation

  • English LaTeX-Beamer template for the presentation: template [.zip] , sample PDF
  • German LaTeX-Beamer template for the presentation: template [.zip] , sample PDF

Google Custom Search

Wir verwenden Google für unsere Suche. Mit Klick auf „Suche aktivieren“ aktivieren Sie das Suchfeld und akzeptieren die Nutzungsbedingungen.

Hinweise zum Einsatz der Google Suche

Technical University of Munich

  • Data Analytics and Machine Learning Group
  • TUM School of Computation, Information and Technology
  • Technical University of Munich

Technical University of Munich

Open Topics

We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A  non-exhaustive list of open topics is listed below.

If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential topics.

Graph Neural Networks for Spatial Transcriptomics

Type:  Master's Thesis

Prerequisites:

  • Strong machine learning knowledge
  • Proficiency with Python and deep learning frameworks (PyTorch, TensorFlow, JAX)
  • Knowledge of graph neural networks (e.g., GCN, MPNN)
  • Optional: Knowledge of bioinformatics and genomics

Description:

Spatial transcriptomics is a cutting-edge field at the intersection of genomics and spatial analysis, aiming to understand gene expression patterns within the context of tissue architecture. Our project focuses on leveraging graph neural networks (GNNs) to unlock the full potential of spatial transcriptomic data. Unlike traditional methods, GNNs can effectively capture the intricate spatial relationships between cells, enabling more accurate modeling and interpretation of gene expression dynamics across tissues. We seek motivated students to explore novel GNN architectures tailored for spatial transcriptomics, with a particular emphasis on addressing challenges such as spatial heterogeneity, cell-cell interactions, and spatially varying gene expression patterns.

Contact : Filippo Guerranti , Alessandro Palma

References:

  • Cell clustering for spatial transcriptomics data with graph neural network
  • Unsupervised spatially embedded deep representation of spatial transcriptomics
  • SpaGCN: Integrating gene expression, spatial location and histology to identify spatial domains and spatially variable genes by graph convolutional network
  • DeepST: identifying spatial domains in spatial transcriptomics by deep learning
  • Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder

GCNG: graph convolutional networks for inferring gene interaction from spatial transcriptomics data

Generative Models for Drug Discovery

Type:  Mater Thesis / Guided Research

  • Proficiency with Python and deep learning frameworks (PyTorch or TensorFlow)
  • Knowledge of graph neural networks (e.g. GCN, MPNN)
  • No formal education in chemistry, physics or biology needed!

Effectively designing molecular geometries is essential to advancing pharmaceutical innovations, a domain which has experienced great attention through the success of generative models. These models promise a more efficient exploration of the vast chemical space and generation of novel compounds with specific properties by leveraging their learned representations, potentially leading to the discovery of molecules with unique properties that would otherwise go undiscovered. Our topics lie at the intersection of generative models like diffusion/flow matching models and graph representation learning, e.g., graph neural networks. The focus of our projects can be model development with an emphasis on downstream tasks ( e.g., diffusion guidance at inference time ) and a better understanding of the limitations of existing models.

Contact :  Johanna Sommer , Leon Hetzel

Equivariant Diffusion for Molecule Generation in 3D

Equivariant Flow Matching with Hybrid Probability Transport for 3D Molecule Generation

Structure-based Drug Design with Equivariant Diffusion Models

Efficient Machine Learning: Pruning, Quantization, Distillation, and More - DAML x Pruna AI

Type: Master's Thesis / Guided Research / Hiwi

  • Strong knowledge in machine learning
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch)

The efficiency of machine learning algorithms is commonly evaluated by looking at target performance, speed and memory footprint metrics. Reduce the costs associated to these metrics is of primary importance for real-world applications with limited ressources (e.g. embedded systems, real-time predictions). In this project, you will work in collaboration with the DAML research group and the Pruna AI startup on investigating solutions to improve the efficiency of machine leanring models by looking at multiple techniques like pruning, quantization, distillation, and more.

Contact: Bertrand Charpentier

  • The Efficiency Misnomer
  • A Gradient Flow Framework for Analyzing Network Pruning
  • Distilling the Knowledge in a Neural Network
  • A Survey of Quantization Methods for Efficient Neural Network Inference

Deep Generative Models

Type:  Master Thesis / Guided Research

  • Strong machine learning and probability theory knowledge
  • Knowledge of generative models and their basics (e.g., Normalizing Flows, Diffusion Models, VAE)
  • Optional: Neural ODEs/SDEs, Optimal Transport, Measure Theory

With recent advances, such as Diffusion Models, Transformers, Normalizing Flows, Flow Matching, etc., the field of generative models has gained significant attention in the machine learning and artificial intelligence research community. However, many problems and questions remain open, and the application to complex data domains such as graphs, time series, point processes, and sets is often non-trivial. We are interested in supervising motivated students to explore and extend the capabilities of state-of-the-art generative models for various data domains.

Contact : Marcel Kollovieh , David Lüdke

  • Flow Matching for Generative Modeling
  • Auto-Encoding Variational Bayes
  • Denoising Diffusion Probabilistic Models 
  • Structured Denoising Diffusion Models in Discrete State-Spaces

Active Learning for Multi Agent 3D Object Detection 

Type: Master's Thesis  Industrial partner: BMW 

Prerequisites: 

  • Strong knowledge in machine learning 
  • Knowledge in Object Detection 
  • Excellent programming skills 
  • Proficiency with Python and deep learning frameworks (TensorFlow or PyTorch) 

Description: 

In autonomous driving, state-of-the-art deep neural networks are used for perception tasks like for example 3D object detection. To provide promising results, these networks often require a lot of complex annotation data for training. These annotations are often costly and redundant. Active learning is used to select the most informative samples for annotation and cover a dataset with as less annotated data as possible.   

The objective is to explore active learning approaches for 3D object detection using combined uncertainty and diversity based methods.  

Contact: Sebastian Schmidt

References: 

  • Exploring Diversity-based Active Learning for 3D Object Detection in Autonomous Driving   
  • Efficient Uncertainty Estimation for Semantic Segmentation in Videos   
  • KECOR: Kernel Coding Rate Maximization for Active 3D Object Detection
  • Towards Open World Active Learning for 3D Object Detection   

Graph Neural Networks

Type:  Master's thesis / Bachelor's thesis / guided research

  • Knowledge of graph/network theory

Graph neural networks (GNNs) have recently achieved great successes in a wide variety of applications, such as chemistry, reinforcement learning, knowledge graphs, traffic networks, or computer vision. These models leverage graph data by updating node representations based on messages passed between nodes connected by edges, or by transforming node representation using spectral graph properties. These approaches are very effective, but many theoretical aspects of these models remain unclear and there are many possible extensions to improve GNNs and go beyond the nodes' direct neighbors and simple message aggregation.

Contact: Simon Geisler

  • Semi-supervised classification with graph convolutional networks
  • Relational inductive biases, deep learning, and graph networks
  • Diffusion Improves Graph Learning
  • Weisfeiler and leman go neural: Higher-order graph neural networks
  • Reliable Graph Neural Networks via Robust Aggregation

Physics-aware Graph Neural Networks

Type:  Master's thesis / guided research

  • Proficiency with Python and deep learning frameworks (JAX or PyTorch)
  • Knowledge of graph neural networks (e.g. GCN, MPNN, SchNet)
  • Optional: Knowledge of machine learning on molecules and quantum chemistry

Deep learning models, especially graph neural networks (GNNs), have recently achieved great successes in predicting quantum mechanical properties of molecules. There is a vast amount of applications for these models, such as finding the best method of chemical synthesis or selecting candidates for drugs, construction materials, batteries, or solar cells. However, GNNs have only been proposed in recent years and there remain many open questions about how to best represent and leverage quantum mechanical properties and methods.

Contact: Nicholas Gao

  • Directional Message Passing for Molecular Graphs
  • Neural message passing for quantum chemistry
  • Learning to Simulate Complex Physics with Graph Network
  • Ab initio solution of the many-electron Schrödinger equation with deep neural networks
  • Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions
  • Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds

Robustness Verification for Deep Classifiers

Type: Master's thesis / Guided research

  • Strong machine learning knowledge (at least equivalent to IN2064 plus an advanced course on deep learning)
  • Strong background in mathematical optimization (preferably combined with Machine Learning setting)
  • Proficiency with python and deep learning frameworks (Pytorch or Tensorflow)
  • (Preferred) Knowledge of training techniques to obtain classifiers that are robust against small perturbations in data

Description : Recent work shows that deep classifiers suffer under presence of adversarial examples: misclassified points that are very close to the training samples or even visually indistinguishable from them. This undesired behaviour constraints possibilities of deployment in safety critical scenarios for promising classification methods based on neural nets. Therefore, new training methods should be proposed that promote (or preferably ensure) robust behaviour of the classifier around training samples.

Contact: Aleksei Kuvshinov

References (Background):

  • Intriguing properties of neural networks
  • Explaining and harnessing adversarial examples
  • SoK: Certified Robustness for Deep Neural Networks
  • Certified Adversarial Robustness via Randomized Smoothing
  • Formal guarantees on the robustness of a classifier against adversarial manipulation
  • Towards deep learning models resistant to adversarial attacks
  • Provable defenses against adversarial examples via the convex outer adversarial polytope
  • Certified defenses against adversarial examples
  • Lipschitz-margin training: Scalable certification of perturbation invariance for deep neural networks

Uncertainty Estimation in Deep Learning

Type: Master's Thesis / Guided Research

  • Strong knowledge in probability theory

Safe prediction is a key feature in many intelligent systems. Classically, Machine Learning models compute output predictions regardless of the underlying uncertainty of the encountered situations. In contrast, aleatoric and epistemic uncertainty bring knowledge about undecidable and uncommon situations. The uncertainty view can be a substantial help to detect and explain unsafe predictions, and therefore make ML systems more robust. The goal of this project is to improve the uncertainty estimation in ML models in various types of task.

Contact: Tom Wollschläger ,   Dominik Fuchsgruber ,   Bertrand Charpentier

  • Can You Trust Your Model’s Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift
  • Predictive Uncertainty Estimation via Prior Networks
  • Posterior Network: Uncertainty Estimation without OOD samples via Density-based Pseudo-Counts
  • Evidential Deep Learning to Quantify Classification Uncertainty
  • Weight Uncertainty in Neural Networks

Hierarchies in Deep Learning

Type:  Master's Thesis / Guided Research

Multi-scale structures are ubiquitous in real life datasets. As an example, phylogenetic nomenclature naturally reveals a hierarchical classification of species based on their historical evolutions. Learning multi-scale structures can help to exhibit natural and meaningful organizations in the data and also to obtain compact data representation. The goal of this project is to leverage multi-scale structures to improve speed, performances and understanding of Deep Learning models.

Contact: Marcel Kollovieh , Bertrand Charpentier

  • Tree Sampling Divergence: An Information-Theoretic Metricfor Hierarchical Graph Clustering
  • Hierarchical Graph Representation Learning with Differentiable Pooling
  • Gradient-based Hierarchical Clustering
  • Gradient-based Hierarchical Clustering using Continuous Representations of Trees in Hyperbolic Space

Northeastern University

Academic Catalog 2023-2024

  • Data Science

The Bachelor of Science in Data Science studies the collection, manipulation, storage, retrieval, and computational analysis of data in its various forms, including numeric, textual, image, and video data from small to large volumes. The program combines computer science, information science, mathematics, statistics, and probability theory into an integrated curriculum that is designed to prepare students for careers or graduate studies in Big Data analysis, data science, and data analytics. The coursework covers exploratory data analysis, data manipulation in a variety of programming languages, large-scale data storage, predictive analytics, machine learning, data mining, and information visualization and presentation. Data science has emerged as a discipline due to the confluence of two major events:

  • The ability to collect, store, prune, process, and transmit large amounts of data in the cloud
  • The convergence of programming, statistics, artificial intelligence, and visualization as complementary tools for the analysis and understanding of data

Bachelor of Science (BS)

DS 1990. Elective. (1-4 Hours)

Offers elective credit for courses taken at other academic institutions. May be repeated without limit.

DS 2000. Programming with Data. (2 Hours)

Introduces programming for data and information science through case studies in business, sports, education, social science, economics, and the natural world. Presents key concepts in programming, data structures, and data analysis through Python and Excel. Integrates the use of data analytics libraries and tools. Surveys techniques for acquiring and programmatically integrating data from different sources. Explains the data analytics pipeline and how to apply programming at each stage. Discusses the programmatic retrieval of data from application programming interfaces (APIs) and from databases. Introduces predictive analytics for forecasting and classification. Demonstrates the limitations of statistical techniques.

Corequisite(s): DS 2001

Attribute(s): NUpath Analyzing/Using Data

DS 2001. Data Science Programming Practicum. (2 Hours)

Applies data science principles in interdisciplinary contexts, with each section focusing on applications to a different discipline. Involves new experiments and readings in multiple disciplines (both computer science and the discipline focus of the particular section). Requires multiple projects combining interdisciplinary subjects.

Corequisite(s): DS 2000

DS 2500. Intermediate Programming with Data. (4 Hours)

Offers intermediate to advanced Python programming for data science. Covers object-oriented design patterns using Python, including encapsulation, composition, and inheritance. Advanced programming skills cover software architecture, recursion, profiling, unit testing and debugging, lineage and data provenance, using advanced integrated development environments, and software control systems. Uses case studies to survey key concepts in data science with an emphasis on machine-learning (classification, clustering, deep learning); data visualization; and natural language processing. Additional assigned readings survey topics in ethics, model bias, and data privacy pertinent to today's big data world. Offers students an opportunity to prepare for more advanced courses in data science and to enable practical contributions to software development and data science projects in a commercial setting.

Prerequisite(s): DS 2000 with a minimum grade of D-

Corequisite(s): DS 2501

DS 2501. Lab for DS 2500. (1 Hour)

Practices the programming techniques discussed in DS 2500 through hands-on experimentation.

Corequisite(s): DS 2500

DS 2990. Elective. (1-4 Hours)

DS 2991. Research in Data Science. (1-4 Hours)

Offers an opportunity to conduct introductory-level research or creative endeavors under faculty supervision.

DS 3000. Foundations of Data Science. (4 Hours)

Introduces core modern data science technologies and methods that provide a foundation for subsequent Data Science classes. Covers: working with tensors and applied linear algebra in standard numerical computing libraries (e.g., NumPy); processing and integrating data from a variety of structured and unstructured sources; introductory concepts in probability, statistics, and machine learning; basic data visualization techniques; and now standard data science tools such as Jupyter notebooks.

Prerequisite(s): CS 2510 with a minimum grade of D- or DS 2500 with a minimum grade of D-

Attribute(s): NUpath Analyzing/Using Data, NUpath Natural/Designed World

DS 3500. Advanced Programming with Data. (4 Hours)

Offers a deep dive into the design and implementation of enterprise-grade software systems with an emphasis on software architectures for more complex data-driven applications. Covers extensible architectures that support testing, data provenance, reuse, maintainability, scalability, and robustness and building software APIs and libraries for wide-scale adoption and ease of use. Students design, implement, and test complex loosely coupled service-oriented architectures using distributed processing, stream-based data processing, and interprocess communication via message passing. Explores the features, capabilities, and underlying design of popular data analysis and visualization frameworks.

Prerequisite(s): DS 2500 with a minimum grade of D-

DS 3990. Elective. (1-4 Hours)

DS 4200. Information Presentation and Visualization. (4 Hours)

Introduces foundational principles, methods, and techniques of visualization to enable creation of effective information representations suitable for exploration and discovery. Covers the design and evaluation process of visualization creation, visual representations of data, relevant principles of human vision and perception, and basic interactivity principles. Studies data types and a wide range of visual data encodings and representations. Draws examples from physics, biology, health science, social science, geography, business, and economics. Emphasizes good programming practices for both static and interactive visualizations. Creates visualizations in Excel and Tableau as well as R, Python, and open web-based authoring libraries. Requires programming in Python, JavaScript, HTML, and CSS. Requires extensive writing including documentation, explanations, and discussions of the findings from the data analyses and the visualizations.

Attribute(s): NUpath Analyzing/Using Data, NUpath Writing Intensive

DS 4300. Large-Scale Information Storage and Retrieval. (4 Hours)

Introduces data and information storage approaches for structured and unstructured data. Covers how to build large-scale information storage structures using distributed storage facilities. Explores data quality assurance, storage reliability, and challenges of working with very large data volumes. Studies how to model multidimensional data. Implements distributed databases. Considers multitier storage design, storage area networks, and distributed data stores. Applies algorithms, including graph traversal, hashing, and sorting, to complex data storage systems. Considers complexity theory and hardness of large-scale data storage and retrieval. Requires use of nonrelational, document, key-column, key-value, and graph databases and programming in R, Python, and C++.

Prerequisite(s): CS 3200 with a minimum grade of D- ; (DS 4100 with a minimum grade of D- or DS 3000 with a minimum grade of D- )

DS 4400. Machine Learning and Data Mining 1. (4 Hours)

Introduces supervised and unsupervised predictive modeling, data mining, and machine-learning concepts. Uses tools and libraries to analyze data sets, build predictive models, and evaluate the fit of the models. Covers common learning algorithms, including dimensionality reduction, classification, principal-component analysis, k-NN, k-means clustering, gradient descent, regression, logistic regression, regularization, multiclass data and algorithms, boosting, and decision trees. Studies computational aspects of probability, statistics, and linear algebra that support algorithms, including sampling theory and computational learning. Requires programming in R and Python. Applies concepts to common problem domains, including recommendation systems, fraud detection, or advertising.

Prerequisite(s): ((DS 4100 with a minimum grade of D- or DS 3000 with a minimum grade of D- ); ( CS 2810 with a minimum grade of D- or ECON 2350 with a minimum grade of D- or ENVR 2500 with a minimum grade of D- or MATH 3081 with a minimum grade of D- or MGSC 2301 with a minimum grade of D- or PHTH 2210 with a minimum grade of D- or PSYC 2320 with a minimum grade of D- )) or ( CS 2810 with a minimum grade of D- ; CS 3500 with a minimum grade of D- )

Attribute(s): NUpath Analyzing/Using Data, NUpath Capstone Experience, NUpath Writing Intensive

DS 4420. Machine Learning and Data Mining 2. (4 Hours)

Continues with supervised and unsupervised predictive modeling, data mining, and machine-learning concepts. Covers mathematical and computational aspects of learning algorithms, including kernels, time-series data, collaborative filtering, support vector machines, neural networks, Bayesian learning and Monte Carlo methods, multiple regression, and optimization. Uses mathematical proofs and empirical analysis to assess validity and performance of algorithms. Studies additional computational aspects of probability, statistics, and linear algebra that support algorithms. Requires programming in R and Python. Applies concepts to common problem domains, including spam filtering.

Prerequisite(s): DS 4400 with a minimum grade of D-

DS 4440. Practical Neural Networks. (4 Hours)

Offers a hands-on introduction to modern neural network ("deep learning") methods and tools. Covers fundamentals of neural networks and introduces standard and new architectures from simple feedforward networks to recurrent and “transformer” architectures. Also covers stochastic gradient descent and backpropagation, along with related parameter estimation techniques. Emphasizes using these technologies in practice, via modern toolkits. Reviews applications of these models to various types of data, including images and text.

Prerequisite(s): DS 4400 (may be taken concurrently) with a minimum grade of D-

DS 4970. Junior/Senior Honors Project 1. (4 Hours)

Focuses on in-depth project in which a student conducts research or produces a product related to the student’s major field. Combined with Junior/Senior Project 2 or college-defined equivalent for 8 credit honors in the discipline project.

DS 4971. Junior/Senior Honors Project 2. (4 Hours)

Focuses on second semester of in-depth project in which a student conducts research or produces a product related to the student’s major field.

Prerequisite(s): DS 4970 with a minimum grade of D-

DS 4973. Topics in Data Science. (4 Hours)

Offers a lecture course in data science on a topic not regularly taught in a formal course. Topics may vary from offering to offering. May be repeated up to four times.

Prerequisite(s): CS 3000 with a minimum grade of D- ; ( CS 3500 with a minimum grade of D- or DS 3500 with a minimum grade of D- )

DS 4990. Elective. (1-4 Hours)

DS 4991. Research. (4 Hours)

Offers an opportunity to conduct research under faculty supervision.

Attribute(s): NUpath Integration Experience

DS 4992. Directed Study. (1-4 Hours)

Offers independent work under the direction of members of the department on a chosen topic. May be repeated without limit.

DS 4996. Experiential Education Directed Study. (1-4 Hours)

Draws upon the student’s approved experiential activity and integrates it with study in the academic major. Restricted to those students who are using it to fulfill their experiential education requirement. May be repeated without limit.

DS 4997. Data Science Thesis. (4 Hours)

Offers students an opportunity to prepare an undergraduate thesis under faculty supervision.

DS 4998. Data Science Thesis Continuation. (4 Hours)

Focuses on student continuing to prepare an undergraduate thesis under faculty supervision.

DS 5010. Introduction to Programming for Data Science. (4 Hours)

Offers an introductory course on fundamentals of programming and data structures. Covers lists, arrays, trees, hash tables, etc.; program design, programming practices, testing, debugging, maintainability, data collection techniques, and data cleaning and preprocessing. Includes a class project, where students use the concepts covered to collect data from the web, clean and preprocess the data, and make it ready for analysis.

DS 5020. Introduction to Linear Algebra and Probability for Data Science. (4 Hours)

Offers an introductory course on the basics of statistics, probability, and linear algebra. Covers random variables, frequency distributions, measures of central tendency, measures of dispersion, moments of a distribution, discrete and continuous probability distributions, chain rule, Bayes’ rule, correlation theory, basic sampling, matrix operations, trace of a matrix, norms, linear independence and ranks, inverse of a matrix, orthogonal matrices, range and null-space of a matrix, the determinant of a matrix, positive semidefinite matrices, eigenvalues, and eigenvectors.

DS 5110. Introduction to Data Management and Processing. (4 Hours)

Introduces students to the core tasks in data science, including data collection, storage, tidying, transformation, processing, management, and modeling for the purpose of extracting knowledge from raw observations. Programming is a cross-cutting aspect of the course. Offers students an opportunity to gain experience with data science tasks and tools through short assignments. Includes a term project based on real-world data.

DS 5220. Supervised Machine Learning and Learning Theory. (4 Hours)

Introduces supervised machine learning, which is the study and design of algorithms that enable computers/machines to learn from experience or data, given examples of data with a known outcome of interest. Offers a broad view of models and algorithms for supervised decision making. Discusses the methodological foundations behind the models and the algorithms, as well as issues of practical implementation and use, and techniques for assessing the performance. Includes a term project involving programming and/or work with real-world data sets. Requires proficiency in a programming language such as Python, R, or MATLAB.

Attribute(s): NUpath Capstone Experience, NUpath Writing Intensive

DS 5230. Unsupervised Machine Learning and Data Mining. (4 Hours)

Introduces unsupervised machine learning and data mining, which is the process of discovering and summarizing patterns from large amounts of data, without examples of data with a known outcome of interest. Offers a broad view of models and algorithms for unsupervised data exploration. Discusses the methodological foundations behind the models and the algorithms, as well as issues of practical implementation and use, and techniques for assessing the performance. Includes a term project involving programming and/or work with real-life data sets. Requires proficiency in a programming language such as Python, R, or MATLAB.

DS 5500. Data Science Capstone. (4 Hours)

Offers students a capstone opportunity to practice data science skills learned in previous courses and to build a portfolio. Students practice visualization, data wrangling, and machine learning skills by applying them to semester-long term projects on real-world data. Students may either propose their own projects or choose from a selection of industry options. Emphasizes the overall data science process, including identification of the scientific problem, selection of appropriate machine learning methods, and visualization and communication of results. Lectures may include additional topics, including visualization, communication, and data science ethics.

Prerequisite(s): ( CS 5800 with a minimum grade of C- or EECE 7205 with a minimum grade of C- ); DS 5110 with a minimum grade of C- ; DS 5220 with a minimum grade of C- ; DS 5230 with a minimum grade of C-

Print Options

Send Page to Printer

Print this page.

Download Page (PDF)

The PDF will include all information unique to this page.

2023-24 Undergraduate Day PDF

2023-24 CPS Undergraduate PDF

2023-24 Graduate/Law PDF

2023-24 Course Descriptions PDF

Data Science Minor

Undergraduate research.

Many students who pursue the data science minor hope to put their data science skills to practice before graduation. Whether you are interested in researching a data science topic, or simply want to put your skills to practice, undergraduate research can be highly beneficial experience. 

On This Page 

  • Data Science as a Research Topic 
  • Data Science as a Skill for Research 

Pathways into Undergraduate Research 

  • Scope of Research Experience 

Envisioning a Research Experience

  • Helpful Terminology
  • Finding Research Opportunities
  • Reaching Out to Faculty 
  • Data Science Research Skills

Data Science as a Research Topic  

Data science as a research topic involves investigating data science itself and contributes to the knowledge and understanding of data science methods and principles. In other words, researchers are exploring questions, problems, or topics related to data science. This can involve theoretical or empirical research on data science methodologies, algorithms, tools, best practices, or emerging trends. Research topics in data science might include:

  • Development and optimization of machine learning algorithms.
  • Evaluation of data mining techniques for specific applications.
  • Studies on data preprocessing and feature engineering.
  • Ethical considerations in data science and AI.
  • Investigations into the social, economic, or policy implications of data science and big data.

Researchers working on data science as a research topic aim to advance the field itself, contribute to the knowledge base, and often publish their findings in academic journals or conferences specific to data science and related disciplines.

Data Science as a Skill for Research

Data science as a skill for research involves using data science techniques and tools as a means to conduct research in other fields, applying data science methods to answer questions or address problems within a specific discipline. Researchers use data science techniques to collect, analyze, and derive insights from data that are relevant to their primary area of study. Data science skills are considered a means to an end in this context, rather than the main focus of research. Examples include:

  • A biologist using data science methods to analyze genetic data.
  • An economist using data science for economic modeling and forecasting.
  • A sociologist using data science for social network analysis.
  • A psychologist using data science for analyzing behavioral data.
  • A public health researcher using data science to analyze epidemiological data.

In these cases, data science skills are applied to enhance the quality and depth of research within a specific discipline. The primary goal is to answer research questions or address issues in the primary domain of study, and data science is a tool to achieve that goal.

There are two primary pathways for undergraduate students to get started in research - independently reaching out to a mentor or getting started through an organized program. Independent outreach is the most common way students get involved in research. This involves identifying a research mentor/faculty whose research aligns with your interests and reaching out to them through email. Students can also apply to an organized program to get involved in research. Organized programs have specific application processes, deadlines, and expectations for each program. To learn more, visit the Office of Undergraduate Research How to Get Started page or read on for more information specific to data science minors. 

Office of Undergraduate Research How to Get Started 

Scope of Research Experience

Undergraduate students engage in reseach at all points of the research process. Some may be interested in researching their own question and engaging in a full research cycle (following a project from start to finish), while others may be interested in working on a project already in motion or at a particular point in the research process. Both independent outreach and organized programs can provide students with opportunites that fit their desired research experience and goals. Many structured/organized undergraduate research programs support students who are working extremely independently. 

It is important to reflect on why you are interested in research and your goals for the experience. Before you start exploring opportunities, reflect on the following: 

  • Independence:  How much independence are you looking for - both in the research experience and the research topic/project? Some experiences may offer a higher degree of independence in choosing the research topic, designing experiments, and conducting research. Do you want to experience a full research cycle (working on a project from start to finish) or be part of a particular step in the research process? Do you want to be part of a larger research program with defined goals and objectives?
  • Flexibility: How much flexibility are you looking for - both in terms of time committment and research topic? Students working 1:1 with a faculty member may have more flexibility in terms of the research project's focus, timelines, and methodologies. They may be able to explore areas of personal interest. Structured programs may provide students with specific program timelines (i.e. summer experiences) and hour requirements. How much time each week do you have to dedicate to research? What other committments do you have?
  • Faculty Mentor:  All students will work with a faculty mentor who guides and advises them throughout the research process. The mentor provides support, reviews progress, and offers expertise. What characterics do you want in a research mentor? What questions could you ask a faculty member to better understand their mentorship philosophy?
  • Credit-Based:  Many students hope to earn academic credit for their work. Is this something that is important to you? Do you hope to use this credit to count for major or minor requirements? 
  • Duration: The duration of research projects can vary widely, from a single quarter to multiple years, depending on the student's goals and availability. Some structured programs may have a set duration, such as a summer research program, which typically lasts for a few months. Sometimes it is possible for a student to work with a faculty member for multiple years as an undergrad, and go on to work in a full time capacity in the lab. Reflect on your long term goals for the work and communicate those with potential mentors. 
  • Resources/Getting Paid: Different research opportunities will have access to different types of research facilities and resources. Do you want to get paid? Independent outreach may lead to more frequent unpaid opportunities compared to structured/funded programs, but there are still many ways to fund your work. Structured programs often provide more resources, including funding, equipment, and facilities.
  • Research Setting:  What type of setting are you interested in? A lab, field work, library, virtual, hospital, community organization, non-profit, clinic, etc? If you don’t know what you may like, how could you gain more insight?
  • Research Topic: Is there a particular academic topic or question you are hoping to investigate? Are you open to the topic and more interested in the research skill you might get to practice? 
  • Publication and Presentation: Are you interested in presenting and publishing your work? Talk with your faculty mentor to explore options in this area. All students are encouraged to present at the Undergraduate Research Symposium.  
  • Personal Goals: What other goals do you have for getting involved? What skills are you hoping to practice or learn? Reflecting on your personal goals will help your faculty mentor help you get the most out of your experience. 

Helpful Terminology 

PI: Principle Investigator (person leading the research)

Research Center/Institute

  • Founded and funded for doing research, group of labs/projects
  • Exists at universities, hospitals, non-profits, government, think tanks, etc.
  • Multidisciplinary, works across disciplines
  • Research centers tend to engage in long-term, ongoing research efforts, often spanning multiple projects

Research Lab/Group 

  • A research lab is a smaller, more specialized facility or unit within a research center, university department, or organization
  • They may be run by a principal investigator or lead researcher and consist of a team of researchers, technicians, and students
  • Research labs are more project-oriented and often have a defined timeline for their research efforts

Research Project

  • A research project is a specific, time-limited endeavor with a well-defined research question, objectives, and scope
  • It can be conducted within a research lab, research center, or as an independent effort
  • Research projects have a specific duration and are designed to answer a particular research question or solve a specific problem

Finding Research Opportunities 

Often the most time consuming part of undergradaute research is learning what opportunities exist. It's time to put on your researcher hat and spend some time online doing research into faculty whose work aligns with your interests and what structured progrIn addition to the resources below, the Office of Undergraduate Research has a great Getting Started page. 

How to Find Research Opportunities 

Office of Undergraduate Research  Office of Undergraduate Research  Undergraduate Research Database  Undergraduate Research Symposium Schedule
UW Research Centers UW Research Centers
Departmental Websites Most departmental websites will have a research section that shares current and past projects. Search UW "department name", then look for a research tab at the top.  Example: Psychology,  Econ
College of Arts and Sciences Research  Research in the College of Arts and Sciences
Structured Undergraduate Research Programs Academic Year Programs Mary Gates Research Scholarship  Levinson Emerging Scholars Award Washington Research Foundation Fellowships Beyond UW Summer Programs  URP Co-Hosted Summer Programs Summer Research Programs
Data Science as a Research Topic  Allen School Data Science Data Management and Visualization UW Database Group   Informatics HCDE eScience Institute UW Database Group
Additional Data Science Programs (may be non-research)  DSI Summer Lab - University of Chicago  Data Science Summer Course, American University of Armenia Wharton Data Science Academy Summer Program Microsoft Research Data Science Summer School Hoya Summer High School Sessions, Georgetown University The University of Chicago Summer Session Data Science Summer Institute Analytics, Data Science & Decision Making Virtual Summer School Jump-Start Data Science Summer Program, William and Mary Essex Analytics, Data Science and Decision Making Online Summer School

Reaching Out to a Faculty Mentor about Research Opportunities 

Once you have identified a faculty whose work you are intersted in, it's time to reach out. Reaching out can be intimitating. The Office of Undergraduate Research has a website to support you in this process. 

Reach Out to a Mentor 

Data Science Research Skills 

Data science skills are highly valuable in research across various disciplines. These skills help researchers collect, analyze, and interpret data, allowing them to draw meaningful insights and make evidence-based conclusions. Prior to engaging in research, reflect on the skills you have experience with, the skills you hope to practice during the research experience, and the skills you hope to gain. This reflection will help your faculty mentor best support you. Intrapersonal, communication, and critical thinking skills are just, if not more, important as the technical skills. Demonstrated curiosity in the research process can not be overstated. 

Statistical Analysis

  • Researchers need a strong foundation in statistics to perform hypothesis testing, regression analysis, and other statistical techniques to extract patterns and relationships from data.

Data Collection

  • The ability to gather, clean, and preprocess data is crucial. This includes skills in data extraction, data cleaning, and data transformation.

Data Visualization

  • Data visualization skills help researchers communicate their findings effectively. Proficiency in tools like Matplotlib, ggplot2, or Tableau can make complex data more accessible.

Machine Learning

  • Machine learning algorithms are used to build predictive models and classify data. Researchers may need to apply techniques like decision trees, support vector machines, or neural networks.

Data Mining

  • Researchers can use data mining techniques to discover patterns, anomalies, and relationships within large datasets. 

Programming

  • Proficiency in programming languages like Python or R is essential for data manipulation, analysis, and scripting.

Database Management

  • Knowledge of database systems, including SQL, is necessary for working with structured data and querying databases for research purposes.

Big Data Technologies

  • Familiarity with big data tools is important for handling and processing large datasets.

Data Ethics

  • Ethical considerations are crucial in research, especially when handling sensitive or personal data. Researchers should be aware of privacy regulations and ethical data practices.

Data Storytelling

  • Researchers need to communicate their findings effectively, which includes the ability to create compelling narratives from data.

Domain/Industry Knowledge

  • Understanding the specific industry or field of research is essential for framing research questions, selecting relevant data sources, and interpreting results accurately.

Experiment Design

  • In experimental research, knowledge of experimental design principles is essential for planning, conducting, and analyzing experiments.

Time Series Analysis

  • Time series analysis helps researchers understand the underlying causes of trends or systemic patterns over time. 

Geospatial Analysis

  • Geospatial analysis is used to add timing and location information to traditional types of data and to build data visualizations.

Text Analysis

  • For research involving text data, natural language processing (NLP) and text mining skills are valuable for extracting insights from textual content.

Data Security

  • Researchers need to ensure the security and integrity of research data, especially in studies involving sensitive or confidential information.

Version Control

  • Using version control systems help track and manage changes to software code, ensuring reproducibility and transparency in research 

Collaboration and Communication

  • Effective communication and collaboration skills are essential for sharing findings, collaborating with peers, and presenting research results.

undergraduate thesis datascience

Leverage familiarity with data science in fields outside of data science, and gain skills and fluency to work with data in your major domain of study.

© 2024 University of Washington | Seattle, WA

Duke University Libraries

Statistical Science

  • Undergraduate theses
  • Navigating the library and website
  • Off-campus access
  • Finding information @ Duke
  • Data sets & collections
  • Data & visualization services This link opens in a new window
  • Statistics consulting This link opens in a new window
  • Citing sources
  • Excel This link opens in a new window
  • Bayesian statistics
  • Actuarial science
  • Sports analytics

Librarian for the Nicholas School of the Environment

Profile Photo

Ask a Librarian

Submit thesis to dukespace.

If you are an undergraduate honors student interested in submitting your thesis to DukeSpace , Duke University's online repository for publications and other archival materials in digital format, please contact Joan Durso to get this process started.

DukeSpace Electronic Theses and Dissertations (ETD) Submission Tutorial

  • DukeSpace Electronic Theses and Dissertation Self-Submission Guide

Need help submitting your thesis? Contact  [email protected] .

  • << Previous: Sports analytics
  • Last Updated: May 14, 2024 3:36 PM
  • URL: https://guides.library.duke.edu/stats

Duke University Libraries

Services for...

  • Faculty & Instructors
  • Graduate Students
  • Undergraduate Students
  • International Students
  • Patrons with Disabilities

Twitter

  • Harmful Language Statement
  • Re-use & Attribution / Privacy
  • Support the Libraries

Creative Commons License

Princeton University

Data science.

undergraduate thesis datascience

Research in data science at Princeton integrates three strengths: the fundamental mathematics of machine learning and artificial intelligence; the interdisciplinary application of those tools to solve a wide range of real-world problems; and deep examination and innovation regarding the societal implications of artificial intelligence, including issues such as bias, equity, job automation, and privacy.

Visit AI@Princeton

Related Programs

Center for digital humanities, center for information technology policy.

Computer Science

Electrical and Computer Engineering

Operations Research and Financial Engineering

Princeton Language and Intelligence Initiaive

Princeton precision health, program in statistics and machine learning.

Student giving a presentation.

Diversifying AI through data collection and career development

Mengdi Wang in her office with math on the whiteboard in background.

Mengdi Wang makes a play at decoding disease

Speaker at podium in front of screen that says AI for Control, Design and Creation.

New initiatives bring Princeton to the fore of AI innovation

Three researchers stand in front of a white board discussing mathematics written on the board.

Statistics start to untangle AI networks

Two instructors speak to students while standing before a white board

Is AI too dangerous to release openly?

Verma posing in his lab with technical equipment.

The next AI frontier? Expanding hardware by making it more compact.

undergraduate thesis datascience

Ryan P. Adams

undergraduate thesis datascience

Amir Ali Ahmadi

undergraduate thesis datascience

Christine Allen-Blanchette

Mechanical and Aerospace Engineering

Center for Statistics and Machine Learning

undergraduate thesis datascience

Sanjeev Arora

undergraduate thesis datascience

Ryne Beeson

undergraduate thesis datascience

Mark Braverman

undergraduate thesis datascience

Matias D. Cattaneo

undergraduate thesis datascience

Daniel J. Cohen

undergraduate thesis datascience

Pablo Debenedetti

Chemical and Biological Engineering

undergraduate thesis datascience

Adji Bousso Dieng

undergraduate thesis datascience

Jianqing Fan

undergraduate thesis datascience

Adam Finkelstein

undergraduate thesis datascience

Jaime Fernández Fisac

undergraduate thesis datascience

Robert Fish

undergraduate thesis datascience

Michael Freedman

portrait of Aarti Gupta

Aarti Gupta

undergraduate thesis datascience

Jürgen Hackl

Civil and Environmental Engineering

undergraduate thesis datascience

Kelsey Hatzell

Andlinger Center for Energy and the Environment

undergraduate thesis datascience

Andrew Houck

undergraduate thesis datascience

Jerelle Joseph

Bioengineering Initiative

undergraduate thesis datascience

Aleksandra Korolova

Princeton School of Public and International Affairs

undergraduate thesis datascience

Sanjeev Kulkarni

undergraduate thesis datascience

Christos Maravelias

undergraduate thesis datascience

Jonathan Mayer

undergraduate thesis datascience

Forrest Meggers

Architecture

undergraduate thesis datascience

Prateek Mittal

undergraduate thesis datascience

Karthik Narasimhan

undergraduate thesis datascience

Arvind Narayanan

undergraduate thesis datascience

Ravi Netravali

undergraduate thesis datascience

H. Vincent Poor

undergraduate thesis datascience

Yuri Pritykin

Lewis-Sigler Institute for Integrative Genomics

undergraduate thesis datascience

Peter Ramadge

undergraduate thesis datascience

Anu Ramaswami

High Meadows Environmental Institute

Princeton Institute for International and Regional Studies (PIIRS)

Ben Raphael portrait

Ben Raphael

undergraduate thesis datascience

Szymon Rusinkiewicz

undergraduate thesis datascience

Sebastian Seung

Princeton Neuroscience Institute (PNI)

undergraduate thesis datascience

Bartolomeo Stellato

undergraduate thesis datascience

Olga Troyanskaya

undergraduate thesis datascience

Pramod Viswanath

undergraduate thesis datascience

Mengdi Wang

undergraduate thesis datascience

Huacheng Yu

undergraduate thesis datascience

Ellen Zhong

Grad Coach

Research Topics & Ideas: Data Science

50 Topic Ideas To Kickstart Your Research Project

Research topics and ideas about data science and big data analytics

If you’re just starting out exploring data science-related topics for your dissertation, thesis or research project, you’ve come to the right place. In this post, we’ll help kickstart your research by providing a hearty list of data science and analytics-related research ideas , including examples from recent studies.

PS – This is just the start…

We know it’s exciting to run through a list of research topics, but please keep in mind that this list is just a starting point . These topic ideas provided here are intentionally broad and generic , so keep in mind that you will need to develop them further. Nevertheless, they should inspire some ideas for your project.

To develop a suitable research topic, you’ll need to identify a clear and convincing research gap , and a viable plan to fill that gap. If this sounds foreign to you, check out our free research topic webinar that explores how to find and refine a high-quality research topic, from scratch. Alternatively, consider our 1-on-1 coaching service .

Research topic idea mega list

Data Science-Related Research Topics

  • Developing machine learning models for real-time fraud detection in online transactions.
  • The use of big data analytics in predicting and managing urban traffic flow.
  • Investigating the effectiveness of data mining techniques in identifying early signs of mental health issues from social media usage.
  • The application of predictive analytics in personalizing cancer treatment plans.
  • Analyzing consumer behavior through big data to enhance retail marketing strategies.
  • The role of data science in optimizing renewable energy generation from wind farms.
  • Developing natural language processing algorithms for real-time news aggregation and summarization.
  • The application of big data in monitoring and predicting epidemic outbreaks.
  • Investigating the use of machine learning in automating credit scoring for microfinance.
  • The role of data analytics in improving patient care in telemedicine.
  • Developing AI-driven models for predictive maintenance in the manufacturing industry.
  • The use of big data analytics in enhancing cybersecurity threat intelligence.
  • Investigating the impact of sentiment analysis on brand reputation management.
  • The application of data science in optimizing logistics and supply chain operations.
  • Developing deep learning techniques for image recognition in medical diagnostics.
  • The role of big data in analyzing climate change impacts on agricultural productivity.
  • Investigating the use of data analytics in optimizing energy consumption in smart buildings.
  • The application of machine learning in detecting plagiarism in academic works.
  • Analyzing social media data for trends in political opinion and electoral predictions.
  • The role of big data in enhancing sports performance analytics.
  • Developing data-driven strategies for effective water resource management.
  • The use of big data in improving customer experience in the banking sector.
  • Investigating the application of data science in fraud detection in insurance claims.
  • The role of predictive analytics in financial market risk assessment.
  • Developing AI models for early detection of network vulnerabilities.

Research topic evaluator

Data Science Research Ideas (Continued)

  • The application of big data in public transportation systems for route optimization.
  • Investigating the impact of big data analytics on e-commerce recommendation systems.
  • The use of data mining techniques in understanding consumer preferences in the entertainment industry.
  • Developing predictive models for real estate pricing and market trends.
  • The role of big data in tracking and managing environmental pollution.
  • Investigating the use of data analytics in improving airline operational efficiency.
  • The application of machine learning in optimizing pharmaceutical drug discovery.
  • Analyzing online customer reviews to inform product development in the tech industry.
  • The role of data science in crime prediction and prevention strategies.
  • Developing models for analyzing financial time series data for investment strategies.
  • The use of big data in assessing the impact of educational policies on student performance.
  • Investigating the effectiveness of data visualization techniques in business reporting.
  • The application of data analytics in human resource management and talent acquisition.
  • Developing algorithms for anomaly detection in network traffic data.
  • The role of machine learning in enhancing personalized online learning experiences.
  • Investigating the use of big data in urban planning and smart city development.
  • The application of predictive analytics in weather forecasting and disaster management.
  • Analyzing consumer data to drive innovations in the automotive industry.
  • The role of data science in optimizing content delivery networks for streaming services.
  • Developing machine learning models for automated text classification in legal documents.
  • The use of big data in tracking global supply chain disruptions.
  • Investigating the application of data analytics in personalized nutrition and fitness.
  • The role of big data in enhancing the accuracy of geological surveying for natural resource exploration.
  • Developing predictive models for customer churn in the telecommunications industry.
  • The application of data science in optimizing advertisement placement and reach.

Recent Data Science-Related Studies

While the ideas we’ve presented above are a decent starting point for finding a research topic, they are fairly generic and non-specific. So, it helps to look at actual studies in the data science and analytics space to see how this all comes together in practice.

Below, we’ve included a selection of recent studies to help refine your thinking. These are actual studies,  so they can provide some useful insight as to what a research topic looks like in practice.

  • Data Science in Healthcare: COVID-19 and Beyond (Hulsen, 2022)
  • Auto-ML Web-application for Automated Machine Learning Algorithm Training and evaluation (Mukherjee & Rao, 2022)
  • Survey on Statistics and ML in Data Science and Effect in Businesses (Reddy et al., 2022)
  • Visualization in Data Science VDS @ KDD 2022 (Plant et al., 2022)
  • An Essay on How Data Science Can Strengthen Business (Santos, 2023)
  • A Deep study of Data science related problems, application and machine learning algorithms utilized in Data science (Ranjani et al., 2022)
  • You Teach WHAT in Your Data Science Course?!? (Posner & Kerby-Helm, 2022)
  • Statistical Analysis for the Traffic Police Activity: Nashville, Tennessee, USA (Tufail & Gul, 2022)
  • Data Management and Visual Information Processing in Financial Organization using Machine Learning (Balamurugan et al., 2022)
  • A Proposal of an Interactive Web Application Tool QuickViz: To Automate Exploratory Data Analysis (Pitroda, 2022)
  • Applications of Data Science in Respective Engineering Domains (Rasool & Chaudhary, 2022)
  • Jupyter Notebooks for Introducing Data Science to Novice Users (Fruchart et al., 2022)
  • Towards a Systematic Review of Data Science Programs: Themes, Courses, and Ethics (Nellore & Zimmer, 2022)
  • Application of data science and bioinformatics in healthcare technologies (Veeranki & Varshney, 2022)
  • TAPS Responsibility Matrix: A tool for responsible data science by design (Urovi et al., 2023)
  • Data Detectives: A Data Science Program for Middle Grade Learners (Thompson & Irgens, 2022)
  • MACHINE LEARNING FOR NON-MAJORS: A WHITE BOX APPROACH (Mike & Hazzan, 2022)
  • COMPONENTS OF DATA SCIENCE AND ITS APPLICATIONS (Paul et al., 2022)
  • Analysis on the Application of Data Science in Business Analytics (Wang, 2022)

As you can see, these research topics are a lot more focused than the generic topic ideas we presented earlier. So, for you to develop a high-quality research topic, you’ll need to get specific and laser-focused on a specific context with specific variables of interest.  In the video below, we explore some other important things you’ll need to consider when crafting your research topic.

Get 1-On-1 Help

If you’re still unsure about how to find a quality research topic, check out our Research Topic Kickstarter service, which is the perfect starting point for developing a unique, well-justified research topic.

Research Topic Kickstarter - Need Help Finding A Research Topic?

You Might Also Like:

IT & Computer Science Research Topics

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly
  • Departments and Units
  • Majors and Minors
  • LSA Course Guide
  • LSA Gateway

Search: {{$root.lsaSearchQuery.q}}, Page {{$root.page}}

  • News and Events
  • Computing Resources
  • Diversity, Equity, and Inclusion
  • Provide Climate Feedback
  • Undergraduate Students
  • Ph.D. Students
  • Master's Students
  • Alumni and Friends

Department of Statistics

  • Undergraduate FAQs
  • Statistics Grad Student Tutors
  • Transfer Credit
  • Undergraduate Programs
  • Michigan Undergraduate Students of Statistics (MUgSS)
  • Undergraduate Courses
  • Undergraduate Research
  • Statistics Ph.D. Student Council
  • FAQs for Current Students
  • Graduate Student Spotlight
  • Ph.D. Program
  • Graduate Courses
  • Alumni Spotlight
  • Prospective Ph.D. Students
  • Graduate Resources
  • Master's Degree Programs
  • Prospective Master's Students - Admissions
  • Frequently Asked Questions (FAQs)
  • Statistics Alumni
  • Giving Opportunities
  • Statistics PhD Alumni
  • Statistics Career Placements
  • Major in Data Science
  • Accelerated Master's Degree Program
  • Major in Informatics
  • Major in Statistics
  • Minor in Applied Statistics
  • Minor in Data Science
  • Minor in Statistics

Data Science is a rapidly growing field providing students with exciting career paths, and opportunities for advanced study. The Data Science major gives students a foundation in those aspects of computer science, statistics, and mathematics that are relevant for analyzing and manipulating voluminous and/or complex data. Students majoring in Data Science will learn computer programming, data analysis and database systems, and will learn to think critically about the process of understanding data. Students will also take a capstone experience course that aims to synthesize the skills and knowledge learned in the various disciplines that encompass data science. The Data Science major is a rigorous program that covers the practical use of Data Science methods as well as the theoretical properties underpinning the performance of the methods and algorithms.

The Data Science major is open to students in the Colleges of LSA and Engineering. This document is intended for students pursuing the Data Science major in LSA. Students in the College of Engineering who are interested in Data Science should visit Undergraduate Program in Data Science site.

The LSA Data Science program office is located in 323 West Hall. Appointments with Data Science program advisors can be scheduled through the Undergraduate Advising Page . 

Program Requirements

The Data Science major in LSA consists of a total of 42 required credit hours , not including pre-requisites or pre-major courses. All courses must be completed with a minimum grade of C. Note that the EECS department limits students to two attempts for EECS 203, EECS 280, and EECS 281.

Data Science Program Guide (Effective for declarations Fall 23' or prior)

Data Science Program Guide (Effective for declarations WN 24' or later)

Students will need to follow the Statistics Department’s Declaration Process which includes using this Canvas link to enroll in the Statistics & Data Science Pre-declaration Tutorial. Completion of the tutorial will unlock the declaration request form.

Please refer to the Data Science Program Guide for the most recent prerequisite information.

Students must complete the linear algebra prerequisite requirement, but can do this after declaring.

Minors and Double Majors

** A Data Science and a Computer Science double major is permitted in LSA. Students are required to complete 14 credits in their Computer Science major that are not double counted with the Data Science major. Students planning this double major should meet with an advisor to plan their curriculum. (This applies only to students declared in DS-LSA and CS-LSA prior to W24.) **

Most combinations of a Data Science major in LSA with other majors or minors is permitted, consistent with the LSA rules, except for the following combinations:

  • Data Science major with a Statistics or Applied Statistics minor 
  • Data Science major with a Computer Science minor
  • Data Science major with a Data Science minor

LSA - College of Literature, Science, and The Arts - University of Michigan

  • Information For
  • Prospective Students
  • Current Students
  • Faculty and Staff
  • More about LSA
  • How Do I Apply?
  • LSA Magazine
  • Student Resources
  • Academic Advising
  • Global Studies
  • LSA Opportunity Hub
  • Social Media
  • Update Contact Info
  • Privacy Statement
  • Report Feedback

Warning icon

Thesis/Capstone for Master's in Data Science | Northwestern SPS - Northwestern School of Professional Studies

  • Post-baccalaureate
  • Undergraduate
  • Professional Development
  • Pre-College
  • Center for Public Safety
  • Get Information

SPS Logo

Data Science

Capstone and thesis overview.

Capstone and thesis are similar in that they both represent a culminating, scholarly effort of high quality. Both should clearly state a problem or issue to be addressed. Both will allow students to complete a larger project and produce a product or publication that can be highlighted on their resumes. Students should consider the factors below when deciding whether a capstone or thesis may be more appropriate to pursue.

A capstone is a practical or real-world project that can emphasize preparation for professional practice. A capstone is more appropriate if:

  • you don't necessarily need or want the experience of the research process or writing a big publication
  • you want more input on your project, from fellow students and instructors
  • you want more structure to your project, including assignment deadlines and due dates
  • you want to complete the project or graduate in a timely manner

A student can enroll in MSDS 498 Capstone in any term. However, capstone specialization courses can provide a unique student experience and may be offered only twice a year. 

A thesis is an academic-focused research project with broader applicability. A thesis is more appropriate if:

  • you want to get a PhD or other advanced degree and want the experience of the research process and writing for publication
  • you want to work individually with a specific faculty member who serves as your thesis adviser
  • you are more self-directed, are good at managing your own projects with very little supervision, and have a clear direction for your work
  • you have a project that requires more time to pursue

Students can enroll in MSDS 590 Thesis as long as there is an approved thesis project proposal, identified thesis adviser, and all other required documentation at least two weeks before the start of any term.

From Faculty Director, Thomas W. Miller, PhD

Tom Miller

Capstone projects and thesis research give students a chance to study topics of special interest to them. Students can highlight analytical skills developed in the program. Work on capstone and thesis research projects often leads to publications that students can highlight on their resumes.”

A thesis is an individual research project that usually takes two to four terms to complete. Capstone course sections, on the other hand, represent a one-term commitment.

Students need to evaluate their options prior to choosing a capstone course section because capstones vary widely from one instructor to the next. There are both general and specialization-focused capstone sections. Some capstone sections offer in individual research projects, others offer team research projects, and a few give students a choice of individual or team projects.

Students should refer to the SPS Graduate Student Handbook for more information regarding registration for either MSDS 590 Thesis or MSDS 498 Capstone.

Capstone Experience

If students wish to engage with an outside organization to work on a project for capstone, they can refer to this checklist and lessons learned for some helpful tips.

Capstone Checklist

  • Start early — set aside a minimum of one to two months prior to the capstone quarter to determine the industry and modeling interests.
  • Networking — pitch your idea to potential organizations for projects and focus on the business benefits you can provide.
  • Permission request — make sure your final project can be shared with others in the course and the information can be made public.
  • Engagement — engage with the capstone professor prior to and immediately after getting the dataset to ensure appropriate scope for the 10 weeks.
  • Teambuilding — recruit team members who have similar interests for the type of project during the first week of the course.

Capstone Lesson Learned

  • Access to company data can take longer than expected; not having this access before or at the start of the term can severely delay the progress
  • Project timeline should align with coursework timeline as closely as possible
  • One point of contact (POC) for business facing to ensure streamlined messages and more effective time management with the organization
  • Expectation management on both sides: (business) this is pro-bono (students) this does not guarantee internship or job opportunities
  • Data security/masking not executed in time can risk the opportunity completely

Publication of Work

Northwestern University Libraries offers an option for students to publish their master’s thesis or capstone in Arch, Northwestern’s open access research and data repository.

Benefits for publishing your thesis:

  • Your work will be indexed by search engines and discoverable by researchers around the world, extending your work’s impact beyond Northwestern
  • Your work will be assigned a Digital Object Identifier (DOI) to ensure perpetual online access and to facilitate scholarly citation
  • Your work will help accelerate discovery and increase knowledge in your subject domain by adding to the global corpus of public scholarly information

Get started:

  • Visit Arch online
  • Log in with your NetID
  • Describe your thesis: title, author, date, keywords, rights, license, subject, etc.
  • Upload your thesis or capstone PDF and any related supplemental files (data, code, images, presentations, documentation, etc.)
  • Select a visibility: Public, Northwestern-only, Embargo (i.e. delayed release)
  • Save your work to the repository

Your thesis manuscript or capstone report will then be published on the MSDS page. You can view other published work here .

For questions or support in publishing your thesis or capstone, please contact [email protected] .

youtube logo

The Future of AI Research: 20 Thesis Ideas for Undergraduate Students in Machine Learning and Deep Learning for 2023!

A comprehensive guide for crafting an original and innovative thesis in the field of ai..

By Aarafat Islam on 2023-01-11

“The beauty of machine learning is that it can be applied to any problem you want to solve, as long as you can provide the computer with enough examples.” — Andrew Ng

This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an  introduction , which presents a brief overview of the topic and the  research objectives . The ideas provided are related to different areas of machine learning and deep learning, such as computer vision, natural language processing, robotics, finance, drug discovery, and more. The article also includes explanations, examples, and conclusions for each thesis idea, which can help guide the research and provide a clear understanding of the potential contributions and outcomes of the proposed research. The article also emphasized the importance of originality and the need for proper citation in order to avoid plagiarism.

1. Investigating the use of Generative Adversarial Networks (GANs) in medical imaging:  A deep learning approach to improve the accuracy of medical diagnoses.

Introduction:  Medical imaging is an important tool in the diagnosis and treatment of various medical conditions. However, accurately interpreting medical images can be challenging, especially for less experienced doctors. This thesis aims to explore the use of GANs in medical imaging, in order to improve the accuracy of medical diagnoses.

2. Exploring the use of deep learning in natural language generation (NLG): An analysis of the current state-of-the-art and future potential.

Introduction:  Natural language generation is an important field in natural language processing (NLP) that deals with creating human-like text automatically. Deep learning has shown promising results in NLP tasks such as machine translation, sentiment analysis, and question-answering. This thesis aims to explore the use of deep learning in NLG and analyze the current state-of-the-art models, as well as potential future developments.

3. Development and evaluation of deep reinforcement learning (RL) for robotic navigation and control.

Introduction:  Robotic navigation and control are challenging tasks, which require a high degree of intelligence and adaptability. Deep RL has shown promising results in various robotics tasks, such as robotic arm control, autonomous navigation, and manipulation. This thesis aims to develop and evaluate a deep RL-based approach for robotic navigation and control and evaluate its performance in various environments and tasks.

4. Investigating the use of deep learning for drug discovery and development.

Introduction:  Drug discovery and development is a time-consuming and expensive process, which often involves high failure rates. Deep learning has been used to improve various tasks in bioinformatics and biotechnology, such as protein structure prediction and gene expression analysis. This thesis aims to investigate the use of deep learning for drug discovery and development and examine its potential to improve the efficiency and accuracy of the drug development process.

5. Comparison of deep learning and traditional machine learning methods for anomaly detection in time series data.

Introduction:  Anomaly detection in time series data is a challenging task, which is important in various fields such as finance, healthcare, and manufacturing. Deep learning methods have been used to improve anomaly detection in time series data, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for anomaly detection in time series data and examine their respective strengths and weaknesses.

undergraduate thesis datascience

Photo by  Joanna Kosinska  on  Unsplash

6. Use of deep transfer learning in speech recognition and synthesis.

Introduction:  Speech recognition and synthesis are areas of natural language processing that focus on converting spoken language to text and vice versa. Transfer learning has been widely used in deep learning-based speech recognition and synthesis systems to improve their performance by reusing the features learned from other tasks. This thesis aims to investigate the use of transfer learning in speech recognition and synthesis and how it improves the performance of the system in comparison to traditional methods.

7. The use of deep learning for financial prediction.

Introduction:  Financial prediction is a challenging task that requires a high degree of intelligence and adaptability, especially in the field of stock market prediction. Deep learning has shown promising results in various financial prediction tasks, such as stock price prediction and credit risk analysis. This thesis aims to investigate the use of deep learning for financial prediction and examine its potential to improve the accuracy of financial forecasting.

8. Investigating the use of deep learning for computer vision in agriculture.

Introduction:  Computer vision has the potential to revolutionize the field of agriculture by improving crop monitoring, precision farming, and yield prediction. Deep learning has been used to improve various computer vision tasks, such as object detection, semantic segmentation, and image classification. This thesis aims to investigate the use of deep learning for computer vision in agriculture and examine its potential to improve the efficiency and accuracy of crop monitoring and precision farming.

9. Development and evaluation of deep learning models for generative design in engineering and architecture.

Introduction:  Generative design is a powerful tool in engineering and architecture that can help optimize designs and reduce human error. Deep learning has been used to improve various generative design tasks, such as design optimization and form generation. This thesis aims to develop and evaluate deep learning models for generative design in engineering and architecture and examine their potential to improve the efficiency and accuracy of the design process.

10. Investigating the use of deep learning for natural language understanding.

Introduction:  Natural language understanding is a complex task of natural language processing that involves extracting meaning from text. Deep learning has been used to improve various NLP tasks, such as machine translation, sentiment analysis, and question-answering. This thesis aims to investigate the use of deep learning for natural language understanding and examine its potential to improve the efficiency and accuracy of natural language understanding systems.

undergraduate thesis datascience

Photo by  UX Indonesia  on  Unsplash

11. Comparing deep learning and traditional machine learning methods for image compression.

Introduction:  Image compression is an important task in image processing and computer vision. It enables faster data transmission and storage of image files. Deep learning methods have been used to improve image compression, while traditional machine learning methods have been widely used as well. This thesis aims to compare deep learning and traditional machine learning methods for image compression and examine their respective strengths and weaknesses.

12. Using deep learning for sentiment analysis in social media.

Introduction:  Sentiment analysis in social media is an important task that can help businesses and organizations understand their customers’ opinions and feedback. Deep learning has been used to improve sentiment analysis in social media, by training models on large datasets of social media text. This thesis aims to use deep learning for sentiment analysis in social media, and evaluate its performance against traditional machine learning methods.

13. Investigating the use of deep learning for image generation.

Introduction:  Image generation is a task in computer vision that involves creating new images from scratch or modifying existing images. Deep learning has been used to improve various image generation tasks, such as super-resolution, style transfer, and face generation. This thesis aims to investigate the use of deep learning for image generation and examine its potential to improve the quality and diversity of generated images.

14. Development and evaluation of deep learning models for anomaly detection in cybersecurity.

Introduction:  Anomaly detection in cybersecurity is an important task that can help detect and prevent cyber-attacks. Deep learning has been used to improve various anomaly detection tasks, such as intrusion detection and malware detection. This thesis aims to develop and evaluate deep learning models for anomaly detection in cybersecurity and examine their potential to improve the efficiency and accuracy of cybersecurity systems.

15. Investigating the use of deep learning for natural language summarization.

Introduction:  Natural language summarization is an important task in natural language processing that involves creating a condensed version of a text that preserves its main meaning. Deep learning has been used to improve various natural language summarization tasks, such as document summarization and headline generation. This thesis aims to investigate the use of deep learning for natural language summarization and examine its potential to improve the efficiency and accuracy of natural language summarization systems.

undergraduate thesis datascience

Photo by  Windows  on  Unsplash

16. Development and evaluation of deep learning models for facial expression recognition.

Introduction:  Facial expression recognition is an important task in computer vision and has many practical applications, such as human-computer interaction, emotion recognition, and psychological studies. Deep learning has been used to improve facial expression recognition, by training models on large datasets of images. This thesis aims to develop and evaluate deep learning models for facial expression recognition and examine their performance against traditional machine learning methods.

17. Investigating the use of deep learning for generative models in music and audio.

Introduction:  Music and audio synthesis is an important task in audio processing, which has many practical applications, such as music generation and speech synthesis. Deep learning has been used to improve generative models for music and audio, by training models on large datasets of audio data. This thesis aims to investigate the use of deep learning for generative models in music and audio and examine its potential to improve the quality and diversity of generated audio.

18. Study the comparison of deep learning models with traditional algorithms for anomaly detection in network traffic.

Introduction:  Anomaly detection in network traffic is an important task that can help detect and prevent cyber-attacks. Deep learning models have been used for this task, and traditional methods such as clustering and rule-based systems are widely used as well. This thesis aims to compare deep learning models with traditional algorithms for anomaly detection in network traffic and analyze the trade-offs between the models in terms of accuracy and scalability.

19. Investigating the use of deep learning for improving recommender systems.

Introduction:  Recommender systems are widely used in many applications such as online shopping, music streaming, and movie streaming. Deep learning has been used to improve the performance of recommender systems, by training models on large datasets of user-item interactions. This thesis aims to investigate the use of deep learning for improving recommender systems and compare its performance with traditional content-based and collaborative filtering approaches.

20. Development and evaluation of deep learning models for multi-modal data analysis.

Introduction:  Multi-modal data analysis is the task of analyzing and understanding data from multiple sources such as text, images, and audio. Deep learning has been used to improve multi-modal data analysis, by training models on large datasets of multi-modal data. This thesis aims to develop and evaluate deep learning models for multi-modal data analysis and analyze their potential to improve performance in comparison to single-modal models.

I hope that this article has provided you with a useful guide for your thesis research in machine learning and deep learning. Remember to conduct a thorough literature review and to include proper citations in your work, as well as to be original in your research to avoid plagiarism. I wish you all the best of luck with your thesis and your research endeavors!

Continue Learning

The depth i: stereo calibration and rectification, using chains and agents for llm application development.

Step-by-step guide to using chains and agents in LangChain

Amazon BedRock — Build Generative AI at Scale

Best practices for ai professional headshots: mastering your visual brand, art generating ai, prompt engineering: how to turn your words into works of art.

Shield

PROFESSIONAL MASTER'S PROGRAM

Master of Data Science

Rice University's Master of Data Science program is a professional, non-thesis degree designed to support the needs of interdisciplinary professionals. Taught by world-class faculty, the program offers students online or on-campus options.

Master of Data Science (MDS): Online & On-Campus Programs

Rice MDS Student

Program Overview

The MDS degree will be offered with both an on-campus and an online option. Students must apply to either the online or on-campus program and will be explicitly admitted to one program or the other.

Rice’s Master of Data Science (MDS) program is designed to support the needs of interdisciplinary professionals who want to apply data science knowledge, theory, and techniques to solve real-world problems.

The program offers:

  • Multidisciplinary, interdepartmental and intercollegiate instruction
  • Customizable, specialized degrees comprised of 31 graduate credit hours
  • Same online & in-person degrees

Program Learning Outcomes

Upon completing the MDS degree, students will have proficiency in:

  • Understanding the computational and statistical foundations of Data Science
  • Knowing and understanding how to use the core methods of Data Science as applied to an area of specialization or across a breadth of areas
  • Applying Data Science knowledge, theory, and techniques to solve difficult, real-world problems, beginning with raw data and ending with actionable insights
  • Effectively communicating written and orally about Data Science methods and results to a lay audience

Curriculum Overview

This non-thesis curriculum requires the completion of a minimum of 31 credits. It is a rigorous blend of courses that deliver the skills you need to collect, evaluate, interpret and communicate data for effective decision-making across a variety of industries.

  • Core Courses: Your curriculum includes core courses designed to help you gain an understanding of the computational and statistical foundations of data science.
  • Specialization: You’ll gain deeper knowledge in data science by choosing a specialization in business analytics, machine learning or image processing.* Currently, image processing coursework is only offered for the on-campus program.
  • Electives: You’ll further customize your program of study with an elective in ethics, cybersecurity, or security and privacy.
  • Capstone: Then, to give you experience applying your knowledge to a real-world problem, you’ll participate in a capstone project that will help you demonstrate your skill, collaborative ability and problem-solving acumen.

View the MDS Curriculum to learn more about core courses, specializations, electives and our data science capstone project.

Online or On-Campus, which is right for you?

The Online MDS is a part-time program that allows working professionals to get the same benefits and curriculum of a full-time, on-campus program in an online environment. Students have access to best-in-class materials and resources and can connect with peers and world-class educators. Learn More.

On-Campus MDS

The On-Campus MDS is a full-time program at the Rice University campus in Houston, Texas. The program hosts a lively and invigorating community of scholars in the Department of Computer Science, the largest academic department at Rice. Learn More .

Engineering Professional Master's Programs

The following professional master's programs also offer non-thesis, advanced degrees involving data science:

  • Master of Computational and Applied Mathematics The Professional Master of Computational and Applied Mathematics (MCAM) is designed for students interested in a technical career path in industry or business.
  • Master in Computational Science and Engineering The Professional Master in Computational Science and Engineering (MCSE) is offered jointly by the Department of Computational and Applied Mathematics, Computer Science and Statistics in the School of Engineering.
  • Master of Computer Science The Professional Master of Computer Science (MCS) degree is a terminal degree for students intending to pursue a technical career in the computer industry.
  • Master of Electrical and Computer Engineering The Department of Electrical and Computer Engineering offers a Professional Master of Electrical and Computer Engineering (MECE) program with a focus in Data Science.
  • Master of Industrial Engineering The School of Engineering offers a Professional Master of Industrial Engineering for students seeking a deeper understanding of how sophisticated decision models can optimize complex systems in any industry as well as the nonprofit sector.
  • Master of Statistics The Department of Statistics offers a Professional Master of Statistics (MStat) program that includes a solid foundation in statistical computing, statistical modeling, experimental design, and mathematical statistics, plus electives in statistical methods and/or theory.
  • Professional Science Master's Program The Subsurface-Geoscience Professional Science Master’s program offers a program focus area in Energy Data Management.

Chapman University Digital Commons

Home > Dissertations and Theses > Computational and Data Sciences (PhD) Dissertations

Computational and Data Sciences (PhD) Dissertations

Below is a selection of dissertations from the Doctor of Philosophy in Computational and Data Sciences program in Schmid College that have been included in Chapman University Digital Commons. Additional dissertations from years prior to 2019 are available through the Leatherby Libraries' print collection or in Proquest's Dissertations and Theses database.

Dissertations from 2024 2024

A Novel Correction for the Multivariate Ljung-Box Test , Minhao Huang

Machine Learning and Geostatistical Approaches for Discovery of Weather and Climate Events Related to El Niño Phenomena , Sachi Perera

Global to Glocal: A Confluence of Data Science and Earth Observations in the Advancement of the SDGs , Rejoice Thomas

Dissertations from 2023 2023

Computational Analysis of Antibody Binding Mechanisms to the Omicron RBD of SARS-CoV-2 Spike Protein: Identification of Epitopes and Hotspots for Developing Effective Therapeutic Strategies , Mohammed Alshahrani

Integration of Computer Algebra Systems and Machine Learning in the Authoring of the SANYMS Intelligent Tutoring System , Sam Ford

Voluntary Action and Conscious Intention , Jake Gavenas

Random Variable Spaces: Mathematical Properties and an Extension to Programming Computable Functions , Mohammed Kurd-Misto

Computational Modeling of Superconductivity from the Set of Time-Dependent Ginzburg-Landau Equations for Advancements in Theory and Applications , Iris Mowgood

Application of Machine Learning Algorithms for Elucidation of Biological Networks from Time Series Gene Expression Data , Krupa Nagori

Stochastic Processes and Multi-Resolution Analysis: A Trigonometric Moment Problem Approach and an Analysis of the Expenditure Trends for Diabetic Patients , Isaac Nwi-Mozu

Applications of Causal Inference Methods for the Estimation of Effects of Bone Marrow Transplant and Prescription Drugs on Survival of Aplastic Anemia Patients , Yesha M. Patel

Causal Inference and Machine Learning Methods in Parkinson's Disease Data Analysis , Albert Pierce

Causal Inference Methods for Estimation of Survival and General Health Status Measures of Alzheimer’s Disease Patients , Ehsan Yaghmaei

Dissertations from 2022 2022

Computational Approaches to Facilitate Automated Interchange between Music and Art , Rao Hamza Ali

Causal Inference in Psychology and Neuroscience: From Association to Causation , Dehua Liang

Advances in NLP Algorithms on Unstructured Medical Notes Data and Approaches to Handling Class Imbalance Issues , Hanna Lu

Novel Techniques for Quantifying Secondhand Smoke Diffusion into Children's Bedroom , Sunil Ramchandani

Probing the Boundaries of Human Agency , Sook Mun Wong

Dissertations from 2021 2021

Predicting Eye Movement and Fixation Patterns on Scenic Images Using Machine Learning for Children with Autism Spectrum Disorder , Raymond Anden

Forecasting the Prices of Cryptocurrencies using a Novel Parameter Optimization of VARIMA Models , Alexander Barrett

Applications of Machine Learning to Facilitate Software Engineering and Scientific Computing , Natalie Best

Exploring Behaviors of Software Developers and Their Code Through Computational and Statistical Methods , Elia Eiroa Lledo

Assessing the Re-Identification Risk in ECG Datasets and an Application of Privacy Preserving Techniques in ECG Analysis , Arin Ghazarian

Multi-Modal Data Fusion, Image Segmentation, and Object Identification using Unsupervised Machine Learning: Conception, Validation, Applications, and a Basis for Multi-Modal Object Detection and Tracking , Nicholas LaHaye

Machine-Learning-Based Approach to Decoding Physiological and Neural Signals , Elnaz Lashgari

Learning-Based Modeling of Weather and Climate Events Related To El Niño Phenomenon via Differentiable Programming and Empirical Decompositions , Justin Le

Quantum State Estimation and Tracking for Superconducting Processors Using Machine Learning , Shiva Lotfallahzadeh Barzili

Novel Applications of Statistical and Machine Learning Methods to Analyze Trial-Level Data from Cognitive Measures , Chelsea Parlett

Optimal Analytical Methods for High Accuracy Cardiac Disease Classification and Treatment Based on ECG Data , Jianwei Zheng

Dissertations from 2020 2020

Development of Integrated Machine Learning and Data Science Approaches for the Prediction of Cancer Mutation and Autonomous Drug Discovery of Anti-Cancer Therapeutic Agents , Steven Agajanian

Allocation of Public Resources: Bringing Order to Chaos , Lance Clifner

A Novel Correction for the Adjusted Box-Pierce Test — New Risk Factors for Emergency Department Return Visits within 72 hours for Children with Respiratory Conditions — General Pediatric Model for Understanding and Predicting Prolonged Length of Stay , Sidy Danioko

A Computational and Experimental Examination of the FCC Incentive Auction , Logan Gantner

Exploring the Employment Landscape for Individuals with Autism Spectrum Disorders using Supervised and Unsupervised Machine Learning , Kayleigh Hyde

Integrated Machine Learning and Bioinformatics Approaches for Prediction of Cancer-Driving Gene Mutations , Oluyemi Odeyemi

On Quantum Effects of Vector Potentials and Generalizations of Functional Analysis , Ismael L. Paiva

Long Term Ground Based Precipitation Data Analysis: Spatial and Temporal Variability , Luciano Rodriguez

Gaining Computational Insight into Psychological Data: Applications of Machine Learning with Eating Disorders and Autism Spectrum Disorder , Natalia Rosenfield

Connecting the Dots for People with Autism: A Data-driven Approach to Designing and Evaluating a Global Filter , Viseth Sean

Novel Statistical and Machine Learning Methods for the Forecasting and Analysis of Major League Baseball Player Performance , Christopher Watkins

Dissertations from 2019 2019

Contributions to Variable Selection in Complexly Sampled Case-control Models, Epidemiology of 72-hour Emergency Department Readmission, and Out-of-site Migration Rate Estimation Using Pseudo-tagged Longitudinal Data , Kyle Anderson

Bias Reduction in Machine Learning Classifiers for Spatiotemporal Analysis of Coral Reefs using Remote Sensing Images , Justin J. Gapper

Estimating Auction Equilibria using Individual Evolutionary Learning , Kevin James

Employing Earth Observations and Artificial Intelligence to Address Key Global Environmental Challenges in Service of the SDGs , Wenzhao Li

Image Restoration using Automatic Damaged Regions Detection and Machine Learning-Based Inpainting Technique , Chloe Martin-King

Theses from 2017 2017

Optimized Forecasting of Dominant U.S. Stock Market Equities Using Univariate and Multivariate Time Series Analysis Methods , Michael Schwartz

  • Collections
  • Disciplines

Advanced Search

  • Notify me via email or RSS

Author Corner

  • Submit Research
  • Rights and Terms of Use
  • Leatherby Libraries
  • Chapman University

ISSN 2572-1496

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

  • Skip to Content
  • Skip to Main Navigation
  • Skip to Search

undergraduate thesis datascience

Indiana University Indianapolis Indiana University Indianapolis IU Indianapolis

Open Search

  • Undergraduate Majors
  • Apply to the Accelerated Program
  • Master's Degrees
  • Doctoral Degrees & Minors
  • Minors & Certificates
  • General Education
  • Artificial Intelligence
  • Bioinformatics
  • Computer Science
  • Data Science
  • Health Informatics
  • Health Information Management
  • Library & Information Science
  • Informatics
  • Media Arts and Science
  • Study Abroad in Greece
  • Study Abroad in Finland
  • Micro-Credentials
  • Freshman Applicants
  • Returning Students
  • Master's Degree
  • Doctoral Program
  • Graduate Certificates
  • Change or Declare your Major
  • Admitted Students
  • Student Ambassadors
  • Virtual Tour
  • Undergraduate Webinars & Information Sessions
  • Graduate Student Information Sessions
  • Summer Camp
  • Earn College Credit
  • Biomedical Informatics Challenge
  • Computer Science Challenge
  • Incoming Undergraduate Scholarships
  • Undergraduate Scholarships
  • Graduate Scholarships
  • Accelerated Program Cost & Aid
  • Travel Funding
  • Tuition Reduction
  • Peer Advisors
  • Forms & Policies
  • Become a Student Leader
  • Student Organizations
  • Honors Program
  • Laptop Requirements
  • Equipment Checkout
  • Luddy Knowledge Base
  • Student Facility Access
  • Biomedical Informatics B.S.
  • Health Information Management B.S.
  • Informatics B.S.
  • Media Arts and Science B.S.
  • Bioinformatics M.S.
  • Health Informatics M.S.
  • Applied Data Science M.S.
  • Human-Computer Interaction M.S.
  • Master of Library and Information Science
  • Media Arts and Science M.S.
  • Find a Job or Internship
  • F-1 Students & Internships
  • Library & Information Science Internships
  • Internship Checklist
  • Forage: Virtual Job Simulations
  • Forage: Earn Credit
  • Network with LinkedIn
  • Big Interview
  • Elevator Pitch
  • Cover Letter
  • Informational Interview
  • Interviewing
  • Technical Interviewing
  • The Offer Process
  • The Negotiation Process
  • Freelance Work
  • Grant Proposal Writing
  • Schedule an Appointment
  • Request a Career Services Presentation
  • Featured Employer Days
  • Resume Reviews
  • Portfolio Reviews
  • Presentations and Workshops
  • Employer Career Fair Registration
  • Research Centers & Labs
  • Undergraduate Research
  • Research Events
  • Luddy Strategic Plan
  • Meet Fred Luddy
  • Faculty Openings
  • Faculty Directory
  • Staff Directory
  • Media Requests
  • Contact Admissions
  • Request Undergraduate Information
  • Request Graduate Information
  • Get involved
  • Advisory Boards
  • Advisory Board
  • Department Blog
  • Strategic Plan
  • Multimedia Stories
  • News Archive
  • Luddy Leads Blog
  • Student Showcases
  • LIS Industry Speaker Series

Luddy School of Informatics, Computing, and Engineering

  • Alumni & Giving
  • Departments
  • News & Blog

Numbers are the name of the game

Statistics have always been part of sports. But the digital age has altered the playing field, as organizations seek out those with the skills to use statistics as a tool for success. Combine sports marketing skills with the analysis and management of data when you earn a master’s in Applied Data Science with a specialization in Sports Analytics at IU Indianapolis.

  • Degrees & Courses
  • Applied Data Science Master's

Sports Analytics Specialization

Succeed with a winning combination.

Analytics is a crucial part of decision-making in amateur and professional athletics . Teams rely on those with the knowledge to interpret data and relate it to the world of athletics. (Nikhil Morar—pictured above—earned his Applied Data Science master's degree with a specialization in Sports Analytics from our program, and became Manager of Business Analytics & Strategy for the Los Angeles Lakers basketball franchise.)

Indianapolis boasts 10 professional sports teams. The city is home to the National Collegiate Athletic Association (NCAA), the National Federation of State High School Associations, and is widely considered the Capital of Amateur Sports.

By teaming up, the Luddy School in Indianapolis and the Department of Tourism, Event, and Sport Management  draw on a unique mix of resources to offer B.S./M.S. in this exciting field.

  • Request information
  • Attend a virtual info session
  • Talk to a current student

Rishi Chandran

I was able to use machine learning and descriptive statistics to create actionable scouting reports focused on finding strategies that will give a team a better chance of winning. Rishi Chandran, M.S. '23 & Basketball Operations Seasonal Assistant with the Cleveland Cavaliers

Careers in Sports Analytics

Our sports analytics alumni work for some of the greatest teams in the NBA. On game day, it's all about the numbers!

Nikhil Morar at the office of the LA Lakers

Nikhil Morar

Manager of Business Analytics & Strategy for the Los Angeles Lakers

“Sports organizations need analytics experts who can turn data about their customers and teams into revenue-generating strategies."

Gabriel Wachowski with the NBA Championship trophy

Gabriel Wachowski

Research and Innovation Analyst for the Milwaukee Bucks

"Overall, the job has been absolutely incredible. I definitely feel like having my master’s was extremely important to being ready for the job that I have. My classes at IUPUI and my internship (with the Indiana Pacers) were both instrumental to where I am today."

A skill set tailored to sports

Students who earn a Master of Science in Applied Data Science with a specialization in Sports Analytics learn core skills in data analysis, data management and infrastructure, and client–server application development, and ethical and professional management of data projects.

Earn additional competencies in sports sales, the management of massive, high-throughput data stores, cloud computing, and the data life cycle.

Degree requirements

The plan of study is 30 credit hours. It includes six core courses and four specialization/ elective courses. Transfer students may be able to transfer in approved graduate courses from an accredited institution.

F-1 students can only take one online course per semester. They must take a minimum of 8 credit hours per semester; the exception being in their final semester. These limitations apply to fall and spring semesters but not summer sessions.

Core Courses (18 credits)

  • INFO-H 501 Introduction to Data Science Programming
  • LIS-S 511 Database Design
  • INFO-H 510 Statistics for Data Science
  • INFO-H 515 Statistical Learning  (Prerequisite: Graduate Statistics course)
  • INFO-H 516 Cloud Computing for Data Science  (Prerequisites: Graduate Database course)
  • INFO-H 517 Visualization Design, Analysis, and Evaluation  (Prerequisite: Programming experience)

Students may test out of LIS-S 511. Students do not receive credit toward their required 30 credit hours by testing out of a course. However, they may instead replace the course with a specialization course or approved elective.

Specialization + Elective Courses (12 credits)

Specialization Courses

  • TESM-T 562 Economics of Event Tourism (Fall)
  • TESM-T 582 Applied Sport Event Research (Spring)
  • TESM-T 598 Master’s Consulting Project (Summer)

Elective Courses

  • INFO-B 505 Informatics Project Management
  • INFO-H 518 Deep Learning Neural Networks
  • INFO-H 519 Natural Language Processing with Deep Learning  
  • INFO-H 695 Thesis/Project in Applied Data Science (MS Thesis students only)
  • INFO-I 575 Informatics Research Design
  • INFO-I 595 Professional Internship
  • INFO-I 698 Research in Informatics (Independent Study)
  • INFO-P 502 Modeling Crisis
  • NEWM-N 510 Web Database Development

Thesis or Project

The Thesis/Project is available to highly motivated students ready to carry out publishable research. Students must prepare a prospectus and gain a commitment from a  primary faculty advisor  with research interests in data science by the end of the first semester. By the end of the second semester, students must complete a course on research design and methods (e.g.,  INFO-I 575 or  LIS-S 506 ).

The thesis or project must be completed in two semesters or in a semester and summer. Thesis students register for a total of 6 credits and project students register for a total of 3–6 credits of  INFO-H 695 Thesis/Project in Data Science . Students are required to prepare and defend a research proposal with a timeline of deliverables in addition to the thesis or project.

Plan of study for fall admissions

Fall year 1, spring year 1.

  • INFO-H 515 Statistical Learning
  • INFO-H 516 Cloud Computing for Data Science
  • INFO-H 517 Visualization Design, Analysis, and Evaluation
  • Sports Analytics Specialization or Elective Course

Summer Year 1 (Optional)

Fall year 2.

  • Sports Analytics Specialization or Elective Course (If not taken in Summer)

Plan of study for spring admissions

Spring year 2, ready to get started.

  • Schedule a Visit
  • Talk to a Current Student
  • Learn how to apply

Luddy School of Informatics, Computing, and Engineering resources and social media channels

Additional links and resources.

  • Degrees & Majors
  • Scholarships

Happening at Luddy

  • Pre-college Programs

Information For

  • Current Students
  • Faculty & Staff Intranet

Luddy Indianapolis

  • Future Students
  • Parents/Families
  • Alumni/Friends
  • Current Students
  • Faculty/Staff
  • MyOHIO Student Center
  • Visit Athens Campus
  • Regional Campuses
  • OHIO Online
  • Faculty/Staff Directory

Undergraduate

School of Art + Design

  • College of Fine Arts Capital Project
  • Meet the Dean
  • Faculty & Staff Directory
  • News and Events
  • Events Newsletter Signup
  • Alumni Magazine
  • Make a Gift
  • Undergraduate Offerings
  • High School Scholarship Competitions
  • Learning Community
  • Admitted Undergraduate Students
  • Admitted Graduate Students
  • Student Resources
  • Visiting Artists & Scholars
  • Athena Cinema
  • Athens Community Music School
  • Athens International Film + Video Festival
  • Kennedy Museum of Art
  • Ohio Valley Center for Collaborative Arts
  • Global Arts Festival
  • Performing Arts Series
  • Summer Arts Programs
  • Tantrum Theater
  • Admitted Students: Next Steps
  • Art + Design Degrees
  • Admission Requirements
  • Financial Aid & Scholarships
  • Frequently Asked Questions
  • Art + Design Graduate Programs
  • Master of Arts Administration
  • Arts in Health Programs
  • Faculty & Staff
  • Art + Design News
  • Art Galleries & Exhibitions
  • Foundations Program
  • Museum Studies Certificate
  • Bachelor's Degree Programs
  • Graduate Programs
  • Graduate Requirements
  • Live Streamed Concerts
  • High School Opportunities
  • Faculty and Staff
  • The Marching 110
  • Student Resources and Materials
  • Theater Degrees
  • Undergrad Auditions/Interviews
  • Theater Graduate Programs
  • Graduate Auditions/Interviews
  • Graduate Funding
  • Current Season
  • Get Tickets
  • Production Gallery
  • The Healthy Village
  • Theater News
  • Dance Degrees
  • Dance Graduate Programs
  • Summer Dance Institute
  • Featured Alumni
  • SHAPE Clinic
  • OHIO Performing Arts Series
  • Film Undergraduate Programs
  • Film Graduate Programs
  • Honors Tutorial College Curriculum
  • Application Process
  • Scholarships and Aid
  • Facilities and Resources
  • Doctoral Programs
  • Scholar/Artist Track
  • Undergraduate Courses
  • Interdisciplinary News
  • Art Galleries and Exhibitions

Helpful Links

Navigate OHIO

Connect With Us

Emerge_Hero_2024

2024 Senior Thesis Exhibition

Anna Aman Thumb

Avery Berman

Brown Thumb

Maggie Champer

Alexis-Craiglow-thumb

Alexis Craiglow

Brenna Cromwell Thumb

Brenna Cromwell

Madison Fantelli_thumb

Madison Fantelli

Needs Thumbnail

Kara Ferguson

Aeden Grothaus Thumbs

Aeden Grothaus

Lauren Hailey Work

Lauren Hailey

Myah Husted

Myah Husted

Taylor Laubacher Work

Taylor Laubacher

Keirstan Mesnick Wrok

Keirstan Mesnick

Amedee Michel Work

Amedee Michel

Josie Stafford Work

Josie Stafford

Lea Strauss Work

Lea Strauss

Thumb

Brynna Pope

Thumb

Sophie Zawodny

Connect with us.

Request more information from the College of Fine Arts.

IMAGES

  1. Summary

    undergraduate thesis datascience

  2. Undergraduate Thesis Presentation

    undergraduate thesis datascience

  3. (PDF) Introduction of PhD thesis

    undergraduate thesis datascience

  4. Find Out How to Write a Reliable Data Science Statement of Purpose

    undergraduate thesis datascience

  5. Summary

    undergraduate thesis datascience

  6. ceng416 undergraduate thesis report contents

    undergraduate thesis datascience

VIDEO

  1. Introduction to thesis writing for Journalism Studies

  2. Why Data Science?

  3. CERTIFICATE IN DATA SCIENCE at IE UNIVERSITY

  4. Testing of Power Amplifier (1)

  5. 2024-04-25 Teknik Basis Data (TI125)

  6. 2024-03-14 E-Business Application Development (SI146)

COMMENTS

  1. PDF Undergraduate Fundamentals of Machine Learning

    dents with an undergraduate-level background in linear algebra and statistics. The nal product is a textbook for Harvard's introductory course in machine learning, CS 181. This work is motivated by a lack of resources for individuals with an undergraduate background in the areas necessary to succeed in an introductory course in machine learning.

  2. How to write a great data science thesis

    They will stress the importance of structure, substance and style. They will urge you to write down your methodology and results first, then progress to the literature review, introduction and conclusions and to write the summary or abstract last. To write clearly and directly with the reader's expectations always in mind.

  3. MIT Theses

    MIT's DSpace contains more than 58,000 theses completed at MIT dating as far back as the mid 1800's. Theses in this collection have been scanned by the MIT Libraries or submitted in electronic format by thesis authors. Since 2004 all new Masters and Ph.D. theses are scanned and added to this collection after degrees are awarded.

  4. Undergraduate Research

    An honors thesis provides an opportunity for eligible students to carry out faculty-supervised research in the senior year. The application process and requirements for the Statistics, Data Science, and Informatics honors programs are described on the department website. Students are encouraged to contribute their thesis to the archive of honors theses at the University of Michigan Library.

  5. BSc/MSc Thesis

    BSc/MSc Thesis. Our research group offers various interesting topics for a BSc or MSc thesis, the latter both in Computer Science and Scientific Computing. These topics are typically closely related to ongoing research projects (see our Research Page and Publications ). Below, we outline the basic procedure you should follow when planning to do ...

  6. Five Tips For Writing A Great Data Science Thesis

    Although educational programs, conventions and thesis requirements vary wildly, I hope to offer some common guidelines for any student currently working on a Data Science thesis. The article offers five guidance points, but may effectively be summarized in a single line: "Write for your reader, not for yourself."

  7. Open Theses

    Open Topics We offer multiple Bachelor/Master theses, Guided Research projects and IDPs in the area of data mining/machine learning. A non-exhaustive list of open topics is listed below.. If you are interested in a thesis or a guided research project, please send your CV and transcript of records to Prof. Stephan Günnemann via email and we will arrange a meeting to talk about the potential ...

  8. Data Science

    Data Science Thesis Continuation. (4 Hours) Focuses on student continuing to prepare an undergraduate thesis under faculty supervision. DS 5010. Introduction to Programming for Data Science. (4 Hours) Offers an introductory course on fundamentals of programming and data structures. Covers lists, arrays, trees, hash tables, etc.; program design ...

  9. Undergraduate Research: Data Science Minor: University of Washington

    Research topics in data science might include: Development and optimization of machine learning algorithms. Evaluation of data mining techniques for specific applications. Studies on data preprocessing and feature engineering. Ethical considerations in data science and AI. Investigations into the social, economic, or policy implications of data ...

  10. Undergraduate theses

    If you are an undergraduate honors student interested in submitting your thesis to DukeSpace, Duke University's online repository for publications and other archival materials in digital format, please contact Joan Durso to get this process started. DukeSpace Electronic Theses and Dissertations (ETD) Submission Tutorial.

  11. Data Science

    Data Science. Research in data science at Princeton integrates three strengths: the fundamental mathematics of machine learning and artificial intelligence; the interdisciplinary application of those tools to solve a wide range of real-world problems; and deep examination and innovation regarding the societal implications of artificial ...

  12. 10 Best Research and Thesis Topic Ideas for Data Science in 2022

    In this article, we have listed 10 such research and thesis topic ideas to take up as data science projects in 2022. Handling practical video analytics in a distributed cloud: With increased dependency on the internet, sharing videos has become a mode of data and information exchange. The role of the implementation of the Internet of Things ...

  13. Research Topics & Ideas: Data Science

    If you're just starting out exploring data science-related topics for your dissertation, thesis or research project, you've come to the right place. In this post, we'll help kickstart your research by providing a hearty list of data science and analytics-related research ideas, including examples from recent studies.. PS - This is just the start…

  14. Ten Research Challenge Areas in Data Science

    Abstract. To drive progress in the field of data science, we propose 10 challenge areas for the research community to pursue. Since data science is broad, with methods drawing from computer science, statistics, and other disciplines, and with applications appearing in all sectors, these challenge areas speak to the breadth of issues spanning ...

  15. Major in Data Science

    The Data Science major in LSA consists of a total of 42 required credit hours, not including pre-requisites or pre-major courses.All courses must be completed with a minimum grade of C. Note that the EECS department limits students to two attempts for EECS 203, EECS 280, and EECS 281. Data Science Program Guide (Effective for declarations Fall 23' or prior)

  16. Thesis/Capstone for Master's in Data Science

    Thesis. A thesis is an academic-focused research project with broader applicability. A thesis is more appropriate if: you want to get a PhD or other advanced degree and want the experience of the research process and writing for publication; you want to work individually with a specific faculty member who serves as your thesis adviser

  17. The Future of AI Research: 20 Thesis Ideas for Undergraduate ...

    This article provides a list of 20 potential thesis ideas for an undergraduate program in machine learning and deep learning in 2023. Each thesis idea includes an introduction, which presents a brief overview of the topic and the research objectives. The ideas provided are related to different areas of machine learning and deep learning, such ...

  18. Data science thesis for undergraduate? : r/datascience

    Python Blaze. Technologies for constructing data processing pipelines. Luigi. Cloud analytics platforms that make machine learning "available to the masses". Amazon Machine Learning. Microsoft Azure Machine Learning. NB: The examples provided for a given topic above are not intended to be comprehensive.

  19. Master of Data Science

    Data science has revolutionized almost every industry, providing some of the most in-demand and highest-paying jobs for graduates. Rice's Master of Data Science (MDS) is a professional non-thesis degree designed to support the needs of interdisciplinary professionals. The program offers students online or on-campus options.

  20. Computational and Data Sciences (PhD) Dissertations

    Below is a selection of dissertations from the Doctor of Philosophy in Computational and Data Sciences program in Schmid College that have been included in Chapman University Digital Commons. Additional dissertations from years prior to 2019 are available through the Leatherby Libraries' print collection or in Proquest's Dissertations and ...

  21. Sports Analytics Specialization: Applied Data Science Master's: Master

    Analytics is a crucial part of decision-making in amateur and professional athletics.Teams rely on those with the knowledge to interpret data and relate it to the world of athletics. (Nikhil Morar—pictured above—earned his Applied Data Science master's degree with a specialization in Sports Analytics from our program, and became Manager of Business Analytics & Strategy for the Los Angeles ...

  22. 2024 Senior Thesis Exhibition

    Dance. Film. Interdisciplinary Arts. OHIO. College of Fine Arts. Art + Design. Undergraduate Programs. Interior Architecture—Bachelor of Fine Arts. 2024 Senior Thesis Exhibition.