Student Projects
CMDA Student Capstone Projects
In the Computational Modeling and Data Analytics (CMDA) Capstone Project course at Virginia Tech, teams of three or four students spend the semester tackling an open-ended, client-driven project. In addition to the technical aspects of the project, students are mentored by CMDA Faculty in teamwork, project management, and technical leadership. Through the lens of their particular projects, the teams also consider the ethical aspects of data science and mathematical modeling.
Expand the links below to read about the Teams' investigations!
- Can we better understand the dominating preference for ultra-processed foods over minimally processed foods in America?
- Analytic methods: Linear mixed effects modeling
Team's Story
Working as a group was tough at times but with help from one another along with the support from Dr. Ahrens and Ms. Lozano we were able to work through it. The experience was something that we never had before in a normal classroom setting. The project allowed us to get a taste of what it would be like in the real world in terms of large-scale projects. In the end, although there was uncertainty at times, we were proud of our final product and the hard work made it that much sweeter.


Project Team Members:
- Jacob Parker, B.S. in Computational Modeling and Data Analytics
- Sabrina Hart, B.S. in Computational Modeling and Data Analytics
- Aaron Ni, B.S. in Computer Science and B.S. in Computational Modeling and Data Analytics
- Anish Monokonda, B.S. in Computational Modeling and Data Analytics
CBHDS Sponsors:

- Can we find demographic and food perception factors that predict overall food preference across respondents of a national survey? Additionally, do people prefer ultra-processed foods over minimally processed foods overall?
- Analytic methods: Linear mixed effects modeling
- This team was recognized by Mr. Brian Sanchez and Ms. Nancy Schuessler with an honorary sponsorship of their project.
Team's Story
Being a part of a new study, Big Byte Analytics had the opportunity to tackle the poor diet issue caused by ultra-processed foods (UPFs) in the United States. We were interested in finding what factors lead to increased UPF intake using survey data and nutrition information data. Over the course of the project, we have faced significant delays in data collection and we unfortunately were unable to conduct analysis on the true nutrition information of the foods featured in the survey. We hope that future teams can use this data along with our cleaned data and results to answer our original research question.
Regardless of our setbacks, we have found some meaningful results that we believe will serve as a great first step towards finding a solution to the poor diet problem in the United States. We have found that an individual’s age and perception of food (i.e. how healthy they think the food is, perceived calorie count, etc.) has significant influence on food preference. We have also found that most people do not have a particular preference between UPFs and minimally-processed foods and concluded that high UPF intake could also be strongly influenced by factors such as low cost, accessibility, advertising, and lack of health/food knowledge.
Project Team Members:
- Laura Nury, B.S. in Computational Modeling and Data Analytics (CryptoCyber Option, Minor in Mathematics), May 2023
- Rithvik Guntor, B.S. in Computational Modeling and Data Analytics (Minor: Mathematics and Computer Science), December 2022
- Renny Adjei, B.S. in Computational Modeling and Data Analytics (Minor: Statistics and Mathematics), May 2023
CBHDS Sponsors:


- Can we build on the work done by team Diet Code and expand the predictive modeling of cleaned NHANES data to additional metabolic outcomes?
- Analytic methods: multiple imputation through chained equations, stepwise linear and logistic regression, random forests, neural networks.
- This team was recognized by Mr. & Mrs. Mark and Nancy Scheffel with an honorary sponsorship of their project.
Team's Story
Due to the scope of organizing and cleaning the NHANES data, team diet Code from Spring 2021 was limited in their ability to analyze their data and model diabetes. Building on that work, team Diet Science used the cleaned NHANES data to predict the presence of hypertension and obesity, and to model the amount of LDL cholesterol, a known risk factor for a number of metabolic diseases.
The team first resolved the problem of missing data by sneering that the missingness was at random and then using multiple imputation by chained equations. After imputation, they modeled their outcomes using a combination of stepwise regression, random forests, and neural networks. Of note, the team was able to greatly improve the performance of their initial neural networks by iteratively expanding and refining those models, a process requiring much determination and skill.

Project Team Members:
- Colin Brant, B.S. in Computational Modeling and Data Analytics, May 2022
- Thomas Stapor, B.S. in Computational Modeling and Data Analytics, May 2022
- Visvas Kaja, B.S. in Computational Modeling and Data Analytics, May 2022
CBHDS Sponsors:
- Ian Crandell
- Alexandra Hanlon
- Can we harmonize and compile several years of NHANES data and use it to predict diabetes?
- Approaches: in-depth study of multiple years of messy survey data, manual harmonization and data manipulation
- Analytic methods: stepwise logistic regression
Team's Story
This project was somewhat different from traditional CMDA projects as its main objective was to tackle a problem which, while common, is not often discussed in class. Typically, a student begins their analytic process with a clean data set to which they can immediately apply whatever summary or analytic method they choose. This is rare in practice, which more commonly begins with a very disorganized collection of data that needs to be transformed into a usable form. For instance, some variable names correspond to different questions in different years, and the only way to resolve these discrepancies is with meticulous study.
After this impressive step, the data was still plagued with issues such as missing data, multicollinearity, and potential sampling bias. The team resolved these via complete case analysis and stepwise logistic regression, and were able to predict diabetes in the sample with a high level of accuracy. This project taught the team the value of careful study and exposed them to an often underappreciated dimension of data analysis.

Project Team Members:
- Lauren Bradley, B.S. in Computational Modeling and Data Analytics, May 2022
- Evan Briscoe, B.S. in Computational Modeling and Data Analytics, December 2021
- Jake Lavitt, B.S. in Computational Modeling and Data Analytics, December 2021
CBHDS Sponsors:
- Ian Crandell
- Xin Xing
- Alexandra Hanlon
- Our project objectives were to study the impact of COVID-19 on the Virginia real estate market and explore inter- and intra-county mobility trends. We were able to answer the following research question: How did COVID-19 impact home values and mobility trends throughout Virginia?
- Analytic methods: Linear Regression and Spatial Regression
- Visual Analytics: An interactive application with Choropleth Maps and Time Series Plots
Team's Story
This project started off very open ended, and we were responsible for coming up with our own research question. With the help of our client and coach, we decided to focus on home values, COVID-19 cases, and mobility trends in Virginia at the county level. From there, our group worked together to produce a final product, while meeting with our client and coach on a weekly basis to brainstorm ideas and go over our progress.
Although we had experience working with data in the past, we ran into some novel concepts such as spatial regression and creating dashboards that took some time to get accustomed to. This project was one of our first experiences working on a team for an extended period of time, and it helped us better understand the importance of team dynamics and time management. Overall, working with CBHDS was a great experience, and we learned a lot throughout the process.

Project Team Members:
- Yohannes Afework, B.S. in Computational Modeling and Data Analytics, May 2021
- Devon Lee, B.S. in Computational Modeling and Data Analytics, May 2021
- Jaffar Shaik, B.S. in Computational Modeling and Data Analytics, May 2021
- Naod Teklie, B.S. in Computational Modeling and Data Analytics, May 2021
CBHDS Sponsors:
- Ian Crandell
- Alicia Lozano
- Alexandra Hanlon

- How has COVID-mandated social distancing affected United States mental health outcomes at the State level?
- Visual analytics: an interactive application to demonstrate the change in Mental Health Scores week by week during the pandemic
Team's Story
Over the course of the semester, Team New Horizons has been working with CBHDS to study the effects of social distancing on mental health for their Computational Modeling and Data Analytics (CMDA) capstone project. The team consists of Jeff Straw, Demory Williamson, Xumanning Luo, and Bella Marku, all of whom are seniors in CMDA. Their primary motivation for choosing the topic was the opportunity to learn more about how the pandemic has affected people emotionally, not just physically.
The project incorporated both mobility data from Google and mental health data from the CDC to determine how mental health symptoms changed in relation to the amount of time spent at home, as compared to a baseline from before the pandemic began. To display their results, the team developed an interactive web dashboard where users can click on states to obtain that state’s specific mental health results since the start of the pandemic. The biggest challenge the team faced was finding publicly available data, especially since the pandemic is still ongoing, but the datasets they incorporated allowed them to create a comprehensive dashboard that provides significant information for its users.

Project Team Members:
- Xumanning Luo, B.S. in Computational Modeling and Data Analytics, May 2021
- Bella Marku, B.S. in Computational Modeling and Data Analytics, May 2021
- Jeff Straw, B.S. in Computational Modeling and Data Analytics, May 2021
- Demory Williamson, B.S. in Computational Modeling and Data Analytics, May 2021
CBHDS Sponsors:
- Ian Crandell
- Kevin McKee
- Alexandra Hanlon

- How do remediation measures affect species spread?
- Analytic methods: generalized linear models and classification trees
- Visual analytics: an interactive application to demonstrate the effect of various parameters and conditions

Team's Story
Our CMDA capstone project began with the unexpected challenge of working around COVID-19, but ultimately the project provided us with an entirely unique and informative experience. Extensive collaboration on such a large project was difficult over solely the internet. However, with the guidance of Dr. Alexandra Hanlon and Jennifer West, we became familiar with working efficiently online. Dr. Hanlon and Jennifer were extremely helpful in advising us and providing us with the resources to succeed. We couldn’t have done this without them!
For as unusual as this semester has been, it went by extremely fast. Our entire team feels like we just became acquainted with the CBHDS team. This seemingly short experience has provided us with valuable skills in not only data science, but more so in the importance of teamwork and communication. It was exciting to be able to apply what we have learned in the classroom to a real world problem, and a privilege to do so under the guidance of our mentors.


Project Team Members:
- Evan Mitchell, B.S. in Computational Modeling and Data Analytics, May 2021
- Colton Mumley, B.S. in Computational Modeling and Data Analytics, December 2020
- Akshay Patel, B.S. in Computational Modeling and Data Analytics, May 2021
CBHDS Sponsors:
- Jennifer West
- Alexandra Hanlon
