Catherine Cai

April 19, 2019

Cancer Biology, Machine Learning, and Creative Problem Solving at Cornell’s Microbiome Hackathon

Print More

On Sunday, April 14, students participating in Cornell’s Microbiome Hackathon wrapped up 28 hours of work by showing off their creations which included an app to predict one’s risk of neurological disease, a possible solution to the disappearing honeybees and machine learning algorithms designed to predict incidence of colorectal cancer.

This past weekend, the Microbiome Hackathon — hosted by the Brito Lab — took over Collegetown’s eHub location, where students crafted technological solutions using their knowledge of data science, computational biology and microbiology. Participants ranged across multiple colleges and included Cornellians from undergraduates to post docs.

Many of the projects focused on the microbiome. The microbiome refers to a community of both friendly and pathogenic bacteria that live in practically every environment and within our own digestive systems.

Scott Oleson, an event guest speaker and director of non-profit OpenBiome, explained that the microbiome is a complex topic, with a plethora of information. “Microbes are everywhere, which makes it a fun topic for a hackathon because you can slice it so many ways,” Olesen said.

Students formed seven teams addressing issues ranging from health diagnostics to fish farming, and devised strategies to acquire the necessary data.

Aman Agarwal grad said that his group related microbiome data back to a pressing environmental issue, the extinction of bees. His group proposed investigating whether a resilient bee population in Puerto Rico would be an ideal replacement for declining bee populations in Europe, all based on their gut microbiomes.

They compared the bacteria present in Puerto Rican bees and European bees, exploring whether Puerto Rican bees would have a better chance of thriving in Europe if the bacteria populations present were similar.

“I wanted to learn about microbiomes and this field,” Agarwal said, “[but] I had to learn on the fly when it came to the data.”

In fact, very few of the students were experts in both microbiome biology and data science — many came from one specialty and learned through their teammates and the event mentors about the others.

“The first goal is for the students to find what they are interested in and ask the right scientific questions. This is challenging as many of the students haven’t been exposed to either data science or biological topics such as the microbiome,” Prof. Ilana Brito, biomedical engineering, said.

The process was not without its setbacks and challenges: groups changed questions multiple times, were roadblocked by limited data sets and experienced the entire process of data analysis and interpretation within one weekend.

“[Students] should try lots of things, fail fast and optimize for feasibility,” speaker Claire Duvalett said.

Much of the student and mentor efforts paid off as four groups left with $200 Best in Category prizes, and three groups left with $100 prizes for Honorable Mentions.

Best in Category Awards

Most Creative Question: Boops Boops

Boops Boops sought to confirm relationships between oceanic microbiomes and the number/type of fish caught in the ocean and determine those oceanic microbiomes could be recreated in fish farms to reduce fish stress.

This team compared the bacterial populations in wild and captive fish populations and found that there are bacteria exclusive to each population.

They hypothesized that introducing missing bacteria from the natural environment into fish farms would help reduce stress in captive fish because it would mimic the internal environment of wild fish populations. Judges were impressed at their ability to see applications of microbiome research outside of the healthcare sector.

Best Analysis: CarcinoBiome

CarcinoBiome sought to elucidate whether sialic acid catabolism differed in CRC patients vs normal patients, and if a predictive relationship exists between sialic acid metabolism and cancer-associated strain presence.

This group used metagenomic datasets and correlation matrices in R to confirm that the strain F. nucleatum is associated with colorectal cancer. They suggested that disease could be detected based on the composition of a patient’s gut microbiome, impressing judges with the soundness of their analytical methods.

Most Creative Use of Methods: Forever Young

Forever Young asked whether the microbiome can be used to reduce oxidative stress, a factor of aging that is greatly reduced in the naked mole rat.

Using the rat as their study species, this group used datasets to find correlations between certain strains of bacteria present and the aging of the organism.

They approached microbiome research with a unique but ambitious goal in mind, focusing not on a particular disease but rather on aging as an overall process.

Best Presentation: Pro-bee-otics

Pro-bee-otics asked whether or not it was ecologically feasible to export Puerto Rican bees to Europe to try and reverse the decline of bee populations .

They compared the gut microbiomes of Puerto Rican bees with both thriving and non-thriving bee populations in Ireland with the idea that similarities in composition to the thriving populations could point to a favorable outcome for exporting the species to Europe. With a simple but memorable presentation, they proposed a logical and feasible solution to impress the judges.

Honorable Mentions

Presentation: McFly

McFly investigated whether there were changes in the microbiome associated with Parkinson’s disease and if that data could be harnessed to develop a better tool for diagnosing the illness in patients.

This group identified bacterial populations associated with Parkinson’s disease and designed an app that can measure risk for Parksinson’s based on a stool sample. Patients would simply download the app, send in their stool sample, and be able to receive diagnosis straight to their phones.

Data Visualization: Kensho

Kensho asked whether machine learning was a competent tool in making lifestyle decisions.

“Kensho” means unlocking one’s true ability, which is what this team aimed to do using machine learning. This group used models of microbiome diversity from breastfeeding to suggest that data can help people make better decisions about their bodies, such as how long children should be breastfed for and what kind of diet the mother should maintain.

Analysis: Colonosco-PY

Colonosco-PY explored the microbial signatures of colorectal cancer.

This team designed their own machine learning algorithms based on the most prevalent bacteria in colorectal patients’ microbiomes, using datasets from China, Germany, and the United States. They then combined data from two sets to see how well the algorithm could predict the third set based on the microbiomes.