The Future of Generative Artificial Intelligence in Scientific Research

Frontiers in Cell and Developmental Biology, a peer-reviewed journal, published an article titled “Cellular functions of spermatogonial stem cells in relation to JAK/STAT signaling pathway” on Feb. 13. Within a day of publication, readers pointed out that the figures were clearly generated with artificial intelligence — full of spelling errors, nonsensical diagrams and an anatomically incorrect, well-endowed rat. Within three days, Frontiers retracted the article.

The retracted article is just one example contributing to growing concern regarding AI’s applications in science. AI can be a powerful tool that spearheads innovation and development — however, it can also present a series of ethical, legal and scientific challenges that are intensely debated.

Generative AI has developed rapidly within the last decade, in part building off of the Generative Adversarial Network proposed by Ian Goodfellow and his colleagues in 2014. GANs allow AI models to create new content, including text, figures and videos, differentiating GenAI from classification and decision-making AI tools.

A recent Cornell University report details the potential benefits of GenAI at every stage of academic research — planning studies, organizing research, collecting data, disseminating findings and securing funding. This includes using GenAI tools to summarize large amounts of data, improve clarity of writing and synthesize information from various sources.

While GenAI could increase research efficiency and accuracy, there are several key challenges to using GenAI in academic research, according to Cornell professors.

“Generative AI can output text or images that perpetuate errors (including misinformation and hallucinations) or invoke stereotypes that are not representative of the world,” wrote Prof. Allison Koenecke, information science, in an email to The Sun.

If the use and outputs of GenAI are not critically examined, there may be a reduction in research quality and accuracy, according to Prof. James Grimmelmann, Cornell Tech and law.

“[The article in Frontiers is an example of] people using generative AI to make things up, to automate the process of writing parts of a paper or illustrating it and to present garbage produced by the AI as though it were scientific research,” Grimmelmann said.

The use of GenAI to create figures in the Frontiers article was obvious to any reader, resulting in the social media storm that occurred shortly after its publication. However, there may be more subtle risks of using GenAI in scientific research.

According to Grimmelmann, using GenAI for routine sections of scientific papers — such as backgrounds, literature reviews and abstracts — opens up the possibilities for GenAI to get things wrong. Unlike the AI-generated images, the errors in these sections may appear reasonably correct at first glance.

“If [GenAI] is just a shortcut, it might be a shortcut that leads to lower quality,” Grimmelmann said.

Additionally, while data fabrication has always been a concern in scientific research, GenAI may allow people to make up or alter datasets at a large scale while maintaining plausibility. Tools are being developed to detect fraud, but they are not yet able to detect with a high accuracy.

According to Prof. David Mimno, information science, another concern is that many researchers using GenAI tools may incorrectly believe they are producing the intended output or accurate results. Rather, GenAI can change lines of code or alter data — making its outputs seem more plausible — without the researcher’s intent.

As AI tools continue to develop, there has been an increase in actions and dialogues across governments, legal systems, research institutions and developers to maximize the benefits of GenAI while minimizing risks.

For example, in October 2023, President Joe Biden issued an Executive Order on the Safe, Secure and Trustworthy Development and Use of Artificial Intelligence. The EO established safety and security standards for AI technology, safeguarded privacy rights, reduced algorithmic bias and sought to promote innovation.

The intersection between AI and science does not halt with research publications. In the U.S., there are more than 20 active legal cases regarding the use of GenAI tools, intellectual property rights and data privacy.

According to Grimmelmann, GenAI is not currently considered eligible for copyright or patent protection in the US because AI-generated content is not considered to be produced by a human. Despite the separation of human and AI, prompts used to generate outputs, data used to train algorithms and programming of AI tools may be considered as being produced by one individual.

“We’re going to have a lot of hard questions about how much human contribution is enough to let the human author take credit for what comes out of the generative AI,” Grimmelmann said.

In academia, scientific journals and editorial boards grapple with disclosure rules for scientific research. The Cornell report proposes a series of duties for researchers using GenAI in their studies — discretion on the use of data and GenAI tools, verification of the outputs, disclosure of GenAI use and responsibility for every researcher to adhere to the standards of GenAI use.

For Mimno, trust is a major component of this discussion.

“A lot of academia and research works on volunteer labor where we are supporting each other as a community to make work better and to find science and truth,” Mimno said. “Ultimately we have to trust each other that we’re not making things up, we’re not creating images that don’t actually mean anything.”

Taylor Rijos can be reached at [email protected].

The Cornell Daily Sun - Independent Since 1880

May 2, 2024

Science

The Future of Generative Artificial Intelligence in Scientific Research

By Science Department | May 2, 2024