Let's delve into the realm of ChatGPT for dental research, unveiling their potential and groundbreaking impact

 

Does this title sound weird to you? You are not alone. Here with Dr. Ilze Maldupa, every time we have to edit/review a paper with such strange words, we wonder if we are reviewing a paper written by ChatGPT. After reading reports from other fields reporting the same thing, we checked what was happening in the dental literature.

We looked at how many dental publications used ChatGPT and identified some words highly correlated with ChatGPT output. We checked how many of these "signaling" words were present in PubMed dental research abstracts, and here is what we found.


To ensure our findings are consistent regardless of the number of dental publications in a given year, we've normalized the results by 10,000.

Then, we checked the specific words and found an exponential increase after the Chatgpt release in November 2022.

For instance, the now classic " delve"


or " underscore"

Because there are such large changes between some words that were practically unused before 2022, we had to plot using the log10 scale because some changes were used tens or hundreds of times more. You can see some of the words here; the rest are in the article here. (The link works only for 50 days, so hurry up!)

Some of the signaling words...Note the y-axis in the log10 scale!!

However, to be sure that we were not seeing an artifact, we used three additional words, periodontal and caries as negative controls, since we assumed that there was a constant output of dental research, and covid as a positive control, since we assumed that we should see an increase. When we compared the signaling words with these control words, we observed the change

Periodontal and caries stayed the same, Covid increased and then decreased, so we were not seeing an artifact.

So, people use ChatGPT to write papers. There is nothing new here. Whenever new technology, such as word processors, Google Translate, or Grammarly, appears, we use it.

However, Generative AI, like ChatGPT, brings new challenges, especially with the ability to present "hallucinations" of the model as facts.

Also, there seems to be a trade-off between quantity and quality of research output. For example, what can you learn after reading this abstract?

Real research..by humans?

So maybe this research is about what drives this desire to publish at any cost. We believe there is a fundamental misalignment between academic incentives (more grants, more "Q1" publications), what society needs (better diagnosis, treatment, and health outcomes), and what researchers need (recognition, money, promotion).

More importantly, this paper explains why specific terms, such as "inquiry," are used over others and how this relates to the "P" in ChatGPT, focusing on how the model is pretrained. To understand this, we must delve, sorry, examine Reinforcement Learning from Human Feedback (RLHF), how the model learns from humans. Here, the models present some text, and humans decide if the text looks academic, for example.

But who is annotating the data used to train these models? Here a hint...

Who is annotating the AI models?

So, next time you read an AI paper boasting 99% accuracy for whatever, ask:

  • What data was used? is available?
  • Who annotated the data? are they similar to you?
  • Using what criteria? is that criteria similar to your criteria?
  • 99% compared to what? to the same data? or to data similar to mine?


But good news! These chatbots are starting to incorporate watermarks, making it easier for editors and reviewers to detect text produced by generative AI tools. I have nothing against Gen AI; if you read the acknowledgments of our paper, you will see that we have used it because science must be transparent!

And speaking of transparency, as always, the data supporting these results is openly available here!


Read the full paper and download your free copy (only 50 days free) here!

We would love to hear your thoughts on this subject.

  • How do you feel about using AI, like ChatGPT, in scholarly writing? Do you consider it a progression, or do you have reservations?
  • Do you believe there's a trade-off between the quantity and quality of research output when using AI tools for writing?
  • Have you encountered a paper or publication in which you suspect AI was utilized based on the vocabulary or style?

Comments