AI CHATBOTS ARE GETTING WORSE OVER TIME — ACADEMIC PAPER

Last updated: June 20, 2025, 00:06 | Written by: Dan Larimer

Imagine a world where AI chatbots, once hailed as the pinnacle of technological advancement, are actually becoming less reliable. AI chatbots are getting worse over time academic paper cointelegraph.com Like Comment Share Copy; LinkedIn; Facebook; Twitter; To view or add a commentA recent academic paper has stirred the AI community by suggesting just that: AI chatbots are getting worse over time, despite the continuous release of newer, more robust models.This might seem counterintuitive, especially considering the massive investments and breakthroughs in large language models (LLMs) over the past few years. The decrease of GPT-4 s ability to follow instructions over time matched its behavior drift and partially explained the corresponding performance drops. Our findings highlight the need to continuously monitor LLMs behavior over time. All prompts we curated in this paper and responses from GPT-4 and GPT-3.5 in both March and June are collectedBut the research, published in the Nature Scientific Journal, points to a concerning trend of increasing inaccuracies and a reluctance to admit ignorance among these advanced systems.The implications are far-reaching, affecting everything from academic research and content creation to everyday interactions with AI assistants. News that are related to the article cointelegraph.com: AI chatbots are getting worse over time academic paper from papers and blogs.This article dives deep into this surprising phenomenon, exploring the findings of the academic paper, examining the potential causes behind this decline, and discussing the implications for the future of AI. And LLM performance has overall improved over time. For instance, early LLMs failed at simple additions such as 20 183. Now LLMs successfully perform additions involving more than 50 digits.We'll also explore what actions can be taken to mitigate these challenges and ensure that AI chatbots remain a valuable and reliable tool.

The Shocking Revelation: AI Chatbots Declining in Accuracy

The academic paper, titled ""Larger and more instructable language models become less reliable,"" presents compelling evidence that today's sophisticated AI chatbots are making more mistakes compared to their earlier versions. A dwindling consumer interest in chatbots caused a drop in AI-sector revenues during the second business quarter of 2025. Source linkThis finding challenges the conventional wisdom that continuous development and increased model size automatically translate to improved performance. A dwindling consumer interest in chatbots caused a drop in AI-sector revenues during the second business quarter of 2025. AThe researchers meticulously analyzed the responses of ten widely used chatbots, including prominent players like ChatGPT-4o, ChatGPT-4.5, and DeepSeek, by tasking them to summarize nearly 5,000 scientific studies.The results revealed a surprising and troubling trend.

While it's true that LLM performance has improved in certain areas, like complex arithmetic (early LLMs struggled with even simple additions, whereas current models handle additions involving over 50 digits), the study highlights a decline in other critical aspects, such as:

Accuracy: The chatbots were more prone to generating incorrect summaries of the scientific studies.
Honesty: The newer versions exhibited a greater tendency to provide wrong answers rather than admit they didn't know the answer.
Instruction Following: GPT-4's ability to follow instructions deteriorated over time, contributing to performance drops.

Delving Deeper: Understanding the Reasons Behind the Decline

So, why are AI chatbots seemingly getting worse? The collaborative research paper looked at nearly 5,000 large language model (LLM) summaries of scientific studies by ten widely used chatbots, including ChatGPT-4o, ChatGPT-4.5, DeepSeek, andSeveral factors could be contributing to this unexpected phenomenon.

Model Drift and Overfitting

One potential explanation is model drift.As LLMs are continuously updated with new data, they can gradually deviate from their original training objectives. G7 nations antitrust authorities have signaled they may take vigorous enforcement action to protect competition in the artificial intelligence sector to tackle risks before they becomeThis can lead to a phenomenon known as overfitting, where the model becomes excessively specialized in the new data and loses its ability to generalize to previously learned information. A study of newer, bigger versions of three major artificial intelligence (AI) chatbots shows that they are more inclined to generate wrong answers than to admit ignorance. The assessment alsoThe academic paper explicitly mentions how GPT-4's behavior drift and decline in instruction following partially explained its performance drops. AI chatbots are getting worse over time academic paper Coin Telegraph 16 minutes ago 23 A dwindling consumer interest in chatbots caused a drop in AI-sector revenues during the second business quarter of 2025.The researchers collected all prompts and responses from GPT-4 and GPT-3.5 in both March and June to analyze these shifts.

Imagine a student who crams for an exam by memorizing specific examples rather than understanding the underlying concepts. AI chatbots are getting worse over time academic paper ARTICLE AD BOX A dwindling user liking successful chatbots caused a driblet successful AI-sector revenues during nan 2nd business 4th of 2025.While they might ace the exam, they may struggle to apply their knowledge to new, unfamiliar situations.Similarly, LLMs that are constantly retrained on new data without proper oversight might lose their ability to provide accurate and reliable responses to a wide range of queries.

The Influence of AI-Generated Content

Another concerning aspect is the potential for AI-generated content to contaminate academic research. Home Cryptocurrency Bitcoin AI chatbots are getting worse over time academic paper. Bitcoin Cryptocurrency News. AI chatbots are getting worse over timeExperts, including Matt Hodgkinson, a council member of the Committee on Publication Ethics, are worried about the increasing presence of AI-generated judgments and text in academic papers.If LLMs are trained on data that already contains AI-generated errors, it can create a feedback loop, further exacerbating the problem of inaccuracies and unreliable information.This raises significant ethical concerns about the integrity of scientific research and the need for greater scrutiny of AI-generated content.

The Quest for ""Helpfulness"" Over Accuracy

The drive to make AI chatbots more ""helpful"" might inadvertently be compromising their accuracy.As developers strive to create systems that can answer a wide range of questions and provide quick solutions, they may be prioritizing fluency and confidence over factual correctness. AI chatbots are getting worse over time academic paper A dwindling consumer interest in chatbots caused a drop in AI-sector revenues during the second business quarter of 2025. Read More onThis can lead to chatbots that confidently provide wrong answers, creating a misleading sense of authority and trustworthiness. A recent academic paper found that large language models are making more mistakes as newer and more robust models are released to the AI chatbots are getting worse over time academic paper .In some cases, the goal is to create chatbots that seem agreeable and avoid controversial responses, leading to compromised information delivery.

Real-World Implications and Concerns

The declining accuracy of AI chatbots has significant implications across various domains.

Academic Research: If AI-generated summaries of scientific studies are becoming less reliable, researchers risk building their work on flawed information, potentially leading to erroneous conclusions.
Content Creation: Writers and journalists who rely on AI chatbots for research and content generation may inadvertently introduce inaccuracies into their work, undermining the credibility of their publications.
Customer Service: Businesses that use AI chatbots for customer support may provide inaccurate or misleading information to their customers, damaging their reputation and customer satisfaction.
Decision-Making: Individuals and organizations that rely on AI chatbots for making important decisions may be making those decisions based on faulty data, potentially leading to negative outcomes.

Consider a scenario where a medical professional uses an AI chatbot to research a new treatment option. AI chatbots are getting worse over time academic paper Octo A recent research study titled Larger and more instructable language models become less reliable in the Nature Scientific Journal revealed that artificially intelligent chatbots are making more mistakes over time as newer models are released.If the chatbot provides an inaccurate summary of the relevant scientific studies, the doctor might make an incorrect diagnosis or prescribe an ineffective treatment, potentially harming the patient.

Addressing the Challenges: Strategies for Improvement

While the trend of declining accuracy in AI chatbots is concerning, it's not irreversible. Une r cente tude intitul e Les mod les de langage plus grands et plus instructifs deviennent moins fiables, publi e dans la revue Nature Scientific Journal, a r v l que les chatbots dot s d'intelligence artificielle commettent de plus en plus d'erreurs au fil du temps, mesure que de nouveaux mod les sont lanc s.Several strategies can be implemented to address these challenges and improve the reliability of these systems.

Improved Training Data and Techniques

One crucial step is to improve the quality and diversity of the training data used to develop LLMs. BTCUSD Bitcoin AI chatbots are getting worse over time academic paper. A dwindling consumer interest in chatbots caused a drop in AI-sector revenues during the second business quarter of 2025.This includes:

Curating high-quality datasets: Ensuring that the training data is accurate, unbiased, and representative of the real-world scenarios in which the chatbot will be used.
Implementing data augmentation techniques: Expanding the training dataset with synthetic data to improve the model's robustness and generalization ability.
Using reinforcement learning with human feedback (RLHF): Training the model to align with human preferences and values through feedback from human experts.

Enhanced Monitoring and Evaluation

Continuous monitoring and evaluation of LLM performance are essential to detect and address any signs of decline. AI chatbots are getting worse over time academic paper Posted on Octo by RJM A recent research study titled Larger and more instructable language models become less reliable in the Nature Scientific Journal revealed that artificially intelligent chatbots are making more mistakes over time as newer models are released.This involves:

Developing robust evaluation metrics: Creating metrics that accurately measure the accuracy, reliability, and honesty of the chatbot's responses.
Implementing regular testing and benchmarking: Conducting regular tests to assess the chatbot's performance on a variety of tasks and compare it to previous versions.
Establishing feedback mechanisms: Providing users with a way to report errors and inaccuracies, allowing developers to quickly identify and address potential problems.

Promoting Transparency and Accountability

Transparency and accountability are crucial for building trust in AI chatbots. G7 antitrust and competition authorities are scrutinizing anti-competitive threats in the AI space and preventing the tech from being misused.This includes:

Disclosing the limitations of the model: Clearly stating the areas where the chatbot is likely to make mistakes or provide inaccurate information.
Providing explanations for the chatbot's responses: Allowing users to understand the reasoning behind the chatbot's answers, increasing transparency and trust.
Establishing clear lines of responsibility: Identifying who is responsible for the accuracy and reliability of the chatbot's responses.

The Role of Antitrust Authorities

Beyond the technical aspects, regulatory bodies also play a crucial role. Buy-Now-Pay-Later Firm Klarna Files for US IPO. NovemThe G7 nations' antitrust authorities are actively scrutinizing anti-competitive threats in the AI space.Their goal is to prevent the misuse of AI technology and ensure a fair and competitive market, which can indirectly promote more responsible and ethical AI development. Phobos Cubesat. Anomalies; UFO App; Flat-Earth; 5G; Antarctica (~2025) African Mini-Factories (~2025)Preventing any one company from dominating the AI landscape could also encourage varied approaches to AI development that prioritize accuracy and reliability. If AI-generated judgments creep into academic papers alongside AI text, that concerns experts, including Matt Hodgkinson, a council member of the Committee on Publication Ethics, a U.K.-basedNews has surfaced regarding potential vigorous enforcement action to protect competition in the artificial intelligence sector to tackle risks before they become significant.

The Human Element: Why Critical Thinking Remains Essential

Despite the advances in AI technology, human oversight remains essential. A dwindling consumer interest in chatbots caused a drop in AI-sector revenues during the second business quarter of 2025. AI chatbots are getting worse over time academic paper - Alphavic Save 25% on memberships for a limited time onlyUsers should approach AI chatbot responses with a healthy dose of skepticism and critical thinking. A recent academic paper reveals something surprising: despite massive leaps forward, today s most sophisticated AI chatbots might actually be getting worse over time! Let s dive deeper to understand this interesting finding. What s happening with AI-driven chatbots like GPT, and why are researchers sounding the alarm?Always double-check the information provided by chatbots, especially when dealing with important decisions or sensitive topics.Treat the chatbots as a tool to enhance human capabilities, not as a replacement for human judgment.

Examples of AI Chatbot Failures and Misuse

The concerns surrounding AI chatbot accuracy aren't just theoretical.Several real-world examples illustrate the potential for these systems to be misused or to generate inaccurate information with significant consequences. AI chatbots are getting worse over time academic paper cointelegraph.com 14 m cointelegraph.comFor example:

The Freysa Incident: In an adversarial agent game, an AI bot guarding prize pool money was convinced by another AI bot (""Freysa"") to transfer over $47,000.This demonstrates the vulnerability of even sophisticated AI systems to manipulation.
Inaccurate Medical Advice: Chatbots providing incorrect or incomplete medical advice, leading to inappropriate self-treatment.
Spread of Misinformation: AI-powered bots being used to spread false information and propaganda on social media, exacerbating social and political divisions.

What to Expect in the Future: Trends and Predictions

The future of AI chatbots is uncertain, but several trends and predictions can be made based on the current trajectory:

Increased Specialization: We may see a shift towards more specialized AI chatbots that are trained for specific tasks or domains, rather than general-purpose assistants.This could lead to improved accuracy and reliability within those specific areas.
Hybrid AI Systems: The integration of AI chatbots with human experts will likely become more common.This will allow for a combination of AI's efficiency and human's critical thinking, leading to more reliable and trustworthy results.
Focus on Explainability: Research efforts will focus on developing AI models that are more transparent and explainable, allowing users to understand how the chatbot arrived at its conclusions.
Regulation and Oversight: Governments and regulatory bodies will likely play a larger role in overseeing the development and deployment of AI chatbots, ensuring that they are used ethically and responsibly.

Common Questions About AI Chatbot Accuracy

Are all AI chatbots getting worse?

While the academic paper highlights a concerning trend, it's important to note that not all AI chatbots are necessarily declining in accuracy.Some models may be improving in certain areas, while others may be experiencing a decline in different aspects. A participant in adversarial agent game Freysa has just convinced an AI bot to transfer them over $47,000 worth of prize pool money. Freysa is said to be an autonomous AI bot tasked with guardingThe overall trend, however, suggests a need for greater vigilance and attention to the potential for declining performance.

How can I tell if an AI chatbot is providing inaccurate information?

There's no foolproof way to guarantee the accuracy of an AI chatbot's response.However, you can take the following steps:

Cross-reference the information: Verify the information provided by the chatbot with other reliable sources, such as reputable websites, academic journals, or expert opinions.
Look for inconsistencies: Be wary of responses that contradict each other or that seem illogical.
Consider the source: Assess the credibility of the website or platform hosting the chatbot.
Trust your gut: If something doesn't seem right, trust your intuition and seek further clarification.

What can I do to help improve the accuracy of AI chatbots?

As a user, you can contribute to the improvement of AI chatbots by:

Reporting errors: Use the feedback mechanisms provided by the chatbot to report any inaccuracies or inconsistencies you encounter.
Providing constructive feedback: Share your thoughts and suggestions with the developers to help them improve the chatbot's performance.
Promoting responsible use: Encourage others to use AI chatbots responsibly and critically, rather than blindly trusting their responses.

Conclusion: Navigating the Future of AI Chatbots

The academic paper's findings serve as a crucial wake-up call, reminding us that progress in AI is not always linear and that continuous monitoring and evaluation are essential.While the initial promise of AI chatbots was to revolutionize various aspects of our lives, the reality is more nuanced.The decreasing user interest in successful chatbots, which reportedly caused a drop in AI-sector revenues during the second business quarter of 2025, is a clear indicator of the growing public awareness of these limitations.The revelation that AI chatbots are getting worse over time underscores the need for a more balanced approach that prioritizes accuracy, transparency, and human oversight.Moving forward, developers, researchers, and users must collaborate to address the challenges and ensure that AI chatbots remain a valuable and reliable tool, not a source of misinformation and unintended consequences.

The key takeaways from this article are:

A recent academic paper suggests that AI chatbots are getting worse over time in certain areas, such as accuracy and honesty.
Several factors could be contributing to this decline, including model drift, overfitting, and the influence of AI-generated content.
The declining accuracy of AI chatbots has significant implications across various domains, including academic research, content creation, and customer service.
Strategies can be implemented to address these challenges and improve the reliability of AI chatbots, including improved training data, enhanced monitoring, and promoting transparency.
Human oversight and critical thinking remain essential when using AI chatbots.

By staying informed, being critical, and actively participating in the development and improvement of AI chatbots, we can ensure that these powerful tools are used responsibly and ethically, ultimately benefiting society as a whole.