News

Stanford Researchers Develop Solution to AI Hallucinations, Enhancing Accuracy

Researchers at Stanford University have developed a new method to detect AI hallucinations, improving the reliability of generative AI tools like ChatGPT. This groundbreaking approach offers a significant step toward more accurate AI systems.

In the innovative world of AI, generative AI tools like ChatGPT have the problem of hallucinating and yielding the wrong output: In the emerging, innovative world of AI, a constant issue is that AI might confidently generate wrong information—a situation known as “hallucination “ Recent AI hallucinations include Air Canada implementing the wrong discount-based on ChatGPT’s output and Google’s AI stating that ingesting rocks is safe.

Sebastian Farquhar, an author of the study, is a senior research fellow and research scientist on Google DeepMind’s safety team “I hope that this opens up ways for large language models to be deployed where they can’t currently be deployed—where a little bit more reliability than is currently available is needed,” says at Oxford University’s department of computer science. 

Though a recent advancement holds prospects for resolving this question, in a work that was published in Nature Scientific magazine, the researchers have discovered a new way of identifying the presence of AI hallucinations. This approach is designed to ask a question, and then analyze both the question and the AI-generated correct answer to determine if the student’s answer is correct or not; It can work with approximately 79 % accuracy, which is higher than current solutions. This may only address one of the causes of the AI hallucinations and the new approach demands more computations but this development may lead to more accurate AI systems in the future.

The research team chose to investigate one form of hallucination known as “confabulations,” which is a phenomenon in which an AI model generates inaccurate and inconsistent responses to well-defined questions. Through such confabulations, the researchers hope to enhance the correctness or fitness of the AI-derived responses.

Procedure

The method used in the study involves creating multiple responses to a specific question by a chatbot, and then using an LM to group the responses into equivalent meanings. To measure the relatedness of the meanings of string vectors, researchers use a concept called semantic entropy. For generating a high semantic entropy score, the model is considered to be confabulating, while a low score means that the answer decided has been consistent, and hence, it’s less probable that it is a hallucination.

Despite these challenges, the development of a method to detect AI hallucinations is a significant step forward in more accurate and reliable AI systems. As AI continues to permeate various aspects of our lives, the importance of ensuring that these tools provide accurate and trustworthy information cannot be overstated. This groundbreaking research brings us one step closer to a future where AI can be relied upon to deliver consistently.

Automated Evaluation Method for Assessing Hallucination in RAG Models

This post was last modified on June 22, 2024 12:45 am

Tech Chilli Desk

Tech Chilli News Desk is a conglomeration of Tech enthusiasts who are committed to delving deep into the evolving new-age technology of Web 3.0, Artificial Intelligence (AI), Robotics, Fintech, Crypto and more. This desk brings the latest information on Digital Transformation through use cases, implementations, coverage, case studies, reporting and deep analysis.

Recent Posts

Best AI Model for Every Task: Image, Video, PPT and More

Pick your task, get the best AI model for it — images, video, slides, research,…

June 17, 2026

What is Agentic AI? Check How it Works with Real-Life Agentic AI Automation Examples

Learn what Agentic AI is, how it works, and how it differs from Generative AI.…

June 14, 2026

13 Best Free Online Vocal Remover AI Tools in 2026

Discover the 13 best free online vocal remover AI tools for 2026, designed to isolate…

January 4, 2026

Top 13 Yield Farming Platforms in 2026: Maximize APY with Secure and Trusted Crypto Tools

Explore the top 13 yield farming platforms for 2026, featuring secure, trusted, and high-APY crypto…

January 4, 2026

Top AI Learning Platforms for 2026: Master AI Skills with Coursera, edX, and Udacity

Explore the best AI learning platforms for 2026, including Coursera, edX, Udacity, and more. Learn…

January 4, 2026

13 Best Polygon Wallets in 2026 You Need to Checkout

Explore the 13 best Polygon wallets in 2026, comparing security, DeFi access, hardware and mobile…

January 1, 2026