New York Tech Media
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
New York Tech Media
No Result
View All Result
Home AI & Robotics

What Are LLM Hallucinations? Causes, Ethical Concern, & Prevention

New York Tech Editorial Team by New York Tech Editorial Team
April 29, 2023
in AI & Robotics
0
What Are LLM Hallucinations? Causes, Ethical Concern, & Prevention
Share on FacebookShare on Twitter

Large language models (LLMs) are artificial intelligence systems capable of analyzing and generating human-like text. But they have a problem – LLMs hallucinate, i.e., make stuff up. LLM hallucinations have made researchers worried about the progress in this field because if researchers cannot control the outcome of the models, then they cannot build critical systems to serve humanity. More on this later.

Generally, LLMs use vast amounts of training data and complex learning algorithms to generate realistic outputs. In some cases, in-context learning is used to train these models using only a few examples. LLMs are becoming increasingly popular across various application areas ranging from machine translation, sentiment analysis, virtual AI assistance, image annotation, natural language processing, etc.

Despite the cutting-edge nature of LLMs, they are still prone to biases, errors, and hallucinations. Yann LeCun, current Chief AI Scientist at Meta, recently mentioned the central flaw in LLMs that causes hallucinations: “Large language models have no idea of the underlying reality that language describes. Those systems generate text that sounds fine, grammatically, and semantically, but they don’t really have some sort of objective other than just satisfying statistical consistency with the prompt”.

Hallucinations in LLMs

Image by Gerd Altmann from Pixabay

Hallucinations refer to the model generating outputs that are syntactically and semantically correct but are disconnected from reality, and based on false assumptions. Hallucination is one of the major ethical concerns of LLMs, and it can have harmful consequences as users without adequate domain knowledge start to over-rely on these increasingly convincing language models.

A certain degree of hallucination is inevitable across all autoregressive LLMs. For example, a model can attribute a counterfeit quote to a celebrity that was never said. They may assert something about a particular topic that is factually incorrect or cite non-existent sources in research papers, thus spreading misinformation.

However, getting AI models to hallucinate does not always have adverse effects. For example, a new study suggests scientists are unearthing ‘novel proteins with an unlimited array of properties’ through hallucinating LLMs.

What Causes LLMs Hallucinations?

LLMs can hallucinate due to various factors, ranging from overfitting errors in encoding and decoding to training bias.

Overfitting

Image by janjf93 from Pixabay

Overfitting is an issue where an AI model fits the training data too well. Still, it cannot fully represent the whole range of inputs it may encounter, i.e., it fails to generalize its predictive power to new, unseen data. Overfitting can lead to the model producing hallucinated content.

Encoding and Decoding Errors

Image by geralt from Pixabay

If there are errors in the encoding and decoding of text and its subsequent representations, this can also cause the model to generate nonsensical and erroneous outputs.

Training Bias

Image by Quince Creative from Pixabay

Another factor is the presence of certain biases in the training data, which can cause the model to give results that represent those biases rather than the actual nature of the data. This is similar to the lack of diversity in the training data, which limits the model’s ability to generalize to new data.

The complex structure of LLMs makes it quite challenging for AI researchers and practitioners to identify, interpret, and correct these underlying causes of hallucinations.

Ethical Concerns of LLM Hallucinations

LLMs can perpetuate and amplify harmful biases through hallucinations and can, in turn, negatively impact the users and have detrimental social consequences. Some of these most important ethical concerns are listed below:

Discriminating and Toxic Content

Image by ar130405 from Pixabay

Since the LLM training data is often full of sociocultural stereotypes due to the inherent biases and lack of diversity. LLMs can, thus, produce and reinforce these harmful ideas against disadvantaged groups in society.

They can generate this discriminating and hateful content based on race, gender, religion, ethnicity, etc.

Privacy Issues

Image by JanBaby from Pixabay

LLMs are trained on a massive training corpus which often includes the personal information of individuals. There have been cases where such models have violated people’s privacy. They can leak specific information such as social security numbers, home addresses, cell phone numbers, and medical details.

Misinformation and Disinformation

Image by geralt from Pixabay

Language models can produce human-like content that seems accurate but is, in fact, false and not supported by empirical evidence. This can be accidental, leading to misinformation, or it can have malicious intent behind it to knowingly spread disinformation. If this goes unchecked, it can create adverse social-cultural-economic-political trends.

Preventing LLM Hallucinations

Image by athree23 from Pixabay

Researchers and practitioners are taking various approaches to address the problem of hallucinations in LLMs. These include improving the diversity of training data, eliminating inherent biases, using better regularization techniques, and employing adversarial training and reinforcement learning, among others:

  • Developing better regularization techniques is at the core of tackling hallucinations. They help prevent overfitting and other problems that cause hallucinations.
  • Data augmentation can reduce the frequency of hallucinations, as evidenced by a research study. Data augmentation involves augmenting the training set by adding a random token anywhere in the sentence. It doubles the size of the training set and causes a decrease in the frequency of hallucinations.
  • OpenAI and Google’s DeepMind developed a technique called reinforcement learning with human feedback (RLHF) to tackle ChatGPT’s hallucination problem. It involves a human evaluator who frequently reviews the model’s responses and picks out the most appropriate for the user prompts. This feedback is then used to adjust the behavior of the model. Ilya Sutskever, OpenAI’s chief scientist, recently mentioned that this approach can potentially resolve hallucinations in ChatGPT: “I’m quite hopeful that by simply improving this subsequent reinforcement learning from the human feedback step, we can teach it to not hallucinate”.
  • Identifying hallucinated content to use as an example for future training is also a method used to tackle hallucinations. A novel technique in this regard detects hallucinations at the token level and predicts whether each token in the output is hallucinated. It also includes a method for unsupervised learning of hallucination detectors.

Token-level Hallucination Detection

Put simply, LLM hallucinations are a growing concern. And despite the efforts, much work still needs to be done to address the problem. The complexity of these models means it’s generally challenging to identify and rectify the inherent causes of hallucinations correctly.

However, with continued research and development, mitigating hallucinations in LLMs and reducing their ethical consequences is possible.

If you want to learn more about LLMs and the preventive techniques being developed to rectify LLMs hallucinations, check out unite.ai to expand your knowledge.

Credit: Source link

Previous Post

CSI releases IT Governance to meet growing regulatory expectations

Next Post

Bluesky is starting to feel like Twitter

New York Tech Editorial Team

New York Tech Editorial Team

New York Tech Media is a leading news publication that aims to provide the latest tech news, fintech, AI & robotics, cybersecurity, startups & leaders, venture capital, and much more!

Next Post
Bluesky is starting to feel like Twitter

Bluesky is starting to feel like Twitter

  • Trending
  • Comments
  • Latest
Meet the Top 10 K-Pop Artists Taking Over 2024

Meet the Top 10 K-Pop Artists Taking Over 2024

March 17, 2024
Panther for AWS allows security teams to monitor their AWS infrastructure in real-time

Many businesses lack a formal ransomware plan

March 29, 2022
Zach Mulcahey, 25 | Cover Story | Style Weekly

Zach Mulcahey, 25 | Cover Story | Style Weekly

March 29, 2022
How To Pitch The Investor: Ronen Menipaz, Founder of M51

How To Pitch The Investor: Ronen Menipaz, Founder of M51

March 29, 2022
10 Raunchy Movies on Netflix You Won’t Regret Watching

10 Raunchy Movies on Netflix You Won’t Regret Watching

May 20, 2024
Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

March 29, 2022
Startups On Demand: renovai is the Netflix of Online Shopping

Startups On Demand: renovai is the Netflix of Online Shopping

2
Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

1
Menashe Shani Accessibility High Tech on the low

Revolutionizing Accessibility: The Story of Purple Lens

1

Netgear announces a $1,500 Wi-Fi 6E mesh router

0
These apps let you customize Windows 11 to bring the taskbar back to life

These apps let you customize Windows 11 to bring the taskbar back to life

0
This bipedal robot uses propeller arms to slackline and skateboard

This bipedal robot uses propeller arms to slackline and skateboard

0
Automat-it Vanta partnership

Automat-it And Vanta Partner To Transform Compliance Into A Growth Engine For AWS Startups

March 5, 2026
PointFive DeepWaste

DeepWaste AI Expands Cost Optimization to GPU Waste, Misconfigurations, and Provisioning Leakage

March 5, 2026
Reclaim Security team

Reclaim Security Raises $26M to Close the Remediation Gap With AI-Driven Automation

March 4, 2026
woman in green top posing beside a mirror wall

Inside the AI Shift: How Dolica Gopisetty Helps Enterprises Turn Hype into Real Transformation

February 25, 2026
New CISO Whisperer report highlights shift toward identity, integrity, and automation oversight

New CISO Whisperer report highlights shift toward identity, integrity, and automation oversight

February 23, 2026
AIUP and AINT*: FINQ Launches the First ETFs Fully Managed by Artificial Intelligence

AIUP and AINT*: FINQ Launches the First ETFs Fully Managed by Artificial Intelligence

February 11, 2026

Recommended

Automat-it Vanta partnership

Automat-it And Vanta Partner To Transform Compliance Into A Growth Engine For AWS Startups

March 5, 2026
PointFive DeepWaste

DeepWaste AI Expands Cost Optimization to GPU Waste, Misconfigurations, and Provisioning Leakage

March 5, 2026
Reclaim Security team

Reclaim Security Raises $26M to Close the Remediation Gap With AI-Driven Automation

March 4, 2026
woman in green top posing beside a mirror wall

Inside the AI Shift: How Dolica Gopisetty Helps Enterprises Turn Hype into Real Transformation

February 25, 2026

Categories

  • AI & Robotics
  • Benzinga
  • Cybersecurity
  • FinTech
  • New York Tech
  • News
  • Startups & Leaders
  • Venture Capital

Tags

AI AI QSRs Allseated AWS B2B marketing Business CISO CISO Whisperer coding Collaborations Companies To Watch cryptocurrency Cybersecurity Entrepreneur Fetcherr Finance FINQ Fintech hi-tech Hi Auto Investing Investors investorsummit Israel israelitech Leaders LinkedIn Leaders Metaverse Mindset Minnesota omri hurwitz OurCrowd PointFive PR QSR Real Estate start- up startupnation Startups Startups On Demand startuptech Tech Tech leaders Unlimited Robotics VC
  • Contact Us
  • Privacy Policy
  • Terms and conditions

© 2024 All Rights Reserved - New York Tech Media

No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital

© 2024 All Rights Reserved - New York Tech Media