New York Tech Media
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
New York Tech Media
No Result
View All Result
Home AI & Robotics

NLP Models Struggle to Understand Recursive Noun Phrases

New York Tech Editorial Team by New York Tech Editorial Team
December 16, 2021
in AI & Robotics
0
NLP Models Struggle to Understand Recursive Noun Phrases
Share on FacebookShare on Twitter

Researchers from the US and China have found that none of the leading Natural Language Processing (NLP) models seem to be capable, by default, of  unraveling English sentences that feature recursive noun phrases (NPs), and ‘struggle’ to individuate the central meaning in closely-related examples such as My favorite new movie and My favorite movie (each of which has a different meaning).

In a headline example from the paper, here is a minor puzzle that children frequently fail to unpick: the second ball is green, but the fifth ball is the 'second green ball'. Source: https://arxiv.org/pdf/2112.08326.pdf

In a headline example from the paper, here is a minor puzzle that children frequently fail to unpick: the second ball is green, but the fifth ball is the ‘second green ball’. Source: https://arxiv.org/pdf/2112.08326.pdf

The researchers set a Recursive Noun Phrase Challenge (RNPC) to several locally installed open source language generation models: OpenAI’s GPT-3*, Google’s BERT, and Facebook’s RoBERTa and BART, finding that these state-of-the-art models only achieved ‘chance’ performance. They conclude†:

‘Results show that state-of-the-art (SOTA) LMs fine-tuned on standard benchmarks of the same format all struggle on our dataset, suggesting that the target knowledge is not readily available.’

Minimal-pair examples in the RNPC challenge where the SOTA models made errors.

Minimal-pair examples in the RNPC challenge where the SOTA models made errors.

In the examples above, the models failed, for instance, to distinguish the semantic disparity between a dead dangerous animal (i.e. a predator that poses no threat because it is dead) and a dangerous dead animal (such as a dead squirrel, that may contain a harmful virus, and is a currently active threat).

(Additionally, though the paper does not touch on it, ‘dead’ is also frequently used as an adverb, which addresses neither case)

However, the researchers also found that additional or supplementary training that includes RNPC material can resolve the issue:

‘Pre-trained language models with SOTA performance on NLU benchmarks have poor mastery of this knowledge, but can still learn it when exposed to small amounts of data from RNPC.’

The researchers argue that a language model’s ability to navigate recursive structures of this type is essential for downstream tasks such as language analysis, translation, and make a special case for its importance in harm detection routines:

‘[We] consider the scenario where a user interacts with a task-oriented agent like Siri or Alexa, and the agent needs to determine whether the involved activity in the user query is potentially harmful [i.e. to minors]. We choose this task because many false positives come from recursive NPs.

‘For example, how to make a homemade bomb is obviously harmful while how to make a homemade bath bomb is harmless.’

The paper is titled Is “my favorite new movie” my favorite movie? Probing the Understanding of Recursive Noun Phrases, and comes from five researchers at the University of Pennsylvania and one at Peking University.

Data and Method

Though prior work has studied syntactic structure of recursive NPs and the semantic categorization of modifiers, neither of these approaches is sufficient, according to the researchers, to address the challenge.

Therefore, based on the use of recursive noun phrases with two modifiers, the researchers have sought to establish whether the prerequisite knowledge exists in SOTA NLP systems (it doesn’t); whether it can be taught to them (it can); what NLP systems can learn from recursive NPs; and in what ways such knowledge can benefit downstream applications.

The dataset the researchers used was created in four stages. First was the construction of a modifier lexicon containing 689 examples drawn from prior literature and novel work.

Next the researchers gathered recursive NPs from literature, existing corpora, and additions of their own invention. Textual resources included the Penn Treebank, and the Annotated Gigaword corpus.

Then the team hired pre-screened college students to create examples for the three tasks that the language models would face, validating them afterwards into 8,260 valid instances.

Finally, more pre-screened college students were hired, this time via Amazon Mechanical Turk, to annotate each instance as a Human Intelligence Task (HIT), deciding disputes on a majority basis. This whittled the instances down to 4,567 examples, which were further filtered down to 3,790 more balanced instances.

The researchers adapted various existing datasets to formulate the three sections of their testing hypotheses, including MNLI, SNLI, MPE and ADEPT, training all the SOTA models themselves, with the exception of the HuggingFace model, where a checkpoint was used.

Results

The researchers found that all models ‘struggle’ on RNPC tasks, versus a reliable 90%+ accuracy score for humans, with the SOTA models performing at ‘chance’ levels (i.e. without any evidence of innate ability versus random chance in response).

Results from the researchers' tests. Here the language models are tested against their accuracy on an existing benchmark, with the central line representing equivalent human performance in the tasks.

Results from the researchers’ tests. Here the language models are tested against their accuracy on an existing benchmark, with the central line representing equivalent human performance in the tasks.

Secondary lines of investigation indicate that these deficiencies can be compensated for at the training or fine-tuning phase of an NLP model’s pipeline by specifically including knowledge of recursive noun phrases. Once this supplementary training was undertaken, the models achieved ‘strong zero-shot performance on an extrinsic Harm Detection [tasks]’.

The researchers promise to release the code for this work at https://github.com/veronica320/Recursive-NPs.

 

* GPT-3 Ada, which is the fastest but not the best of the series. However, the larger ‘showcase’ Davinci model is not available for the fine-tuning that comprises the later phrase of the researchers’ experiments.

† My conversion of inline citations to hyperlinks.

Credit: Source link

Previous Post

Bnext becomes the first fintech to use LoginID’s FIDO-certified Biometric Authentication to secure remittances on Algorand, between Spain and Latin America

Next Post

China’s mobile robots attract big money

New York Tech Editorial Team

New York Tech Editorial Team

New York Tech Media is a leading news publication that aims to provide the latest tech news, fintech, AI & robotics, cybersecurity, startups & leaders, venture capital, and much more!

Next Post
China’s mobile robots attract big money

China's mobile robots attract big money

  • Trending
  • Comments
  • Latest
Meet the Top 10 K-Pop Artists Taking Over 2024

Meet the Top 10 K-Pop Artists Taking Over 2024

March 17, 2024
10 Raunchy Movies on Netflix You Won’t Regret Watching

10 Raunchy Movies on Netflix You Won’t Regret Watching

May 20, 2024
Panther for AWS allows security teams to monitor their AWS infrastructure in real-time

Many businesses lack a formal ransomware plan

March 29, 2022
Zach Mulcahey, 25 | Cover Story | Style Weekly

Zach Mulcahey, 25 | Cover Story | Style Weekly

March 29, 2022
How To Pitch The Investor: Ronen Menipaz, Founder of M51

How To Pitch The Investor: Ronen Menipaz, Founder of M51

March 29, 2022
Clubhouse will soon let you pin links to the top of rooms

Clubhouse will soon let you pin links to the top of rooms

October 23, 2021
Startups On Demand: renovai is the Netflix of Online Shopping

Startups On Demand: renovai is the Netflix of Online Shopping

2
Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

1
Menashe Shani Accessibility High Tech on the low

Revolutionizing Accessibility: The Story of Purple Lens

1

Netgear announces a $1,500 Wi-Fi 6E mesh router

0
These apps let you customize Windows 11 to bring the taskbar back to life

These apps let you customize Windows 11 to bring the taskbar back to life

0
This bipedal robot uses propeller arms to slackline and skateboard

This bipedal robot uses propeller arms to slackline and skateboard

0
three men posing outdoors

An AI Company on a Tiny Island Just Beat the Biggest Names on Wall Street. Here’s the Part That Should Surprise You.

June 2, 2026
man in a blue coat wearing glasses

Why Human Skills Matter More Than Ever in the AI Era

May 27, 2026
essential travel gadgets

May 24, 2026
graphic of Next-Gen Entrepreneurs event

Leadership, Judgment, and Innovation: A Post-Event Conversation with Dr. Fang Miao

May 21, 2026
Arito founding team

Arito AI Raises $6 Million To Bring Agentic Intelligence To Finance And Revenue Teams

May 20, 2026
Viewz founding team

Viewz Raises $7M to Retire the Finance Stack as We Know It

May 19, 2026

Recommended

three men posing outdoors

An AI Company on a Tiny Island Just Beat the Biggest Names on Wall Street. Here’s the Part That Should Surprise You.

June 2, 2026
man in a blue coat wearing glasses

Why Human Skills Matter More Than Ever in the AI Era

May 27, 2026
essential travel gadgets

May 24, 2026
graphic of Next-Gen Entrepreneurs event

Leadership, Judgment, and Innovation: A Post-Event Conversation with Dr. Fang Miao

May 21, 2026

Categories

  • AI & Robotics
  • Benzinga
  • Cybersecurity
  • FinTech
  • New York Tech
  • News
  • Startups & Leaders
  • Venture Capital

Tags

AI AI QSRs Allseated Automat-it AWS B2B marketing Business CISO CISO Whisperer Collaborations Companies To Watch cryptocurrency Cybersecurity Entrepreneur Fetcherr Finance FINQ Fintech Funding Announcement hi-tech Hi Auto Impala Investing Investors investorsummit Israel israelitech Leaders LinkedIn Leaders Metaverse Mindset Minnesota omri hurwitz PointFive PR QSR Real Estate start- up startupnation Startups Startups On Demand Tech Tech leaders Unlimited Robotics VC
  • Contact Us
  • Privacy Policy
  • Terms and conditions

© 2024 All Rights Reserved - New York Tech Media

No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital

© 2024 All Rights Reserved - New York Tech Media