New York Tech Media
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
New York Tech Media
No Result
View All Result
Home AI & Robotics

Machine Learning Model Understands Object Relationships

New York Tech Editorial Team by New York Tech Editorial Team
December 16, 2021
in AI & Robotics
0
Machine Learning Model Understands Object Relationships
Share on FacebookShare on Twitter

Researchers at Massachusetts Institute of Technology (MIT) have developed a new machine learning (ML) model that understands the underlying relationships between objects in a scene. The model represents individual relationships one at a time before combining the representations to describe the overall scene. 

Through this new approach, the ML model can generate more accurate images from text descriptions, and it can do this even when the scene has multiple projects arranged in different relationships with one another. 

This new development is important given that many deep learning models are unable to understand the entangled relationships between individual objects.

The team’s model could be used in cases where industrial robots must perform multi step manipulation tasks, such as stacking items or assembling appliances. It also helps lead to machines eventually being able to learn from and interact with their environments, just like humans. 

Yilun Du is a PhD student in the Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper. Du co-led the research with Shuang Li, a CSAIL PhD student, and Nan Liu, a graduate student at the University of Illinois at Urbana-Champaign. It also included Joshua B. Tenenbaum, Paul E. Newton Career Development Professor of Cognitive Science and Computation in the Department of Brain and Cognitive Sciences, and senior author Antonio Torralba, the Delta Electronics Professor of Electrical Engineering and Computer Science. Both Tenenbaum and Torralba are members of CSAIL.

The New Framework

“When I look at a table, I can’t say that there is an object at XYZ location. Our minds don’t work like that. In our minds, when we understand a scene, we really understand it based on the relationships between the objects. We think that by building a system that can understand the relationships between objects, we could use that system to more effectively manipulate and change our environments,” says Du.

The new framework can generate an image of a scene based on a text description of objects and their relationships. 

The system can then break these sentences down into smaller pieces that describe each individual relationship. Each part is then modeled separately, and the pieces are combined through an optimization process that generates an image of the scene. 

With the sentences broken down into shorter pieces, the system can then recombine them in different ways, enabling it to adapt to scene descriptions it has never encountered.

“Other systems would take all the relations holistically and generate the image one-shot from the description. However, such approaches fail when we have out-of-distribution descriptions, such as descriptions with more relations, since these models can’t really adapt one shot to generate images containing more relationships. However, as we are composing these separate, smaller models together, we can model a larger number of relationships and adapt to novel combinations,” Du says.

The system can also carry out this process in reverse. If it is fed an image, it can find text descriptions that match the relationships between objects in the scene. 

Evaluating the Model

The researchers asked humans to evaluate whether the generated images matched the original scene description. When descriptions contained three relationships, which was the most complex type, 91 percent of participants said the new model performed better than other deep learning methods.

“One interesting thing we found is that for our model, we can increase our sentence from having one relation description to having two, or three, or even four descriptions, and our approach continues to be able to generate images that are correctly described by those descriptions, while other methods fail,” Du says.

The model also demonstrated an impressive ability to work with descriptions it hadn’t encountered previously.

“This is very promising because that is closer to how humans work. Humans may only see several examples, but we can extract useful information from just those few examples and combine them together to create infinite combinations. And our model has such a property that allows it to learn from fewer data but generalize to more complex scenes or image generations,” Li says.

The team will now look to test the model on real-world images that are more complex and explore how to eventually incorporate the model into robotics systems. 

 

Credit: Source link

Previous Post

Following a $228M Series A, autonomous vehicle company Robotic Research is ready to hit the accelerator

Next Post

The Verge’s gift guide for dads: the 28 best holiday gifts for fathers

New York Tech Editorial Team

New York Tech Editorial Team

New York Tech Media is a leading news publication that aims to provide the latest tech news, fintech, AI & robotics, cybersecurity, startups & leaders, venture capital, and much more!

Next Post
The Verge’s gift guide for dads: the 28 best holiday gifts for fathers

The Verge’s gift guide for dads: the 28 best holiday gifts for fathers

  • Trending
  • Comments
  • Latest
Meet the Top 10 K-Pop Artists Taking Over 2024

Meet the Top 10 K-Pop Artists Taking Over 2024

March 17, 2024
Panther for AWS allows security teams to monitor their AWS infrastructure in real-time

Many businesses lack a formal ransomware plan

March 29, 2022
Zach Mulcahey, 25 | Cover Story | Style Weekly

Zach Mulcahey, 25 | Cover Story | Style Weekly

March 29, 2022
How To Pitch The Investor: Ronen Menipaz, Founder of M51

How To Pitch The Investor: Ronen Menipaz, Founder of M51

March 29, 2022
Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

March 29, 2022
UK VC fund performance up on last year

VC-backed Aerium develops antibody treatment for Covid-19

March 29, 2022
Startups On Demand: renovai is the Netflix of Online Shopping

Startups On Demand: renovai is the Netflix of Online Shopping

2
Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

1
Menashe Shani Accessibility High Tech on the low

Revolutionizing Accessibility: The Story of Purple Lens

1

Netgear announces a $1,500 Wi-Fi 6E mesh router

0
These apps let you customize Windows 11 to bring the taskbar back to life

These apps let you customize Windows 11 to bring the taskbar back to life

0
This bipedal robot uses propeller arms to slackline and skateboard

This bipedal robot uses propeller arms to slackline and skateboard

0
New York City

Why Bite-Sized Learning is Booming in NYC’s Hustle Culture

June 4, 2025
Driving Innovation in Academic Technologies: Spotlight from ICTIS 2025

Driving Innovation in Academic Technologies: Spotlight from ICTIS 2025

June 4, 2025
Coffee Nova’s $COFFEE Token

Coffee Nova’s $COFFEE Token

May 29, 2025
Money TLV website

BridgerPay to Spotlight Cross-Border Payments Innovation at Money TLV 2025

May 27, 2025
The Future of Software Development: Why Low-Code Is Here to Stay

Building Brand Loyalty Starts With Your Team

May 23, 2025
Tork Media Expands Digital Reach with Acquisition of NewsBlaze and Buzzworthy

Creative Swag Ideas for Hackathons & Launch Parties

May 23, 2025

Recommended

New York City

Why Bite-Sized Learning is Booming in NYC’s Hustle Culture

June 4, 2025
Driving Innovation in Academic Technologies: Spotlight from ICTIS 2025

Driving Innovation in Academic Technologies: Spotlight from ICTIS 2025

June 4, 2025
Coffee Nova’s $COFFEE Token

Coffee Nova’s $COFFEE Token

May 29, 2025
Money TLV website

BridgerPay to Spotlight Cross-Border Payments Innovation at Money TLV 2025

May 27, 2025

Categories

  • AI & Robotics
  • Benzinga
  • Cybersecurity
  • FinTech
  • New York Tech
  • News
  • Startups & Leaders
  • Venture Capital

Tags

3D bio-printing acoustic AI Allseated B2B marketing Business carbon footprint climate change coding Collaborations Companies To Watch consumer tech crypto cryptocurrency deforestation drones earphones Entrepreneur Fetcherr Finance Fintech food security Investing Investors investorsummit israelitech Leaders LinkedIn Leaders Metaverse news OurCrowd PR Real Estate reforestation software start- up Startups Startups On Demand startuptech Tech Tech leaders technology UAVs Unlimited Robotics VC
  • Contact Us
  • Privacy Policy
  • Terms and conditions

© 2024 All Rights Reserved - New York Tech Media

No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital

© 2024 All Rights Reserved - New York Tech Media