New York Tech Media
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
New York Tech Media
No Result
View All Result
Home AI & Robotics

Faking ‘Better’ Bodies With AI

New York Tech Editorial Team by New York Tech Editorial Team
March 10, 2022
in AI & Robotics
0
Faking ‘Better’ Bodies With AI
Share on FacebookShare on Twitter

New research from the Alibaba DAMO academy offers an AI-driven workflow for automating the reshaping of images of bodies – a rare effort in a computer vision sector currently occupied with face-based manipulations such as deepfakes and GAN-based face editing.

Inset in 'result' columns, the generated attention maps which define the areas to be amended. Source: https://arxiv.org/pdf/2203.04670.pdf

Inset in ‘result’ columns, the generated attention maps which define the areas to be amended. Source: https://arxiv.org/pdf/2203.04670.pdf

The researchers’ architecture uses skeleton pose estimation to tackle the greater complexity that image synthesis and editing systems face in conceptualizing and parametrizing existing body images, at least to a level of granularity that actually allows meaningful and selective editing.

Estimated skeleton maps help to individuate and focus attention on areas of the body likely to be retouched, such as the upper arm area.

The system ultimately enables a user to set parameters that can change the appearance of weight, muscle mass, or weight distribution in full-length or mid-length photos of people, and is able to generate arbitrary transformations on clothed or unclothed body sections.

Left, the input image; middle, a heat-map of the derived attention areas; right, the transformed image.

Left, the input image; middle, a heat-map of the derived attention areas; right, the transformed image.

The motivation for the work is the development of automated workflows that could replace the arduous digital manipulations undertaken by photographers and production graphics artists in various branches of the media, from fashion to magazine-style output and publicity material.

In general, the authors acknowledge, these transformations are usually applied with ‘warp’ techniques in Photoshop and other other traditional bitmap editors, and are almost exclusively used on images of women. Consequently, the custom dataset developed to facilitate the new process consists mostly of pictures of female subjects:

‘As body retouching is mainly desired by females, the majority of our collection are female photos, considering the diversity of ages, races (African:Asian:Caucasian = 0.33:0.35:0.32), poses, and garments.’

The paper is titled Structure-Aware Flow Generation for Human Body Reshaping, and comes from five authors associated with Alibaba’s global DAMO academy.

Dataset Development

As is usually the case with image synthesis and editing systems, the architecture for the project required a customized training dataset. The authors commissioned three photographers to produce standard Photoshop manipulations of apposite images from stock photography site Unsplash, resulting in a dataset – titled BR-5K*  – of 5,000 high quality images at 2K resolution.

The researchers emphasize that the objective of training on this dataset is not to produce ‘idealized’ and generalized features relating to an index of attractiveness or desirable appearance, but rather to extract the central feature mappings associated with professional manipulations of body images.

However, they concede that the manipulations ultimately reflect transformative processes that map a progression from ‘real’ to a preset notion of ‘ideal’:

‘We invite three professional artists to retouch bodies using Photoshop independently, with the goal of achieving slender figures that meet the popular aesthetics, and select the best one as ground-truth.’

Since the framework does not deal with faces at all, these were blurred out before being included in the dataset.

Architecture and Core Concepts

The system’s workflow involves feeding in a high resolution portrait, downsampling it to a lower resolution that can fit into the available computing resources, and extracting an estimated skeleton-map pose (second figure from left in image below), as well as Part Affinity Fields (PAFs), which were innovated in 2016 by The Robotics Institute at Carnegie Mellon University (see video embedded directly below).

Part Affinity Fields help to define orientation of limbs and general association with the broader skeletal framework, providing the new project with an additional attention/localization tool.

From the 2016 Part Affinity Fields paper, predicted PAFs encode limb orientation as part of a 2D vector that also includes the general position of the limb. Source: https://arxiv.org/pdf/1611.08050.pdf

From the 2016 Part Affinity Fields paper, predicted PAFs encode limb orientation as part of a 2D vector that also includes the general position of the limb. Source: https://arxiv.org/pdf/1611.08050.pdf

Despite their apparent irrelevance to the appearance of weight, skeleton maps are useful in directing the final transformative processes to parts of the body to be amended, such as upper arms, rear, and thighs.

After this, the results are fed to a Structure Affinity Self-Attention (SASA) in the central bottleneck of the process (see image below).

The SASA regulates the consistency of the flow generator that fuels the process, the results of which are then passed to the warping module (second from right in the image above), which applies the transformations learned from training on the manual revisions included in the dataset.

The Structure Affinity Self-Attention (SASA) module allocates attention to pertinent body parts, helping to avoid extraneous or irrelevant transformations.

The Structure Affinity Self-Attention (SASA) module allocates attention to pertinent body parts, helping to avoid extraneous or irrelevant transformations.

The output image is subsequently upsampled back to the original 2K resolution, using processes not dissimilar to the standard, 2017-style deepfake architecture from which popular packages such as DeepFaceLab have since been derived; the upsampling process is also common in GAN editing frameworks.

The attention network for the schema is modeled after Compositional De-Attention Networks (CODA), a 2019 US/Singapore academic collaboration with Amazon AI and Microsoft.

Tests

The flow-based framework was tested against prior flow-based methods FAL and Animating Through Warping (ATW), as well as image translation architectures Pix2PixHD and GFLA, with SSIM, PSNR and LPIPS as evaluation metrics.

Results of initial tests (arrow direction in headers indicates whether lower or higher figures are best).

Results of initial tests (arrow direction in headers indicates whether lower or higher figures are best).

Based on these adopted metrics, the authors’ system outperforms the prior architectures.

Selected results. Please refer to the original PDF linked in this article for higher resolution comparisons.

Selected results. Please refer to the original PDF linked in this article for higher resolution comparisons.

In addition to the automated metrics, the researchers conducted a user study (final column of results table pictured earlier), wherein 40 participants were each shown 30 questions randomly selected from a 100-question pool relating to the images produced via the various methods. 70% of the respondents favored the new technique as more ‘visually appealing’.

Challenges

The new paper represents a rare excursion into AI-based body manipulation. The image synthesis sector is currently far more interested either in generating editable bodies via methods such as Neural Radiance Fields (NeRF), or else is fixated on exploring the latent space of GANs and the potential of autoencoders for facial manipulation.

The authors’ initiative is currently limited to producing changes in perceived weight, and they have not implemented any kind of inpainting technique that would restore the background that’s inevitably revealed when you slim down a picture of someone.

However, they propose that portrait matting and background blending through textural inference could trivially solve the problem of restoring the parts of the world that were formerly hidden in the image by human ‘imperfection’.

A proposed solution for restoring background that's revealed by AI-driven fat reduction.

A proposed solution for restoring background that’s revealed by AI-driven fat reduction.

 

* Though the preprint refers to supplemental material giving more details about the dataset, as well as further examples from the project, the location of this material is not made available in the paper, and the corresponding author has not yet responded to our request for access.

First published 10th March 2022.

Credit: Source link

Previous Post

Makers Fund Launches $500 Million Game-Focused VC Investment Fund

Next Post

Updates You Need to Know Today!

New York Tech Editorial Team

New York Tech Editorial Team

New York Tech Media is a leading news publication that aims to provide the latest tech news, fintech, AI & robotics, cybersecurity, startups & leaders, venture capital, and much more!

Next Post
Updates You Need to Know Today!

Updates You Need to Know Today!

  • Trending
  • Comments
  • Latest
Meet the Top 10 K-Pop Artists Taking Over 2024

Meet the Top 10 K-Pop Artists Taking Over 2024

March 17, 2024
Panther for AWS allows security teams to monitor their AWS infrastructure in real-time

Many businesses lack a formal ransomware plan

March 29, 2022
Zach Mulcahey, 25 | Cover Story | Style Weekly

Zach Mulcahey, 25 | Cover Story | Style Weekly

March 29, 2022
10 Raunchy Movies on Netflix You Won’t Regret Watching

10 Raunchy Movies on Netflix You Won’t Regret Watching

May 20, 2024
How To Pitch The Investor: Ronen Menipaz, Founder of M51

How To Pitch The Investor: Ronen Menipaz, Founder of M51

March 29, 2022
Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

March 29, 2022
Startups On Demand: renovai is the Netflix of Online Shopping

Startups On Demand: renovai is the Netflix of Online Shopping

2
Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

1
Menashe Shani Accessibility High Tech on the low

Revolutionizing Accessibility: The Story of Purple Lens

1

Netgear announces a $1,500 Wi-Fi 6E mesh router

0
These apps let you customize Windows 11 to bring the taskbar back to life

These apps let you customize Windows 11 to bring the taskbar back to life

0
This bipedal robot uses propeller arms to slackline and skateboard

This bipedal robot uses propeller arms to slackline and skateboard

0
laptop on glass table

Automat-it Cuts Deployment Friction as Monce Scales AI Order Processing on AWS

April 13, 2026
Lee's Famous Recipe Chicken

Why Lee’s Famous Recipe Chicken Is Betting on Hi Auto to Quietly Rewire the Drive-Thru

April 9, 2026
computer generated image of letters

San Francisco Tribune Lists 11 HumanX Startups Moving AI Closer to the Operating Core

April 8, 2026
Impala CEO and Highrise AI CEO

The Industrialization of AI Infrastructure: What Impala and Highrise AI Reveal About the Next Scaling Frontier

April 7, 2026
Employee Time Tracking

What is an Employee Time Tracking Solution? A Definite Guide for 2026

March 31, 2026
Voltify founders

Voltify Raises $30 Million Seed Round as It Challenges $1 Trillion Rail Electrification Model

March 31, 2026

Recommended

laptop on glass table

Automat-it Cuts Deployment Friction as Monce Scales AI Order Processing on AWS

April 13, 2026
Lee's Famous Recipe Chicken

Why Lee’s Famous Recipe Chicken Is Betting on Hi Auto to Quietly Rewire the Drive-Thru

April 9, 2026
computer generated image of letters

San Francisco Tribune Lists 11 HumanX Startups Moving AI Closer to the Operating Core

April 8, 2026
Impala CEO and Highrise AI CEO

The Industrialization of AI Infrastructure: What Impala and Highrise AI Reveal About the Next Scaling Frontier

April 7, 2026

Categories

  • AI & Robotics
  • Benzinga
  • Cybersecurity
  • FinTech
  • New York Tech
  • News
  • Startups & Leaders
  • Venture Capital

Tags

AI AI QSRs Allseated Automat-it AWS B2B marketing Business CISO CISO Whisperer Collaborations Companies To Watch cryptocurrency Cybersecurity Entrepreneur Fetcherr Finance FINQ Fintech Funding Announcement hi-tech Hi Auto Impala Investing Investors investorsummit Israel israelitech Leaders LinkedIn Leaders Metaverse Mindset Minnesota omri hurwitz PointFive PR QSR Real Estate start- up startupnation Startups Startups On Demand Tech Tech leaders Unlimited Robotics VC
  • Contact Us
  • Privacy Policy
  • Terms and conditions

© 2024 All Rights Reserved - New York Tech Media

No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital

© 2024 All Rights Reserved - New York Tech Media