New York Tech Media
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
New York Tech Media
No Result
View All Result
Home AI & Robotics

A New and Simpler Deepfake Method That Outperforms Prior Approaches

New York Tech Editorial Team by New York Tech Editorial Team
November 10, 2021
in AI & Robotics
0
A New and Simpler Deepfake Method That Outperforms Prior Approaches
Share on FacebookShare on Twitter

A collaboration between a Chinese AI research group and US-based researchers has developed what may be the first real innovation in deepfakes technology since the phenomenon emerged four years ago.

The new method can perform faceswaps that outperform all other existing frameworks on standard perceptual tests, without needing to exhaustively gather and curate large dedicated datasets and train them for up to a week for just a single identity. For the examples presented in the new paper, models were trained on the entirety of two popular celebrity datasets, on one NVIDIA Tesla P40 GPU for about three days.

Full video embedded at the end of this article. In this sample from a video in supplementary materials for the new paper, Scarlett Johansson's face is transferred onto the source video. CihaNet removes the problem of edge-masking when performing a swap, by forming and enacting deeper relationships between the source and target identities, meaning an end to 'obvious borders' and other superimposition glitches that occur in traditional deepfake approaches. Source: Source: https://mitchellx.github.io/#video

Full video available at the end of this article. In this sample from a video in supplementary materials provided by one of the authors of the new paper, Scarlett Johansson’s face is transferred onto the source video. CihaNet removes the problem of edge-masking when performing a swap, by forming and enacting deeper relationships between the source and target identities, meaning an end to ‘obvious borders’ and other superimposition glitches that occur in traditional deepfake approaches. Source: Source: https://mitchellx.github.io/#video

The new approach removes the need to ‘paste’ the transplanted identity crudely into the target video, which frequently leads to tell-tale artifacts that appear where the fake face ends and the real, underlying face begins. Rather, ‘hallucination maps’ are used to perform a deeper mingling of visual facets, because the system separates identity from context far more effectively than current methods, and therefore can blend the target identity at a more profound level.

From the paper. CihaNet transformations are facilitated through hallucination maps (bottom row). The system uses context information (i.e. face direction, hair, glasses and other occlusions, etc.) entirely from the image into which the new identity will be superimposed, and facial identity information entirely from the person who is to be inserted into the image. This ability to separate face from context is critical to the success of the system. Source: https://dl.acm.org/doi/pdf/10.1145/3474085.3475257

From the paper. CihaNet transformations are facilitated through hallucination maps (bottom row). The system uses context information (i.e. face direction, hair, glasses and other occlusions, etc.) entirely from the image into which the new identity will be superimposed, and facial identity information entirely from the person who is to be inserted into the image. This ability to separate face from context is critical to the success of the system. Source: https://dl.acm.org/doi/pdf/10.1145/3474085.3475257

Effectively the new hallucination map provides a more complete context for the swap, as opposed to the hard masks that often require extensive curation (and in the case of DeepFaceLab, separate training) while providing limited flexibility in terms of real incorporation of the two identities.

From samples provided in the supplementary materials, using both the FFHQ and Celeb-A HQ datasets, across VGGFace and Forensics++. The first two columns show the randomly-selected (real) images to be swapped. The following four columns show the results of the swap using the four most effective methods currently available, while the final column shows the result from CihaNet. The FaceSwap repository has been used, rather than the more popular DeepFaceLab, since both projects are forks of the original 2017 Deepfakes code on GitHub. Though each project has since added models, techniques, diverse UIs and supplementary tools, the underlying code that makes deepfakes possible has never changed, and remains common to both. Source: https://dl.acm.org/action/downloadSupplement?doi=10.1145%2F3474085.3475257&file=mfp0519aux.zip

The paper, titled One-stage Context and Identity Hallucination Network, is authored by researchers affiliated with JD AI Research, and the University of Massachusetts Amherst, and was supported by the National Key R&D Program of China under Grant No. 2020AAA0103800. It was introduced at the 29th ACM International Conference on Multimedia, on October 20th- 24th, at Chengdu, China.

No Need for ‘Face-On’ Parity

Both the most popular current deepfake software, DeepFaceLab, and competing fork FaceSwap, perform tortuous and frequently hand-curated workflows in order to identify which way a face is inclined, what obstacles are in the way that must be accounted for (again, manually), and must cope with many other irritating impediments (including lighting) that make their use far from the ‘point-and-click’ experience inaccurately portrayed in the media since the advent of deepfakes.

By contrast, CihaNet does not require that two images be facing the camera directly in order to extract and exploit useful identity information from a single image.

In these examples, a suite of deepfake software contenders are challenged with the task of swapping faces that are not only dissimilar in identity, but which are not facing the same way. Software derived from the original deepfakes repository (such as the hugely popular DeepFaceLab and FaceSwap, pictured above) cannot handle the disparity in angles between the two images to be swapped (see third column). Meanwhile, Cihanet can abstract the identity correctly, since the 'pose' of the face is not intrinsically part of the identity information.

In these examples, a suite of deepfake software contenders are challenged with the task of swapping faces that are not only dissimilar in identity, but which are not facing the same way. Software derived from the original deepfakes repository (such as the hugely popular DeepFaceLab and FaceSwap, pictured above) cannot handle the disparity in angles between the two images to be swapped (see third column). Meanwhile, CihaNet can abstract the identity correctly, since the ‘pose’ of the face is not intrinsically part of the identity information.

Architecture

The CihaNet project, according to the authors, was inspired by the 2019 collaboration between Microsoft Research and Peking University, called FaceShifter, though it makes some notable and critical changes to the core architecture of the older method.

FaceShifter uses two Adaptive Instance Normalization (AdaIN) networks to handle identity information, which data is then transposed into the target image via a mask, in a way similar to current popular deepfake software (and with all its related limitations), using an additional HEAR-Net (which includes a separately trained sub-net trained on occlusion obstacles – an additional layer of complexity).

Instead, the new architecture directly uses this ‘contextual’ information for the transformative process itself, via a two-step single Cascading Adaptive Instance Normalization (C-AdaIN) operation, which provides consistency of context (i.e. face skin and occlusions) of ID-relevant areas.

The second sub-net crucial to the system is called Swapping Block (SwapBlk), which generates an integrated feature from the context of the reference image and the embedded ‘identity’ information from the source image, bypassing the multiple stages necessary to accomplish this by conventional current means.

To help distinguish between context and identity, a hallucination map is generated for each level, standing in for a soft-segmentation mask, and acting on a wider range of features for this critical part of the deepfake process.

As the value of the hallucination map (pictured below right) grows, a clearer path between identities emerges.

As the value of the hallucination map (pictured below right) grows, a clearer path between identities emerges.

In this way, the entire swapping process is accomplished in a single stage and without post-processing.

Data and Testing

To try out the system, the researchers trained four models on two highly popular and variegated open image datasets – CelebA-HQ  and NVIDIA’s Flickr-Faces-HQ Dataset (FFHQ), each containing 30,000 and 70,000 images respectively.

No pruning or filtering was performed on these base datasets. In each case, the researchers trained the entirety of each dataset on the single Tesla GPU over three days, with a learning rate of 0.0002 on Adam optimization.

They then rendered out a series of random swaps among the thousands of personalities featured in the datasets, without regard for whether or not the faces were similar or even gender-matched, and compared CihaNet’s results to the output from four leading deepfake frameworks: FaceSwap (which stands in for the more popular DeepFaceLab, since it shares a root codebase in the original 2017 repository that brought deepfakes to the world); the aforementioned FaceShifter; FSGAN; and SimSwap.

In comparing the results via VGG-Face, FFHQ, CelebA-HQ and FaceForensics++, the authors found that their new model outperformed all prior models, as indicated in the table below.

The three metrics used in evaluating the results were Structural Similarity (SSIM), pose estimation error and ID retrieval accuracy, which is computed based on the percentage of successfully retrieved pairs.

The researchers contend that CihaNet represents a superior approach in terms of qualitative results, and a notable advance on the current state of the art in deepfake technologies, by removing the burden of extensive and labor-intensive masking architectures and methodologies, and achieving a more useful and actionable separation of identity from context.

Take a look below to see further video examples of the new technique. You can find the full-length video here.

From supplementary materials for the new paper, CihaNet performs faceswapping on various identities. Source: https://mitchellx.github.io/#video

 

Credit: Source link

Previous Post

Latest iOS beta lets you manually scan for sneaky AirTags

Next Post

Imagine Learning Acquires Robotify, Innovative New Platform Designed To Teach Coding Through Virtual Robotics Simulation

New York Tech Editorial Team

New York Tech Editorial Team

New York Tech Media is a leading news publication that aims to provide the latest tech news, fintech, AI & robotics, cybersecurity, startups & leaders, venture capital, and much more!

Next Post
Imagine Learning Acquires Robotify, Innovative New Platform Designed To Teach Coding Through Virtual Robotics Simulation

Imagine Learning Acquires Robotify, Innovative New Platform Designed To Teach Coding Through Virtual Robotics Simulation

  • Trending
  • Comments
  • Latest
Meet the Top 10 K-Pop Artists Taking Over 2024

Meet the Top 10 K-Pop Artists Taking Over 2024

March 17, 2024
Panther for AWS allows security teams to monitor their AWS infrastructure in real-time

Many businesses lack a formal ransomware plan

March 29, 2022
Zach Mulcahey, 25 | Cover Story | Style Weekly

Zach Mulcahey, 25 | Cover Story | Style Weekly

March 29, 2022
How To Pitch The Investor: Ronen Menipaz, Founder of M51

How To Pitch The Investor: Ronen Menipaz, Founder of M51

March 29, 2022
Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

March 29, 2022
UK VC fund performance up on last year

VC-backed Aerium develops antibody treatment for Covid-19

March 29, 2022
Startups On Demand: renovai is the Netflix of Online Shopping

Startups On Demand: renovai is the Netflix of Online Shopping

2
Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

1
Menashe Shani Accessibility High Tech on the low

Revolutionizing Accessibility: The Story of Purple Lens

1

Netgear announces a $1,500 Wi-Fi 6E mesh router

0
These apps let you customize Windows 11 to bring the taskbar back to life

These apps let you customize Windows 11 to bring the taskbar back to life

0
This bipedal robot uses propeller arms to slackline and skateboard

This bipedal robot uses propeller arms to slackline and skateboard

0
Coffee Nova’s $COFFEE Token

Coffee Nova’s $COFFEE Token

May 29, 2025
Money TLV website

BridgerPay to Spotlight Cross-Border Payments Innovation at Money TLV 2025

May 27, 2025
The Future of Software Development: Why Low-Code Is Here to Stay

Building Brand Loyalty Starts With Your Team

May 23, 2025
Tork Media Expands Digital Reach with Acquisition of NewsBlaze and Buzzworthy

Creative Swag Ideas for Hackathons & Launch Parties

May 23, 2025
Tork Media Expands Digital Reach with Acquisition of NewsBlaze and Buzzworthy

Strengthening Cloud Security With Automation

May 22, 2025
How Local IT Services in Anderson Can Boost Your Business Efficiency

Why VPNs Are a Must for Entrepreneurs in Asia

May 22, 2025

Recommended

Coffee Nova’s $COFFEE Token

Coffee Nova’s $COFFEE Token

May 29, 2025
Money TLV website

BridgerPay to Spotlight Cross-Border Payments Innovation at Money TLV 2025

May 27, 2025
The Future of Software Development: Why Low-Code Is Here to Stay

Building Brand Loyalty Starts With Your Team

May 23, 2025
Tork Media Expands Digital Reach with Acquisition of NewsBlaze and Buzzworthy

Creative Swag Ideas for Hackathons & Launch Parties

May 23, 2025

Categories

  • AI & Robotics
  • Benzinga
  • Cybersecurity
  • FinTech
  • New York Tech
  • News
  • Startups & Leaders
  • Venture Capital

Tags

3D bio-printing acoustic AI Allseated B2B marketing Business carbon footprint climate change coding Collaborations Companies To Watch consumer tech crypto cryptocurrency deforestation drones earphones Entrepreneur Fetcherr Finance Fintech food security Investing Investors investorsummit israelitech Leaders LinkedIn Leaders Metaverse news OurCrowd PR Real Estate reforestation software start- up Startups Startups On Demand startuptech Tech Tech leaders technology UAVs Unlimited Robotics VC
  • Contact Us
  • Privacy Policy
  • Terms and conditions

© 2024 All Rights Reserved - New York Tech Media

No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital

© 2024 All Rights Reserved - New York Tech Media