New York Tech Media
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
New York Tech Media
No Result
View All Result
Home AI & Robotics

Amy Steier, Principal Machine Learning Scientist at Gretel.ai – Interview Series

New York Tech Editorial Team by New York Tech Editorial Team
February 8, 2022
in AI & Robotics
0
Amy Steier, Principal Machine Learning Scientist at Gretel.ai – Interview Series
Share on FacebookShare on Twitter

Amy Steier is the Principal Machine Learning Scientist at Gretel.ai, the world’s most advanced privacy engineering platform. Gretel makes it easy to embed privacy by design into the fabric of data-driven technology. Its AI-based, open-sourced libraries are designed for transforming, anonymizing, and synthesizing sensitive information.

Amy is a highly accomplished machine learning and data scientist with more than 20 years experience. Her passion is big data and surfacing the hidden intelligence within using techniques from machine learning, data mining, artificial intelligence and statistics. She is highly skilled in predictive modeling, classification, clustering, anomaly detection, data visualization, ensemble methods, information retrieval, cybersecurity analytics , NLP, recommendation models, and user behavioral analytics.

What initially attracted you to pursue a career in computer science and machine learning?

My sheer, unabashed, enduring love of data.  The power, mystery, intrigue and potential of data has always fascinated me.  Computer science and machine learning are tools for harnessing that potential. It’s also terribly fun to work in a field where the state of the art moves so quickly. I love the intersection of research and product. It’s very satisfying to take bleeding edge ideas, push them a little further, and then morph them to fit existing, tangible product needs.

For readers who are unfamiliar, could you explain what synthetic data is?

Synthetic data is data that looks and acts like the original data but is also different enough that it satisfies some use case.  The most common use case is the need to protect the privacy of the information in the original data. Another use case is the need to create additional data to increase the size of the original dataset. Yet another use case is to help address a class imbalance or perhaps demographic bias in the original data set.

Synthetic data allows us to continue developing new and innovative products and solutions when the data necessary to do so otherwise wouldn’t be present or available.

How does the Gretel platform work to create synthetic data via APIs?

Gretel privacy engineering APIs allow you to ingest data to Gretel and explore the data we are able to extract. These are the same APIs used by our Console. By exposing the APIs, through an intuitive interface, we hope to empower developers and data scientists to build their own workflows around Gretel.

While the console makes creating synthetic data very easy, the APIs enable you to integrate the creation of synthetic data into your workflow. I love using the APIs because it enables me to customize the creation of synthetic data to a very particular use case.

Could you discuss some of the tools that are offered by Gretel to help assess the quality of the synthetic data?

After the creation of synthetic data, Gretel will generate a Synthetic Report.  In this report you can see the Synthetic Data Quality Score (SQS), as well as a Privacy Protection Level grade (PPL).

The SQS score is an estimate of how well the generated synthetic data maintains the same statistical properties as the original dataset. In this sense, the SQS score can be viewed as a utility score or a confidence score as to whether scientific conclusions drawn from the synthetic dataset would be the same if one were to have used the original dataset instead.

The Synthetic Data Quality Score is computed by combining the individual quality metrics: Field Distribution Stability, Field Correlation Stability and Deep Structure Stability.

Field Distribution Stability is a measure of how well the synthetic data maintains the same field distributions as in the original data.  The Field Correlation Stability is a measure of how well correlations between fields were maintained in the synthetic data.  And finally the Deep Structure Stability measures the statistical integrity of deeper, multi-field distributions and correlations. To estimate this, Gretel compares a Principal Component Analysis (PCA) computed first on the original data, then again on the synthetic data.

How do the Gretel privacy filters work?

The Gretel Privacy Filters were the culmination of much research on the nature of adversarial attacks on synthetic data. The Privacy Filters prevent the creation of synthetic data with weaknesses commonly exploited by adversarials. We have two Privacy Filters, the first is the Similarity Filter, and the second is the Outlier Filter. The Similarity Filter prevents the creation of synthetic records that are overly similar to a training record. These are prime targets of adversarials seeking to gain insights into the original data. The second Privacy Filter is the Outlier Filter. This prevents the creation of synthetic records that would be deemed an outlier in the space defined by the training data. Outliers revealed in a synthetic dataset can be exploited by Membership Inference Attacks, Attribute Inference, and a wide variety of other adversarial attacks. They are a serious privacy risk.

How can synthetic data assist with reducing AI bias?

The most common technique is to address the representational bias of the data feeding into an AI system.  For example, if there is a strong class imbalance in your data, or perhaps there exists demographic bias in your data, Gretel offers tools to help first measure the imbalance and then resolve it in the synthetic data. By removing the bias in the data, you often then remove the bias in the AI system built on the data.

You clearly enjoy learning about new machine learning technologies, how do you personally keep up with all of the changes?

Read, read, and then read some more, lol! I enjoy starting my day with reading about new ML technologies.  The Medium knows me so well. I enjoy reading articles in Towards Data Science, Analytics Vidhya and newsletters like The Sequence. Facebook AI, Google AI and OpenMined all have great blogs. There are a plethora of good conferences to follow such as NeurIPS, ICML, ICLR, AISTATS.

I also enjoy tools that track citation trails, help you find papers similar to ones you like and that get to know your specific interests and are always watching in the background for a paper that might interest you. Zeta Alpha is one such tool I use a lot.

Finally, you really can’t underestimate the benefit of having colleagues with similar interests. At Gretel, the ML team tracks research papers relevant to the fields we explore and frequently will get together to discuss interesting papers.

What’s your vision for the future of machine learning?

Easy access to data will instigate a great era of innovation in machine learning which then turbocharges innovation in a broad spectrum of fields such as healthcare, finance, manufacturing and the biosciences.  Historically, many ground-breaking advancements in ML can be attributed to a large volume of rich data. Yet historically, much research has been hindered by the inability to access or share data because of privacy concerns. As tools such as Gretel remove this barrier, access to data will be democratized.  The entire machine learning community will benefit from access to rich, large datasets, instead of just a few elite mega-companies.

Is there anything else that you would like to share about Gretel?

If you love data, you will love Gretel (so clearly I love Gretel!).  Easy access to data has been the thorn in the side of every data scientist I’ve ever known. At Gretel, we take great pride in having created a console and set of APIs that make the creation of private, shareable data as simple as possible. We profoundly believe that data is more valuable when it is shared.

Thank you for the great interview and for sharing your insights, readers who wish to learn more should visit Gretel.ai.

Credit: Source link

Previous Post

The Super Bowl won’t be lit: no cannabis ads allowed

Next Post

ecoSUB Robotics reveals all on the 3D printing technology behind its deep sea monitoring AUV

New York Tech Editorial Team

New York Tech Editorial Team

New York Tech Media is a leading news publication that aims to provide the latest tech news, fintech, AI & robotics, cybersecurity, startups & leaders, venture capital, and much more!

Next Post
ecoSUB Robotics reveals all on the 3D printing technology behind its deep sea monitoring AUV

ecoSUB Robotics reveals all on the 3D printing technology behind its deep sea monitoring AUV

  • Trending
  • Comments
  • Latest
Meet the Top 10 K-Pop Artists Taking Over 2024

Meet the Top 10 K-Pop Artists Taking Over 2024

March 17, 2024
Panther for AWS allows security teams to monitor their AWS infrastructure in real-time

Many businesses lack a formal ransomware plan

March 29, 2022
Zach Mulcahey, 25 | Cover Story | Style Weekly

Zach Mulcahey, 25 | Cover Story | Style Weekly

March 29, 2022
10 Raunchy Movies on Netflix You Won’t Regret Watching

10 Raunchy Movies on Netflix You Won’t Regret Watching

May 20, 2024
How To Pitch The Investor: Ronen Menipaz, Founder of M51

How To Pitch The Investor: Ronen Menipaz, Founder of M51

March 29, 2022
Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

March 29, 2022
Startups On Demand: renovai is the Netflix of Online Shopping

Startups On Demand: renovai is the Netflix of Online Shopping

2
Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

1
Menashe Shani Accessibility High Tech on the low

Revolutionizing Accessibility: The Story of Purple Lens

1

Netgear announces a $1,500 Wi-Fi 6E mesh router

0
These apps let you customize Windows 11 to bring the taskbar back to life

These apps let you customize Windows 11 to bring the taskbar back to life

0
This bipedal robot uses propeller arms to slackline and skateboard

This bipedal robot uses propeller arms to slackline and skateboard

0
laptop on glass table

Automat-it Cuts Deployment Friction as Monce Scales AI Order Processing on AWS

April 13, 2026
Lee's Famous Recipe Chicken

Why Lee’s Famous Recipe Chicken Is Betting on Hi Auto to Quietly Rewire the Drive-Thru

April 9, 2026
computer generated image of letters

San Francisco Tribune Lists 11 HumanX Startups Moving AI Closer to the Operating Core

April 8, 2026
Impala CEO and Highrise AI CEO

The Industrialization of AI Infrastructure: What Impala and Highrise AI Reveal About the Next Scaling Frontier

April 7, 2026
Employee Time Tracking

What is an Employee Time Tracking Solution? A Definite Guide for 2026

March 31, 2026
Voltify founders

Voltify Raises $30 Million Seed Round as It Challenges $1 Trillion Rail Electrification Model

March 31, 2026

Recommended

laptop on glass table

Automat-it Cuts Deployment Friction as Monce Scales AI Order Processing on AWS

April 13, 2026
Lee's Famous Recipe Chicken

Why Lee’s Famous Recipe Chicken Is Betting on Hi Auto to Quietly Rewire the Drive-Thru

April 9, 2026
computer generated image of letters

San Francisco Tribune Lists 11 HumanX Startups Moving AI Closer to the Operating Core

April 8, 2026
Impala CEO and Highrise AI CEO

The Industrialization of AI Infrastructure: What Impala and Highrise AI Reveal About the Next Scaling Frontier

April 7, 2026

Categories

  • AI & Robotics
  • Benzinga
  • Cybersecurity
  • FinTech
  • New York Tech
  • News
  • Startups & Leaders
  • Venture Capital

Tags

AI AI QSRs Allseated Automat-it AWS B2B marketing Business CISO CISO Whisperer Collaborations Companies To Watch cryptocurrency Cybersecurity Entrepreneur Fetcherr Finance FINQ Fintech Funding Announcement hi-tech Hi Auto Impala Investing Investors investorsummit Israel israelitech Leaders LinkedIn Leaders Metaverse Mindset Minnesota omri hurwitz PointFive PR QSR Real Estate start- up startupnation Startups Startups On Demand Tech Tech leaders Unlimited Robotics VC
  • Contact Us
  • Privacy Policy
  • Terms and conditions

© 2024 All Rights Reserved - New York Tech Media

No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital

© 2024 All Rights Reserved - New York Tech Media