New York Tech Media
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital
No Result
View All Result
New York Tech Media
No Result
View All Result
Home AI & Robotics

Dr. Ram Sriharsha, VP of Engineering at Pinecone – Interview Series

New York Tech Editorial Team by New York Tech Editorial Team
February 4, 2023
in AI & Robotics
0
Dr. Ram Sriharsha, VP of Engineering at Pinecone – Interview Series
Share on FacebookShare on Twitter

Dr. Ram Sriharsha, is the VP of Engineering and R&D at Pinecone.

Before joining Pinecone, Ram had VP roles at Yahoo, Databricks, and Splunk. At Yahoo, he was both a principal software engineer and then research scientist; at Databricks, he was the product and engineering lead for the unified analytics platform for genomics; and, in his three years at Splunk, he played multiple roles including Sr Principal Scientist, VP Engineering and Distinguished Engineer.

Pinecone is a fully managed vector database that makes it easy to add vector search to production applications. It combines vector search libraries, capabilities such as filtering, and distributed infrastructure to provide high performance and reliability at any scale.

What initially attracted you to machine learning?

High dimensional statistics, learning theory and topics like that were what attracted me to machine learning. They are mathematically well defined, can be reasoned and have some fundamental insights to offer on what learning means, and how to design algorithms that can learn efficiently.

Previously you were Vice President of Engineering at Splunk, a data platform that helps turn data into action for Observability, IT, Security and more. What were some of your key takeaways from this experience?

I hadn’t realized until I got to Splunk how diverse the use cases in enterprise search are: people use Splunk for log analytics, observability and security analytics among myriads of other use cases. And what is common to a lot of these use cases is the idea of detecting similar events or highly dissimilar (or anomalous) events in unstructured data. This turns out to be a hard problem and traditional means of searching through such data aren’t very scalable. During my time at Splunk I initiated research around these areas on how we could use machine learning (and deep learning) for log mining, security analytics, etc. Through that work, I came to realize that vector embeddings and vector search would end up being a fundamental primitive for new approaches to these domains.

Could you describe for us what is vector search?

In traditional search (otherwise known as keyword search), you are looking for keyword matches between a query and documents (this could be tweets, web documents, legal documents, what have you). To do this, you split up your query into its tokens, retrieve documents that contain the given token and merge and rank to determine the most relevant documents for a given query.

The main problem of course, is that to get relevant results, your query has to have keyword matches in the document.  A classic problem with traditional search is: if you search for “pop” you will match “pop music”, but will not match “soda”, etc. as there is no keyword overlap between “pop” and documents containing “soda”, even though we know that colloquially in many areas in the US, “pop” means the same as “soda”.

In vector search, you start by converting both queries and documents to a vector in some high dimensional space. This is usually done by passing the text through a deep learning model like OpenAI’s LLMs or other language models. What you get as a result is an array of floating point numbers that can be thought of as a vector in some high dimensional space.

The core idea is that nearby vectors in this high dimensional space are also semantically similar. Going back to our example of “soda” and “pop”, if the model is trained on the right corpus, it is likely to consider “pop” and “soda” semantically similar and thereby the corresponding embeddings will be close to each other in the embedding space. If that is the case, then retrieving nearby documents for a given query becomes the problem of searching for the nearest neighbors of the corresponding query vector in this high dimensional space.

Could you describe what the vector database is and how it enables the building of high-performance vector search applications?

A vector database stores, indexes and manages these embeddings (or vectors). The main challenges a vector database solves are:

  • Building an efficient search index over vectors to answer nearest neighbor queries
  • Building efficient auxiliary indices and data structures to support query filtering. For example, suppose you wanted to search over only a subset of the corpus, you should be able to leverage the existing search index without having to rebuild it

Support efficient updates and keep both the data and the search index fresh, consistent, durable, etc.

What are the different types of machine learning algorithms that are used at Pinecone?

We generally work on approximate nearest neighbor search algorithms and develop new algorithms for efficiently updating, querying and otherwise dealing with large amounts of data in as cost effective a manner as possible.

We also work on algorithms that combine dense and sparse retrieval for improved search relevance.

 What are some of the challenges behind building scalable search?

While approximate nearest neighbor search has been researched for decades, we believe there is a lot left to be uncovered.

In particular, when it comes to designing large scale nearest neighbor search that is cost effective, in performing efficient filtering at scale, or in designing algorithms that support high volume updates and generally fresh indexes are all challenging problems today.

What are some of the different types of use cases that this technology can be used for?

The spectrum of use cases for vector databases is growing by the day. Apart from its uses in semantic search, we also see it being used in image search, image retrieval, generative AI, security analytics, etc.

What is your vision for the future of search?

I think the future of search will be AI driven, and I don’t think this is very far off. In that future, I expect vector databases to be a core primitive. We like to think of vector databases as the long term memory (or the external knowledge base) of AI.

Thank you for the great interview, readers who wish to learn more should visit Pinecone.

Credit: Source link

Previous Post

Drata Audit Hub unifies customer and auditor communication

Next Post

Becoming Indistractable, Time Management, Focus, and ChatGPT

New York Tech Editorial Team

New York Tech Editorial Team

New York Tech Media is a leading news publication that aims to provide the latest tech news, fintech, AI & robotics, cybersecurity, startups & leaders, venture capital, and much more!

Next Post
Becoming Indistractable, Time Management, Focus, and ChatGPT

Becoming Indistractable, Time Management, Focus, and ChatGPT

  • Trending
  • Comments
  • Latest
Meet the Top 10 K-Pop Artists Taking Over 2024

Meet the Top 10 K-Pop Artists Taking Over 2024

March 17, 2024
Panther for AWS allows security teams to monitor their AWS infrastructure in real-time

Many businesses lack a formal ransomware plan

March 29, 2022
Zach Mulcahey, 25 | Cover Story | Style Weekly

Zach Mulcahey, 25 | Cover Story | Style Weekly

March 29, 2022
How To Pitch The Investor: Ronen Menipaz, Founder of M51

How To Pitch The Investor: Ronen Menipaz, Founder of M51

March 29, 2022
10 Raunchy Movies on Netflix You Won’t Regret Watching

10 Raunchy Movies on Netflix You Won’t Regret Watching

May 20, 2024
Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

Japanese Space Industry Startup “Synspective” Raises US $100 Million in Funding

March 29, 2022
Startups On Demand: renovai is the Netflix of Online Shopping

Startups On Demand: renovai is the Netflix of Online Shopping

2
Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

Robot Company Offers $200K for Right to Use One Applicant’s Face and Voice ‘Forever’

1
Menashe Shani Accessibility High Tech on the low

Revolutionizing Accessibility: The Story of Purple Lens

1

Netgear announces a $1,500 Wi-Fi 6E mesh router

0
These apps let you customize Windows 11 to bring the taskbar back to life

These apps let you customize Windows 11 to bring the taskbar back to life

0
This bipedal robot uses propeller arms to slackline and skateboard

This bipedal robot uses propeller arms to slackline and skateboard

0
Automat-it Vanta partnership

Automat-it And Vanta Partner To Transform Compliance Into A Growth Engine For AWS Startups

March 5, 2026
PointFive DeepWaste

DeepWaste AI Expands Cost Optimization to GPU Waste, Misconfigurations, and Provisioning Leakage

March 5, 2026
Reclaim Security team

Reclaim Security Raises $26M to Close the Remediation Gap With AI-Driven Automation

March 4, 2026
woman in green top posing beside a mirror wall

Inside the AI Shift: How Dolica Gopisetty Helps Enterprises Turn Hype into Real Transformation

February 25, 2026
New CISO Whisperer report highlights shift toward identity, integrity, and automation oversight

New CISO Whisperer report highlights shift toward identity, integrity, and automation oversight

February 23, 2026
AIUP and AINT*: FINQ Launches the First ETFs Fully Managed by Artificial Intelligence

AIUP and AINT*: FINQ Launches the First ETFs Fully Managed by Artificial Intelligence

February 11, 2026

Recommended

Automat-it Vanta partnership

Automat-it And Vanta Partner To Transform Compliance Into A Growth Engine For AWS Startups

March 5, 2026
PointFive DeepWaste

DeepWaste AI Expands Cost Optimization to GPU Waste, Misconfigurations, and Provisioning Leakage

March 5, 2026
Reclaim Security team

Reclaim Security Raises $26M to Close the Remediation Gap With AI-Driven Automation

March 4, 2026
woman in green top posing beside a mirror wall

Inside the AI Shift: How Dolica Gopisetty Helps Enterprises Turn Hype into Real Transformation

February 25, 2026

Categories

  • AI & Robotics
  • Benzinga
  • Cybersecurity
  • FinTech
  • New York Tech
  • News
  • Startups & Leaders
  • Venture Capital

Tags

AI AI QSRs Allseated AWS B2B marketing Business CISO CISO Whisperer coding Collaborations Companies To Watch cryptocurrency Cybersecurity Entrepreneur Fetcherr Finance FINQ Fintech hi-tech Hi Auto Investing Investors investorsummit Israel israelitech Leaders LinkedIn Leaders Metaverse Mindset Minnesota omri hurwitz OurCrowd PointFive PR QSR Real Estate start- up startupnation Startups Startups On Demand startuptech Tech Tech leaders Unlimited Robotics VC
  • Contact Us
  • Privacy Policy
  • Terms and conditions

© 2024 All Rights Reserved - New York Tech Media

No Result
View All Result
  • News
  • FinTech
  • AI & Robotics
  • Cybersecurity
  • Startups & Leaders
  • Venture Capital

© 2024 All Rights Reserved - New York Tech Media