Why vector databases are having a moment where the AI hype cycle peaks

Vector databases are is all the rage, judging by the number of startups entering the space and investors paying for a piece of the pie. The proliferation of large language models (LLM) and the generative ai movement (GenAI) have created fertile ground for vector database technologies to flourish.

While traditional relational databases such as Postgres or MySQL are suitable for structured data (predefined data types that can be filed neatly in rows and columns), this does not work as well for unstructured data such as images, videos, emails, social networks. posts and any data that does not adhere to a predefined data model.

Vector databases, on the other hand, store and process data in the form of vector embeddings, which convert text, documents, images, and other data into numerical representations that capture the meaning and relationships between different data points. This is perfect for machine learning, as the database stores data spatially based on the relevance of each element to the other, making it easy to retrieve semantically similar data.

This is particularly useful for LLMs, such as OpenAI's GPT-4, as it allows the ai chatbot to better understand the context of a conversation by analyzing previous similar conversations. Vector search is also useful for all kinds of real-time applications, such as content recommendations on social media or e-commerce applications, as you can see what a user has searched for and retrieve similar items in an instant.

Vector search can also help reduce “hallucinations” in LLM applications by providing additional information that may not have been available in the original training data set.

“Without using vector similarity search, ai/ML applications can still be developed, but more retraining and fine-tuning would be required.” Other ZayarniCEO and co-founder of vector search startup tech/” target=”_blank” rel=”noopener”>Quadrant, he explained to TechCrunch. “Vector databases come into play when there is a large data set and a tool is needed to work with vector embeddings in an efficient and convenient way.”

In January, Qdrant raised $28 million in funding to capitalize on the growth that saw it become one of the top 10 fastest-growing commercial open source startups last year. And it's far from the only vector database startup to raise money lately: Vespa, ai-native-vector-database-technology-301803296.html” target=”_blank” rel=”noopener”>Weaviatepineapple and chroma They collectively raised $200 million last year for various vector offerings.

Qdrant founding team. Image credits: Quadrant

Since the beginning of the year, we have also seen Index Ventures lead a $9.5 million seed round in Superlinked, a platform that transforms complex data into vector embeddings. And a few weeks ago, Y Combinator (YC) unveiled its Winter '24 cohort, which included Flashlighta startup that sells a hosted vector search engine for Postgres.

Elsewhere, ai/” target=”_blank” rel=”noopener”>Broth raised a $4.4 million seed round late last year, quickly followed by a ai-powered-Vector-Search-Seamless.html” target=”_blank” rel=”noopener”>Series A round of 12.5 million dollars in February. The Marqo platform provides a full range of out-of-the-box vector tools, covering vector generation, storage and retrieval, allowing users to bypass third-party tools like OpenAI or Hugging Face, and delivers everything through one single API.

Marqo Co-Founders Tom Hamer and Jesse Clark Previously he worked in engineering positions at amazon, where they realized the “huge unmet need” for flexible, semantic search across different modalities, such as text and images. And that's when they jumped ship to form Marqo in 2021.

“Working with visual and robotic search at amazon was when I really looked at vector search; I was thinking about new ways to discover products, and that converged very quickly on vector search,” Clark told TechCrunch. “In robotics, I was using multimodal search to search through a lot of our images and identify if there were errant things like hoses and packages. Otherwise, this would be very difficult to resolve.”

Marqo co-founders Jesse Clark and Tom Hamer. Image credits: Broth

Enter the company

While vector databases are having a moment amid the ChatGPT hoopla and the GenAI movement, they are not a panacea for all enterprise search scenarios.

“Dedicated databases tend to focus entirely on specific use cases and can therefore design their architecture for the performance of necessary tasks, as well as the user experience, compared to general-purpose databases. , which must be adapted to the current design”. Peter Zaitsevfounder of database services and support company Percona, explained to TechCrunch.

While specialized databases may excel at one thing to the exclusion of others, this is why we are starting to see database holders as Elastic, Redis, Open search, ai-development-apache-cassandra-introduces-vector-search” target=”_blank” rel=”noopener”>cassandra, ai-2023-09-19/” target=”_blank” rel=”noopener”>Oracleand ai” target=”_blank” rel=”noopener”>MongoDB adding vector database search intelligence to the mix, as do cloud service providers like Microsoft Azure, amazon.com/about-aws/whats-new/2023/11/vector-search-amazon-documentdb/” target=”_blank” rel=”noopener”>amazon AWSand cloud flare.

Zaitsev compares this latest trend with what happened with JSON more than a decade ago, when web applications became more prevalent and developers needed a language-independent data format that was easy for humans to read and write. In that case, a new class of database emerged in the form of document databases like MongoDB, while existing relational databases also introduced JSON support.

“I think the same thing is likely to happen with vector databases,” Zaitsev told TechCrunch. “Users who are building very complicated, large-scale ai applications will use dedicated vector search databases, while people who need to create some ai functionality for their existing application are more likely to use search functionality.” of vectors in the databases they already use. “

But Zayarni and his colleagues at Qdrant are betting that native solutions built entirely around vectors will provide the “speed, memory safety and scale” needed as vector data explodes, compared to companies that incorporate search. vectors as an afterthought.

“Their argument is, 'we can also do vector searches, if necessary,'” Zayarni said. “Our argument is: 'we do advanced vector search in the best way possible.' It's all a matter of specialization. In fact, we recommend starting with any database you already have in your technology stack. At some point, users will face limitations if vector search is a critical component of their solution.”

Tags: AI Cycle Databases Hype moment Peaks Qdrant Vector vector database vector search

Why vector databases are having a moment where the AI hype cycle peaks

Technical Terrence Team

Meta launches Meta AI chatbot assistant powered by Llama-3 to compete with ChatGPT

Leave a Reply Cancel reply

Recommended.

Why the regular FTSE 250 Hunting may be undervalued by 40%

Samsung's Galaxy Buds FE are more affordable than ever with a nearly $40 discount

Opening your Threads account on the fediverse is as easy as clicking

Formula 1 has reportedly asked F1 creators to stop using its brand

Towards goal of $2.72 million by March 21

Categories

Important Links

Why vector databases are having a moment where the AI ​​hype cycle peaks

Enter the company

Related

Technical Terrence Team

Meta launches Meta AI chatbot assistant powered by Llama-3 to compete with ChatGPT

Leave a Reply Cancel reply

Recommended.

Why the regular FTSE 250 Hunting may be undervalued by 40%

Samsung's Galaxy Buds FE are more affordable than ever with a nearly $40 discount

Opening your Threads account on the fediverse is as easy as clicking

Formula 1 has reportedly asked F1 creators to stop using its brand

Towards goal of $2.72 million by March 21

Categories

Important Links

Get daily news updates to your inbox!

Why vector databases are having a moment where the AI hype cycle peaks