Superlinked Democratizes Real-Time Semantic Search

This Python framework combines structured and unstructured data to build high-performance search and recommendation applications.

May 20th, 2025 10:30am by Meredith Shubel

Featued image for: Superlinked Democratizes Real-Time Semantic Search

Vector databases, often lauded as a key component of new AI-driven architectures, are having a moment. They play an important role in enabling semantic search and real-time information retrieval — but actually driving vector search and interpreting results is a challenge.

Ben Gutkovich, co-founder and COO of Superlinked, explained: “Enterprises obviously run on structured data, but there’s a lot of unstructured data by volume. When you’re building a solution for real-time search, you want to take this data into account.”

Therein lies the challenge.

For e-commerce companies, for example, this means not just accounting for a product’s description and image but also its price, stock levels and availability at different locations. According to Gutkovich, “The current approaches are suboptimal at best. They’re either hard or don’t work at all.”

One common approach is what Gutkovich calls the “stringify-and-embed” approach, where you unify all structured and unstructured data in the same text structure (i.e., the same “string”) and embed it within a large language model (LLM). But Gutkovich pointed out that this approach severely limits filtering capabilities: “[For example], if I want to prioritize items that I have significant stock of, there’s no way of doing this.”

A New Way To Power Vector Search

To overcome this challenge and realize the full potential of vector search, Gutkovich pointed to Superlinked as the answer.

Superlinked is a Python framework and a cloud infrastructure designed to help AI engineers combine structured and unstructured data to build high-performance search and recommendation applications. Specifically, it’s supposed to allow developers to create, manage, update, delete and otherwise maintain vector embeddings, ultimately enabling them to retrieve contextually relevant results.

Or as Gutkovich simply described it: “We’re solving information retrieval.”

Doing so requires first understanding which data is needed to generate the best results, and then designing an algorithm to deliver those results. “Basically, in a nutshell, we allow you to build a data schema relevant for your data. Then you connect your data sources, compile to the cloud and get an API to connect to your application,” Gutkovich explained.

Recommendation Systems

Gutkovich says founding Superlinked in 2020 was a natural next step for him, as he had his own years of experience improving recommendations with structured data.

As former head of business development at easyCar Club, Gutkovich said he had helped take the company’s simple, price- and location-focused search and transform it into a more sophisticated mechanism capable of considering more diverse attributes.

When he met Daniel Svonava, Superlinked co-founder and CEO, in 2020, Gutkovich said he had a similar story of working on the marketplace for communities: “We decided to team up and build this recommendation system as a service solution … letting people actually choose which data they want to include, how they want to embed it, and also providing them [with] APIs to push the data in and pull the recommendation out of the system.”

E-commerce was the first obvious fit for their solution.

Suppose a user visits a website to buy shoes. “Even though in the past they may have bought a jacket, you don’t want to show them more jackets if they search for shoes,” said Gutkovich. “You want to show them in real-time the most relevant results.” This is where Superlinked’s online component comes into play to treat real-time queries and real-time updates.

But after the user has bought their shoes and moved on, the work isn’t done. “You want to normalize [the session],” he continues. “You don’t want them to see shoes … the next time they come to your website.” Here, the batch-processing component steps in. “After the session is over, it takes into account all the history and aligns the weights according to overall history, not just the current session,” he explains.

In this way, Superlinked enables personalized search that can evolve with each user session. While effective, this was only a precursor for the Superlinked that exists today.

New Funding Supercharges Growth

After ChatGPT was released, things changed: “Suddenly, everyone wanted to work with vector embeddings,” Gutkovich recalls.

With ChatGPT came new interest in semantic search, contextual understanding and real-time information retrieval — all reliant on vector embeddings and efficient vector search. At that point, Superlinked had already built an internal system to combine structured and unstructured data and create real-time, relevant recommendations. For the many enterprises eager to use LLMs but struggling to integrate structured data to refine recommendations, Superlinked seemed to already have the solution.

It would prove a significant turning point for the startup.

In March 2024, Superlinked announced $9.5 million in seed funding, led by Index Ventures and Tomasz Tunguz. Before the seed round, Superlinked was a small team of eight people. Now, they number over two dozen. “We hired experts in the space — machine learning engineers, data scientists. … The idea was to build a big team,” explains Gutkovich.

That’s not all the funding did. The team also decided to unbundle Superlinked as it was and broaden it beyond the recommendation system use case. “Now you can actually use it in a range of different use cases,” he continues, “supporting any vector database [you] might want to use, any data source.” As part of this unbundling and broadening, Superlinked also built the whole batch platform to work at scale.

Looking Ahead: Democratizing ML

There’s a lot on the horizon for the framework.

Retrieval-augmented generation (RAG) is one popular new use case. For example, for internal enterprise search, Superlinked can help ensure only the most relevant data makes it into LLMs. Fraud detection is another space where the framework can be useful. With each financial transaction comes an abundance of structured and unstructured data; Superlinked can help operations teams filter all this data to flag risky transactions while minimizing false positives.

Using vector embeddings to power search and improve recommendations isn’t necessarily new, but until now, adoption has been mostly confined to the tech giants due to limited access to tooling. By simplifying vector search, Superlinked stands to unlock scalable, real-time semantic search for all enterprises. According to Gutkovich, it’s all part of the team’s mission to democratize machine learning: “With vector search now being so popular … and with the advances in LLMs and general AI models, we believe [Superlinked] has the potential to help developers build ML-powered applications without having to understand data science or MLOps.”