DataStax Exec Talks About Recent Acquisition That Gives Businesses Powerful AI Capabilities
DataStax's chief product officer also details how midmarket companies are using the platform
Earlier this month, DataStax acquired Langflow, which provides a framework for developing generative AI apps, CRN reported.
While DataStax is mostly known for its Astra DB vector database platform built on Apache Cassandra, the company has been focusing on its AI strategy for well over a year, even before the acquisition, Ed Anuff, chief product officer at DataStax, told MES Computing in an interview.
Anuff spoke about how the acquisition has allowed DataStax to offer scalable, customizable AI apps to its customers and how the offering can benefit midmarket organizations.
Give us an idea how the solution works for building a GenAI app.
What emerged starting a little over a year ago was this idea called retrieval augmented generation. And what [RAG] says is 'I'm going to ask a question to the AI—to the large language model. Before I ask that question—maybe asking a shopping bot what should I buy because let's imagine I am preparing a meal—and imagine I have an e-commerce site that's for my supermarket.'
Before I give [that question] to the LLM, I'm going to first look up from [my] database, maybe I have a database of my inventory. I want to get all of the things that are in my inventory that are relevant to the question that my customer just asked. That will give us a list of products.
I'll take the customer's question and that list of products behind the scenes, and I'll give that to the LLM. The LLM will then compose a reply and say back to the original customer, 'You're trying to create this meal, you should use this particular ingredient in aisle five. And you should use this and this … and, by the way, you have indicated in the past that you have these three food allergies, so don't use this, this or this—instead substitute these other things.'
In that situation, what retrieval augmented generation is doing is … it's taking the question the user's asking and augmenting it with a bunch of information in advance that it's retrieved and gives it to the model. And then the model is able to give a very personalized, targeted response based on the company's data.
When a company does that, a couple of things happen, good things. The first is, rather than having the model kind of speculate or try to pull things out of the data it was trained on, which sometimes results in hallucinations, it's limiting its response. It's been instructed to limit its response to the data that's been supplied to it. The model has been told, 'When you generate a response to the customer, use as much as possible this data we're giving you along with that question.'
That's what we call grounding, and grounding is a way that we can eliminate the hallucinations or inaccuracies in the response.
Second thing equally important is now [the responses are] highly personalized and is leveraging the company's unique data. Which means that it's going to have much more business impact. Most of the businesses that we see, they're using AI, that's what they're doing. They're building along this type of pattern.
Which LLMs can customers use with your solution?
We actually let people select which LLM they want to use. It's one of the biggest choices that people make. The OpenAI model is the largest production model in the world and it's expensive.
There are a lot of smaller models, and you could go with Google, you could go with Anthropic. There are free models you can use—Facebook released Llama 2, or you may have heard about a French company, Mistral, [that] has created a model.
These are very good models but they're not as powerful [as Open AI] but they're a lot cheaper to use. So businesses [have to ask themselves], 'Do I want this more powerful model that might be too expensive, if it's something I am putting, for example, on my website? Or if it's something that is internal use only—maybe something that I'm giving to my internal support people because they want to be able to use it as a knowledge base, then that might be the [model] where it's not so expensive to go.'
The solution sounds like complex technology that mostly larger enterprises would adopt. Is DataStax's AI app creation solution a fit for the midmarket?
We're actually seeing a lot of smaller companies able to achieve quite a bit. It might sound complicated [but] are they doing any of their own custom software development or applications? If they are building these types of systems, [it's] similar in complexity to building a website. If you have people that are using JavaScript, they can build these types of systems.
[That's] part of the reason we acquired Langflow. What Langflow allows you to do is design these types of chatbots entirely visually. It's still for a developer but does not have to be the super advanced developer.
But your developers can build these things quickly, and the first step for every business is to experiment and see what can it do for my business.
We've tried to make the experimentation process a lot easier before you spend a lot of money building something. You're able to … prototype [a] chatbot. Let me go ask it some questions, let me supply it with information from my company—Langflow will integrate with any database, not just ours.
If I can make it possible for me to ask questions of the data that I already have in my company, I can now go and have some positive business impact. I can make it easier for my customers to support themselves, for example.
So yes, if [the midmarket is] looking to do something custom and specific for their business, it's definitely feasible to do so.
Can you share any specific use cases a midmarket company has implemented with this solution?
A lot of it ends up being customer support use cases. The biggest challenge really for any business, but for smaller businesses, is how do I end up being able to respond to these customer questions in a fast, timely basis?
You may have a very overworked customer support or even sales staff … and suddenly by giving them AI tools they are able to generate responses, it might be something that they just copy and paste into an email to send to a customer. It ends up having a lot of impact. You now have the ability to step up your professionalism and respond in a timely manner with well-crafted responses. This cuts across all business sizes, we see this in 10-person companies, we see this in major enterprises as well.
What are the security implications of adding AI apps to your organization? Are we opening new security vectors?
The LLM itself doesn't remember anything—it's a black box. You give it a set of information and it generates a response; it doesn't actually remember the stuff you give it.
That's one of the concerns people have: 'When I asked something to an LLM does it take my data? Is it going to leak my data?'
The LLM itself won't but the systems around it will. This is one of the reasons why we see people rather than using Chat GPT by itself [because Chat GPT is not just at LLM, it's a database [though] Open AI takes a lot of security precautions] … we see many companies going and building their own systems that keeps their data safe.