Kinetica, ChatGPT Open a New Era of ‘Conversational Queries’ with Databases

Database provider Kinetica is integrating with ChatGPT to let users perform natural language ‘conversational queries’ and eliminate the need to know SQL.  IDN spoke with Kinetica’s Phil Darringer about the new wave of user-friendly ways to work with enterprise data.  

Tags: AI, ChatGPT, databases, Generative AI, natural language, queries, SQL, vectorized,

Phil Darringer, Kinetica
Phil Darringer
vp product management
Kinetica


"We think this whole idea of a ‘conversational query’ with ChatGPT will open the door beyond those who are proficient in SQL to any type of business user."

Enterprise Integration Summit
Integration Powers Digital Transformation for APIs, Apps, Data & Cloud

August 24, 2023
Virtual Summit

ChatGPT is set to unleash a wave of new user-friendly ways to work with enterprise databases.

 

One of the first out of the gate is the Kinetica vectorized analytics database. Kinetica’s integration with ChatGPT lets users ask natural language questions – and skip the need to know SQL (Structured Query Language).  Kinetica calls this next level user-friendly interface “conversational query.”

 

“With ‘conversational query,’ ChatGPT understands the user's question and intent and translates the natural language input into formal SQL queries the database understands,” Phil Darringer, Kinetica vice president of product management told IDN. In addition, ‘conversational query’ “allows users to ask questions using their own words and phrasing,” he added.

 

Perhaps even more powerful, Darringer said ChatGPT lets users have a long-running, interactive conversation with their data. After their initial question or query, users can “continue to refine their queries” with follow-up questions to get more precise, finetuned results, he said. 

 

“We think this whole idea of a ‘conversational query’ with ChatGPT will open the door beyond those who are proficient in SQL to any type of business user who can frame all kinds of questions as pure natural language prompts,” Darringer said.

Beyond ease-of-use for non-technical users, ‘conversational queries’ also offer other benefits, Darringer said, including:

  • Increased Productivity: By providing real-time access to information, users get immediate answers to their questions -- without waiting for long running queries or data pipelines to be built. This saves time and improves overall efficiency.
  • Improved Data Insights: With the ability to ask a series of iterative questions, users can uncover new patterns or unexpected correlations and relationships that may not have been immediately apparent through traditional queries.

The Many Powers of Kinetica’s Integration with ChatGPT

Much of the magic of the ChatGPT/Kinetica solution comes from the well-designed integration of the two technologies:

 

ChatGPT has strong innate capabilities to convert natural language to well-formed SQL, Darringer told IDN, and went on to highlight some noteworthy ones.  

“ChatGPT has been trained on Internet data, it can write code very well and it knows SQL very well and knows SQL join operations, and that has delivered a lot of great results [with Kinetica] already,” he said.   That said, ChatGPT also needs to be aware of some specifics about a user’s data, such as names and definitions of columns and what type of data is and so forth, he added.  But it’s not a big barrier for ChatGPT. “Actually [ChatGPT] does quite well just by providing the DDL for the tables.” 

 

In certain cases, though, Darringer noted “ChatGPT needs some ‘hints’ to how to work well with Kinetica. So, we can [let users easily] provide optional metadata and context to ChatGPT, telling it how to express SQL in a way that works best for Kinetica,”

As for Kinetica, it’s high-performance vectorized database provides for a speedy response to conversational queries, providing users “true ad-hoc data analysis at speed,” Darringer told IDN.   

A key aspect of the Kinetica database’s rapid response comes from the company’s use of “native vectorization.”

 

In a vectorized query engine, data is stored in fixed-size blocks (called vectors). Query operations are performed on these vectors in parallel -- rather than on individual data elements. This approach allows the query engine to process multiple data elements simultaneously, which delivers faster query results with less compute.  Kinetica’s vectorization is optimized for GPU and CPU advances, which allow for the database to perform simultaneous calculations on multiple data elements and process them in parallel across multiple cores or threads.

ChatGPT/Kinetica is Ready for the ‘Never-Before-Seen Question’

Another standout feature of the ChatGPT/Kinetica marriage is the solution’s ability to provide rapid respond for unexpected queries or ‘the never-before-seen’ question, Darringer told IDN.

“If I were estimate, probably 80-90% of SQL queries are known in advance,” Darringer told IDN. “They’ve been defined against a well-defined schema with a well-defined data pipeline, a well-defined data dictionary and all the data transformations are all locked down. So, most queries are all pre-planned.” 

 

That’s because asking such ‘never-before-seen’ (or unplanned) question poses some real technical problems, Darringer added.

 

“Such ‘never-seen-before’ questions may take hours to run -- or not complete at all,” Darringer told IDN. That’s because data engineers have to be sure the database is ready for them. “You’ve got to know if the data is even there, and is it set up to perform well against that query. Even after all that, you have to know:

 

Do I have to re-engineer my data pipeline? Will my indices support that type of new question that I've haven't seen before.” 

 

Kinetica’s database is architected to be ready for such unanticipated questions. Without any pre-engineering, Kinetica can handle any question thrown at it and send responses in seconds,” Darringer said.

 

“We work well without indices, and don’t require data engineering. This flexibility lets users ask a wide range of questions, even never-before-seen questions – without the need for data engineers to do a lot of set up work.”  

Kinetica’s ready-for-any-question architecture plays right into benefits of the ChatGPT/Kinetica integration, Darringer added. “If ChatGPT can create great SQL, but you don't know if you’re set up for a type of query, don't worry. You’ll get a result back and in the timeframe you expect [from Kinetica] when you're having a conversation with ChatGPT.” 


Other benefits from Kinetica/Chat GPT integration:

 

Darringer shared other benefits from ChatGPT/Kinetica well-crafted integration, including: 

Real-time Data Processing:  Enables users to analyze and respond to data in real time. This provides immediate insights and actions, based on data flowing through the system.

 

Scalability and Performance: Kinetica’s distributed and parallel processing capabilities enable ChatGPT to handle large volumes of data and concurrent user interactions, with high-efficiency and speed. It allows for seamless scalability, ensuring ChatGPT can keep pace with increasing workloads.

 

Advanced Analytics: Kinetica comes with built-in advanced analytics capabilities, including machine learning, geospatial processing and time series analysis. This enhances the accuracy of ChatGPT’s responses and provides a more personalized user experience. 

One other dimension of flexibility in queries is worth noting is how the ChatGPT/Kinetica integration will allow for users to tap other data sources for their answers.

 

Kinetica’s vectorized database architecture is purpose-built to incorporate multiple databases and analysis techniques, Darringer said.

 

With a ChatGPT front-end, Kinetica users extend a relational database query to add data from the following (in any combination):  

  • graph analysis (to uncover relationships between data points),
  • termporal analysis (to analyze data over time),
  • spatial analysis (to understand geographic patterns) and
  • machine learning (to provide deeper insights from data)

Darringer shared an example of how this would work. “You could set that up as an external table defined in Kinetica, but then once you do that you could then write queries – or ChatGPT would help you write the queries - that would join the data resident in Kinetica,” Darringer said.  


When users can enter conversations with their data using natural language, we think these types of questions will naturally require certain graph capabilities, as well as time and spatial data, he added.   

Free Trials and Getting Started with Kinetica/ChatGPT

For those who want to see how the Kinetica/ChatGPT solution works, Kinetica is offering a free trial SaaS-based environment using public data.

 

Not surprisingly, customers are already eager for a Kinetica/ChatGPT database solution that can run on premise or run with their companies’ private data, “That will all come very shortly here,” Darringer told IDN.

 

 




back