Chat with your database to get instant data visualization

Lotus Labs
3 min readJan 6, 2025

--

What is Vanna

Vanna is a python package which uses retrieval augmentation to help you generate accurate SQL queries for your database using LLMs.

Architecture of Vanna

How Vanna Revolutionizes Data Analysis

In the realm of data-driven decision-making, tools that bridge the gap between complex databases and intuitive user interaction are invaluable. Vanna is one such innovation, offering a streamlined way to convert natural language queries into actionable SQL commands, complete with visualized outputs. Here’s a closer look at how Vanna operates to deliver these remarkable results.

Understanding the User’s Question

The process begins with a user’s question, often expressed as a simple string of natural language text. This query might range from, “What are the monthly sales trends?” to “Show me the top-performing products last quarter.” At its core, Vanna focuses on interpreting the intent behind these queries.

To achieve this, Vanna leverages the training data associated with the specific database schema. The schema serves as the backbone, providing the structural and relational details of the database. Alongside this, additional documentation is used, such as KPI (Key Performance Indicator) definitions, business logic, and metadata. These supplementary materials ensure that the context and nuances of the user’s query are understood accurately.

Using Examples to Enhance Accuracy

To further refine the understanding of queries, Vanna incorporates example SQL queries. These examples act as benchmarks or templates, guiding the system in forming precise and relevant SQL statements. By doing so, Vanna ensures that its interpretations align closely with the organization’s data structures and analytical needs.

Generating SQL Queries with an LLM

Once the user’s intent is clear, the extracted context and structured understanding are passed through a pre-set prompt to a Large Language Model (LLM) of choice. This LLM is trained to generate SQL queries tailored specifically to the database in question. The synergy between Vanna’s contextual foundation and the LLM’s generative capabilities ensures the SQL queries are both accurate and efficient.

Delivering Comprehensive Outputs

Vanna doesn’t stop at query generation. The output encompasses three essential components:

  • Generated SQL Query: The precise SQL statement corresponding to the user’s natural language query.
  • Associated Query Output: The data retrieved from the database based on the generated SQL query.
  • Visualization: A visual representation of the data made using python plotly, such as charts or graphs, making insights more accessible and actionable.

The Role of Documentation and Context

The inclusion of additional documentation and example queries is pivotal. These resources provide the LLM with the necessary context to interpret the user’s questions accurately. For instance, KPI definitions might clarify what “sales_per_cluster” means in a particular database, while example queries illustrate the preferred structure and syntax. This approach not only enhances the reliability of the generated queries but also ensures they align with organizational standards and practices.

By seamlessly integrating user intent, database context, and advanced language models, Vanna transforms natural language queries into actionable insights. Its ability to deliver accurate SQL queries, comprehensive data outputs, and intuitive visualizations makes it an indispensable tool for data teams and decision-makers alike. With Vanna, bridging the gap between complex data systems and user-friendly interaction has never been easier.

Demo of the vanna ai user interface app

--

--

Lotus Labs
Lotus Labs

Written by Lotus Labs

Transform your business into an AI-driven enterprise. We specialize in Machine learning for Retail, Insurance, and Healthcare industries. www.lotuslabs.ai

No responses yet