Feb 13, 2025
Introduction
Data is the backbone of software companies, and SQL is the backbone of data management. Over 80% of the world’s structured data is stored in relational databases, with SQL serving as the primary language for accessing and managing it. Every day, millions of data analysts, data scientists, and developers rely on SQL to query and manipulate data. In essence, SQL acts as a bridge between you and your data.
The Challenge and Solution
Translating natural language questions into SQL queries is a complex task that often delays insights. This process is particularly difficult for non-technical users, making it hard for them to gather insights and make data-driven decisions. As a result, crucial opportunities are sometimes missed, or decisions are delayed.
Large Language Models (LLMs) offer a promising solution by automating the generation of SQL queries. By doing so, they help bridge the gap between data and its stakeholders. Text-to-SQL technology has emerged as one of the top three applications of Generative AI since 2023, enabling companies to unlock insights faster and empower a wider range of users to interact with data effectively.

A Text-to-SQL example (above), and Evolutionary Process of Text-to-SQL Research (below). Source: arXiv:2406.08426 (Next-Generation Database Interfaces)
A lot has changed in the space of Generative AI (GenAI) since 2023. There have been significant improvements in both closed and open-weight models, as well as new methods for customizing these models using techniques like fine-tuning with QLoRA. In this blog, we will explore how to implement text-to-SQL in 2025. Genloop has been collaborating with enterprise clients on text-to-SQL solutions, and here, we provide a comprehensive guide on the approaches you can try. We will also objectively compare these methods to help you choose the best approach for your needs.
Understanding the Basics: Glossary
Table Identification: The process of accurately determining which table(s) should be used to retrieve data for a specific question or query.
Business Rules: A set of business knowledge or rules that are relevant to the specific use case. This includes domain knowledge, business formulas, and customer-specific rules.
Context: All the information and metadata about the database required to retrieve the necessary data for answering a query. This includes tables, schemas, columns, descriptions, etc.
SQL Writer: A module that takes the relevant context and generates an SQL query. This is typically powered by an LLM or an agent.
SQL Executor: A module that executes an SQL query and returns the corresponding data. It may also include mechanisms to validate or correct SQL queries.
SQL Data: The data is retrieved by executing the SQL query. This is used to prepare a response to the user’s input or question.
Decorator: The final module that processes the SQL data and user input to prepare a natural language response. This module often includes an LLM and features like chart creation. In some cases, it may also validate the correctness of the response.
Different Approaches to Text to SQL
1. Prompting the Smartest LLM
The simplest solution is just to give all the schemas, columns, tables, and table descriptions (collectively called context) to the biggest and smartest model with the hope that it will be smart enough to reason and arrive at the right SQL query. This SQL query when executed gives us the SQL data, which the decorator uses to prepare the final response.

Architecture of Text to SQL: Prompting the Smartest LLM
While this approach requires the LLM to handle significant complexity, it might yield reasonable accuracy for smaller schemas and straightforward queries. However, as schemas grow larger and queries become more complex — as they typically do in real-world production systems — this method often falls short. The context window of the LLM becomes a limiting factor, and such models are expensive, slow, and require transferring sensitive business data to third-party servers.
Performance Metrics:

Attribute Scoring: Prompting the Smartest LLM
Bonus: A Sample Prompt Template
2. RAG with Fast General-Purpose LLMs
In the first approach, we just dumped the complete information to an LLM hoping it would figure everything out. The second approach is to fetch only the relevant information through RAG (Retrieval Augmented Generation) techniques, break the problem into simpler tasks, and have those simpler tasks performed in orchestration through relatively faster general-purpose LLMs like GPT-4o, Claude, Meta Llama 3.3 70B, DeepSeek-v3, etc.

Architecture of Text to SQL: RAG with Fast General-Purpose LLMs
The first step is to select the relevant information from the complete context. This could either be done through vector search-based retrieval, where the question maps to a related canonical question, which further has pre-selected tables, schemas, and examples that can be used. Or, it could be useful to identify the relevant tables through a Table Intent Classification LLM. Once the right tables are selected, the rules and schema governing those tables are collected and supplied as the context for downstream processing.
After the relevant context is found, a general-purpose LLM is used to generate an SQL query from the given context and query. Since the context is already filtered by RAG, general-purpose LLMs are able to do a better job.
The process of selecting the best model for this could be iterative. We suggest going with the smarter models prioritizing accuracy (example: gpt-4o). Once satisfied, you can try even smaller models (like GPT-4o-mini) and evaluate their accuracy for your use case on your test set.
Performance Metrics:

Attribute Scoring: RAG with Fast General-Purpose LLMs
Bonus: A Sample Prompt Template for Table Intent LLM
3. Agents with General-Purpose LLMs
AI agents are autonomous software systems that perceive their environment and take actions to achieve specific goals. They can operate independently, learn from experience, and make decisions based on their programming and environmental inputs.
A simple RAG with a general-purpose model cannot recover from errors. This is where agents come to the rescue. A simple agentic system could have 2 agents, a context agent that collects the most relevant context to share. RAG is a simplified implementation of this context agent. The other agent is the SQL agent which uses workers like SQL Writer (LLM) and SQL Executor to produce the RAG-based for generating the response.

Architecture of Text to SQL: Agents with General-Purpose LLMs
A real example of such approach is the Open Data QnA by Google.

Open Data QnA Architecture. Source: https://github.com/GoogleCloudPlatform/Open_Data_QnA
Such system is able to successfully recover from many mistakes and common syntactical errors because of the agentic planning and interaction. The query enhancement step also gets better since the agents are better able to follow the business rules due to the multi-step nature of the system as opposed to a single-shot nature of the RAG approach.
However, this approach takes longer to get the response. The increase in accuracy in this approach comes at the expense of cost and latency. This is because the models have not become smarter or better but essentially the agentic nature gave the system more steps and time to get to the answer and also correct itself from the errors.
Performance Metrics:

Attribute Scoring: Agents with General-Purpose LLMs
Bonus:
Various tools help simplify agent building and execution. AutoGen, OpenAI Swarm, Pydantic AI, and Crew AI are some frameworks to get started with agentic workflows.
4. Customized Contextual LLMs
While the previous approaches can typically deliver 70–85% accuracy, customizing large language models (LLMs) to build personalized contextual models can help you push beyond that threshold. If you’re experiencing challenges with any of the attributes mentioned above or need more tailored results, this approach is your best bet.

Architecture of Text to SQL: Customized Contextual LLMs
The approach is to fine-tune open-weight models such as Llama, Mistral, and Qwen, tailoring them to better understand the business and domain-specific needs.
Here’s how it works: when a user query comes in, the first customized LLM is responsible for establishing the correct context for downstream tasks. It not only understands the user query but also identifies relevant data tables and devises an execution plan. This plan is then passed to the SQL Writer LLM, which generates a query specific to your domain, accounting for business rules such as how growth is calculated or the formulas used for quarterly sales. Afterward, the SQL data is processed by the Decorator LLM and returned as a response. Since the entire execution is highly contextual, it rarely needs any error recovery.
By adopting this model, enterprises gain full ownership of their AI, enjoying unmatched cost-effectiveness, control, and performance. With ownership, the model can be deployed behind the company’s firewall, ensuring complete data security. Moreover, because the model is fine-tuned for the business, it delivers unparalleled accuracy. The smaller size of these models also means they require a much faster path to response — up to 250x quicker than general LLMs — resulting in the fastest response times.
This approach has proven highly effective in our enterprise engagements. We will dive deeper into the benefits and workflow details in an upcoming blog post. However, it’s important to note that while this approach offers exceptional value, it requires a significant effort in development effort to build, scale, and maintain these LLMs.
Performance Metrics:

Attribute Scoring: Customized Contextual LLMs
Conclusion

Final Comparison Table of All Approaches
If you have a straightforward use case, a RAG-based general model might suffice. However, for use cases with significant scale — where rental AI costs skyrocket — or if you require the highest accuracy standards or have privacy concerns, Approach 4: Customized Contextual Models is essential. Make an informed decision using the Should I Fine-Tune tool.
About Genloop
Genloop delivers customized LLMs that provide unmatched cost, control, simplicity, and performance for production enterprise applications. Please visit genloop.ai or email founder@genloop.ai for more details. Schedule a free consultation call with our GenAI experts for personalized guidance and recommendations.