I have just built a sql chatbot for local data using langchain and streamlit. I found the whole process surprisingly smooth with these two great tools, I put here how I built this and some thoughts alongside.
In case you don’t know what langchain and streamlit are, here is a brief introduction. Langchain in my mind is a tool to connect to all kinds of language models(e.g. openai) and provide interfaces to easily using them. Streamlit is a tool that allows you to write UI included apps with only python. so langchain+streamlit means you can build UI included llm apps with only python.
The final app looks like this:
Here is how I built it:
1. Install langchain and streamlit
pip install langchain streamlit
(I simplified the installation, you should in your project set up a virtual environment and install them in it.)
2. Find your data connection string
under the hood langchain uses sqlalchemy to connect to database, so you can use any database that sqlalchemy supports. Here is what a connection string for mysql looks like
3. Set up a streamlit app
you just need to create a python file and import streamlit
4. Use langchain sql agent to talk to your database
langchain sql agentallows you to use an agent to explore your database, the agent is powered by an llm model, it could be openai or some open source models like llama2.
besides sql agent , you can also use sql chain which in my sense is a manual car vs sqlagent as an automatic car.
5. Hook up langchain and streamlit
StreamlitCallbackHandler is the trick to make langchain and streamlit work together, it’s like for each response from langchain agent, it will call a function to update the streamlit UI. The function is provided by streamlit and included in langchain.
finally here is the all the codes(removing sensitive info):
from langchain.sql_database import SQLDatabase from langchain.llms import OpenAI from langchain.agents import AgentType, initialize_agent, load_tools from langchain.callbacks import StreamlitCallbackHandler from langchain.agents import create_sql_agent from langchain.agents.agent_toolkits import SQLDatabaseToolkit import streamlit as st st.title("Talk to your data") api_key = st.text_input("api_key") db_string = st.text_input("db_string") if api_key: db = SQLDatabase.from_uri(db_string) toolkit = SQLDatabaseToolkit(db=db, llm=OpenAI(temperature=0, openai_api_key=api_key)) agent_executor = create_sql_agent( llm=OpenAI(temperature=0, streaming=True, openai_api_key=api_key), toolkit=toolkit, verbose=True, agent_type=AgentType.ZERO_SHOT_REACT_DESCRIPTION, ) else: st.write("Please input openai_api_key") if prompt := st.chat_input(): st.chat_message("user").write(prompt) with st.chat_message("assistant"): st_callback = StreamlitCallbackHandler(st.container()) response = agent_executor.run(prompt, callbacks=[st_callback]) st.write(response)
say the name is
app.py, then you can run it with
streamlit run app.py
you can deploy it to streamlit cloud, or you can deploy it to your own server, I deployed it to my own server with docker, here is the dockerfile:
FROM python:3.11 WORKDIR /app COPY requirements.txt . RUN pip install -r requirements.txt COPY ./app . CMD ["streamlit", "run", "index.py"]
Then you have a sql chatbot which applies to almost all database that you have access to. easy right?
While it’s easy to build this app up, it still have some limitations.
- Openai has a limitition of 5000 tokens per request, so if you have a large data schema to upload, it will fail.
- The llm for sql is not as good as it works on other daily tasks, according to my using experience the accurate rate is roughly 60-70%.
- You have to pay for openai or you need a decent GPU to run local llm models.
So ideally this tool can be used for exploration of your data, you can used it for generating some sql queries and then use the query in other places.Howere I understand both llm and its related tools are developing fast, so I’m looking forward to seeing more powerful usage in the future.
I have put this app together with other streamlit tools I built in this repo: streamlit-tools you can check it out and give it a try.