Introduction
A couple weeks back in our blog on ChatGPT plugins we talked about the potential for plugins to help expand ChatGPT’s functionality by allowing it to leverage third-party resources to act upon the conversations that you have with it. The value for these plugins is greatest when they help make up for a current short-coming that ChatGPT has. For example ChatGPT is built on top of GPT 4.0 which is a large language model which doesn’t understand mathematical and algebraic reasoning as well as it does written language and thus using the WolframAlpha plugin as a “math-mode” when needing to solve mathematical problems makes perfect sense!
Another short-coming of ChatGPT we spoke about was that it lacks the use of context in answering your questions unless this context is specifically conveyed in the body of the prompt. The solution to this shortcoming was the ChatGPT retrieval plugin which connects ChatGPT to a vector database which provides a robust solution to the above problem. The vector database connected to ChatGPT can be used to store and refer to relevant information when answering prompts and acts as longterm memory of the LLM.
Plugins are a very powerful way that you and I can contribute to improving LLM use-cases without having to retrain the underlying GPT model. Let’s say you’re using ChatGPT and realize that it doesn’t carry a conversation well enough when you ask it a question about the weather or it doesn’t have a specialized enough understanding of your health to suggest tasty and healthy recipes based on your previous blood sugar, pressure levels and health conditions. You can create a plugin to tackle these problems and in doing so improve usability for everyone since they can simply install your plugin and use it!
The only questions then are how do you get access to the exclusive plugin alpha that OpenAI is running and how do you go about creating a plugin for ChatGPT!? Worry not, we come bearing good news on both notes 😀.
Weaviate is partnering with OpenAI and Cortical Ventures to host a full-day Generative AI Hackathon at ODSC East on May 11th in Boston at the Hynes Convention Center. There you will get access to the OpenAI API and ChatGPT plugin tokens that OpenAI is providing and you will be able to create your own plugins as well as AutoGPT-like apps to solve problems near and dear to your heart using tools like ChatGPT and Weaviate! You can register using the link provided above, slots are limited so don’t delay!
Now getting to how you can create your own plugin for ChatGPT, here we will go through the step-by-step process of how we created the Weaviate Retrieval Plugin. The Weaviate retrieval plugin connects ChatGPT to an instance of Weaviate and allows it to query relevant documents from the vector database, upsert documents to “remember” information for later and also delete documents to “forget” them! The process that we took to create this plugin is quite similar to what one might take in creating a general plugin and thus we believe it’s quite instructive and we hope that it helps!
How to Create a ChatGPT Plugin
The entire code repository for the complete Weaviate Retrieval Plugin is located here. Let’s go through the steps one-by-one including code snippets and some challenges we encountered and how we eventually solved them.
The tech stack we used to develop this plugin is as follows:
- Python: write everything in python
- FastAPI: the server used to run the plugin
- Pytest: to write and run our tests
- Docker: we create containers to build, test and deploy the plugin
Below are the steps we took to develop the plugin, Part 1 focuses on building a web application with our desired endpoint, Part 2 is specific to the development of a ChatGPT plugin while Part 3 is all about remote deployment using Fly.io. We cover the steps in order but feel free to skip steps depending on you level of comfort with the material.
Part 1: Building a Web App
Step 1: Setup the Development Environment
To setup our development environment we used Dev Containers. The devcontainer.json
file was updated by adding Fly.io, Docker and Poetry. You can find other dev container templates here.
Step 2. Test the Setup
- After setting up the environment we tested that everything worked by:
Create a dummy endpoint which will simply respond with a{“Hello”: “World”}
object when called.
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def read_root():
"""
Say hello to the world
"""
return {"Hello": "World"}
-
Set up tests using PyTest which accomplish two goals – firstly we want to check that our Weaviate instance is up and running which is setup here and secondly that the Fast API endpoint is responding. Both of these tests are defined here
-
We also created a makefile to automate running and tests and firing up the endpoint. In our makefile we also specify a
run
command which will spin up the server locally to ensure the network connectivity settings are all setup properly. You can also connect to port 8000 which is the default port that FastAPI listens to to check connectivity. -
The last step to verify everything is running correctly is to go to
localhost:8000/docs
which should give you the Swagger UI for your endpoint. The Swagger UI gives you the ability to play around with your server, interact with any endpoints you may have defined, and it all gets updated in real-time – this is particularly convenient when we later want to call endpoints manually to interact with our Weaviate instance to query, upsert and delete objects.
Once you’ve done all of the above and everything looks to be in good order you can start implementing plugin specific functions.
Step 3: Implement a function to get vector embeddings
Since we are implementing a plugin that connects a vector database to ChatGPT we will need to define a way to generate vector embeddings which can be used when we upsert documents to our database to generate and store vector embeddings for our documents, this function will also be used to vectorize queries when querying and performing vector search over the vector database. This function is implemented here.
import openai
def get_embedding(text):
"""
Get the embedding for a given text
"""
results = openai.Embedding.create(input=text, model="text-embedding-ada-002")
return results["data"][0]["embedding"]
Here we simply chose to use the ada-002
model as OpenAI specifies that this particular model is used for their template retrieval plugin, however since the querying is done in the vector database we could have chosen to use any vectorizer.
Step 4: Implement function to initialize the Weaviate Client and vector database
Next we implement a couple of functions to initialize the Weaviate python client and through the client initialize the Weaviate instance by checking if a schema exists and if it doesn’t we add one.
import weaviate
import os
import logging
INDEX_NAME = "Document"
SCHEMA = {
"class": INDEX_NAME,
"properties": [
{"name": "text", "dataType": ["text"]},
{"name": "document_id", "dataType": ["string"]},
],
}
def get_client():
"""
Get a client to the Weaviate server
"""
host = os.environ.get("WEAVIATE_HOST", "http://localhost:8080")
return weaviate.Client(host)
def init_db():
"""
Create the schema for the database if it doesn't exist yet
"""
client = get_client()
if not client.schema.contains(SCHEMA):
logging.debug("Creating schema")
client.schema.create_class(SCHEMA)
else:
class_name = SCHEMA["class"]
logging.debug(f"Schema for {class_name} already exists")
logging.debug("Skipping schema creation")
Step 5: Initialize the database when the server starts and add a dependency for the Weaviate client
We now need to integrate the usage of these functions so that on starting the ChatGPT plugin server up we automatically initialize the Weaviate instance and connection of the client. We do this by using FastAPI’s lifespan feature in the main server python script which gets run every time the server starts. This simple function calls our database initialization function defined above which yields the Weaviate client object. Any logic that needs to be run on server shutdown can be included after the yield
statement below. Since we don’t need to do anything specific for our plugin we leave it empty.
from fastapi import FastAPI
from contextlib import asyncontextmanager
from .database import get_client, init_db
@asynccontextmanager
async def lifespan(app: FastAPI):
init_db()
yield
app = FastAPI(lifespan=lifespan)
def get_weaviate_client():
"""
Get a client to the Weaviate server
"""
yield get_client()
After this point the initial server setup and testing is complete. Now we get to the fun part of implementing our endpoints that will give ChatGPT different ways to interact with our plugin!
Part 2: Implementing OpenAI Specific Functionality
Step 1: Development of the Weaviate Retrieval Plugin specific endpoints
Our plugin has three specific endpoints: /upsert
, /query
and /delete
. These functions give ChatGPT the ability to add objects to the Weaviate instance, query and search through objects in the Weaviate instance and lastly delete objects if needed. Upon interacting with ChatGPT while the plugin is enabled it can be instructed to use a particular endpoint via prompt but will also independently decide when to use the appropriate endpoint to complete the response to a query! These endpoints are what extend the functionality of ChatGPT and enable it to interact with the vector database.
We developed these three endpoints through test driven development, as such we will display the tests that each endpoint must first pass and then the implementation that satisfies these tests. In preparation to setup the Weaviate instance for these tests we added the following test documents through a fixture:
@pytest.fixture
def documents(weaviate_client):
docs = [
{"text": "The lion is the king of the jungle", "document_id": "1"},
{"text": "The lion is a carnivore", "document_id": "2"},
{"text": "The lion is a large animal", "document_id": "3"},
{"text": "The capital of France is Paris", "document_id": "4"},
{"text": "The capital of Germany is Berlin", "document_id": "5"},
]
for doc in docs:
client.post("/upsert", json=doc)
Implementing the /upsert
endpoint:
After using the /upsert
endpoint we mainly want to test that that we got the appropriate status code in addition to checking that the content, id and vector’s were all upserted correctly.
Here’s the test that carries this out:
def test_upsert(weaviate_client):
response = client.post("/upsert", json={"text": "Hello World", "document_id": "1"})
assert response.status_code == 200
docs = weaviate_client.data_object.get(with_vector=True)["objects"]
assert len(docs) == 1
assert docs[0]["properties"]["text"] == "Hello World"
assert docs[0]["properties"]["document_id"] == "1"
assert docs[0]["vector"] is not None
The implementation below satisfies all of these requirements and tests above:
@app.post("/upsert")
def upsert(doc: Document, client=Depends(get_weaviate_client)):
"""
Insert a document into weaviate
"""
with client.batch as batch:
batch.add_data_object(
data_object=doc.dict(),
class_name=INDEX_NAME,
vector=get_embedding(doc.text),
)
return {"status": "ok"}
The /query
and /delete
endpoints were developed similarly, if you’re interested you can read below!
See details for /query endpoint implementation.
Implement the /query
endpoint:
For this endpoint we mainly want to check that it returns the right number of objects and that the required document that we were expecting is part of the returned objects.
def test_query(documents):
LIMIT = 3
response = client.post("/query", json={"text": "lion", "limit": LIMIT})
results = response.json()
assert len(results) == LIMIT
for result in results:
assert "lion" in result["document"]["text"]
The implementation below will take in a query and return a list of retrieved documents and metadata.
@app.post("/query", response_model=List[QueryResult])
def query(query: Query, client=Depends(get_weaviate_client)) -> List[Document]:
"""
Query weaviate for documents
"""
query_vector = get_embedding(query.text)
results = (
client.query.get(INDEX_NAME, ["document_id", "text"])
.with_near_vector({"vector": query_vector})
.with_limit(query.limit)
.with_additional("certainty")
.do()
)
docs = results["data"]["Get"][INDEX_NAME]
return [
QueryResult(
document={"text": doc["text"], "document_id": doc["document_id"]},
score=doc["_additional"]["certainty"],
)
for doc in docs
]
See details for /delete endpoint implementation.
Implement the /delete
endpoint:
Here we simply want to check that the response returned correctly and that after removing one object we the number of total objects in the Weaviate instance goes down by one.
def test_delete(documents, weaviate_client):
num_docs_before_delete = weaviate_client.data_object.get()["totalResults"]
response = client.post("/delete", json={"document_id": "3"})
assert response.status_code == 200
num_docs_after_delete = weaviate_client.data_object.get()["totalResults"]
assert num_docs_after_delete == num_docs_before_delete - 1
And the implementation of the endpoint is as follows:
@app.post("/delete")
def delete(delete_request: DeleteRequest, client=Depends(get_weaviate_client)):
"""
Delete a document from weaviate
"""
result = client.batch.delete_objects(
class_name=INDEX_NAME,
where={
"operator": "Equal",
"path": ["document_id"],
"valueString": delete_request.document_id,
},
)
if result["results"]["successful"] == 1:
return {"status": "ok"}
else:
return {"status": "not found"}
Here we showed you how our endpoints work, this will be where your plugin will be most unique, depending on