Retrieval-augmented generation (RAG) leverages generative AI for solving real-world problems, using Wikipedia as a key source of information. However, integrating Wikipediaโs content into AI applications is complex, requiring steps like downloading dumps, text processing, vector conversion, indexing, and setting up APIs. This process is resource-intensive, demanding considerable time, computational effort, and storage. For small-scale projects or initial experiments, having access to a ready-made online service would greatly simplify development.
This was the inspiration behind the creation of our online API. Through this API, users can bypass the complexities and directly retrieve similar Wikipedia articles with just a single API call. The subsequent sections will detail various use cases for this service.
To proceed, you require two access tokens:
We try to keep thing minimal. There is only one POST API /search
. The input has the following fomat. v
represents the text embedding vector, while k
specifies the how many results we would like it to return (maximum 5).
{
"v" : [ 0.3, 0.5, 0.2 ... 0.7 ],
"k" : 3
}
The outcome is as follows, very self-explained. d
contains a list of returned result, while e
contains error information if anything goes wrong.
{
"d": [
{
"title": "List of past presumed highest mountains",
"url": "https://en.wikipedia.org/?curid=10374104",
"score": 0.75751114
},
{
"title": "List of highest mountains on Earth",
"url": "https://en.wikipedia.org/?curid=1821694",
"score": 0.7503605
},
{
"title": "Siguang Ri",
"url": "https://en.wikipedia.org/?curid=42714303",
"score": 0.70713806
}
],
"e": ""
}
The simplest approach requires only a Python runtime, without the need for any other local software dependencies. Text embedding can be easily achieved using the HuggingFace Inference API. The resulting vector enables querying our vector database. We utilize the all-MiniLM-L6-v2 model for embedding, offering efficient text processing.
#!/usr/bin/env python3
import json
import requests
def embed(text):
HFAPI = "https://api-inference.huggingface.co/pipeline/feature-extraction/sentence-transformers/all-MiniLM-L6-v2"
headers = {"Authorization": "Bearer REPLACE_WITH_YOUR_HUGGINGFACE_TOKEN_HERE"}
payload = {"inputs": [text]}
response = requests.post(HFAPI, headers=headers, json=payload)
return response.json()[0]
def main():
api_url = "https://english-wikipedia-vector-database.p.rapidapi.com/search"
query = {"v": embed("What is the highest mountain on earth?"), "k": 5}
headers = {
"X-RapidAPI-Key": "REPLACE_WITH_YOUR_RAPIDAPI_TOKEN_HERE",
"X-RapidAPI-Host": "english-wikipedia-vector-database.p.rapidapi.com"
}
response = requests.post(api_url, json=query, headers=headers)
resp = response.json()
print(json.dumps(resp, indent=2))
if __name__ == '__main__':
main()
Embedding function can also be performed locally. But you need to install some python libraries as below.
python3 -m venv venv
source venv/bin/activate
pip install sentence-transformers
The code is almost the same as before, except the embed
function now uses local model for embedding text to vectors.
#!/usr/bin/env python3
import json
import requests
from sentence_transformers import SentenceTransformer
def embed(text):
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
embeddings = model.encode([text])
return embeddings[0].tolist()
def main():
api_url = "https://english-wikipedia-vector-database.p.rapidapi.com/search"
query = {"v": embed("How many stars are there in the sky?"), "k": 5}
headers = {
"X-RapidAPI-Key": "REPLACE_WITH_YOUR_RAPIDAPI_TOKEN_HERE",
"X-RapidAPI-Host": "english-wikipedia-vector-database.p.rapidapi.com"
}
response = requests.post(api_url, json=query, headers=headers)
resp = response.json()
print(json.dumps(resp, indent=2))
if __name__ == '__main__':
main()
Happy hacking!