This is a Developer Step-by-step Guide in which we will be using MongoDB Atlas, NodeJS and OpenAI
- Create and Database, Collection and Vector Search index on MongoDB Atlas.
- Create an API key on OpenAI.
- Create the NodeJS microservice.
- Bonus : Create a Trigger on MongoDB Atlas that will automatically generate Vector Embeddings for newly inserted or Updated documents.
- Go to www.mongodb.com and create an Account (if you don't have one)
- When you create a new Cluster, give it the username: "demo" and password : "demo"
- Create a new Database called "databaseDemo" and a collection called "collectionDemo".
- Once the Cluster is created, click on it and go in the "Search" Tab (see below)
- Click on "Create Search Index":
- Select the "JSON Editor"
- Name the index: "vectorIndex"
- Select the "databaseDemo" and "collectionDemo"
- and Insert the following in the JSON Editor:
{
"mappings": {
"dynamic": true,
"fields": {
"embedding": {
"dimensions": 1536,
"similarity": "cosine",
"type": "knnVector"
}
}
}
}
- It should look like this :
... then click next and create Search Index
- ๐ We have successfully create a Vector Search Index ! ๐
โน๏ธ FYI : OpenAI uses 1,536 dimensions for embeddings when using the "text-embedding-ada-002"
model
- Go to https://platform.openai.com/account/api-keys
- Create a API
token
and save it somewhere
npm install axios cors express mongodb openai-api
const express = require('express');
const { MongoClient } = require("mongodb");
const axios = require('axios');
const app = express();
โน๏ธย In case your browser requires you to import CORS (Optional):
/** In case you require CORS for Browser */
//const cors = require('cors');
//app.use(cors());
/** OpenAI Embedding Function */
async function openaiEmbedding(query) {
// OpenAI Embeddings
const url = 'https://api.openai.com/v1/embeddings';
const openai_key = "YOUR-API-TOKEN"; // Replace with your OpenAI key.
// OpenAI embeddings APIs
let response = await axios.post(url, {
input: query,
model: "text-embedding-ada-002"
}, {
headers: {
'Authorization': `Bearer ${openai_key}`,
'Content-Type': 'application/json'
}
});
if(response.status === 200) {
console.log(response.data.data[0].embedding)
return response.data.data[0].embedding;
} else {
throw new Error(`Failed to get embedding with code: ${response.status}`);
}
}
- Change the URI with the one from Atlas URI:
- Copy the
GET
route below into yourindex.js
file:
app.get("/vectorSearch/:query", async (req,res)=>{
try {
// Transform query into embedding
const embedding = await openaiEmbedding(req.params.query);
// Change these constants:
const URI = "mongodb+srv:https://username:[email protected]";
const databaseName = "databaseDemo";
const collectionName = "collectionDemo"
const client = new MongoClient(URI);
await client.connect();
const db = client.db(databaseName);
const collection = db.collection(collectionName);
// Query for similar documents.
const documents = await collection.aggregate([
{
"$search": {
"index": "vectorIndex", // Name of Vector Search Index
"knnBeta": {
"vector": embedding,
"path": "embedding", // Name of the 'embedding' field
"k": 5
}
}
}
]).toArray();
res.send(documents);
} catch(err) {
console.error(err);
}
});
/** PORT */
const port = process.env.PORT || 8000;
/** PORT LISTENER **/
app.listen(port, () => {
console.log(`Listening to port ${port}`);
});
DONE ! The Microservice is Ready ! We just need to add a Trigger that will generate the embedding inside each new document !
- Inside the MongoDB Atlas, create a trigger:
- It is pretty straight forward configuring your Triggers:
- Replace with your own OpenAI key
- The trigger will use the
description
field in your document and transform it into a vector embedding. If you wish, you can change the name of the field that will be converted into embedding.
exports = async function(changeEvent) {
// Gets the full document that was changed
const changedDocument = changeEvent.fullDocument;
const url = 'https://api.openai.com/v1/embeddings';
// OpenAI API to change
const openai_key = "<YOUR-OPENAI-KEY>";
try {
// HTTP call to OpenAI API
let response = await context.http.post({
url: url,
headers: {
'Authorization': [`Bearer ${openai_key}`],
'Content-Type': ['application/json']
},
body: JSON.stringify({
input: changedDocument.description, //You can change the 'description' field to another one that you wish to convert into vector embedding
model: "text-embedding-ada-002"
})
});
// Parse the JSON response
let responseData = EJSON.parse(response.body.text());
if(response.statusCode === 200) {
console.log("Successfully received embedding.");
const responseEmbedding = responseData.data[0].embedding;
// MongoDB Atlas Cluster / Database / Collection
const collection = context.services.get("<CLUSTER_NAME>").db("databaseDemo").collection("collectionDemo");
// Update the document in MongoDB.
const result = await collection.updateOne(
{ _id: changedDocument._id },
// Adds the embedding field
{ $set: { embedding: responseEmbedding }}
);
if(result.modifiedCount === 1) {
console.log("Document successfully Updated.");
} else {
console.log("Failed to modify document.");
}
} else {
console.log(`Failed embedding with code: ${response.statusCode}`);
}
} catch(err) {
console.error(err);
}
};
[
{
"name": "UltraFast Smartphone",
"description": "Experience lightning-fast browsing and high-quality photography with the UltraFast Smartphone, equipped with the latest processor and a state-of-the-art camera system."
},
{
"name": "EcoFriendly Electric Scooter",
"description": "Travel green with the EcoFriendly Electric Scooter, offering efficient battery life and a compact design for easy portability."
},
{
"name": "Intelli Clean Vacuum Cleaner",
"description": "Maintain a spotless home with the IntelliClean Vacuum Cleaner, boasting intelligent navigation and powerful suction capabilities."
},
{
"name": "Ultimate Comfort Mattress",
"description": "Enjoy restful nights with the UltimateComfort Mattress, featuring adaptive foam technology and a breathable fabric cover."
},
{
"name": "SoundBlast Wireless Headphones",
"description": "Immerse yourself in rich sound quality with the SoundBlast Wireless Headphones, offering noise-cancellation and a comfortable fit."
},
{
"name": "AquaPure Water Filter",
"description": "Ensure safe and clean drinking water with the AquaPure Water Filter, incorporating advanced filtration technology for pure and fresh water."
}
]
Then we are going to make a GET request to our microservice and replace the <QUERY>
at the end of the URL, with our query
It should look like what you see above, we should get a list of results from our GET request.
You can always go a step further and transforme this GET into a POST request. You know how implement Vector Search, now it's all up to you to start building awesome applications ๐๐๐