Skip to content

shreyas-girjapure/Salesforce-CLI-RAG-Bot

Repository files navigation

Salesforce CLI Search RAG Bot - Chat with your documents using AI

Live page 🚀 : https://shreyas-girjapure.github.io/Salesforce-CLI-RAG-Bot/

Overview

Document search implementations generally involve

  1. Data Splitting ✂️
  2. Embedding 🧵
  3. Storing in Vector DB 💾
  4. Retrieving with LLM's layer for summary.🕵️

There are some major inaccuracies in search results depending RAG strategy and dataset used. Below are some areas which contributes to inaccuracy.

  1. Data splitting
    1. Chunked data may lose important context for the query. Retrieval of such data generates bad results in final outcome.
  2. LLM's layer
    1. LLM some times add their own flavors hallucinations on vectored context provided.

So by avoiding standard split and embed approaches and fine controlling the dataset , Better results can be achieved.

This project is implementation of such finely controlled dataset's RAG strategy.

Tip : You can easily create and host simple AI powered bots using low code tools like 
    1. RelevanceAI
    1. Flowise

Problem Statement / Motivations

  1. Overcome inaccuracy of simple RAG strategy and provide accurate results.
  2. Implement concepts like RAG , Vector DB , LLM Monitoring using lang-chain JS.
  3. Always wanted a personal AI powered RAG bot 😁.

Existing ways to do this [Find CLI commands you need]

  1. Actually read through Salesforce CLI Reference Documentation
  2. Use sfdx search command for keyword searching commands.

How to Use

  1. Visit 🚀 RAG Bot Page for salesforce cli documentation searches.
  2. Search for any thing related to salesforce CLI.

Features

  1. Semantic search over large document.
  2. High accuracy and complete context in results.
  3. Answers* unrelated questions to the context used.
  4. Has support to load local vector stores for faster retrieval.
  5. LLM Monitor support for token and LLM analytics.
  6. Around 0.5 seconds for Salesforce CLI Document related searches.
  7. 5+ seconds for Non Salesforce CLI Document related searches.

AI Awareness section

  1. AI is not magic , lot of guard rails and code alterations are needed to have a useful AI bot which doesn't hallucinates much.
  2. Understanding your own dataset is really important before choosing approach for RAG bot.
  3. Adding chat bot layer over large documents will only improve current businesses and user experiences.
  4. This is a simple lang chain based implementation of some AI concepts. Lots of improvements are needed. Feel free to contribute 🥳

Limitations

  1. Free Server has cold start issues. May face delays in searches due to server inactivity.
  2. No Rate guards are placed in code , prone to credit loss or server crashes 😭.
    1. RPM : 3,500
    2. TPM : 90,000
  3. Since documentations has 2 sections of same semantic commands , deprecated commands may be retrieved.
    1. Example : When you search for How to login, vector search might retrieve auth web login section or sfdx force auth web login.
    2. Similarly for deploy commands and other similar commands from sf and sfdx section.
  4. Allowed Requests Per Minute : 3,500
    1. Current implementation does not involve agent-ish implementations , so RPM limit should be very tough to reach for current project scale.
  5. Allowed Tokens Per Minute : 90,000
    1. Single requests cost around 1000-3000 tokens , So to exhaust daily limits more than 30 requests have to be made in a minute.
  6. UI is very poorly designed 😑.

References

Project References

  1. Upcoming Features
  2. Backend Specifications

External References

  1. RAG
  2. Vector Database
  3. Faiss
  4. Lang Chain Framework
  5. Salesforce CLI Reference Documentation
  6. Older RAG version used
  7. Create a simple one using RelevanceAI
  8. CLI Plugin
  9. LLM Monitor