Retrieval and Retrieval-augmented LLMs
-
Updated
Jun 13, 2024 - Python
Retrieval and Retrieval-augmented LLMs
Code for BeLLM: Backward Dependency Enhanced Large Language Model for Sentence Embeddings (NAACL2024)
Project repository for the development of a Question-Answering (QA) information retrieval system fine-tuned on customer queries.
💡 All-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows
Local-GenAI-Search is a generative search engine based on Llama 3, langchain and qdrant that answers questions based on your local files
Leveraging BERT and c-TF-IDF to create easily interpretable topics.
Experiment Code for Computational Linguistics Ⅱ (108.536A, 2024-1)
1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
The code for my master's thesis "Sentence Embeddings in Various Supervision Settings"
Lighter code to train Korean SimCSE
Adapted BERTopic pipeline for Topic Modeling the arXiv dataset
An implementation of the TaxRetrievalBenchmark task for the 🤗 Massive Text Embedding Benchmark (MTEB) framework.
Word2vec, sentenceBert, BM25 and IVFFlat Index quality and speed comparison
Model to classify and categorize user complaints into categories for specific departments using LLMs.
Train and Infer Powerful Sentence Embeddings with AnglE | 🔥 SOTA on STS and MTEB Leaderboard
Backend for the AI-copilot
The project's goal is to help job seekers understand the basic qualifications for specific jobs and evaluate the suitability of their skills for those positions. Additionally, the program aims to assist recruiters in enhancing their resume selection processes by analyzing and understanding job advertisements ....
🍊 PAUSE (Positive and Annealed Unlabeled Sentence Embedding), accepted by EMNLP'2021 🌴
This study aims to investigate the effectiveness of three Transformers (BERT, RoBERTa, XLNet) in handling data sparsity and cold start problems in the recommender system. We present a Transformer-based hybrid recommender system that predicts missing ratings and ex- tracts semantic embeddings from user reviews to mitigate the issues.
Data and scripts for training the open source PDF questionnaire extraction component for Harmony Kaggle competition using natural language processing (NLP)
Add a description, image, and links to the sentence-embeddings topic page so that developers can more easily learn about it.
To associate your repository with the sentence-embeddings topic, visit your repo's landing page and select "manage topics."