- Besides the fact that LLMs have a huge power in generative use cases, there is a use case that is quite frequently overlooked by frameworks such as LangChain: Text Classification.
- 🚅 bullet was created to address this. It leverages the power of ChatGPT, while removing any boilerplate code that is needed for performing text classification using either Zero Shot or Few Shot Learning.
- Install
bullet
:pip install "git+https://github.com/rafaelpierre/bullet.git#egg=bullet&subdirectory=src"
- Configure your
OPENAI_API_KEY
- You should be good to go
from bullet.core.sentiment import SentimentClassifier
df_train_sample = df_train.sample(n = 50)
classifier = SentimentClassifier()
result = classifier.predict_pandas(df_train_sample)
# Define Few Shot examples
template = "Review: \"{review}\"\nLabel: \"{label}\""
examples = [
template.format(
review = row["text"],
label = "POS" if row["label"] == 1 else "NEG"
)
for _, row
in df_train.sample(3).iterrows()
]
df_test_sample = dataset["test"].to_pandas().sample(100)
reviews = df_test_sample.text.values
results = classifier.predict_few_shot(
reviews = reviews,
examples = examples
)
Full working example on a Jupyter Notebook can be found in notebooks/sandbox.ipynb
- From a terminal window, start a Python virtual environment and activate it:
python -m venv .venv
source .venv/bin/activate
- Install
tox
:
pip install tox
- Running unit tests:
tox
- From the
docs
folder, runpip install -r requirements.txt
- Run
make html
- Documentation will be available at
docs/_build/html/index.html