{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "provenance": [], "gpuType": "T4", "authorship_tag": "ABX9TyM3vxACY/vOArAktjoeYZmS", "include_colab_link": true }, "kernelspec": { "name": "python3", "display_name": "Python 3" }, "language_info": { "name": "python" }, "accelerator": "GPU" }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "view-in-github", "colab_type": "text" }, "source": [ "\"Open" ] }, { "cell_type": "markdown", "source": [ "# Step 1: Install Caikit\n", "\n", "## Installation and Setup\n", "\n", "In this example Jupyter notebook, we'll be using various Python libraries and pre-trained models for evaluating and analyzing natural language processing tasks. Before we proceed, we need to install the required dependencies and download some essential resources.\n", "\n", "### 1. Installing Libraries\n", "\n", "To begin, we'll install the following Python packages using `pip`:\n", "\n", "- `evaluate`: A library for evaluating model performance on different NLP tasks.\n", "- `rouge_score`: A package for calculating ROUGE (Recall-Oriented Understudy for Gisting Evaluation) metrics for text summarization.\n", "\n", "Please note that these libraries may have dependencies, so we'll ensure all the necessary requirements are met during the installation process.\n", "\n", "```python\n", "!pip install evaluate\n", "!pip install rouge_score\n", "```\n", "\n", "### 2. Installing `caikit` and `caikit-nlp`\n", "\n", "Next, we'll install specific versions of the caikit and caikit-nlp libraries, as the project is still in beta and breaking changes can happen.\n", "\n", "```python\n", "!pip install git+https://github.com/caikit/caikit@v0.11.3\n", "!pip install git+https://github.com/caikit/caikit-nlp\n", "```\n", "\n", "### 3. Downloading Additional Resources\n", "\n", "In order to explore the capabilities of pre-trained models, we'll need to download the caikit-nlp repository.\n", "\n", "\n", "```python\n", "!git clone https://github.com/caikit/caikit-nlp\n", "```\n", "\n", "Now that we have all the necessary libraries and resources installed, we can move on to the next steps in our NLP analysis using these powerful tools!" ], "metadata": { "id": "nqAU3Yh-rha5" } }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "ZhZcVULDrTRz", "colab": { "base_uri": "https://localhost:8080/" }, "outputId": "aa1b6a72-1f39-4d37-b8a4-04d2176a6478" }, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "Collecting evaluate\n", " Downloading evaluate-0.4.0-py3-none-any.whl (81 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m81.4/81.4 kB\u001b[0m \u001b[31m1.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting datasets>=2.0.0 (from evaluate)\n", " Downloading datasets-2.14.1-py3-none-any.whl (492 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m492.4/492.4 kB\u001b[0m \u001b[31m12.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from evaluate) (1.22.4)\n", "Collecting dill (from evaluate)\n", " Downloading dill-0.3.7-py3-none-any.whl (115 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m115.3/115.3 kB\u001b[0m \u001b[31m5.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from evaluate) (1.5.3)\n", "Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/dist-packages (from evaluate) (2.27.1)\n", "Requirement already satisfied: tqdm>=4.62.1 in /usr/local/lib/python3.10/dist-packages (from evaluate) (4.65.0)\n", "Collecting xxhash (from evaluate)\n", " Downloading xxhash-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m212.5/212.5 kB\u001b[0m \u001b[31m10.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting multiprocess (from evaluate)\n", " Downloading multiprocess-0.70.15-py310-none-any.whl (134 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m134.8/134.8 kB\u001b[0m \u001b[31m7.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: fsspec[http]>=2021.05.0 in /usr/local/lib/python3.10/dist-packages (from evaluate) (2023.6.0)\n", "Collecting huggingface-hub>=0.7.0 (from evaluate)\n", " Downloading huggingface_hub-0.16.4-py3-none-any.whl (268 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m268.8/268.8 kB\u001b[0m \u001b[31m18.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from evaluate) (23.1)\n", "Collecting responses<0.19 (from evaluate)\n", " Downloading responses-0.18.0-py3-none-any.whl (38 kB)\n", "Requirement already satisfied: pyarrow>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.0.0->evaluate) (9.0.0)\n", "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets>=2.0.0->evaluate) (3.8.5)\n", "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.0.0->evaluate) (6.0.1)\n", "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.7.0->evaluate) (3.12.2)\n", "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub>=0.7.0->evaluate) (4.7.1)\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->evaluate) (1.26.16)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->evaluate) (2023.7.22)\n", "Requirement already satisfied: charset-normalizer~=2.0.0 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->evaluate) (2.0.12)\n", "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->evaluate) (3.4)\n", "Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas->evaluate) (2.8.2)\n", "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->evaluate) (2022.7.1)\n", "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (23.1.0)\n", "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (6.0.4)\n", "Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (4.0.2)\n", "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (1.9.2)\n", "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (1.4.0)\n", "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.0.0->evaluate) (1.3.1)\n", "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.8.1->pandas->evaluate) (1.16.0)\n", "Installing collected packages: xxhash, dill, responses, multiprocess, huggingface-hub, datasets, evaluate\n", "Successfully installed datasets-2.14.1 dill-0.3.7 evaluate-0.4.0 huggingface-hub-0.16.4 multiprocess-0.70.15 responses-0.18.0 xxhash-3.2.0\n", "Collecting rouge_score\n", " Downloading rouge_score-0.1.2.tar.gz (17 kB)\n", " Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n", "Requirement already satisfied: absl-py in /usr/local/lib/python3.10/dist-packages (from rouge_score) (1.4.0)\n", "Requirement already satisfied: nltk in /usr/local/lib/python3.10/dist-packages (from rouge_score) (3.8.1)\n", "Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from rouge_score) (1.22.4)\n", "Requirement already satisfied: six>=1.14.0 in /usr/local/lib/python3.10/dist-packages (from rouge_score) (1.16.0)\n", "Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from nltk->rouge_score) (8.1.6)\n", "Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk->rouge_score) (1.3.1)\n", "Requirement already satisfied: regex>=2021.8.3 in /usr/local/lib/python3.10/dist-packages (from nltk->rouge_score) (2022.10.31)\n", "Requirement already satisfied: tqdm in /usr/local/lib/python3.10/dist-packages (from nltk->rouge_score) (4.65.0)\n", "Building wheels for collected packages: rouge_score\n", " Building wheel for rouge_score (setup.py) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for rouge_score: filename=rouge_score-0.1.2-py3-none-any.whl size=24933 sha256=bd08d7c51859b7fd181aed977f2dc00ce7821eadef79b06cc1848f893b8b748e\n", " Stored in directory: /root/.cache/pip/wheels/5f/dd/89/461065a73be61a532ff8599a28e9beef17985c9e9c31e541b4\n", "Successfully built rouge_score\n", "Installing collected packages: rouge_score\n", "Successfully installed rouge_score-0.1.2\n", "Collecting git+https://github.com/caikit/caikit@v0.11.3\n", " Cloning https://github.com/caikit/caikit (to revision v0.11.3) to /tmp/pip-req-build-_d_a7zk2\n", " Running command git clone --filter=blob:none --quiet https://github.com/caikit/caikit /tmp/pip-req-build-_d_a7zk2\n", " Running command git checkout -q da1dc8fa7df4f9e9ba5a5b7d926cb38b9e2f1757\n", " Resolved https://github.com/caikit/caikit to commit da1dc8fa7df4f9e9ba5a5b7d926cb38b9e2f1757\n", " Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n", " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n", " Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", "Collecting alchemy-config<2.0.0,>=1.1.1 (from caikit==0.0.1)\n", " Downloading alchemy_config-1.1.2-py3-none-any.whl (7.2 kB)\n", "Collecting alchemy-logging<2.0.0,>=1.0.4 (from caikit==0.0.1)\n", " Downloading alchemy_logging-1.1.1-py3-none-any.whl (14 kB)\n", "Collecting anytree<3.0,>=2.7.0 (from caikit==0.0.1)\n", " Downloading anytree-2.9.0-py3-none-any.whl (38 kB)\n", "Collecting docstring-parser<0.16.0,>=0.14.1 (from caikit==0.0.1)\n", " Downloading docstring_parser-0.15-py3-none-any.whl (36 kB)\n", "Requirement already satisfied: grpcio!=1.55.0,<2.0,>=1.35.0 in /usr/local/lib/python3.10/dist-packages (from caikit==0.0.1) (1.56.2)\n", "Collecting ijson<3.3.0,>=3.1.4 (from caikit==0.0.1)\n", " Downloading ijson-3.2.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (111 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m111.8/111.8 kB\u001b[0m \u001b[31m5.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting munch<5.0,>=2.5.0 (from caikit==0.0.1)\n", " Downloading munch-4.0.0-py2.py3-none-any.whl (9.9 kB)\n", "Requirement already satisfied: numpy<2,>=1.20 in /usr/local/lib/python3.10/dist-packages (from caikit==0.0.1) (1.22.4)\n", "Requirement already satisfied: protobuf<5,>=3.19.0 in /usr/local/lib/python3.10/dist-packages (from caikit==0.0.1) (3.20.3)\n", "Collecting py-to-proto!=0.2.1,<0.5.0,>=0.4.0 (from caikit==0.0.1)\n", " Downloading py_to_proto-0.4.1-py310-none-any.whl (32 kB)\n", "Requirement already satisfied: PyYAML<7.0,>=6.0 in /usr/local/lib/python3.10/dist-packages (from caikit==0.0.1) (6.0.1)\n", "Collecting semver<4.0,>=2.13.0 (from caikit==0.0.1)\n", " Downloading semver-3.0.1-py3-none-any.whl (17 kB)\n", "Requirement already satisfied: six<2.0.0,>=1.16.0 in /usr/local/lib/python3.10/dist-packages (from caikit==0.0.1) (1.16.0)\n", "Requirement already satisfied: tqdm<5.0.0,>=4.59.0 in /usr/local/lib/python3.10/dist-packages (from caikit==0.0.1) (4.65.0)\n", "Building wheels for collected packages: caikit\n", " Building wheel for caikit (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for caikit: filename=caikit-0.0.1-py3-none-any.whl size=288707 sha256=0a74e3baec7e97b85bb46fdfada10a84ec5574b59fc92de7a11fee7807e7a0e6\n", " Stored in directory: /tmp/pip-ephem-wheel-cache-2mfzwvnw/wheels/83/70/e6/fbfc01278ea550744ce890a667227c6f09bb2e8de0a7414191\n", "Successfully built caikit\n", "Installing collected packages: ijson, alchemy-logging, semver, py-to-proto, munch, docstring-parser, anytree, alchemy-config, caikit\n", "Successfully installed alchemy-config-1.1.2 alchemy-logging-1.1.1 anytree-2.9.0 caikit-0.0.1 docstring-parser-0.15 ijson-3.2.3 munch-4.0.0 py-to-proto-0.4.1 semver-3.0.1\n", "Collecting git+https://github.com/caikit/caikit-nlp\n", " Cloning https://github.com/caikit/caikit-nlp to /tmp/pip-req-build-exod5ibx\n", " Running command git clone --filter=blob:none --quiet https://github.com/caikit/caikit-nlp /tmp/pip-req-build-exod5ibx\n", " Resolved https://github.com/caikit/caikit-nlp to commit 09f49530061c42c7b4dee7a67d3a176d9cbb9b3a\n", " Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n", " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n", " Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", "Collecting peft@ git+https://github.com/mayank31398/peft.git@mpt-os-test (from caikit-nlp==0.0.1)\n", " Cloning https://github.com/mayank31398/peft.git (to revision mpt-os-test) to /tmp/pip-install-1vcj8v7m/peft_67b6b7a95a7e4f9b9d932939c25bb119\n", " Running command git clone --filter=blob:none --quiet https://github.com/mayank31398/peft.git /tmp/pip-install-1vcj8v7m/peft_67b6b7a95a7e4f9b9d932939c25bb119\n", " Running command git checkout -b mpt-os-test --track origin/mpt-os-test\n", " Switched to a new branch 'mpt-os-test'\n", " Branch 'mpt-os-test' set up to track remote branch 'mpt-os-test' from 'origin'.\n", " Resolved https://github.com/mayank31398/peft.git to commit a2e3a2d615cf78801639190cc684fffa866dee3e\n", " Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n", " Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n", " Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", "Collecting caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0 (from caikit-nlp==0.0.1)\n", " Downloading caikit-0.14.2-py3-none-any.whl (292 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m292.3/292.3 kB\u001b[0m \u001b[31m5.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting caikit-tgis-backend<0.2.0,>=0.1.14 (from caikit-nlp==0.0.1)\n", " Downloading caikit_tgis_backend-0.1.14-py3-none-any.whl (23 kB)\n", "Collecting accelerate>=0.18.0 (from caikit-nlp==0.0.1)\n", " Downloading accelerate-0.21.0-py3-none-any.whl (244 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m244.2/244.2 kB\u001b[0m \u001b[31m7.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: datasets>=2.4.0 in /usr/local/lib/python3.10/dist-packages (from caikit-nlp==0.0.1) (2.14.1)\n", "Requirement already satisfied: huggingface-hub in /usr/local/lib/python3.10/dist-packages (from caikit-nlp==0.0.1) (0.16.4)\n", "Requirement already satisfied: numpy>=1.22.4 in /usr/local/lib/python3.10/dist-packages (from caikit-nlp==0.0.1) (1.22.4)\n", "Requirement already satisfied: pandas>=1.5.0 in /usr/local/lib/python3.10/dist-packages (from caikit-nlp==0.0.1) (1.5.3)\n", "Requirement already satisfied: scikit-learn>=1.1 in /usr/local/lib/python3.10/dist-packages (from caikit-nlp==0.0.1) (1.2.2)\n", "Requirement already satisfied: scipy>=1.8.1 in /usr/local/lib/python3.10/dist-packages (from caikit-nlp==0.0.1) (1.10.1)\n", "Collecting tokenizers>=0.13.3 (from caikit-nlp==0.0.1)\n", " Downloading tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.8/7.8 MB\u001b[0m \u001b[31m22.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: torch>=1.13.1 in /usr/local/lib/python3.10/dist-packages (from caikit-nlp==0.0.1) (2.0.1+cu118)\n", "Requirement already satisfied: tqdm>=4.65.0 in /usr/local/lib/python3.10/dist-packages (from caikit-nlp==0.0.1) (4.65.0)\n", "Collecting transformers>=4.31.0 (from caikit-nlp==0.0.1)\n", " Downloading transformers-4.31.0-py3-none-any.whl (7.4 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m7.4/7.4 MB\u001b[0m \u001b[31m51.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.18.0->caikit-nlp==0.0.1) (23.1)\n", "Requirement already satisfied: psutil in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.18.0->caikit-nlp==0.0.1) (5.9.5)\n", "Requirement already satisfied: pyyaml in /usr/local/lib/python3.10/dist-packages (from accelerate>=0.18.0->caikit-nlp==0.0.1) (6.0.1)\n", "Requirement already satisfied: grpcio<2.0,>=1.35.0 in /usr/local/lib/python3.10/dist-packages (from caikit-tgis-backend<0.2.0,>=0.1.14->caikit-nlp==0.0.1) (1.56.2)\n", "Collecting requests<3,>=2.28.2 (from caikit-tgis-backend<0.2.0,>=0.1.14->caikit-nlp==0.0.1)\n", " Downloading requests-2.31.0-py3-none-any.whl (62 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m62.6/62.6 kB\u001b[0m \u001b[31m8.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: alchemy-config<2.0.0,>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (1.1.2)\n", "Requirement already satisfied: alchemy-logging<2.0.0,>=1.0.4 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (1.1.1)\n", "Requirement already satisfied: anytree<3.0,>=2.7.0 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (2.9.0)\n", "Requirement already satisfied: docstring-parser<0.16.0,>=0.14.1 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (0.15)\n", "Requirement already satisfied: ijson<3.3.0,>=3.1.4 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (3.2.3)\n", "Requirement already satisfied: munch<5.0,>=2.5.0 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (4.0.0)\n", "Requirement already satisfied: protobuf<5,>=3.19.0 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (3.20.3)\n", "Requirement already satisfied: py-to-proto!=0.2.1,<0.5.0,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (0.4.1)\n", "Requirement already satisfied: semver<4.0,>=2.13.0 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (3.0.1)\n", "Requirement already satisfied: six<2.0.0,>=1.16.0 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (1.16.0)\n", "Collecting grpcio-health-checking<2.0,>=1.35.0 (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading grpcio_health_checking-1.56.2-py3-none-any.whl (8.5 kB)\n", "Collecting grpcio-reflection<2.0,>=1.35.0 (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading grpcio_reflection-1.56.2-py3-none-any.whl (11 kB)\n", "Requirement already satisfied: prometheus_client<1.0,>=0.12.0 in /usr/local/lib/python3.10/dist-packages (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (0.17.1)\n", "Collecting py-grpc-prometheus<0.8,>=0.7.0 (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading py_grpc_prometheus-0.7.0-py3-none-any.whl (12 kB)\n", "Collecting fastapi[all]<1,>=0.95 (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading fastapi-0.100.1-py3-none-any.whl (65 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m65.8/65.8 kB\u001b[0m \u001b[31m8.8 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting sse-starlette<2,>=1.6.1 (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading sse_starlette-1.6.1-py3-none-any.whl (9.6 kB)\n", "Requirement already satisfied: pyarrow>=8.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.4.0->caikit-nlp==0.0.1) (9.0.0)\n", "Requirement already satisfied: dill<0.3.8,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.4.0->caikit-nlp==0.0.1) (0.3.7)\n", "Requirement already satisfied: xxhash in /usr/local/lib/python3.10/dist-packages (from datasets>=2.4.0->caikit-nlp==0.0.1) (3.2.0)\n", "Requirement already satisfied: multiprocess in /usr/local/lib/python3.10/dist-packages (from datasets>=2.4.0->caikit-nlp==0.0.1) (0.70.15)\n", "Requirement already satisfied: fsspec[http]>=2021.11.1 in /usr/local/lib/python3.10/dist-packages (from datasets>=2.4.0->caikit-nlp==0.0.1) (2023.6.0)\n", "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets>=2.4.0->caikit-nlp==0.0.1) (3.8.5)\n", "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->caikit-nlp==0.0.1) (3.12.2)\n", "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub->caikit-nlp==0.0.1) (4.7.1)\n", "Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.5.0->caikit-nlp==0.0.1) (2.8.2)\n", "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas>=1.5.0->caikit-nlp==0.0.1) (2022.7.1)\n", "Requirement already satisfied: joblib>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=1.1->caikit-nlp==0.0.1) (1.3.1)\n", "Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn>=1.1->caikit-nlp==0.0.1) (3.2.0)\n", "Requirement already satisfied: sympy in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.1->caikit-nlp==0.0.1) (1.11.1)\n", "Requirement already satisfied: networkx in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.1->caikit-nlp==0.0.1) (3.1)\n", "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.1->caikit-nlp==0.0.1) (3.1.2)\n", "Requirement already satisfied: triton==2.0.0 in /usr/local/lib/python3.10/dist-packages (from torch>=1.13.1->caikit-nlp==0.0.1) (2.0.0)\n", "Requirement already satisfied: cmake in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.13.1->caikit-nlp==0.0.1) (3.25.2)\n", "Requirement already satisfied: lit in /usr/local/lib/python3.10/dist-packages (from triton==2.0.0->torch>=1.13.1->caikit-nlp==0.0.1) (16.0.6)\n", "Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/dist-packages (from transformers>=4.31.0->caikit-nlp==0.0.1) (2022.10.31)\n", "Collecting safetensors>=0.3.1 (from transformers>=4.31.0->caikit-nlp==0.0.1)\n", " Downloading safetensors-0.3.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m61.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,<3.0.0,>=1.7.4 in /usr/local/lib/python3.10/dist-packages (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (1.10.12)\n", "Collecting starlette<0.28.0,>=0.27.0 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading starlette-0.27.0-py3-none-any.whl (66 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m67.0/67.0 kB\u001b[0m \u001b[31m9.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting email-validator>=2.0.0 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading email_validator-2.0.0.post2-py3-none-any.whl (31 kB)\n", "Collecting httpx>=0.23.0 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading httpx-0.24.1-py3-none-any.whl (75 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m75.4/75.4 kB\u001b[0m \u001b[31m10.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: itsdangerous>=1.1.0 in /usr/local/lib/python3.10/dist-packages (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (2.1.2)\n", "Collecting orjson>=3.2.1 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading orjson-3.9.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (138 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m138.7/138.7 kB\u001b[0m \u001b[31m18.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting pydantic-extra-types>=2.0.0 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading pydantic_extra_types-2.0.0-py3-none-any.whl (13 kB)\n", "Collecting pydantic-settings>=2.0.0 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading pydantic_settings-2.0.2-py3-none-any.whl (11 kB)\n", "Collecting python-multipart>=0.0.5 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading python_multipart-0.0.6-py3-none-any.whl (45 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m45.7/45.7 kB\u001b[0m \u001b[31m6.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting ujson!=4.0.2,!=4.1.0,!=4.2.0,!=4.3.0,!=5.0.0,!=5.1.0,>=4.0.1 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading ujson-5.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (53 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m53.9/53.9 kB\u001b[0m \u001b[31m7.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting uvicorn[standard]>=0.12.0 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading uvicorn-0.23.1-py3-none-any.whl (59 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m59.5/59.5 kB\u001b[0m \u001b[31m7.7 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.4.0->caikit-nlp==0.0.1) (23.1.0)\n", "Requirement already satisfied: charset-normalizer<4.0,>=2.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.4.0->caikit-nlp==0.0.1) (2.0.12)\n", "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.4.0->caikit-nlp==0.0.1) (6.0.4)\n", "Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.4.0->caikit-nlp==0.0.1) (4.0.2)\n", "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.4.0->caikit-nlp==0.0.1) (1.9.2)\n", "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.4.0->caikit-nlp==0.0.1) (1.4.0)\n", "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets>=2.4.0->caikit-nlp==0.0.1) (1.3.1)\n", "Collecting protobuf<5,>=3.19.0 (from caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading protobuf-4.23.4-cp37-abi3-manylinux2014_x86_64.whl (304 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m304.5/304.5 kB\u001b[0m \u001b[31m30.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->torch>=1.13.1->caikit-nlp==0.0.1) (2.1.3)\n", "Requirement already satisfied: setuptools>=39.0.1 in /usr/local/lib/python3.10/dist-packages (from py-grpc-prometheus<0.8,>=0.7.0->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (67.7.2)\n", "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.28.2->caikit-tgis-backend<0.2.0,>=0.1.14->caikit-nlp==0.0.1) (3.4)\n", "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.28.2->caikit-tgis-backend<0.2.0,>=0.1.14->caikit-nlp==0.0.1) (1.26.16)\n", "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2.28.2->caikit-tgis-backend<0.2.0,>=0.1.14->caikit-nlp==0.0.1) (2023.7.22)\n", "Requirement already satisfied: mpmath>=0.19 in /usr/local/lib/python3.10/dist-packages (from sympy->torch>=1.13.1->caikit-nlp==0.0.1) (1.3.0)\n", "Collecting dnspython>=2.0.0 (from email-validator>=2.0.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading dnspython-2.4.1-py3-none-any.whl (300 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m300.3/300.3 kB\u001b[0m \u001b[31m35.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting httpcore<0.18.0,>=0.15.0 (from httpx>=0.23.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading httpcore-0.17.3-py3-none-any.whl (74 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m74.5/74.5 kB\u001b[0m \u001b[31m10.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from httpx>=0.23.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (1.3.0)\n", "Collecting pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,<3.0.0,>=1.7.4 (from fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading pydantic-2.1.1-py3-none-any.whl (370 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m370.9/370.9 kB\u001b[0m \u001b[31m39.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting annotated-types>=0.4.0 (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,<3.0.0,>=1.7.4->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading annotated_types-0.5.0-py3-none-any.whl (11 kB)\n", "Collecting pydantic-core==2.4.0 (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,<3.0.0,>=1.7.4->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading pydantic_core-2.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.9 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.9/1.9 MB\u001b[0m \u001b[31m49.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting python-dotenv>=0.21.0 (from pydantic-settings>=2.0.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading python_dotenv-1.0.0-py3-none-any.whl (19 kB)\n", "Requirement already satisfied: anyio<5,>=3.4.0 in /usr/local/lib/python3.10/dist-packages (from starlette<0.28.0,>=0.27.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (3.7.1)\n", "Requirement already satisfied: click>=7.0 in /usr/local/lib/python3.10/dist-packages (from uvicorn[standard]>=0.12.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (8.1.6)\n", "Collecting h11>=0.8 (from uvicorn[standard]>=0.12.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading h11-0.14.0-py3-none-any.whl (58 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m58.3/58.3 kB\u001b[0m \u001b[31m7.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting httptools>=0.5.0 (from uvicorn[standard]>=0.12.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading httptools-0.6.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (428 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m428.8/428.8 kB\u001b[0m \u001b[31m50.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting uvloop!=0.15.0,!=0.15.1,>=0.14.0 (from uvicorn[standard]>=0.12.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading uvloop-0.17.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m4.1/4.1 MB\u001b[0m \u001b[31m67.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting watchfiles>=0.13 (from uvicorn[standard]>=0.12.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading watchfiles-0.19.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m1.3/1.3 MB\u001b[0m \u001b[31m63.5 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hCollecting websockets>=10.4 (from uvicorn[standard]>=0.12.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1)\n", " Downloading websockets-11.0.3-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (129 kB)\n", "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m129.9/129.9 kB\u001b[0m \u001b[31m9.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", "\u001b[?25hRequirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.4.0->starlette<0.28.0,>=0.27.0->fastapi[all]<1,>=0.95->caikit[runtime-grpc,runtime-http]<0.15.0,>=0.13.0->caikit-nlp==0.0.1) (1.1.2)\n", "Building wheels for collected packages: caikit-nlp, peft\n", " Building wheel for caikit-nlp (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for caikit-nlp: filename=caikit_nlp-0.0.1-py3-none-any.whl size=66015 sha256=8de87bb3537b6983dd7872f7d98e47233a7a46ea4ee45f8b81dc76ceb6652216\n", " Stored in directory: /tmp/pip-ephem-wheel-cache-etzrio_l/wheels/29/1c/3c/060d91e84e7a56eab1cb92fe59f09d22e106a75668a0cb62db\n", " Building wheel for peft (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n", " Created wheel for peft: filename=peft-0.5.0.dev0-py3-none-any.whl size=75373 sha256=1c9fe73f3afcbfad1185d1baf77a467f2ec26356cc91c79037c21c8f10cd1c9c\n", " Stored in directory: /tmp/pip-ephem-wheel-cache-etzrio_l/wheels/45/98/99/87ec5598ca81396c2670c1eff2dd4d3c4315760861c751b559\n", "Successfully built caikit-nlp peft\n", "Installing collected packages: tokenizers, safetensors, websockets, uvloop, ujson, requests, python-multipart, python-dotenv, pydantic-core, py-grpc-prometheus, protobuf, orjson, httptools, h11, dnspython, annotated-types, watchfiles, uvicorn, starlette, pydantic, httpcore, grpcio-reflection, grpcio-health-checking, email-validator, transformers, sse-starlette, pydantic-settings, pydantic-extra-types, httpx, fastapi, caikit, caikit-tgis-backend, accelerate, peft, caikit-nlp\n", " Attempting uninstall: requests\n", " Found existing installation: requests 2.27.1\n", " Uninstalling requests-2.27.1:\n", " Successfully uninstalled requests-2.27.1\n", " Attempting uninstall: protobuf\n", " Found existing installation: protobuf 3.20.3\n", " Uninstalling protobuf-3.20.3:\n", " Successfully uninstalled protobuf-3.20.3\n", " Attempting uninstall: pydantic\n", " Found existing installation: pydantic 1.10.12\n", " Uninstalling pydantic-1.10.12:\n", " Successfully uninstalled pydantic-1.10.12\n", " Attempting uninstall: caikit\n", " Found existing installation: caikit 0.0.1\n", " Uninstalling caikit-0.0.1:\n", " Successfully uninstalled caikit-0.0.1\n", "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n", "confection 0.1.0 requires pydantic!=1.8,!=1.8.1,<1.11.0,>=1.7.4, but you have pydantic 2.1.1 which is incompatible.\n", "google-colab 1.0.0 requires requests==2.27.1, but you have requests 2.31.0 which is incompatible.\n", "inflect 6.0.5 requires pydantic<2,>=1.9.1, but you have pydantic 2.1.1 which is incompatible.\n", "spacy 3.5.4 requires pydantic!=1.8,!=1.8.1,<1.11.0,>=1.7.4, but you have pydantic 2.1.1 which is incompatible.\n", "thinc 8.1.10 requires pydantic!=1.8,!=1.8.1,<1.11.0,>=1.7.4, but you have pydantic 2.1.1 which is incompatible.\u001b[0m\u001b[31m\n", "\u001b[0mSuccessfully installed accelerate-0.21.0 annotated-types-0.5.0 caikit-0.14.2 caikit-nlp-0.0.1 caikit-tgis-backend-0.1.14 dnspython-2.4.1 email-validator-2.0.0.post2 fastapi-0.100.1 grpcio-health-checking-1.56.2 grpcio-reflection-1.56.2 h11-0.14.0 httpcore-0.17.3 httptools-0.6.0 httpx-0.24.1 orjson-3.9.2 peft-0.5.0.dev0 protobuf-4.23.4 py-grpc-prometheus-0.7.0 pydantic-2.1.1 pydantic-core-2.4.0 pydantic-extra-types-2.0.0 pydantic-settings-2.0.2 python-dotenv-1.0.0 python-multipart-0.0.6 requests-2.31.0 safetensors-0.3.1 sse-starlette-1.6.1 starlette-0.27.0 tokenizers-0.13.3 transformers-4.31.0 ujson-5.8.0 uvicorn-0.23.1 uvloop-0.17.0 watchfiles-0.19.0 websockets-11.0.3\n", "Cloning into 'caikit-nlp'...\n", "remote: Enumerating objects: 1784, done.\u001b[K\n", "remote: Counting objects: 100% (815/815), done.\u001b[K\n", "remote: Compressing objects: 100% (271/271), done.\u001b[K\n", "remote: Total 1784 (delta 614), reused 654 (delta 543), pack-reused 969\u001b[K\n", "Receiving objects: 100% (1784/1784), 1.79 MiB | 11.17 MiB/s, done.\n", "Resolving deltas: 100% (1219/1219), done.\n" ] } ], "source": [ "!pip install evaluate\n", "!pip install rouge_score\n", "\n", "!pip install git+https://github.com/caikit/caikit@v0.11.3\n", "!pip install git+https://github.com/caikit/caikit-nlp\n", "\n", "!git clone https://github.com/caikit/caikit-nlp" ] }, { "cell_type": "markdown", "source": [ "# Step 2. Prompt Tuning\n", "\n", "```\n", "!python caikit-nlp/examples/run_peft_tuning.py MULTITASK_PROMPT_TUNING \\\n", " --dataset \"glue/rte\" \\\n", " --model_name t5-base \\\n", " --num_epochs 1 \\\n", " --verbose \\\n", " --num_virtual_tokens 100 \\\n", " --prompt_tuning_init RANDOM \\\n", " --output_dir tmp/foo \\\n", " --learning_rate 0.3 \\\n", " --batch_size=12 \\\n", " --accumulate_steps 16\n", "```\n", "\n", "This is a command-line instruction to run a Python script called `run_peft_tuning.py` using the python interpreter. It is part of the `caikit-nlp` package and is meant to tune a pre-trained model (in this case t5-base) on a specific dataset (in this case glue/rte) using different parameter efficient tuning approaches, e.g., MPT.\n", "\n", "Let's explain each argument in the command:\n", "\n", "1. `caikit-nlp/examples/run_peft_tuning.py`: This specifies the path to the Python script that will be executed. It is a part of the caikit-nlp library and contains the implementation of the PEFT approach.\n", "1. `MULTITASK_PROMPT_TUNING`: This is the option recognized by the `run_peft_tuning.py` script, indicating the type of tuning or learning strategy to be used. In this case, it's \"Multitask Prompt Tuning.\" (The alternative option is `PROMPT_TUNING` which has slightly different options. )\n", "1. `--dataset \\\"glue/rte\\\"`: This specifies the dataset to be used for tuning. In this example, the dataset is `glue/rte` which is subset of the GLUE (General Language Understanding Evaluation) benchmark containing the Recognizing Textual Entailment (RTE) task.\n", "1. `--model_name t5-base`: This indicates the base model that will be used for prompt tuning. In this case, it's `t5-base`, which refers to the T5-base model from Hugging Face.\n", "1. `--num_epochs 1`: This sets the number of epochs (training iterations) for the prompt-tuning process. Here, it's set to 1, meaning the model will go through the dataset once during prompt-tuning.\n", "1. ` --verbose`: This is a flag to enable verbose or detailed output during the training process. It will cause the script to print more information about the training progress.\n", "1. `--num_virtual_tokens 100`: This sets the number of virtual tokens used for prompt tuning. Prompt tuning involves freezing the weights of a base model and learning `soft prompts`, which can be concatenated to the inputs when running text generation on the model.\n", "1. `--prompt_tuning_init RANDOM`: This specifies the method used for initializing prompt tuning. In this case, it's set to \"RANDOM,\" meaning the prompt tuning will start with random values.\n", "1. `--output_dir tmp/foo`: This sets the directory where the prompt-tuned model and related outputs will be stored. In this case, it's set to the `tmp/foo` directory.\n", "1. `--learning_rate 0.3`: This sets the learning rate for the optimization process during prompt-tuning. The learning rate controls the step size in gradient descent.\n", "1. `--batch_size=12`: This sets the batch size used during training. The data will be divided into batches of 12 samples each.\n", "1. `--accumulate_steps 16`: This specifies the number of steps before gradients are accumulated and the weights are updated. It can be useful for larger batch sizes when the GPU memory is limited.\n", "\n", "Overall, this command line script is prompt-tuning the T5-base model using the MPT approach on the `glue/rte` dataset, with specific settings for learning rate, batch size, accumulation steps, and so on. For a full list of available args, their descriptions, and default values, run `!python caikit-nlp/examples/run_peft_tuning.py MULTITASK_PROMPT_TUNING --help`\n", " \n", "The results of prompt-tuning will be stored in the `tmp/foo` directory." ], "metadata": { "id": "7RIXUii94aLy" } }, { "cell_type": "code", "source": [ "%env ALLOW_DOWNLOADS=true\n", "!python caikit-nlp/examples/run_peft_tuning.py MULTITASK_PROMPT_TUNING --dataset \"glue/rte\" --model_name t5-base --num_epochs 1 --verbose --num_virtual_tokens 100 --prompt_tuning_init RANDOM --output_dir tmp/foo --learning_rate 0.3 --batch_size=12 --accumulate_steps 16" ], "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "L4deByI5aRNQ", "outputId": "0fe9ce42-7e43-46b1-e338-5c1c57454b27" }, "execution_count": 8, "outputs": [ { "output_type": "stream", "name": "stdout", "text": [ "env: ALLOW_DOWNLOADS=true\n", "2023-07-28 19:26:45.633173: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT\n", " is still in the BETA phase and subject to change!\n", "Could not deduct output type from function train for module class FineTuning.\n", "\u001b[94mExperiment Configuration\n", "- Model Name: [t5-base]\n", " |- Inferred Model Resource Type: []\n", "- Tuning Type: [MULTITASK_PROMPT_TUNING]\n", "- Prompt Tuning Initialization Type [RANDOM]\n", "- Number of Virtual Tokens: [100]\n", "- Dataset: [glue/rte]\n", "- Verbalizer: [rte { 0 : entailment, 1 : not entailment } {{input}}]\n", "- Number of Epochs: [1]\n", "- Learning Rate: [0.3]\n", "- Batch Size: [12]\n", "- Output Directory: [tmp/foo]\n", "- Exporting prompt only: [False]\n", "- Number of shots: [None]\n", "- Maximum source sequence length: [256]\n", "- Maximum target sequence length: [128]\n", "- Gradient accumulation steps: [16]\u001b[0m\n", "\u001b[94m[Loading the dataset...]\u001b[0m\n", "2023-07-28T19:26:50.217918 [fsspe:DBUG] open file: /root/.cache/huggingface/datasets/glue/rte/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/dataset_info.json\n", "2023-07-28T19:26:50.244294 [fsspe:DBUG] open file: /root/.cache/huggingface/datasets/glue/rte/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad/dataset_info.json\n", "\u001b[94m[Loading the base model resource...]\u001b[0m\n", "\u001b[94m[Starting the training...]\u001b[0m\n", "2023-07-28T19:26:53.553550 [PEFT_:INFO] Using initialization method [MultitaskPromptTuningInit.RANDOM]\n", "2023-07-28T19:26:55.997687 [PEFT_:DBUG] Shuffling enabled? True\n", "2023-07-28T19:26:55.997860 [PEFT_:DBUG] Shuffling buffer size: 2490\n", "2023-07-28T19:26:55.998389 [PEFT_:INFO] [{'output_model_types', 'prompt_tuning_init_method', 'prompt_tuning_init_source_model'}] config params not supported by provided tuning type!\n", "2023-07-28T19:26:55.998595 [PEFT_:INFO] Parameters used: {'num_virtual_tokens': 100, 'prompt_tuning_init_text': None, 'prompt_tuning_init_state_dict_path': None, 'tokenizer_name_or_path': 't5-base', 'num_transformer_submodules': 1, 'prompt_tuning_init': 'RANDOM'}\n", "2023-07-28T19:26:55.998719 [PEFT_:DBUG] Peft config [MultitaskPromptTuningConfig(peft_type=, auto_mapping=None, base_model_name_or_path=None, revision=None, task_type=, inference_mode=False, num_virtual_tokens=100, token_dim=None, num_transformer_submodules=1, num_attention_heads=None, num_layers=None, prompt_tuning_init='RANDOM', prompt_tuning_init_text=None, tokenizer_name_or_path='t5-base', prompt_tuning_init_state_dict_path=None, prompt_tuning_init_task=0, num_ranks=1, num_tasks=1)]\n", "2023-07-28T19:26:56.008847 [PEFT_:DBUG] Using device: cuda\n", "Epoch: 0: 100% 208/208 [03:09<00:00, 1.10it/s]\n", "2023-07-28T19:30:08.098438 [PEFT_:INFO] epoch 0: tensor(0.2951, device='cuda:0', grad_fn=)\n", "2023-07-28T19:30:08.275813 [PEFT_:INFO] Calculating single target task prompt vector\n", "\u001b[94m[Training Complete]\u001b[0m\n" ] } ] } ] }