-
Notifications
You must be signed in to change notification settings - Fork 892
Insights: abetlen/llama-cpp-python
Overview
Could not load contribution data
Please try again later
12 Releases published by 1 person
-
v0.2.86-metal
published
Aug 7, 2024 -
v0.2.86-cu124
published
Aug 7, 2024 -
v0.2.86-cu121
published
Aug 7, 2024 -
v0.2.86
published
Aug 7, 2024 -
v0.2.86-cu122
published
Aug 7, 2024 -
v0.2.86-cu123
published
Aug 7, 2024 -
v0.2.87-metal
published
Aug 7, 2024 -
v0.2.87-cu122
published
Aug 7, 2024 -
v0.2.87-cu124
published
Aug 7, 2024 -
v0.2.87-cu121
published
Aug 7, 2024 -
v0.2.87
published
Aug 7, 2024 -
v0.2.87-cu123
published
Aug 7, 2024
5 Pull requests merged by 5 people
-
Enable recursive search of
HFFS.ls
#1656 merged
Aug 7, 2024 -
chore(deps): bump pypa/cibuildwheel from 2.19.2 to 2.20.0
#1657 merged
Aug 7, 2024 -
Add more detailed log for prefix-match
#1659 merged
Aug 7, 2024 -
Ported back new grammar changes from C++ to Python implementation
#1637 merged
Aug 7, 2024 -
Fix crash when using grammar
#1649 merged
Aug 4, 2024
3 Pull requests opened by 3 people
-
🚀 Add Ruff Linter
#1651 opened
Aug 2, 2024 -
Refactor all_text to remaining_text
#1658 opened
Aug 6, 2024 -
Remove spurious bracket added in PR 1656
#1667 opened
Aug 7, 2024
10 Issues closed by 3 people
-
Could not build the wheels for 0.2.86 on Mac Silicon
#1663 closed
Aug 7, 2024 -
Bug: GitHub Pages missing wheels for v0.2.83 - v0.2.86
#1627 closed
Aug 7, 2024 -
segmentation fault 0.2.84 when using function calling
#1636 closed
Aug 7, 2024 -
Cannot load Phi3 with latest (0.2.84) release
#1638 closed
Aug 7, 2024 -
Chat completions crashes when asked for JSON response
#1655 closed
Aug 7, 2024 -
Grammars bracket repetition symbol not working
#1547 closed
Aug 7, 2024 -
The latest version kills python kernel with LlamaGrammar
#1623 closed
Aug 4, 2024 -
nvcc fatal : Host compiler targets unsupported OS.
#1647 closed
Aug 1, 2024 -
nvcc fatal : Host compiler targets unsupported OS.
#1642 closed
Aug 1, 2024
10 Issues opened by 10 people
-
`LlamaGrammar` prints grammar on each iteration
#1666 opened
Aug 7, 2024 -
error: no matching function for call to 'ggml_group_norm'
#1665 opened
Aug 7, 2024 -
Wheel build showing error of cmake suddenly - building version 0.2.76 on windows
#1664 opened
Aug 7, 2024 -
Wheel build fails building version 0.2.86
#1662 opened
Aug 7, 2024 -
High CPU Usage, very slow performance, with flash_attn=true on ROCM 6.1.2
#1661 opened
Aug 6, 2024 -
Error When Loading Model with llama_cpp: [WinError -1073741795] Windows Error 0xc000001d
#1660 opened
Aug 6, 2024 -
pip install llama-cpp-python on anaconda
#1654 opened
Aug 5, 2024 -
CUDA 12.1 Llama-cpp-python version 0.2.84 pre-built request.
#1652 opened
Aug 2, 2024 -
create_chat_completion is stuck in versions 0.2.84 and 0.2.85 for Mac Silicon
#1648 opened
Aug 1, 2024
80 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Integrate Functionary v2.5 + Refactor Functionary Code
#1509 commented on
Aug 4, 2024 • 3 new comments -
Update chatml-function-calling handler to allow 'required' tool choice
#1613 commented on
Aug 4, 2024 • 0 new comments -
Add user-assistant chat format
#1281 commented on
Aug 4, 2024 • 0 new comments -
Add a progress bar for prompt evaluation
#1248 commented on
Aug 4, 2024 • 0 new comments -
Exposes json_schema_to_gbnf method for importing from module
#1212 commented on
Aug 4, 2024 • 0 new comments -
WIP: Parallel generation implemenation
#1209 commented on
Aug 4, 2024 • 0 new comments -
feat: implement required attributes in json_schema_to_gbnf
#1170 commented on
Aug 4, 2024 • 0 new comments -
Add custom routers register to `create_app`
#1160 commented on
Aug 4, 2024 • 0 new comments -
Remove subsequences of cached tokens to match a longer prefix
#1106 commented on
Aug 4, 2024 • 0 new comments -
explicit cast messages to string for RAG purposes
#1080 commented on
Aug 4, 2024 • 0 new comments -
Multistage CUDA Dockerfile to reduce image size and allow local repository build
#993 commented on
Aug 4, 2024 • 0 new comments -
QA Document High api examples
#962 commented on
Aug 4, 2024 • 0 new comments -
Update shared library import & license compliance
#955 commented on
Aug 4, 2024 • 0 new comments -
Add batch inference support (WIP)
#951 commented on
Aug 4, 2024 • 0 new comments -
Replace all absolute imports of llama_cpp in llama_cpp
#922 commented on
Aug 4, 2024 • 0 new comments -
fix: get system message from messages for all prompt formats
#913 commented on
Aug 4, 2024 • 0 new comments -
limit_concurrency Uvicorn
#868 commented on
Aug 4, 2024 • 0 new comments -
feat(llm-vscode): add `generate` endpoint to support llm-vscode
#843 commented on
Aug 4, 2024 • 0 new comments -
Get rid of star imports
#835 commented on
Aug 4, 2024 • 0 new comments -
Fix #777, #464
#778 commented on
Aug 4, 2024 • 0 new comments -
Add cancel() method to interrupt a stream
#733 commented on
Aug 4, 2024 • 0 new comments -
Updated README.md, llama_cpp/llama.py and pyproject.toml to add support for cross-encoders
#1605 commented on
Aug 4, 2024 • 0 new comments -
Enable detokenizing special tokens
#1596 commented on
Aug 7, 2024 • 0 new comments -
Support images from local storage for Llava models
#1583 commented on
Aug 4, 2024 • 0 new comments -
Fix import error for multiple packages
#1576 commented on
Aug 4, 2024 • 0 new comments -
Add stream_options support according to OpenAI API
#1552 commented on
Aug 5, 2024 • 0 new comments -
Change server approach to handle parallel requests
#1550 commented on
Aug 4, 2024 • 0 new comments -
Workflow speed up - POST PART 1
#1526 commented on
Aug 4, 2024 • 0 new comments -
Support all types of GGUF metadata
#1525 commented on
Aug 4, 2024 • 0 new comments -
Workflow update - PART 2
#1515 commented on
Aug 4, 2024 • 0 new comments -
Support parallel function calls with tool_choice
#1503 commented on
Aug 1, 2024 • 0 new comments -
Render chat template tojson filter as unicode
#1486 commented on
Aug 4, 2024 • 0 new comments -
Loading sharded (GGUF) model files from HF with LLama.from_pretrained() 'additional_files' argument
#1457 commented on
Aug 4, 2024 • 0 new comments -
Support multiple chat templates - step 2
#1440 commented on
Aug 4, 2024 • 0 new comments -
LLaMA cpp python server: IPV6 support
#1427 commented on
Aug 4, 2024 • 0 new comments -
fix: add binding for name in ChatCompletionRequestToolMessage
#1407 commented on
Aug 4, 2024 • 0 new comments -
Add the Phi 3 mini chat format
#1383 commented on
Aug 4, 2024 • 0 new comments -
Add the Command R chat format
#1382 commented on
Aug 4, 2024 • 0 new comments -
Improve function calling (auto selection, parallel functions)
#1351 commented on
Aug 4, 2024 • 0 new comments -
Feature: Lightweight llama_cpp.server Docker Image Build Workflow
#1331 commented on
Aug 4, 2024 • 0 new comments -
Pull from Ollama repo functionality
#1607 commented on
Aug 7, 2024 • 0 new comments -
Add support for croos-encoders
#1611 commented on
Aug 7, 2024 • 0 new comments -
ERROR: Failed building wheel for llama-cpp-python for SYCL installation on Windows
#1614 commented on
Aug 7, 2024 • 0 new comments -
Build fail for version from 0.2.80 to 0.2.83
#1616 commented on
Aug 7, 2024 • 0 new comments -
ERROR: Could not build wheels for llama-cpp-python
#1617 commented on
Aug 7, 2024 • 0 new comments -
llama-cpp-python CMake,Failed building wheel for llama-cpp-python error on windows 11 pro
#1619 commented on
Aug 7, 2024 • 0 new comments -
can not install llama-cpp-python
#1621 commented on
Aug 7, 2024 • 0 new comments -
Not able to Install with cuda support in Bento
#1622 commented on
Aug 7, 2024 • 0 new comments -
CUDA error: unspecified launch failure on inference on Nvidia V100 GPUs
#1624 commented on
Aug 7, 2024 • 0 new comments -
Pre-built cpu wheel does not work on Ubuntu due to libc.musl dependency
#1628 commented on
Aug 7, 2024 • 0 new comments -
[Bug:Server] Lack of usage information on streaming response
#1640 commented on
Aug 7, 2024 • 0 new comments -
All requests end with 'finish_reason': 'length' when the max_tokens=-1 parameter is set.
#1645 commented on
Aug 7, 2024 • 0 new comments -
OSError: [WinError -1073741795] Windows Error 0xc000001d
#728 commented on
Aug 7, 2024 • 0 new comments -
Support for arm64 wheels and CPU Features
#1342 commented on
Aug 6, 2024 • 0 new comments -
Trying to load llm model using llama cpp python with GPU support fails with an OSError: exception: access violation reading 0x0000000000000000
#1581 commented on
Aug 5, 2024 • 0 new comments -
build error for version 0.2.81 and 0.2.80
#1573 commented on
Aug 5, 2024 • 0 new comments -
BUG: import error or execute error (NULL pointer access) for the latest prebuilt version `v0.2.81`
#1571 commented on
Aug 4, 2024 • 0 new comments -
How to use GPU?
#576 commented on
Aug 1, 2024 • 0 new comments -
tool_call function "name" has incorrect format so function calling does not work for functionary
#1560 commented on
Aug 1, 2024 • 0 new comments -
destructor llama error: TypeError: 'NoneType' object is not callable
#1610 commented on
Aug 1, 2024 • 0 new comments -
gguf reader for layer and size estimates
#716 commented on
Aug 4, 2024 • 0 new comments -
pyinstaller hook script
#709 commented on
Aug 4, 2024 • 0 new comments -
Add Helm-Chart for easy Kubernetes deployment
#678 commented on
Aug 4, 2024 • 0 new comments -
Add beam search
#631 commented on
Aug 4, 2024 • 0 new comments -
Create Dockerfile-CN
#624 commented on
Aug 4, 2024 • 0 new comments -
Add parameter to skip saving to cache when caching is enabled
#594 commented on
Aug 4, 2024 • 0 new comments -
Improve GitHub actions
#577 commented on
Aug 4, 2024 • 0 new comments -
Create simple_local_chat.py
#575 commented on
Aug 4, 2024 • 0 new comments -
Implement a flake.nix that uses the upstream llama.cpp flake by reference
#517 commented on
Aug 4, 2024 • 0 new comments -
WIP. Refer https://github.com/NixOS/nixpkgs/issues/242792
#505 commented on
Aug 4, 2024 • 0 new comments -
Create server_streaming.py
#414 commented on
Aug 4, 2024 • 0 new comments -
Added Mirostat Mode and related Params to Llama initialization
#329 commented on
Aug 4, 2024 • 0 new comments -
Allow relative paths at model initialization
#198 commented on
Aug 4, 2024 • 0 new comments -
WIP: Mechanism to retrieve all logprobs on completion
#176 commented on
Aug 4, 2024 • 0 new comments -
Add truncate to high level api
#172 commented on
Aug 4, 2024 • 0 new comments -
added huggingface space implementation
#146 commented on
Aug 4, 2024 • 0 new comments -
(WIP) Openapi client gen
#144 commented on
Aug 4, 2024 • 0 new comments -
Feat: Add support for Llama 3.1 function calling
#1618 commented on
Aug 7, 2024 • 0 new comments -
ERROR: Failed building wheel for llama-cpp-python
#1125 commented on
Aug 7, 2024 • 0 new comments -
Windows - OpenBLAS (CPU) - Could NOT find BLAS (missing: BLAS_LIBRARIES)
#1595 commented on
Aug 7, 2024 • 0 new comments