Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : improve CI + add more tests #295

Open
1 task
ggerganov opened this issue Jun 25, 2023 · 9 comments
Open
1 task

ggml : improve CI + add more tests #295

ggerganov opened this issue Jun 25, 2023 · 9 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed testing Everything test related

Comments

@ggerganov
Copy link
Owner

ggerganov commented Jun 25, 2023

The current state of the testing framework is pretty bad - we have a few simple test tools in tests, but these are not maintained properly and are quite rudimentary. Additionally, the Github Actions do not allow to run heavy workloads so it is difficult to run integration tests even on small models such as GPT-2. Not to mention that there is no GPU support

Ideally, it would be awesome to make a CI that can build the code on as much different hardware as possible and perform some performance and accuracy tests for various models. This will allow quicker iteration over new changes to the core library

I posted a discussion in llama.cpp on this topic - hopefully we gather some insight on how to make such CI in the cloud:

ggerganov/llama.cpp#1985

Extra related issues:

TODOs:

@ggerganov ggerganov added help wanted Extra attention is needed good first issue Good for newcomers testing Everything test related labels Jun 25, 2023
@goerch
Copy link
Contributor

goerch commented Jul 11, 2023

I'd be interested in helping with the 'add more tests' part of this because of some unanswered question. But I believe it would be reasonable to have some directions here. Obvious question: do we have some means to get test coverage yet?

@ggerganov
Copy link
Owner Author

ggerganov commented Jul 11, 2023

I guess we can focus on CPU-only testing for now. The most straightforward approach is to have a unit test for each function in the ggml.h API. Some functions like ggml_rope() and ggml_alibi() should be cross-validated with the reference Python implementations somehow since these are difficult to judge if they compute stuff correctly. Such tests are lightweight and can be part of the existing Github Actions.

Regarding GPU tests - when the cloud CI framework is ready, we will simply run "integration" tests in the cloud. For example, the CI can obtain certain model data and run text generation and perplexity calculations using different GPUs. Whatever is available for rent. We can figure out the details for this later.

Test coverage would be nice - I've used lcov in the past. Maybe we can integrate it in the Github Actions CI.

@goerch
Copy link
Contributor

goerch commented Jul 12, 2023

@ggerganov : My tries to get lcov working on Windows failed miserably, but I got clang/llvm coverage analysis working. Here is a first summary:

Filename                      Regions    Missed Regions     Cover   Functions  Missed Functions  Executed       Lines      Missed Lines     Cover    Branches   Missed Branches     Cover
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
src\ggml.c                      10461              5746    45.07%         512               216    57.81%        9985              4703    52.90%        4848              2775    42.76%
tests\test-grad0.c                464                68    85.34%          11                 1    90.91%         818                68    91.69%         322                62    80.75%

Files which contain no functions:
include\ggml\ggml.h                 0                 0         -           0                 0         -           0                 0         -           0                 0         -
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
TOTAL                           10925              5814    46.78%         523               217    58.51%       10803              4771    55.84%        5170              2837    45.13%

The report is based on the merged profile data of all currently active tests, run by ctest, llvm seems to require the mention of a specific executable, however. If this is an acceptable way forward, I'll try to cleanup the CMake changes and propose a PR.

@ggerganov
Copy link
Owner Author

Yes, this looks even better. Let's give it a try

@alonfaraj
Copy link

Hi @ggerganov,
I just published a PR to support multiple platforms and OS few days ago. Let me know if it's something you find relevant?

@ggerganov
Copy link
Owner Author

@alonfaraj

Thank you very much! I'm currently looking at the PR - sorry for the delay

In the meantime, I've made progress on the Azure Cloud CI idea and hacked a simple framework using Bash + Git:

https://github.com/ggml-org/ci

Currently, I am able to very easily attach new nodes from the cloud and have them run various tests. The tests are implemented in the ci/run.sh script. At the moment I've rented just 3 CPU instances:

image

The ggml-2 instance is a high-performance one and can run heavier workloads like MPT 7B inference.
The results are summarized neatly in Github README.md files for each commit.

If this strategy turns out to be effective, I will probably scale it up and add GPU and bare-metal nodes.

@alonfaraj
Copy link

Looks good!
I will take a deeper look as well.

@ggerganov
Copy link
Owner Author

Add Metal CI to llama.cpp using the new macos-13 runners: #514

@ianscrivener
Copy link

ianscrivener commented Apr 17, 2024

@ggerganov ,
How are you going... and how are you progressing on CI?

I've recently finished an Azure Arctitecture/Devops contract... got familiar with CI/CD on Azure, Azure Infrastructure-as-code (IaC), different Azure Services etc.

Re-reading this Roadmap item... seems the solution may be

Github Action starts a CI process on Azure;
- create Azure "Webworker" infrastructure - multi approaches, shared, to dedicated, CPU or GPU
- [optionally] run unit tests
- run performance test
- return reports
- destroy Azure "Webworker" infrastructure

A yaml settings file + GitHub Secrets to manage the config.

The CI could be run on the forked Repo.. using the GitHub Secrets, hence Azure Credentials, of the fork GitHub Account

Do you still have a large Azure allocation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed testing Everything test related
Projects
Status: In Progress
Development

No branches or pull requests

4 participants