Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support vector ANN search benchmarking #3094

Draft
wants to merge 1 commit into
base: bb-11.4-vec-vicentiu
Choose a base branch
from

Conversation

HugoWenTD
Copy link
Contributor

Description

Introduce scripts and Docker file for running the ann-benchmarks tool, dedicated to vector search performance testing.

  • Offer developers support to run the benchmark in their development environment via existing MariaDB builds or by deploying the source code and executing the benchmark within Docker.

  • Also, integrate these builds into GitLab CI for Ubuntu 22.04 and include ANN benchmarking tests.

For detailed usage instructions, refer to the commit message and script help command.

How can this PR be tested?

Manual test was done for the scripts. The script is also integrated in Git-Lab CI pipeline.

Basing the PR against the correct MariaDB version

  • This is a new feature and the PR is based against the latest MariaDB development branch

Backward compatibility

The changes fully backward compatible.

Copyright

All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.

@CLAassistant
Copy link

CLAassistant commented Mar 1, 2024

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@HugoWenTD
Copy link
Contributor Author

HugoWenTD commented Mar 1, 2024

Test results

Example for a local run with ./support-files/ann-benchmark/run-local.sh:

Click to expand
wenhug@ud83c070d9ea75a:~/workspace/server$ ./support-files/ann-benchmark/run-local.sh
Downloading ann-benchmark...

Cloning into '/home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/ann-benchmarks'...
remote: Enumerating objects: 237, done.
remote: Counting objects: 100% (237/237), done.
remote: Compressing objects: 100% (214/214), done.
remote: Total 237 (delta 23), reused 152 (delta 18), pack-reused 0
Receiving objects: 100% (237/237), 1.60 MiB | 9.34 MiB/s, done.
Resolving deltas: 100% (23/23), done.
Installing ann-benchmark dependencies...

Starting ann-benchmark...

downloading https://ann-benchmarks.com/random-xs-20-euclidean.hdf5 -> data/random-xs-20-euclidean.hdf5...
Cannot download https://ann-benchmarks.com/random-xs-20-euclidean.hdf5
Creating dataset locally
Splitting 10000*None into train/test
train size: 9000 * 20
test size:  1000 * 20
0/1000...
2024-03-18 11:17:30,522 - annb - INFO - running only mariadb
2024-03-18 11:17:30,526 - annb - INFO - Order: [Definition(algorithm='mariadb', constructor='MariaDB', module='ann_benchmarks.algorithms.mariadb', docker_tag='ann-benchmarks-mariadb', arguments=['euclidean', {'M': 24, 'efConstruction': 200}], query_argument_groups=[[10], [20], [40], [80], [120], [200], [400], [800]], disabled=False), Definition(algorithm='mariadb', constructor='MariaDB', module='ann_benchmarks.algorithms.mariadb', docker_tag='ann-benchmarks-mariadb', arguments=['euclidean', {'M': 16, 'efConstruction': 200}], query_argument_groups=[[10], [20], [40], [80], [120], [200], [400], [800]], disabled=False)]
Trying to instantiate ann_benchmarks.algorithms.mariadb.MariaDB(['euclidean', {'M': 24, 'efConstruction': 200}])

Setup paths:
MARIADB_ROOT_DIR: /home/ANT.AMAZON.COM/wenhug/workspace/server/builddir
DATA_DIR: /home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/data
LOG_FILE: /home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/mariadb.err
SOCKET_FILE: /tmp/mysql_4gl2e5ms.sock


Initialize MariaDB database...
/home/ANT.AMAZON.COM/wenhug/workspace/server/builddir/*/mariadb-install-db --no-defaults --verbose --skip-name-resolve --skip-test-db --datadir=/home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/data --srcdir=/home/ANT.AMAZON.COM/wenhug/workspace/server/support-files/ann-benchmark/../..
mysql.user table already exists!
Run mariadb-upgrade, not mariadb-install-db

Starting MariaDB server...
/home/ANT.AMAZON.COM/wenhug/workspace/server/builddir/*/mariadbd --no-defaults --datadir=/home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/data --log_error=/home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/mariadb.err --socket=/tmp/mysql_4gl2e5ms.sock --skip_networking --skip_grant_tables  &

MariaDB server started!
Got a train set of size (9000 * 20)
Got 1000 queries

Preparing database and table...

Inserting data...

Insert time for 180000 records: 0.4894428253173828

Creating index...

Index creation time: 9.5367431640625e-07
Built index in 0.5406086444854736
Index size:  128.0
Running query argument group 1 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 2 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 3 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 4 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 5 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 6 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 7 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 8 of 8...
Run 1/1...
Processed 1000/1000 queries...
Trying to instantiate ann_benchmarks.algorithms.mariadb.MariaDB(['euclidean', {'M': 16, 'efConstruction': 200}])

Setup paths:
MARIADB_ROOT_DIR: /home/ANT.AMAZON.COM/wenhug/workspace/server/builddir
DATA_DIR: /home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/data
LOG_FILE: /home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/mariadb.err
SOCKET_FILE: /tmp/mysql_q1gbgaf3.sock


Initialize MariaDB database...
/home/ANT.AMAZON.COM/wenhug/workspace/server/builddir/*/mariadb-install-db --no-defaults --verbose --skip-name-resolve --skip-test-db --datadir=/home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/data --srcdir=/home/ANT.AMAZON.COM/wenhug/workspace/server/support-files/ann-benchmark/../..
mysql.user table already exists!
Run mariadb-upgrade, not mariadb-install-db

Starting MariaDB server...
/home/ANT.AMAZON.COM/wenhug/workspace/server/builddir/*/mariadbd --no-defaults --datadir=/home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/data --log_error=/home/ANT.AMAZON.COM/wenhug/workspace/server/ann-workspace/mariadb-workspace/mariadb.err --socket=/tmp/mysql_q1gbgaf3.sock --skip_networking --skip_grant_tables  &

MariaDB server started!
Got a train set of size (9000 * 20)
Got 1000 queries

Preparing database and table...

Inserting data...

Insert time for 180000 records: 0.4275703430175781

Creating index...

Index creation time: 1.1920928955078125e-06
Built index in 0.48961424827575684
Index size:  0.0
Running query argument group 1 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 2 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 3 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 4 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 5 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 6 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 7 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 8 of 8...
Run 1/1...
Processed 1000/1000 queries...
2024-03-18 11:17:57,147 - annb - INFO - Terminating 1 workers

Ann-benchmark exporting data...

Looking at dataset deep-image-96-angular
Looking at dataset fashion-mnist-784-euclidean
Looking at dataset gist-960-euclidean
Looking at dataset glove-25-angular
Looking at dataset glove-50-angular
Looking at dataset glove-100-angular
Looking at dataset glove-200-angular
Looking at dataset mnist-784-euclidean
Looking at dataset random-xs-20-euclidean
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Looking at dataset random-s-100-euclidean
Looking at dataset random-xs-20-angular
Looking at dataset random-s-100-angular
Looking at dataset random-xs-16-hamming
Looking at dataset random-s-128-hamming
Looking at dataset random-l-256-hamming
Looking at dataset random-s-jaccard
Looking at dataset random-l-jaccard
Looking at dataset sift-128-euclidean
Looking at dataset nytimes-256-angular
Looking at dataset nytimes-16-angular
Looking at dataset word2bits-800-hamming
Looking at dataset lastfm-64-dot
Looking at dataset sift-256-hamming
Looking at dataset kosarak-jaccard
Looking at dataset movielens1m-jaccard
Looking at dataset movielens10m-jaccard
Looking at dataset movielens20m-jaccard
Looking at dataset dbpedia-openai-100k-angular
Looking at dataset dbpedia-openai-200k-angular
Looking at dataset dbpedia-openai-300k-angular
Looking at dataset dbpedia-openai-400k-angular
Looking at dataset dbpedia-openai-500k-angular
Looking at dataset dbpedia-openai-600k-angular
Looking at dataset dbpedia-openai-700k-angular
Looking at dataset dbpedia-openai-800k-angular
Looking at dataset dbpedia-openai-900k-angular
Looking at dataset dbpedia-openai-1000k-angular

Ann-benchmark plotting...

writing output to results/random-xs-20-euclidean.png
Found cached result
  0:                                 MariaDB(m=16, ef_construction=200, ef_search=40)        1.000     1007.832
Found cached result
  1:                                MariaDB(m=24, ef_construction=200, ef_search=400)        1.000      941.649
Found cached result
  2:                                 MariaDB(m=24, ef_construction=200, ef_search=10)        1.000     1140.663
Found cached result
  3:                                 MariaDB(m=24, ef_construction=200, ef_search=20)        1.000      988.373
Found cached result
  4:                                MariaDB(m=24, ef_construction=200, ef_search=120)        1.000     1091.114
Found cached result
  5:                                MariaDB(m=16, ef_construction=200, ef_search=120)        1.000      998.908
Found cached result
  6:                                 MariaDB(m=16, ef_construction=200, ef_search=20)        1.000     1021.691
Found cached result
  7:                                 MariaDB(m=24, ef_construction=200, ef_search=40)        1.000      823.179
Found cached result
  8:                                MariaDB(m=16, ef_construction=200, ef_search=400)        1.000     1079.078
Found cached result
  9:                                MariaDB(m=24, ef_construction=200, ef_search=800)        1.000     1218.009
Found cached result
 10:                                MariaDB(m=24, ef_construction=200, ef_search=200)        1.000      870.886
Found cached result
 11:                                MariaDB(m=16, ef_construction=200, ef_search=800)        1.000     1058.689
Found cached result
 12:                                 MariaDB(m=24, ef_construction=200, ef_search=80)        1.000      851.237
Found cached result
 13:                                MariaDB(m=16, ef_construction=200, ef_search=200)        1.000      930.801
Found cached result
 14:                                 MariaDB(m=16, ef_construction=200, ef_search=80)        1.000     1208.318
Found cached result
 15:                                 MariaDB(m=16, ef_construction=200, ef_search=10)        1.000      913.258

Ann-benchmark plot done, the last two colunms in above output for 'recall rate' and 'QPS'. ^^^ 


[COMPLETED]


Example for a local run with ./support-files/ann-benchmark/run-docker.sh (when doing an incremental build):

Click to expand
wenhug@ud83c070d9ea75a:~/workspace/server$ ./support-files/ann-benchmark/run-docker.sh
Docker image found.
-- Running cmake version 3.22.1
-- MariaDB 11.4.0
-- Updating submodules
-- Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE) 
== Configuring MariaDB Connector/C
-- SYSTEM_LIBS: /usr/lib/x86_64-linux-gnu/libz.so;dl;m;dl;m;/usr/lib/x86_64-linux-gnu/libssl.so;/usr/lib/x86_64-linux-gnu/libcrypto.so;/usr/lib/x86_64-linux-gnu/libz.so
-- Configuring OQGraph
-- Configuring done
-- Generating done
-- Build files have been written to: /build/ann-workspace/builddir
[13/13] Linking CXX executable extra/mariabackup/mariadb-backup
Downloading ann-benchmark...

[WARN] ann-benchmarks repository already exists. Skipping cloning. Remove /build/server/ann-workspace/ann-benchmarks if you want it to be re-initialized.

Installing ann-benchmark dependencies...

WARNING: The directory '/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Starting ann-benchmark...

2024-03-18 18:18:54,384 - annb - INFO - running only mariadb
2024-03-18 18:18:54,393 - annb - INFO - Order: [Definition(algorithm='mariadb', constructor='MariaDB', module='ann_benchmarks.algorithms.mariadb', docker_tag='ann-benchmarks-mariadb', arguments=['euclidean', {'M': 16, 'efConstruction': 200}], query_argument_groups=[[10], [20], [40], [80], [120], [200], [400], [800]], disabled=False), Definition(algorithm='mariadb', constructor='MariaDB', module='ann_benchmarks.algorithms.mariadb', docker_tag='ann-benchmarks-mariadb', arguments=['euclidean', {'M': 24, 'efConstruction': 200}], query_argument_groups=[[10], [20], [40], [80], [120], [200], [400], [800]], disabled=False)]
Trying to instantiate ann_benchmarks.algorithms.mariadb.MariaDB(['euclidean', {'M': 16, 'efConstruction': 200}])

Setup paths:
MARIADB_ROOT_DIR: /build/ann-workspace/builddir
DATA_DIR: /build/server/ann-workspace/mariadb-workspace/data
LOG_FILE: /build/server/ann-workspace/mariadb-workspace/mariadb.err
SOCKET_FILE: /tmp/mysql_4yk6c666.sock

Could not get current user, could be docker user mapping. Ignore.

Initialize MariaDB database...
/build/ann-workspace/builddir/*/mariadb-install-db --no-defaults --verbose --skip-name-resolve --skip-test-db --datadir=/build/server/ann-workspace/mariadb-workspace/data --srcdir=/build/server/support-files/ann-benchmark/../..
mysql.user table already exists!
Run mariadb-upgrade, not mariadb-install-db

Starting MariaDB server...
/build/ann-workspace/builddir/*/mariadbd --no-defaults --datadir=/build/server/ann-workspace/mariadb-workspace/data --log_error=/build/server/ann-workspace/mariadb-workspace/mariadb.err --socket=/tmp/mysql_4yk6c666.sock --skip_networking --skip_grant_tables  &

MariaDB server started!
Got a train set of size (9000 * 20)
Got 1000 queries

Preparing database and table...

Inserting data...

Insert time for 180000 records: 0.43891072273254395

Creating index...

Index creation time: 1.1920928955078125e-06
Built index in 0.4922800064086914
Index size:  128.0
Running query argument group 1 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 2 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 3 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 4 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 5 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 6 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 7 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 8 of 8...
Run 1/1...
Processed 1000/1000 queries...
Trying to instantiate ann_benchmarks.algorithms.mariadb.MariaDB(['euclidean', {'M': 24, 'efConstruction': 200}])

Setup paths:
MARIADB_ROOT_DIR: /build/ann-workspace/builddir
DATA_DIR: /build/server/ann-workspace/mariadb-workspace/data
LOG_FILE: /build/server/ann-workspace/mariadb-workspace/mariadb.err
SOCKET_FILE: /tmp/mysql_renlus59.sock

Could not get current user, could be docker user mapping. Ignore.

Initialize MariaDB database...
/build/ann-workspace/builddir/*/mariadb-install-db --no-defaults --verbose --skip-name-resolve --skip-test-db --datadir=/build/server/ann-workspace/mariadb-workspace/data --srcdir=/build/server/support-files/ann-benchmark/../..
mysql.user table already exists!
Run mariadb-upgrade, not mariadb-install-db

Starting MariaDB server...
/build/ann-workspace/builddir/*/mariadbd --no-defaults --datadir=/build/server/ann-workspace/mariadb-workspace/data --log_error=/build/server/ann-workspace/mariadb-workspace/mariadb.err --socket=/tmp/mysql_renlus59.sock --skip_networking --skip_grant_tables  &

MariaDB server started!
Got a train set of size (9000 * 20)
Got 1000 queries

Preparing database and table...

Inserting data...

Insert time for 180000 records: 0.3983802795410156

Creating index...

Index creation time: 1.1920928955078125e-06
Built index in 0.4507639408111572
Index size:  0.0
Running query argument group 1 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 2 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 3 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 4 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 5 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 6 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 7 of 8...
Run 1/1...
Processed 1000/1000 queries...
Running query argument group 8 of 8...
Run 1/1...
Processed 1000/1000 queries...
2024-03-18 18:19:22,024 - annb - INFO - Terminating 1 workers

Ann-benchmark exporting data...

Looking at dataset deep-image-96-angular
Looking at dataset fashion-mnist-784-euclidean
Looking at dataset gist-960-euclidean
Looking at dataset glove-25-angular
Looking at dataset glove-50-angular
Looking at dataset glove-100-angular
Looking at dataset glove-200-angular
Looking at dataset mnist-784-euclidean
Looking at dataset random-xs-20-euclidean
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Computing knn metrics
Computing epsilon metrics
Computing epsilon metrics
Computing rel metrics
Looking at dataset random-s-100-euclidean
Looking at dataset random-xs-20-angular
Looking at dataset random-s-100-angular
Looking at dataset random-xs-16-hamming
Looking at dataset random-s-128-hamming
Looking at dataset random-l-256-hamming
Looking at dataset random-s-jaccard
Looking at dataset random-l-jaccard
Looking at dataset sift-128-euclidean
Looking at dataset nytimes-256-angular
Looking at dataset nytimes-16-angular
Looking at dataset word2bits-800-hamming
Looking at dataset lastfm-64-dot
Looking at dataset sift-256-hamming
Looking at dataset kosarak-jaccard
Looking at dataset movielens1m-jaccard
Looking at dataset movielens10m-jaccard
Looking at dataset movielens20m-jaccard
Looking at dataset dbpedia-openai-100k-angular
Looking at dataset dbpedia-openai-200k-angular
Looking at dataset dbpedia-openai-300k-angular
Looking at dataset dbpedia-openai-400k-angular
Looking at dataset dbpedia-openai-500k-angular
Looking at dataset dbpedia-openai-600k-angular
Looking at dataset dbpedia-openai-700k-angular
Looking at dataset dbpedia-openai-800k-angular
Looking at dataset dbpedia-openai-900k-angular
Looking at dataset dbpedia-openai-1000k-angular

Ann-benchmark plotting...

Matplotlib created a temporary config/cache directory at /tmp/matplotlib-tuav14cy because the default path (/.config/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
writing output to results/random-xs-20-euclidean.png
Found cached result
  0:                                 MariaDB(m=16, ef_construction=200, ef_search=40)        1.000      841.485
Found cached result
  1:                                MariaDB(m=24, ef_construction=200, ef_search=400)        1.000      896.407
Found cached result
  2:                                 MariaDB(m=24, ef_construction=200, ef_search=10)        1.000      827.326
Found cached result
  3:                                 MariaDB(m=24, ef_construction=200, ef_search=20)        1.000      875.636
Found cached result
  4:                                MariaDB(m=24, ef_construction=200, ef_search=120)        1.000      877.246
Found cached result
  5:                                MariaDB(m=16, ef_construction=200, ef_search=120)        1.000      843.912
Found cached result
  6:                                 MariaDB(m=16, ef_construction=200, ef_search=20)        1.000      844.746
Found cached result
  7:                                 MariaDB(m=24, ef_construction=200, ef_search=40)        1.000     1006.725
Found cached result
  8:                                MariaDB(m=16, ef_construction=200, ef_search=400)        1.000     1143.344
Found cached result
  9:                                MariaDB(m=24, ef_construction=200, ef_search=800)        1.000      769.048
Found cached result
 10:                                MariaDB(m=24, ef_construction=200, ef_search=200)        1.000     1011.292
Found cached result
 11:                                MariaDB(m=16, ef_construction=200, ef_search=800)        1.000      938.419
Found cached result
 12:                                 MariaDB(m=24, ef_construction=200, ef_search=80)        1.000      972.378
Found cached result
 13:                                MariaDB(m=16, ef_construction=200, ef_search=200)        1.000      839.023
Found cached result
 14:                                 MariaDB(m=16, ef_construction=200, ef_search=80)        1.000      798.808
Found cached result
 15:                                 MariaDB(m=16, ef_construction=200, ef_search=10)        1.000      912.495

Ann-benchmark plot done, the last two colunms in above output for 'recall rate' and 'QPS'. ^^^ 


[COMPLETED]


New Git-Lab CI Job passed.

Ignore other failed jobs as the development branch does not build for some plugins:

image

@HugoWenTD HugoWenTD force-pushed the bb-11.4-vec-ann-benchmark branch 2 times, most recently from ac8d6d9 to 7a3507a Compare March 8, 2024 05:09
@HugoWenTD HugoWenTD force-pushed the bb-11.4-vec-ann-benchmark branch 2 times, most recently from 38fba8a to 663c971 Compare March 19, 2024 20:50
@HugoWenTD HugoWenTD changed the base branch from bb-11.4-vec to bb-11.4-vec-vicentiu March 26, 2024 22:43
@HugoWenTD HugoWenTD force-pushed the bb-11.4-vec-ann-benchmark branch 2 times, most recently from 746db91 to 205a394 Compare April 9, 2024 18:01
Introduce scripts and Dockerfile for executing the `ann-benchmarks` tool,
aimed at vector search performance testing. Support running the ANN
benchmarking both in GitLab CI and manually.

Developer Interface:

Both of the scripts provide flexibility for altering default behavior via
environment variables. Refer to the detailed description in the scripts'
documentation section.

- `run-local.sh`:
  This script facilitates the execution of the ANN (Approximate Nearest
  Neighbors) benchmarking test either against local builds or a specified
  folder where the MariaDB server is installed.

- `run-docker.sh`:
  This script automates the execution of the ANN benchmarking test within
  a Docker container.
  It builds the required Docker image if it doesn't exist or if forced,
  then build the source code and runs the benchmark in the specified
  workspace.

GitLab CI Build:

- A new job `ann-benchmark` is included in the test stage. This job runs
  ann-benchmark against the MariaDB server built in Ubuntu 22.04.
  Initially, we are using the `random-xs-20-euclidean` dataset with 20
  dimensions and 10000 records.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants