GitHub - nghenglim/database_benchmark: benchmark data + benchmark script to benchmark well known databases

Introduction

The intention of this benchmark script is a starting of decentralized database benchmark tool. The script might be not complete in feature but slowly we can enhance it.

This repo structure ultimately still have a lot to be changed, feel free to criticize in an open opinion way or use it for your own benchmark.

Advised benchmark procedure

Put benchmark environment detail to {folder}/readme.md
Use docker container to perform benchmark, only 1 container running at 1 time
Install anaconda with MySQLdb + psycopg2 package to execute benchmark remotely

How to Benchmark

Use similar table structure: located at {db}.sql
python word_list_generate.py to generate new sql insert script
python benchmark.py to start benchmark

currently have to manually edit hardcoded value to perform benchmark

To Do

No need to comment & uncomment benchmark.init(): to detect if exist database don't create or drop and create
pass argv optionally to benchmark mysql/mariadb/postgres
Include more features to benchmark: index, joining, disk usage
exclude transfer time in benchmark by passing a param: also has to make sure databases raw execute time is measuring in same way internally
Script or other methods to make all three database in same/similar config, this should be optional to run.
More database to benchmark, preferably nosql database
Decouple read query(q1 - q6) to another file, write query and database init query should be configurable too
In benchmark summary should auto populate query used or query version, script version or commit version
Benchmark should summarized all database summary and generate a html result with chart(preferably with JS)

FAQ

Why using default database configuration / configuration between databases is difference:
- This benchmark is using default configuration that comes with the docker image.
- Time consuming to standardize databases configuration
- Default configuration should give a basic idea on how the database performs.
Why missing benchmark on index, included data transfer time for benchmark, etc:
- Is in the To Do list, however IMHO these are good to have benchmark.
Why run in VM which make this benchmark not so reliable:
- Do not want to spend money to buy a physical server or cloud instances.
- Make benchmark decentralized, everyone who had a laptop should be able to do their own benchmark instead of relying on others.
- A rough idea on how these database performs is very important knowledge for all the developer.

License

This database_benchmark script is open-sourced software licensed under the MIT license

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Mariadb10.1.11-MySQL5.7.10-Postgres9.5.0-take2		Mariadb10.1.11-MySQL5.7.10-Postgres9.5.0-take2
Mariadb10.1.11-MySQL5.7.10-Postgres9.5.0		Mariadb10.1.11-MySQL5.7.10-Postgres9.5.0
benchmark		benchmark
sql		sql
.editorconfig		.editorconfig
README.md		README.md
benchmark.py		benchmark.py
generate_sql.py		generate_sql.py
mysql.sql		mysql.sql
postgres.sql		postgres.sql
requirements.txt		requirements.txt
word_list.txt		word_list.txt
word_list_generate.py		word_list_generate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Advised benchmark procedure

How to Benchmark

To Do

FAQ

License

About

Releases

Packages

Contributors 2

Languages

nghenglim/database_benchmark

Folders and files

Latest commit

History

Repository files navigation

Introduction

Advised benchmark procedure

How to Benchmark

To Do

FAQ

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages