Skip to content

ML-based approximate query processing engine

License

Notifications You must be signed in to change notification settings

kihyuk-nam/traindb

 
 

Repository files navigation

Java CI with Maven

TrainDB

TrainDB is a ML-based approximate query processing engine that aims to answer time-consuming analytical queries in a few seconds. TrainDB will provide SQL-like query interface and support various DBMS data sources.

Currently, we are implementing a prototype for proof of concept.

Requirements

  • Java 11+
  • Maven 3.x
  • Git, Subversion
  • DBMS(e.g. MySQL, SQLite3)
  • Python 3.x
  • Python Package Manager (e.g. Anaconda)
  • SDGym

Install

Download

$ git clone https://github.com/traindb-project/traindb.git

Build

$ cd traindb
$ mvn package

Then, you can find traindb-x.y.z-SNAPSHOT.tar.gz in traindb-assembly/target directory.

$ tar xvfz traindb-assembly/target/traindb-x.y.z-SNAPSHOT.tar.gz

To use ML models, you need to checkout models.
For python environment setup, see README in our traindb-model repository.

$ cd traindb-assembly/target/traindb-x.y.z-SNAPSHOT
$ svn co https://github.com/traindb-project/traindb-model/trunk/models

Run

Example

Now, you can execute SQL statements using the command line interface.
You need to put JDBC driver for your DBMS into the directory included in CLASSPATH.

$ cd traindb-assembly/target/traindb-x.y.z-SNAPSHOT
$ bin/trsql
sqlline> !connect jdbc:traindb:<dbms>:https://<host>
Enter username for jdbc:traindb:<dbms>:https://localhost: <username> 
Enter password for jdbc:traindb:<dbms>:https://localhost: <password>
0: jdbc:traindb:<dbms>:https://<host>>

You can train ML models and run approximate queries like the following example.

0: jdbc:traindb:<dbms>:https://<host>> CREATE MODELTYPE tablegan FOR SYNOPSIS AS LOCAL CLASS 'TableGAN' IN '$TRAINDB_PREFIX/models/TableGAN.py';
No rows affected (0.255 seconds)
0: jdbc:traindb:<dbms>:https://<host>> TRAIN MODEL tgan MODELTYPE tablegan ON <schema>.<table>(<column 1>, <column 2>, ...);
epoch 1 step 50 tensor(1.1035, grad_fn=<SubBackward0>) tensor(0.7770, grad_fn=<NegBackward>) None
epoch 1 step 100 tensor(0.8791, grad_fn=<SubBackward0>) tensor(0.9682, grad_fn=<NegBackward>) None
...
0: jdbc:traindb:<dbms>:https://<host>> CREATE SYNOPSIS <synopsis> FROM MODEL tgan LIMIT <# of rows to generate>;
...
0: jdbc:traindb:<dbms>:https://<host>> SELECT APPROXIMATE avg(<column>) FROM <schema>.<table>;

About

ML-based approximate query processing engine

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 95.7%
  • Python 1.8%
  • Shell 1.3%
  • ANTLR 0.8%
  • Visual Basic 6.0 0.2%
  • FreeMarker 0.2%