Skip to content

Mongodb, improved security, consistent package installation, better reactjs syntax, reusable unit tests

Compare
Choose a tag to compare
@jeff1evesque jeff1evesque released this 05 Nov 00:49
· 1755 commits to master since this release
dd1b778

This release encompasses issues pertaining to milestone 0.6.

This release has taken a significant amount of time, largely due to many important factors. First, the single mariadb database, has been split, allowing ML related datasets to be stored in mongodb. This was streamlined, to improve performance, and reduce complexity of the code. Now, users can supply datasets (i.e. json file), without needing to be parsed into several dedicated sql database tables. Additionally, anonymous users are limited to upload a maximum of 50 mongodb collections, while the authenticated users, are granted 150. To further on this, the sum of all collections, are allowed 10 (anonymous users), and 30 (authenticated users) documents. These values, can be configured, through the provided application.yaml, which will require the corresponding webserver(s) to be restarted.

Also, our flask app, is wired up in such a way now, that a mariadb, and mongodb connection is always open, and ready for transactions. This is a better solution, since each client accessing the ML application, doesn't need to open up a new connection each time they perform an operation. This was more of a problem, when the database was restricted to a single mariadb, since corresponding sql transactions used to be very granular. If related questions come up, regarding the benefits of having connection pools (versus having a single open connection), we can briefly argue, to spin up a dedicated machine, containing another flask instance. However, this application, is not yet production grade.

Additionally, major changes has occurred to help improve many security aspects of the application. For example, now the vagrant up build includes https://, as well as redis being implemented in place of the default, traditional cookie implementation. Users can /login, through the browser, and have their user information stored in redis, while having a randomized value, corresponding to their redis key, returned to them intrinsically as a cookie. This is better than sending an entire cookie, containing all of the user information. Similarly, users can now authenticate through the programmatic-api. Upon a successful post login, flask will return a token, which can be used on successive rest calls, to validate their session as a valid user.

Also, our build process, of enforcing the installation of particular packages (across multiple package managers), has been dynamically streamlined, based on the definition of packages.yaml:

    ## iterate 'packages' hash
    $packages.each |String $provider, $providers| {
        if ($provider in ['apt', 'npm', 'pip']) {
            $providers['general'].each|String $package, String $version| {
                package { $package:
                    ensure   => $version,
                    provider => $provider,
                    require  => [
                        Class['apt'],
                        Class['python'],
                        Class['package::nodejs'],
                        Class['package::python_dev']
                    ],
                }
            }
        }
}

We've also completed many enhancements to the frontend, which is difficult to formally list. To put things short, we've begun (not entirely) to heavily use redux between various reactjs components. Also, two new minimal reactjs pages have been created. One dedicated to allow users to save a generated prediction result, through a minimal webform, on /session/current-result, and another to list all previously saved /session/results. Lastly, we have to give thanks to @Vitao18, for converting every jsx file's createClass, and corresponding constructor, to the native javascript syntax.

Unit testing, has dramatically improved, in context of functionality, and resusability. We now have a single bash script, unit-tests, which contains all the necessary logic to build a sufficient testing environment, before tests are run against it. This allows the script to be used by our travis ci, along with the potential of running the test locally, even in our vagrant up build.

You may wonder what the heck the heck the bgc, and bgr datasets are doing in this milestone. To answer that, you'll have to wait until milestone-0.9 is finally merged to the master branch. Many thanks also go out to @protojas, for helping expediate, our future milestone-0.9, with the ensemble learning models.