Skip to content

Latest commit

 

History

History
180 lines (140 loc) · 7.64 KB

databases.rst

File metadata and controls

180 lines (140 loc) · 7.64 KB

Databases

Why do we use "databases"?

We use databases to store data while we run the tests. When globally talking about databases, we are indirectly talking about the following subsystems. (JSON)

  • Autocontinue
  • InactiveDB
  • Mining
  • WhoisDB

Warning

There is a difference between what we are talking here and the --database argument which only enable/disable the InactiveDB subsystem.

How do we manage them?

They consist of simple JSON files which are read and updated on the fly.

Warnings around Database (self) management

Warning

If you plan to delete everything and still manage to use PyFunceble in the future, please use the --clean-all argument.

Indeed, it will delete everything which is related to what we generated, except things like the whois database file/table which saves (almost) static data which can be reused in the future.

Deleting, for example, the whois database file/table will just make your test run for a much longer time if you retest subject that used to be indexed into the whois database file/table.

Databases types

Since PyFunceble 2.0.0 (equivalent of >=1.18.0.dev), we offer multiple database types which are (as per configuration) json (default), mariadb and mysql.

We however only offers our support time for opensource software, hence Oracles MySql are not supported. if it works, great. However they should be pretty good covered through SQLAlchemy <https://docs.sqlalchemy.org/>

Why different database types?

With the introduction of the multiprocessing logic, it became natural to introduce other database format as it's a nightmare to update a JSON formatted file.

In order to write or use a JSON formatted database, we have to load it and overwrite it completely. It's great while working with a single CPU/process but as soon as we get out of that scope it become unmanageable.

How to use the mariadb or mysql format?

  1. Create a new user, password and database (optional) for PyFunceble to work with.

  2. Create a .pyfunceble-env file at the root of your configuration directory.

  3. Complete it with the following content (example)

    PYFUNCEBLE_DB_CHARSET=utf8mb4
    PYFUNCEBLE_DB_HOST=localhost
    PYFUNCEBLE_DB_NAME=PyFunceble
    PYFUNCEBLE_DB_PASSWORD=Hello,World!
    PYFUNCEBLE_DB_PORT=3306
    PYFUNCEBLE_DB_USERNAME=pyfunceble
    

    Note

    Since version 2.4.3.dev it is possible to use the UNIX socket for the PYFUNCEBLE_DB_HOST environment variable.

    The typical location for mysqld.sock is /var/run/mysqld/mysqld.sock.

    This have been done to make

    1. It easier to use the socket in conjunction with a supported CI environment/platform.

    2. Leaving more space on the IP-stack on local DB installations.

    3. The UNIX:SOCKET is usually faster than the IP connection on local runs.

      PYFUNCEBLE_DB_CHARSET=utf8mb4
      PYFUNCEBLE_DB_HOST=/var/run/mysqld/mysqld.sock
      PYFUNCEBLE_DB_NAME=PyFunceble
      PYFUNCEBLE_DB_PASSWORD=Hello,World!
      PYFUNCEBLE_DB_PORT=3306
      PYFUNCEBLE_DB_USERNAME=pyfunceble
      
  4. Switch the db_type index of your configuration file to mariadb or mysql.

  5. Play with PyFunceble!

Note

If the environment variables are not found, you will be asked to prompt the information.

SQL Layout:

DRAFT:

The layout and data within the Sql database and how they are used should currently be following this patterns.

alembic_version
  • version_num The Current version of Pyfunceble in number
pyfunceble_file
  • id Primary key, auto_increment
  • created creation date of the record
  • modified Date the record was last tested, (altered)
  • path source of the file tested. URI or File_path
  • test_completed (bool) this data is used for picking up a interrupted (broken) test or in CI for auto-continue -c
pyfunceble_mined
  • id Primary key, auto_increment
  • created creation date of the record
  • modified Date the record was last tested, altered
  • subject_id key_ref to pyfunceble_status.id
  • file_id key_ref to pyfunceble_file.id
  • mined the full fqdns results of a --mining response
pyfunceble_status
  • id Primary key, auto_increment
  • created creation date of the record
  • modified Date the record was last tested, altered
  • file_id (one to many relation) to pyfunceble_file.id This is used to extracting where a record comes from.
  • tested Is the actual record tested in full (domain/URI)
  • _status ACTIVE/INACTIVE status from the PyFunceble test (Twice??)
  • status ACTIVE/INACTIVE status from the PyFunceble test (Twice??)
  • _status_source The technique to determine the status WHOIS/DNSLOOKUP (Twice??)
  • status_source The technique to determine the status WHOIS/DNSLOOKUP (Twice??)
  • domain_syntax_validation (*INT???) Would expect a (bool(true,false)). Here I'm in doubt: Does this mean there was performed a --syntax test OR if it (0= failed, 1= past) syntax test?
  • expiration_date domain expiration date from a successful WHOIS response (shouldn't it be served true the whois table???)
  • http_status_code the HTTP code from a lookup, example: 200 = succes, 404 file not found (suggested to be moved to new table see <https://www.mypdns.org/T1250#19039> for reusable data)
  • ipv4_range_syntax_validation (*INT???) Would expect a (bool(true,false))
  • ipv4_syntax_validation (*INT???) Would expect a (bool(true,false))
  • ipv6_range_syntax_validation (*INT???) Would expect a (bool(true,false))
  • ipv6_syntax_validation (*INT???) Would expect a (bool(true,false))
  • subdomain_syntax_validation ?? but from current data set I would again expect a (bool) and not (*INT) as it is 0 OR 1
  • url_syntax_validation (*INT???) Would expect a (bool(true,false)). Here I'm in doubt: Does this mean there was performed a --syntax test OR if it (0= failed, 1= past) syntax test?
  • is_complement is this record from a --complement test. (*INT???) Would expect a (bool(true,false))
  • test_completed (*INT???) Would expect a (bool(true,false)) Have we done testing this record since last commit for test.
  • tested_at The date for last succeeded tested.
pyfunceble_whois_record
  • id Primary key, auto_increment
  • created creation date of the record
  • modified Data the record was last tested, altered
  • subject the domain for which this record is stored
  • expiration_date The domain expiration data according to the WHOIS
  • epoch The domain expiration data according to the WHOIS. just in EPOC format
  • state the domain state based on expiration_date and/or epoc future/past
  • record [NULL]?? would expect a key_ref to pyfunceble_status.id and as replacement for pyfunceble_whois_record.subject
  • server the whois server holding the WHOIS data (Should be altered to separate table/(DB) for reusable data and gaining from db.cache and minimize I/O & DB size)