This work proposes a workflow for processing log files, which helps identify errors and presents a clear difference between correct and incorrect execution.
The proposed overview includes:
- generation of the log files (using the Python reproduction scripts)
- parsing the log files (using
templater.py
) - extracting the differences between the parsed log files (using
difference.py
)
- For CASSANDRA-14989 and CASSANDRA-11803: Docker
- For the rest of the reproduction scripts: Cassandra Cluster Manager (CCM) (and its dependencies)
- DataStax Python Driver:
pip install cassandra-driver
- It is advised to execute the scripts on a Linux system, since CCM has some known bugs on Windows
The scripts are to be executed separately by simply running the specific file. The program should terminate with exit code 0
and the log files are generated in /home/<USER>/.ccm/<CLUSTER_NAME>/<NODE_NAME>/logs/
, where NODE_NAME is by default node1, node2, etc. For this work, the important file is debug.log
.
The debug.log
can then be copied to a desktop folder named 13346_failure
or 13346_normal
(or the numbers of the other bugs). The copy of debug.log
is then ingested into templater.py
.
The parsed log file then goes through difference.py
, which generates an output file consisting of the different log entries.