Data-Driven Hint Generation for Alloy using Historial Student Submissions
- Java 11
- Maven
- Python 3 and respective packages
- Neo4j Enterprise and the following plugins:
- APOC
- Graph Data Science Library
# Build Higena
$ cd higena
$ mvn clean package
# Rename the shaded jar
$ mv lib/higena/higena/1.0.0/higena-1.0.0-shaded.jar ../lib/higena/higena/1.0.0/higena-1.0.0.jar
To add a new challenge, drop the .als file in the data/datasets/challenges/
folder. The file name should be the same as the challenge ID. For example, if the challenge name is challenge1
, the file name should be challenge1.als
. This file does not contain secrets.
To add the dataset of student submissions for a challenge, drop the .json file in the data/datasets/submissions/
folder. The file name should be the same as the challenge ID. For example, if the challenge name is challenge1
, the file name should be challenge1.json
.
The .json file should be an array of objects, where each object represents a submission. The object should have the following fields:
_id
: the submission IDtime
: the timestamp of its creationderivationOf
: the parent entryoriginal
: the first ancestor with secretscode
: the complete code of the modelsat
: command’s result. or -1 for errors [only for executions]cmd_i
: the index of the executed command [only for executions]cmd_n
: the name of the executed command [only for successful executions]cmd_c
: whether the command was a check [only for successful executions]msg
: the error or warning message [if any]theme
: the visualisation theme [only for sharing entries]
If you prefer, you can use the dataset available at Zenodo.
HiGenA requires the data to be in a specific format. To prepare the data, run the following command:
# Prepare data
$ cd data
$ python3 prepare_data.py
This script will create 4 folders:
all
: contains all the submissionsno_canon
: contains the submissions without canonicalizationonly_anon
: contains the submissions with only anonymizationonly_sort
: contains the submissions with only sortingtest
: contains the submissions for testingtrain
: contains the submissions for traininglogs
: contains the log files of the data preparation step
These folders contain one folder for each challenge. Each challenge folder contains subfolders for each public predicate in that challenge. These subfolders contain the submissions for that challenge, which contain ".csv" files, one for each predicate
- Move the prepared data to the neo4j import/prepared_data folder.
- Enable enterprise edition.
- Increase the maximum number of databases if necessary. The default is 100. The necessary number of databases is equal to the total number of predicates in the dataset.
- Install necessary plugins:
- apoc
- Graph Data Science Library
- Move the folder of submission to the neo4j import folder.
- Each neo4j instance contains an import folder. The default path is
/var/lib/neo4j/import/prepared_data
. If you want to create the graphs with all submissions you can move the contents of theall
folder to the import folder. It should look like:
import ├── prepared_data │ ├── challenge1 │ │ ├── predicate1.csv │ │ ├── predicate2.csv │ │ ├── ... │ ├── challenge2 │ │ ├── predicate1.csv │ │ ├── predicate2.csv │ │ ├── ... │ ├── ...
- Each neo4j instance contains an import folder. The default path is
- Start the neo4j instance.
- HiGenA requires the neo4j instance to be running and some environment variables to be set. You can create a
.env
file in thehigena
folder of the project with the following variables: - URI_NEO4J: the URI of the neo4j instance, e.g. bolt:https://localhost:7687
- USENAME_NEO4J: the username of the neo4j instance, e.g. neo4j
- PASSWORD_NEO4J: the password of the neo4j instance, e.g. 1234
- Before starting making hints for a challenge you have to create a database for that challenge. To do so, run the following command:
$ cd lib/higena/higena/1.0.0/
$ java -jar higena-1.0.0.jar $challenge $predicate
An alternative is to use the API.
import org.higena.graph.Graph;
// ...
Graph graph = new Graph(challenge, predicate);
graph.setup();
To request an hint for a submission use the API.
Graph graph = new Graph(challenge, predicate);
graph.getHint(expression, HintGenType.TED);
You can also send the code of the model if you use auxiliary predicates in your expression.
Graph graph = new Graph(challenge, predicate);
graph.getHint(expression, code, HintGenType.TED);
-
data
: contains the data related files.datesets
: contains the datasets of challenges and submissions.challenges
: contains the challenges in .als format (without secrets).prepared
: contains the prepared data.all
: contains all the submissionsno_canon
: contains the submissions without canonicalizationonly_anon
: contains the submissions with only anonymizationonly_sort
: contains the submissions with only sortingtest
: contains the submissions for testingtrain
: contains the submissions for training
evaluation
: contains data used for evaluation.logs
: contains the log files of the data preparation step.- data_analysis.ipynb: script to analyse the data for evaluation purposes.
- data_preparation.ipynb: script to prepare the data for HiGenA.
-
lib
: contains the libraries used by HiGenA. -
higena
: contains the source code of HiGenA (maven project).