Skip to content

Releases: kreeedit/trace

TRACE: Text Reuse Analysis and Comparison Engine

26 Jul 08:23
Compare
Choose a tag to compare

Revoke Global initialization of the SentenceTransformer model to speed up.

TRACE: Text Reuse Analysis and Comparison Engine

25 Jul 12:36
6a7dedb
Compare
Choose a tag to compare

TRACE is a simple Python script that compares the similarities between different text files (even when they are multilingual or paraphrased) using two methods: MinHash and SentenceTransformer. It allows you to specify the directory containing the text (txt) files, the size of the text windows to compare, the step size to move for each new window, and the similarity threshold. It also creates a network graph (text_similarity_network.gexf) of the text similarities to see the relations of the different texts. The result of the analytics is stored in a json file (result.json)