Skip to content

radon-h2020/radon-repository-miner

Repository files navigation

Build Documentation Codecov lgtm pypi-version python-version

radon-repository-miner

RepositoryMiner is a Python framework that helps developers and researchers to mining software repositories. It currently supports the Infrastructure as Code technology Ansible and Tosca.

How to install

First, ensure your git version is greater than 2.38.0.

Then, from PyPI:

pip install repository-miner

Or, from source code:

git clone https://github.com/radon-h2020/radon-repository-miner.git
cd radon-repository-miner
pip install -r requirements.txt
pip install .

Important: to properly use the FixingCommitCategorized, install the spaCy statistical model en_core_web_sm:

python -m spacy download en_core_web_sm

Usage

from repominer.mining.<ansible|tosca> import <Ansible|Tosca>Miner
from repominer.metrics.<ansible|tosca> import <Ansible|Tosca>MetricsExtractor
    
miner = <Ansible|Tosca>Miner(url_to_repo='github.com/adriagalin/ansible.motd', clone_repo_to='/tmp') 
miner.get_fixing_commits()
miner.get_fixed_files()
failure_prone_files = miner.label()

metrics_extractor = <Ansible|Tosca>MetricsExtractor(path_to_repo='/tmp/ansible.motd')
metrics_extractor.extract(failure_prone_files, product=True, process=True, delta=True)
metrics_extractor.to_csv('metrics.csv')

print('FIXING COMMITS:', miner.fixing_commits)
print('FAILURE-PRONE FILES:', failure_prone_files)
print('METRICS:', metrics_extractor.dataset.head())

How to test

pip install pytest
unzip test_data.zip -d .
pytest