This project creates a dataset from the FDIC failed bank list. The dataset is published to kaggle here: failed-bank-list.
-
Clone the Repository:
git clone https://github.com/RonMallory/failed-bank-dataset.git cd failed-bank-dataset
-
Setup with Poetry:
Ensure you have Poetry installed:
poetry install
This command installs all the necessary dependencies specified in
pyproject.toml
.
This project uses pre-commit
to maintain code quality and consistency. The following hooks are in place:
pre-commit install
The init dataset is published manually while additional updates to the dataset occur in github actions.
- Create api token from kaggle account settings
- Run the following command to generate the dataset.csv and the kaggle-metadata.json files
poetry run python src/main.py
- Run the following command to publish the dataset to kaggle
kaggle datasets create -p ./data"
- With the kaggle.json file that was created in Initial Publish create a github secret with the name KAGGLE_USERNAME and KAGGLE_KEY
- Once a pull request has been approved and merged into the main branch the github action will run and update the dataset.
- The ci.yml file will use the commit message to annotate the dataset with the changes made.
- Fork the project.
- Create a branch based on the DSLP strategy:
git checkout -b feature/new-feature
- Commit your changes:
git commit -am 'Add new feature'
- Push to the branch:
git push origin feature/new-feature
- Submit a pull request against the appropriate DSLP branch.
This project is licensed under the MIT License.