Skip to content

Commit

Permalink
Merge pull request #44 from moskomule/dev
Browse files Browse the repository at this point in the history
2020.12 update
  • Loading branch information
moskomule committed Dec 29, 2020
2 parents 2b98d4e + 30d06f7 commit c366ca7
Show file tree
Hide file tree
Showing 29 changed files with 672 additions and 760 deletions.
33 changes: 33 additions & 0 deletions .github/workflows/pypi.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
name: publish to pypi

on: push

jobs:
build:

runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- uses: actions/setup-python@v2
with:
python-version: 3.8

- name: Install dependencies
run: pip install -U setuptools wheel

- name: Build and tar
run: python setup.py sdist bdist_wheel

- name: Publish distribution to PyPI
if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags')
uses: pypa/gh-action-pypi-publish@master
with:
password: ${{ secrets.pypi_password }}

- name: Publish distribution to Test PyPI
uses: pypa/gh-action-pypi-publish@master
with:
password: ${{ secrets.test_pypi_password }}
repository_url: https://test.pypi.org/legacy/
skip_existing: true
26 changes: 0 additions & 26 deletions .github/workflows/release.yml

This file was deleted.

11 changes: 5 additions & 6 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name: pytest

on: [ push, pull_request ]
on: push

jobs:
build:
Expand All @@ -11,10 +11,9 @@ jobs:
strategy:
matrix:
python: [ '3.8' ]
torch: [ 'torch==1.6.0+cpu torchvision==0.7.0+cpu -f https://download.pytorch.org/whl/torch_stable.html' ]
#'--pre torch torchvision -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html' ]
# I don't know why, but with this version, homura.vision.datasets.VisionDataset cannot get datasets' args in
# __init__.
# python: [ '3.8' , '3.9' ]
torch: [ 'torch==1.7.1+cpu torchvision==0.8.2+cpu -f https://download.pytorch.org/whl/torch_stable.html',
'--pre torch torchvision -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html' ]

steps:
- uses: actions/checkout@v2
Expand All @@ -26,7 +25,7 @@ jobs:
run: |
python -m venv venv
. venv/bin/activate
pip install numpy
pip install -U numpy
pip install ${{ matrix.torch }}
pip install -U pytest
pip install -U .
Expand Down
115 changes: 51 additions & 64 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,56 @@
# homura ![](https://github.com/moskomule/homura/workflows/pytest/badge.svg) [![document](https://img.shields.io/static/v1?label=doc&message=homura&color=blue)](https://moskomule.github.io/homura)
# homura [![document](https://img.shields.io/static/v1?label=doc&message=homura&color=blue)](https://moskomule.github.io/homura)

**homura** is a library for fast prototyping DL research.
| master | dev |
| --- | --- |
| ![pytest](https://github.com/moskomule/homura/workflows/pytest/badge.svg) | ![pytest](https://github.com/moskomule/homura/workflows/pytest/badge.svg?branch=dev) |

**homura** is a fast prototyping library for DL research.

🔥🔥🔥🔥 *homura* (焰) is *flame* or *blaze* in Japanese. 🔥🔥🔥🔥

## Important Notes

* no longer supports `horovod` by default
* no longer installs `hydra-core` by default
* no longer steps schedulers by default. Do it manually.
* In order to avoid a name conflict on pypi, the library name is renamed to `homura-core`.
+ For *installation*, use `homura-core`.
+ For *importing*, use `homura`.
+ If you have already installed `homura<2020.12.0`, uninstall it before installing the latest one.

## Requirements

### Minimal requirements

```
Python>=3.8
PyTorch>=1.6.0
torchvision>=0.7.0
PyTorch>=1.7.0
torchvision>=0.8.0
```

### Optional
## Installation

```
faiss (for faster kNN)
cupy
accimage (for faster image pre-processing)
nlp (to run an example)
```console
pip uninstall homura
pip install -U homura-core
```

### test
or

```
pytest .
```console
pip uninstall homura
pip install -U git+https://github.com/moskomule/homura
```

## Installation
## Optional

```console
pip install git+https://github.com/moskomule/homura
```
faiss (for faster kNN)
accimage (for faster image pre-processing)
cupy
```

or
## test

```console
git clone https://github.com/moskomule/homura
cd homura
pip install -e .
```
pytest .
```

# APIs
Expand All @@ -67,22 +71,23 @@ model = MODEL_REGISTRY('model_name')(num_classes=num_classes)

# Model is registered in optimizer lazily. This is convenient for distributed training and other complicated scenes.
optimizer = optim.SGD(lr=0.1, momentum=0.9)
scheduler = lr_scheduler.MultiStepLR(milestones=[30,80], gamma=0.1)
scheduler = lr_scheduler.MultiStepLR(milestones=[30, 80], gamma=0.1)

with trainers.SupervisedTrainer(model,
optimizer,
F.cross_entropy,
with trainers.SupervisedTrainer(model,
optimizer,
F.cross_entropy,
reporters=[reporters.TensorboardReporter(...)],
scheduler=scheduler) as trainer:
# epoch-based training
for _ in trainer.epoch_iterator(num_epochs):
trainer.train(train_loader)
trainer.scheduler.step()
trainer.test(test_loader)
trainer.scheduler.step()

# otherwise, iteration-based training

trainer.run(train_loader, test_loader,
trainer.run(train_loader, test_loader,
total_iterations=1_000, val_intervals=10)

print(f"Max Accuracy={max(trainer.history['accuracy']['test'])}")
Expand All @@ -96,9 +101,10 @@ from homura.metrics import accuracy

trainer = SupervisedTrainer(...)


# from v2020.08, iteration is much simpler

def iteration(trainer: TrainerBase,
def iteration(trainer: TrainerBase,
data: Tuple[torch.Tensor, torch.Tensor]
) -> None:
input, target = data
Expand All @@ -111,6 +117,9 @@ def iteration(trainer: TrainerBase,
trainer.optimizer.zero_grad()
loss.backward()
trainer.optimizer.step()
# in case schedule is step-wise
trainer.scheduler.step()


SupervisedTrainer.iteration = iteration
# or
Expand All @@ -126,18 +135,21 @@ trainer = CustomTrainer({"generator": generator, "discriminator": discriminator}
**kwargs)
```

`reporter` internally tracks the values during each epoch and reduces after every epoch. Therefore, users can compute mIoU, for example, as
`reporter` internally tracks the values during each epoch and reduces after every epoch. Therefore, users can compute
mIoU, for example, as

```python
from homura.metrics import confusion_matrix


def cm_to_miou(cms: List[torch.Tensor]) -> torch.Tensor:
# cms: list of confusion matrices
cm = sum(cms).float()
miou = cm.diag() / (cm.sum(0) + cm.sum(1) - cm.diag())
return miou.mean().item()

def iteration(trainer: TrainerBase,

def iteration(trainer: TrainerBase,
data: Tuple[torch.Tensor, torch.Tensor]
) -> None:
input, target = data
Expand All @@ -148,13 +160,14 @@ def iteration(trainer: TrainerBase,

## Distributed training

Distributed training is complicated at glance. `homura` has simple APIs, to hide the messy codes for DDP, such as `homura.init_distributed` for the initialization and `homura.is_master` for checking if the process is master or not.
Distributed training is complicated at glance. `homura` has simple APIs, to hide the messy codes for DDP, such
as `homura.init_distributed` for the initialization and `homura.is_master` for checking if the process is master or not.

For details, see `examples/imagenet.py`.

## Reproducibility

This method makes randomness deterministic in its context.
These methods make randomness deterministic in its context.

```python
from homura.utils.reproducibility import set_deterministic, set_seed
Expand All @@ -171,16 +184,19 @@ with set_seed(seed):

## Registry System

Following major libraries, `homura` also has a simple register system.
Following major libraries, `homura` also has a simple registry system.

```python
from homura import Registry

MODEL_REGISTRY = Registry("language_models")


@MODEL_REGISTRY.register
class Transformer(nn.Module):
...


# or

MODEL_REGISTRY.register(bert_model, 'bert')
Expand All @@ -197,35 +213,6 @@ bert = MODEL_REGISTRY('bert', ...)

See [examples](examples).

* [cifar10.py](examples/cifar10.py): training ResNet-20 or WideResNet-28-10 with random crop on CIFAR10
* [imagenet.py](examples/imagenet.py): training a CNN on ImageNet on multi GPUs (single and multi process)

Note that homura expects datasets are downloaded in `~/.torch/data/DATASET_NAME`.

For [imagenet.py](examples/imagenet.py), if you want

* single node single gpu
* single node multi gpus

run `python imagenet.py`.

If you want

* single node multi threads multi gpus

run `python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS imagenet.py [...]`.

If you want

* multi nodes multi threads multi gpus,

run

* `python -m torch.distributed.launch --nnodes=$NUM_NODES --node_rank=0 --master_addr=$MASTER_IP --master_port=$MASTER_PORT --nproc_per_node=$NUM_GPUS imagenet.py` on the master node
* `python -m torch.distributed.launch --nnodes=$NUM_NODES --node_rank=$RANK --master_addr=$MASTER_IP --master_port=$MASTER_PORT --nproc_per_node=$NUM_GPUS imagenet.py` on the other nodes

Here, `0<$RANK<$NUM_NODES`.

# Citing

```bibtex
Expand Down
39 changes: 39 additions & 0 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Examples

## Requirements

* `homura`
* `chika` by `pip install -U chika`

## Contents

* [cifar10.py](examples/cifar10.py): training ResNet-20 or WideResNet-28-10 with random crop on CIFAR10
* [imagenet.py](examples/imagenet.py): training a CNN on ImageNet on multi GPUs (single and multi process)

Note that homura expects datasets are downloaded in `~/.torch/data/DATASET_NAME`.

For [imagenet.py](examples/imagenet.py), if you want

* single node single gpu
* single node multi gpus

run `python imagenet.py`.

If you want

* single node multi threads multi gpus

run `python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS imagenet.py [...]`.

If you want

* multi nodes multi threads multi gpus,

run

* `python -m torch.distributed.launch --nnodes=$NUM_NODES --node_rank=0 --master_addr=$MASTER_IP --master_port=$MASTER_PORT --nproc_per_node=$NUM_GPUS imagenet.py`
on the master node
* `python -m torch.distributed.launch --nnodes=$NUM_NODES --node_rank=$RANK --master_addr=$MASTER_IP --master_port=$MASTER_PORT --nproc_per_node=$NUM_GPUS imagenet.py`
on the other nodes

Here, `0<$RANK<$NUM_NODES`.
Loading

0 comments on commit c366ca7

Please sign in to comment.