Add top level hub benchmark files and CI scripts

- remove submodule update as no submodules used - remove torchhub job - update paths in scripts
pytorch · Sep 1, 2020 · 4bbc2fa · 4bbc2fa
1 parent 65e599d
commit 4bbc2fa
Show file tree

Hide file tree

Showing 17 changed files with 599 additions and 0 deletions.
diff --git a/.circleci/config.yml b/.circleci/config.yml
@@ -0,0 +1,40 @@
+version: 2
+
+jobs:
+ run_benchmarks:
+ machine:
+ image: ubuntu-1604:201903-01
+ resource_class: gpu.small
+ steps:
+ - checkout
+ - run:
+ name: Setup CI environment
+ command: ./scripts/setup_ci.sh
+ - run:
+ name: Install Conda
+ command: ./scripts/install_basics.sh
+ - run:
+ name: Install PyTorch nightly
+ command: ./scripts/install_nightlies.sh
+ - run:
+ name: Validate training benchmark suite
+ command: . ~/miniconda3/etc/profile.d/conda.sh; conda activate base; python test.py
+ - run:
+ name: Validate pytest-benchmark invocation of training suite
+ command: ./scripts/run_bench_and_upload.sh
+
+workflows:
+ version: 2
+ workflow-build:
+ jobs:
+ - run_benchmarks
+ nightly:
+ triggers:
+ - schedule:
+ cron: "0 0,12 * * *"
+ filters:
+ branches:
+ only:
+ - master
+ jobs:
+ - run_benchmarks
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,5 @@
+.benchmarks
+.data
 */**/__pycache__
 */**/*.pyc
 *.out*

diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,29 @@
+BSD 3-Clause License
+
+Copyright (c) 2019, pytorch
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are met:
+
+1. Redistributions of source code must retain the above copyright notice, this
+ list of conditions and the following disclaimer.
+
+2. Redistributions in binary form must reproduce the above copyright notice,
+ this list of conditions and the following disclaimer in the documentation
+ and/or other materials provided with the distribution.
+
+3. Neither the name of the copyright holder nor the names of its
+ contributors may be used to endorse or promote products derived from
+ this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
+DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
+FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
+CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
diff --git a/README.md b/README.md
@@ -0,0 +1,31 @@
+## Running Benchmarks
+There are currently two top-level scripts for running the models in hub. 
+
+`test.py` offers the simplest wrapper around the infrastructure for iterating through each model in the hub and installing and executing it.
+
+test_bench.py is a pytest-benchmark script that leverages the same infrastructure but collects benchmark statistics and supports filtering ala pytest. 
+
+In each model repo, the assumption is that the user would already have all of the torch family of packages installed (torch, torchtext, torchvision, ...) but it installs the rest of the dependencies for the model.
+
+### Using `test.py`
+`python test.py` will execute the setup and run steps for each model in the hub.
+
+Note: setup steps require connectivity, make sure to enable a proxy if needed.
+
+### Using pytest-benchmark driver
+Run `python test.py --setup_only` first to cause setup steps for each model to happen.
+
+`pytest test_bench.py` invokes the benchmark driver. See `--help` for a complete list of options. 
+
+Some useful options include
+- `--benchmark-autosave` (or other save related flags) to get .json output
+- `-k <filter expression>` (standard pytest filtering)
+- `--collect-only` only show what tests would run, useful to see what models there are or debug your filter expression
+
+## Nightly CI runs
+Currently, hub models run on nightly pytorch builds and push data to scuba. 
+
+See [Unidash](https://www.internalfb.com/intern/unidash/dashboard/pytorch_benchmarks/hub_detail/) (internal only)
+
+## Adding new models
+Instructions for adding new models are currently under development in a quip. At a high level, each model exists in its own repository, usually forked from an original open source repository and modified to add `install.py` and `hubconf.py` files which enable the hub scripts to interact with a known API.
diff --git a/bench_utils.py b/bench_utils.py
@@ -0,0 +1,80 @@
+import os
+from pathlib import Path
+import subprocess
+import sys
+import torch
+from urllib import request
+
+git_submodule_suggestion = "Have you run\n`" \
+ "`git submodule update --init --recursive`?"
+proxy_suggestion = "Unable to verify https connectivity, " \
+ "required for setup.\n" \
+ "Do you need to use a proxy?"
+
+this_dir = Path(__file__).parent.absolute()
+model_dir = 'models/'
+install_file = 'install.py'
+hubconf_file = 'hubconf.py'
+
+
+def _test_https(test_url='https://github.com', timeout=0.5):
+ try:
+ request.urlopen(test_url, timeout=timeout)
+ except OSError:
+ return False
+ return True
+
+
+def _install_deps(model_path):
+ if os.path.exists(os.path.join(model_path, install_file)):
+ subprocess.check_call([sys.executable, install_file], cwd=model_path)
+ else:
+ print('No install.py is found in {}.'.format(model_path))
+ print(git_submodule_suggestion)
+ sys.exit(-1)
+
+
+class workdir():
+ def __init__(self, path):
+ self.path = path
+ self.cwd = os.getcwd()
+
+ def __enter__(self):
+ sys.path.insert(0, self.path)
+ os.chdir(self.path)
+
+ def __exit__(self, exc_type, exc_value, traceback):
+ try:
+ os.chdir(self.cwd)
+ sys.path.remove(self.path)
+ except ValueError:
+ pass
+
+
+def list_model_paths():
+ p = Path(__file__).parent.joinpath(model_dir)
+ return [str(child.absolute()) for child in p.iterdir()]
+
+
+def setup():
+ if not _test_https():
+ print(proxy_suggestion)
+ sys.exit(-1)
+
+ _install_deps(this_dir)
+ for model_path in list_model_paths():
+ _install_deps(model_path)
+
+
+def list_models():
+ models = []
+ for model_path in list_model_paths():
+ with workdir(model_path):
+ try:
+ hub_module = torch.hub.import_module(hubconf_file, hubconf_file)
+ Model = getattr(hub_module, 'Model', None)
+ except FileNotFoundError:
+ raise RuntimeError(f"Unable to find {hubconf_file} in {model_path}.\n"
+ "{git_submodule_suggestion")
+ models.append(Model)
+ return zip(models, list_model_paths())
diff --git a/compare.py b/compare.py
@@ -0,0 +1,34 @@
+import argparse
+import json
+from collections import namedtuple
+
+Result = namedtuple("Result", ["name", "base_time", "diff_time"])
+
+def get_times(pytest_data):
+ return {b["name"]: b["stats"]["mean"] for b in pytest_data["benchmarks"]}
+
+parser = argparse.ArgumentParser("compare two pytest jsons")
+parser.add_argument('base', help="base json file")
+parser.add_argument('diff', help='diff json file')
+args = parser.parse_args()
+
+with open(args.base, "r") as base:
+ base_times = get_times(json.load(base))
+with open(args.diff, "r") as diff:
+ diff_times = get_times(json.load(diff))
+
+all_keys = set(base_times.keys()).union(diff_times.keys())
+results = [
+ Result(name, base_times.get(name, float("nan")), diff_times.get(name, float("nan")))
+ for name in sorted(all_keys)
+]
+
+print("{:48s} {:>13s} {:>15s} {:>10s}".format(
+ "name", "base time (s)", "diff time (s)", "% change"))
+for r in results:
+ print("{:48s} {:13.6f} {:15.6f} {:9.1f}%".format(
+ r.name,
+ r.base_time,
+ r.diff_time,
+ (r.diff_time / r.base_time - 1.0) * 100.0
+ ))
diff --git a/compare.sh b/compare.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+
+pytest test_bench.py -k cuda-jit --fuser legacy --benchmark-json legacy.json
+pytest test_bench.py -k cuda-jit --fuser te --benchmark-json te.json
+python compare.py legacy.json te.json
diff --git a/conftest.py b/conftest.py
@@ -0,0 +1,23 @@
+import pytest
+import torch
+
+def pytest_addoption(parser):
+ parser.addoption("--fuser", help="fuser to use for benchmarks")
+
+def set_fuser(fuser):
+ if fuser == "legacy":
+ torch._C._jit_set_profiling_executor(False)
+ torch._C._jit_set_profiling_mode(False)
+ torch._C._jit_override_can_fuse_on_gpu(True)
+ torch._C._jit_set_texpr_fuser_enabled(False)
+ elif fuser == "te":
+ torch._C._jit_set_profiling_executor(True)
+ torch._C._jit_set_profiling_mode(True)
+ torch._C._jit_set_bailout_depth(20)
+ torch._C._jit_set_num_profiled_runs(2)
+ torch._C._jit_override_can_fuse_on_cpu(False)
+ torch._C._jit_override_can_fuse_on_gpu(False)
+ torch._C._jit_set_texpr_fuser_enabled(True)
+
+def pytest_configure(config):
+ set_fuser(config.getoption("fuser"))
diff --git a/install.py b/install.py
@@ -0,0 +1,11 @@
+import subprocess
+import sys
+
+
+def pip_install_requirements():
+ subprocess.check_call([sys.executable, '-m',
+ 'pip', 'install', '-r', 'requirements.txt'])
+
+
+if __name__ == '__main__':
+ pip_install_requirements()
diff --git a/requirements.txt b/requirements.txt
@@ -0,0 +1,2 @@
+pytest
+pytest-benchmark
diff --git a/scripts/install_basics.sh b/scripts/install_basics.sh
@@ -0,0 +1,19 @@
+#!/bin/bash
+set -e
+
+# Install basics
+sudo apt-get install vim
+
+# Install miniconda
+CONDA=https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
+filename=$(basename "$CONDA")
+wget "$CONDA"
+chmod +x "$filename"
+./"$filename" -b -u
+
+# Force to use python3.7
+. ~/miniconda3/etc/profile.d/conda.sh
+conda activate base
+conda install -y python=3.7
+
+
diff --git a/scripts/install_nightlies.sh b/scripts/install_nightlies.sh
@@ -0,0 +1,8 @@
+#!/bin/bash
+set -e
+
+. ~/miniconda3/etc/profile.d/conda.sh
+conda activate base
+conda install -y pytorch torchtext torchvision -c pytorch-nightly
+pip install -q pytest pytest-benchmark requests
+
diff --git a/scripts/run_bench_and_upload.sh b/scripts/run_bench_and_upload.sh
@@ -0,0 +1,16 @@
+#!/bin/bash
+set -e
+. ~/miniconda3/etc/profile.d/conda.sh
+conda activate base
+
+BENCHMARK_DATA=".data"
+mkdir -p ${BENCHMARK_DATA}
+pytest test_bench.py --setup-show --benchmark-sort=Name --benchmark-json=${BENCHMARK_DATA}/hub.json
+
+# Token is only present for certain jobs, only upload if present
+if [ -z "$SCRIBE_GRAPHQL_ACCESS_TOKEN" ]
+then
+ echo "Skipping benchmark upload, token is missing."
+else
+ python scripts/upload_scribe.py --pytest_bench_json ${BENCHMARK_DATA}/hub.json
+fi
diff --git a/scripts/setup_ci.sh b/scripts/setup_ci.sh
@@ -0,0 +1,32 @@
+#!/usr/bin/env bash
+set -ex -o pipefail
+
+# Set up NVIDIA docker repo
+curl -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
+echo "deb https://nvidia.github.io/libnvidia-container/ubuntu16.04/amd64 /" | sudo tee -a /etc/apt/sources.list.d/nvidia-docker.list
+echo "deb https://nvidia.github.io/nvidia-container-runtime/ubuntu16.04/amd64 /" | sudo tee -a /etc/apt/sources.list.d/nvidia-docker.list
+echo "deb https://nvidia.github.io/nvidia-docker/ubuntu16.04/amd64 /" | sudo tee -a /etc/apt/sources.list.d/nvidia-docker.list
+
+# Remove unnecessary sources
+sudo rm -f /etc/apt/sources.list.d/google-chrome.list
+sudo rm -f /etc/apt/heroku.list
+sudo rm -f /etc/apt/openjdk-r-ubuntu-ppa-xenial.list
+sudo rm -f /etc/apt/partner.list
+
+sudo apt-get -y update
+sudo apt-get -y remove linux-image-generic linux-headers-generic linux-generic docker-ce
+sudo apt-get -y install \
+ linux-headers-$(uname -r) \
+ linux-image-generic \
+ moreutils \
+ docker-ce=5:18.09.4~3-0~ubuntu-xenial \
+ nvidia-container-runtime=2.0.0+docker18.09.4-1 \
+ nvidia-docker2=2.0.3+docker18.09.4-1 \
+ expect-dev
+
+sudo pkill -SIGHUP dockerd
+
+DRIVER_FN="NVIDIA-Linux-x86_64-440.59.run"
+wget "https://s3.amazonaws.com/ossci-linux/nvidia_driver/$DRIVER_FN"
+sudo /bin/bash "$DRIVER_FN" -s --no-drm || (sudo cat /var/log/nvidia-installer.log && false)
+nvidia-smi