The Evaluation & Testing framework for Computer vision models

Control performance risks, bias and security issues in AI models

Install Moonwatcher 🌝

pip install moonwatcher

Try the demos

Warning

The demos require wget to be installed on your system.

In the demo the performance of a model on unusual values for brightness, contrast and saturation of the underlying dataset are checked. To see how to create your own specific test scenarios check out Quickstart.

Object detection (the demo will download the val2017 set of COCO and use a subset of it):

python -m moonwatcher.demo_detection

Classification (the demo will download STL-10 as a dataset):

python -m moonwatcher.demo_classification

🏃‍♀️ Quickstart

1. 🧑‍🏫 Slices, Checks and Checksuites

There are three core concepts (apart from models and datasets) to this framework. These concepts are called Checks, Checksuites and Slices.

Slices

A slice is a subset of a dataset. There are different methods in the framework to create those subsets for sophisticated evaluation and testing setups.

Checks

A check is defining one specific evaluation and/or testing setups. It defines the metric used, the dataset or slice to evaluate/test on and optionally the test comparison. When a check is applied on a specific model it returns the evaluation calculated and optionally the testing result (True/False).

Checksuites

A checksuite combines multiple checks into one. It is a suite of checks as the name suggests.

2. 🤖 Run automated checks

Look into the relevant demo (demo_classification.py or demo_detection.py) to see how to create the MoonwatcherModel and MoonwatcherDataset from your data.

from moonwatcher.check import automated_checking
from moonwatcher.model.model import MoonwatcherModel
from moonwatcher.dataset.dataset import MoonwatcherDataset

# Your model (your_model) and dataset (your_dataset) loading somewhere

# Look into the relevant demo (demo_classification.py or demo_detection.py)
# to see how to create the MoonwatcherModel and MoonwatcherDataset from your data.
mw_model = MoonwatcherModel(
  model=your_model,
  ...
)
mw_dataset = MoonwatcherDataset(
  dataset=your_dataset,
  ...
)

automated_checking(model=mw_model, dataset=mw_dataset)

3. 👨‍💻 Write custom checks and checksuites

Writing a custom check works like this.

from moonwatcher.check import Check

accuracy_check = Check(
    name="AccuracyCheck",
    dataset_or_slice=mw_dataset,
    metric="Accuracy",
    operator=">",
    value=0.8,
)

# and run it on your model:
check_result = accuracy_check(mw_model)

Tip

You can also slice your dataset and use a slice for the check instead of the whole dataset.

Tip

Class/category based checking is not yet supported, but will be part of the next iteration.

Now adding another check and combining both into a checksuite

from moonwatcher.check import Check, CheckSuite

precision_check = Check(
    name="PrecisionCheck",
    dataset_or_slice=mw_dataset,
    metric="Precision",
    operator=">",
    value=0.8,
)

# Combine them into a checksuite
first_checksuite = CheckSuite(
    name="AllChecks", checks=[accuracy_check, precision_check]
)

# and run it on your model:
checksuite_result = first_checksuite(mw_model)

🖥️ Web app

The package can be used on its own, is open-source and will always be. We additionally developed a web app you can use to visualize results in a nice way. To try it out, check out

Web app instructions.

⭐️ Don’t forget to star the project if you want to support open source testing of ML models.

That's it. Have fun! 🌚

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
moonwatcher		moonwatcher
readme		readme
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Evaluation & Testing framework for Computer vision models

Control performance risks, bias and security issues in AI models

Install Moonwatcher 🌝

Try the demos

Contents

🏃‍♀️ Quickstart

1. 🧑‍🏫 Slices, Checks and Checksuites

Slices

Checks

Checksuites

2. 🤖 Run automated checks

3. 👨‍💻 Write custom checks and checksuites

🖥️ Web app

About

Releases

Packages

Contributors 2

Languages

License

moonwatcher-ai/moonwatcher

Folders and files

Latest commit

History

Repository files navigation

The Evaluation & Testing framework for Computer vision models

Control performance risks, bias and security issues in AI models

Install Moonwatcher 🌝

Try the demos

Contents

🏃‍♀️ Quickstart

1. 🧑‍🏫 Slices, Checks and Checksuites

Slices

Checks

Checksuites

2. 🤖 Run automated checks

3. 👨‍💻 Write custom checks and checksuites

🖥️ Web app

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages