VIPY: Python Tools for Visual Dataset Transformation
Documentation: https://visym.github.io/vipy
VIPY is a python package for representation, transformation and visualization of annotated videos and images. Annotations are the ground truth provided by labelers (e.g. object bounding boxes, face identities, temporal activity clips), suitable for training computer vision systems. VIPY provides tools to easily edit videos and images so that the annotations are transformed along with the pixels. This enables a clean interface for transforming complex datasets for input to your computer vision training and testing pipeline.
VIPY provides:
- Representation of videos with labeled activities that can be resized, clipped, rotated, scaled, padded, cropped and resampled
- Representation of images with object bounding boxes that can be manipulated as easily as editing an image
- Clean visualization of annotated images and videos
- Lazy loading of images and videos suitable for distributed processing (e.g. dask, spark)
- Straightforward integration into machine learning toolchains (e.g. torch, numpy)
- Fluent interface for chaining operations on videos and images
- Dataset download, unpack and import (e.g. Charades, AVA, ActivityNet, Kinetics, Moments in Time)
- Minimum dependencies for easy installation (e.g. AWS Lambda, Flask)
python 3.6+
ffmpeg (required for videos)
numpy, matplotlib, dill, pillow, ffmpeg-python
pip install vipy
Optional dependencies are installable as a complete package:
pip install pip --upgrade
pip install 'vipy[all]'
You will receive a friendly warning if attempting to use an optional dependency before installation.
import vipy
vipy.image.owl().mindim(512).zeropad(padwidth=150, padheight=0).show()
The tutorials and demos provide useful examples to help you get started.