Multi-Fiber Networks for Video Recognition

This repository contains the code and trained models of:

Yunpeng Chen, Yannis Kalantidis, Jianshu Li, Shuicheng Yan, Jiashi Feng. "Multi-Fiber Networks for Video Recognition" (PDF).

Implementation

We use MXNet @92053bd for image classification and PyTorch 0.4.0a0@a83c240 for video classification.

Normalization

The inputs are substrated by mean RGB = [ 124, 117, 104 ], and then multiplied by 0.0167.

Usage

Train motion from scratch:

python train_kinetics.py

Fine-tune with pre-trained model:

python train_ucf101.py

or

python train_hmdb51.py

Evaluate the trained model:

cd test
# the default setting is to test trained model on ucf-101 (split1)
python evaluate_video.py

Results

Image Recognition (ImageNet-1k)

Single Model, Single Crop Validation Accuracy:

Model	Params	FLOPs	Top-1	Top-5	MXNet Model
ResNet-18 (reproduced)	11.7 M	1.8 G	71.4 %	90.2 %	GoogleDrive
ResNet-18 (MF embedded)	9.6 M	1.6 G	74.3 %	92.1 %	GoogleDrive
MF-Net (N=16)	5.8 M	861 M	74.6 %	92.0 %	GoogleDrive

Video Recognition (UCF-101, HMDB51, Kinetics)

Model	Params	Target Dataset	Top-1
MF-Net (3D)	8.0 M	Kinetics	72.8 %
MF-Net (3D)	8.0 M	UCF-101	96.0 %*
MF-Net (3D)	8.0 M	HMDB51	74.6 %*

* accuracy averaged on slip1, slip2, and slip3.

Trained Models

Model	Target Dataset	PyTorch Model
MF-Net (2D)	ImageNet-1k	GoogleDrive
MF-Net (3D)	Kinetics	GoogleDrive
MF-Net (3D)	UCF-101 (split1)	GoogleDrive
MF-Net (3D)	HMDB51 (split1)	GoogleDrive

Other Resources

ImageNet-1k Trainig/Validation List:

Download link: GoogleDrive

ImageNet-1k category name mapping table:

Download link: GoogleDrive

Kinetics Dataset:

Downloader: GitHub

UCF-101 Dataset:

Download link: Website

HMDB51 Dataset:

Download link: Website

FAQ

Do I need to convert the raw videos to specific format?

Our `dataiter' supports reading from raw videos and can tolerate corrupted videos.

How can I make the training faster?

Decoding frames from compressed videos consumes quite a lot CPU resources which is the bottleneck for the speed. You can try to convert the downloaded videos to other format or reduce the quality of the video. For example:

# convet to sort_edge_length = 360
ffmpeg -y -i ${SRC_VID} -c:v mpeg4 -filter:v "scale=min(iw\,(360*iw)/min(iw\,ih)):-1" -b:v 640k -an ${DST_VID}
# or, convet to sort_edge_length = 256
ffmpeg -y -i ${SRC_VID} -c:v mpeg4 -filter:v "scale=min(iw\,(256*iw)/min(iw\,ih)):-1" -b:v 512k -an ${DST_VID}
# or, convet to sort_edge_length = 160
ffmpeg -y -i ${SRC_VID} -c:v mpeg4 -filter:v "scale=min(iw\,(160*iw)/min(iw\,ih)):-1" -b:v 240k -an ${DST_VID}

Find another computer with better CPU.
The group convolution may not be well optimized.

Citation

If you use our code/model in your work or find it is helpful, please cite the paper:

@inproceedings{chen2018multifiber,
  title={Multi-Fiber networks for Video Recognition},
  author={Chen, Yunpeng and Kalantidis, Yannis and Li, Jianshu and Yan, Shuicheng and Feng, Jiashi},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
dataset		dataset
exps		exps
network		network
test		test
train		train
.gitignore		.gitignore
Hosts		Hosts
LICENSE		LICENSE
README.md		README.md
dist.sh		dist.sh
kill_python.sh		kill_python.sh
run_dist.sh		run_dist.sh
run_local.sh		run_local.sh
train_hmdb51.py		train_hmdb51.py
train_kinetics.py		train_kinetics.py
train_model.py		train_model.py
train_ucf101.py		train_ucf101.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Fiber Networks for Video Recognition

Implementation

Normalization

Usage

Results

Image Recognition (ImageNet-1k)

Video Recognition (UCF-101, HMDB51, Kinetics)

Trained Models

Other Resources

FAQ

Do I need to convert the raw videos to specific format?

How can I make the training faster?

Citation

About

Releases

Packages

Languages

License

zomkey/PyTorch-MFNet

Folders and files

Latest commit

History

Repository files navigation

Multi-Fiber Networks for Video Recognition

Implementation

Normalization

Usage

Results

Image Recognition (ImageNet-1k)

Video Recognition (UCF-101, HMDB51, Kinetics)

Trained Models

Other Resources

FAQ

Do I need to convert the raw videos to specific format?

How can I make the training faster?

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages