DCASE2019_task4/baseline at public · turpaultn/DCASE2019_task4

History

Name		Name	Last commit message	Last commit date
parent directory ..
models		models
utils		utils
DataLoad.py		DataLoad.py
DatasetDcase2019Task4.py		DatasetDcase2019Task4.py
README.md		README.md
TestModel.py		TestModel.py
__init__.py		__init__.py
config.py		config.py
download_data.py		download_data.py
evaluation_measures.py		evaluation_measures.py
main.py		main.py
main_simple_CRNN.py		main_simple_CRNN.py

README.md

Baseline

Important update: 19th of May 2019, problem with annotations (export), the corrected labels are updated. This affects:

validation/eval_dcase2018.csv
validation/validation.csv The results table at the bottom have been updated.

Minor updates described in previous fodler. To include new functionnalities, do not hesitate to do a pull request.

If you use the baseline, please cite this paper.

System description

The baseline system is based on the idea of the best submission of DCASE 2018 task 4 [1]. The author provided his system code and most of the hyper-parameters of this year baseline close to the hyper-parameters defined by last year winner. However, the network architecture itself remains similar to last year baseline so it is much simpler that the networks used by Lu JiaKai [1]. The parameters of the CRNN model can be found in config.py.

The baseline using a mean-teacher model that is composed of two networks that are both the same CRNN. The implementation of Mean teacher model is based on Tarvainen & Valpola from Curious AI [2]. The model is trained as follows:

The student model is trained on synthetic and weakly labeled data. The classification cost is computed at frame level on synthetic data and at clip level on weakly labeled data.
The teacher model is not trained, its weights are a moving average of the student model (at each epoch).
The inputs of the teacher model are the inputs of the student model + some Gaussian noise
A cost for consistency between teacher and student model is applied (for weak and strong predictions).

The baseline exploit unlabeled, weakly labeled and synthetic data for training and is trained for 100 epochs. Inputs are 864 frames long. The CRNN model is pooling in time to have 108 frames. Postrocessing (median filtering) of 5 frames is used to obtain events onset and offset for each file. The baseline system includes evaluations of results using event-based F-score as metric.

Evaluation

System performance are reported in term of event-based F-scores with a 200ms collar on onsets and a 200ms / 20% of the events length collar on offsets. Additionally, the performance in terms of segment-based F-scores on 1 sec segment are reported for information. Performance are reported on this year validation set (Validation 2019) and on the evaluation set from DCASE 2018 task 4.

F-score metrics (macro averaged)
	Public evaluation 2019 (Youtube)	Validation 2019	Evaluation 2018
Event-based	29.0 %	23.7 %	20.6 %
Segment-based	58.54 %	55.2 %	51.4 %

Note: The performance might not be exactly reproducible on a GPU based system. That is why, you can download the weights of the networks used for the experiments and run TestModel.py --model_path="Path_of_model" to reproduce the results.

References

[1] JiaKai, Lu: Mean teacher convolution system for dcase 2018 task 4. DCASE 2018 Challenge report. September, 2018.
[2] Tarvainen, A. and Valpola, H., 2017. Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. In Advances in neural information processing systems (pp. 1195-1204).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

baseline

baseline

README.md

Baseline

System description

Evaluation

References

Files

baseline

Directory actions

More options

Directory actions

More options

Latest commit

History

baseline

Folders and files

parent directory

README.md

Baseline

System description

Evaluation

References