Skip to content

seewiese/bee-finder

Β 
Β 

Repository files navigation

Bee-finder

The bee-finder aims to simplify the evaluation process of videos. One can input video files in "H.264" and "mp4" format, which the bee-finder will filter for target species (according to the weights provided) and reconstruct a "mp4" video file out of the frames which contain detected target species. The target group of the bee-finder are users that are not (yet) familiar with tools like command line, programming etc. The provided weights are to detect horned mason bees (O. cornuta) in front of a nesting aid. This manual provides a step-by-step guide how to train one's own YOLO network, as well.

Installation and set-up

Requirements

  • Windows Operating System or
  • Linux Operating System.

A decent NVIDIA GPU is necessary to use the CNN YOLOv5 on which the bee-finder is based.

Installation of neccessary software for training a neural network and the the bee-finder toolbox

Anaconda
Programming tool, neccessary to run YOLO. Download the Anaconda version which is appropriate for your system.

  • On Windows: Install Anaconda in directory "C:/anaconda3". A detailed guide with images can be found here.
  • On Linux: Type and confirm
    Anaconda-latest-Linux-x86_64.sh
    in your Terminal window. A detailed guide can be found here.

ffmpeg
Neccessary for video processing, e.g. to convert the videos in H.264 format into mp4 format.

  • On Windows:
    1. Download the ffmpeg version which is appropriate for your system.
    2. Open Control Panel > System > Advanced system settings and click on "Advanced".
    3. Click on "Environment variables and add in "PATH" the pathway with the labelImg-folder, in this case: "C:\labelImg".

  • On Linux:
    1. Launch Anaconda Prompt (Anaconda3) from Start Menu.
    2. Type and confirm:

    conda install pip

    pip install ffmpeg

Windows Build Tools C++ 2022
Some libraries in python are written natively in C++, for which we need this software.

  • On Windows:
    1. Download Visual Studio Community 2022.
    2. Install "VisualStudioSetup.exe". After the installer has started, select "Desktop development with C++" and click on "Install" with following parameters:

Installer

  • On Linux: No installation neccesary, required for Windows only.

CUDA and cuDNN
Developed by NVIDIA, CUDA focuses on general computing on GPUs and speeds up various computations. It is required to run deep learning frameworks such as PyTorch (required for YOLOv5). Please note, that a NVIDIA account is neccessary to download files.

  • On Windows:
    - Download and install CUDA version 11.7.1 from CUDA Toolkit 11.7 Update 1 (default settings).
    - Download cuDNN version 8.5.0 for CUDA 11.x from CUDA Deep Neural Network (cuDNN). To install CuDNN, one needs to copy following files in the respective CUDA directory (found in the "NVIDIA GPU computing Toolkit" folder):

    - bin/cudnn64/8.dll
    - include/cudnn.h
    - lib/cudnn.lib

    πŸ’‘ TIPP: If the version listed here cannot be found on the website: all archived versions of cuDNN are found here.

  • On Linux: No installation neccesary, required for Windows only.


Visual Studio Code

Code editor that will be used to view, change and develop code. The software can be downloaded here.

  • On Windows: Doubleclick on the downloaded file and follow the provided installer instructions.
  • On Linux: Doubleclick on the downloaded file and click "Install".

🠊 Restart your computer after those installations have been completed successfully.

Setup of a virtual environment in Anaconda, installation of neccessary packages and the bee-finder

Some libraries and python require a specific version which we need to install in a virtual environment.

  1. Launch Anaconda Prompt (Anaconda3) from Start Menu.

  2. Create a virtual environment (here called "yolo_env") by typing and running:

    conda create --name yolo_env python=3.8.0
  3. Activate the created Anaconda environment with:

    conda activate yolo_env
  4. Install the necessary packages to handle download and installation via "git" and "pip" by typing and confirming:

    conda install git
    conda install pip
  5. Navigate into the folder you want to have the bee-finder in, e.g. cd 'C:/Users/Max Mustermann/Desktop/'

  6. Install the bee-finder in this folder by typing and confirming:

    git clone https://github.com/seewiese/bee-finder.git
  7. Now install all necessary packages for the bee-finder:

    pip install -r requirements_bee_finder.txt

🠊 The bee-finder has been set up successfully.

πŸ’‘ TIPP : The same steps are necessary to install an unmodified YOLOv5 version. It is only necessary to replace the code from step 6 and 7 with β€œgit clone https://github.com/ultralytics/yolov5” and β€œpip install -r requirements.txt”


Data organisation for YOLOv5 training

To ease the training process, the following folder structure is recommended to conduct the YOLOv5 training with the provided scripts. Copy all ".jpg" files of the annotated images in the the folder images > Class_0 and all ".txt" files of the annotated images in the folder labels > Class_0 .

πŸ’‘ TIPP : If you have more than one class, add a folder in the "images" and "labels" folder, following the same structure (e.g. Class_1).


β”œβ”€β”€ data
  └── images
      β”œβ”€β”€ Class_0
          β”œβ”€β”€ image01.jpg
          β”œβ”€β”€ image02.jpg
          └── ...
      β”œβ”€β”€ test
      β”œβ”€β”€ val
      └── train
   └── labels
      β”œβ”€β”€ Class_0
          β”œβ”€β”€ image01.txt
          β”œβ”€β”€ image02.txt
          └── ...
      β”œβ”€β”€ test
      β”œβ”€β”€ val
      └── train

[Optional] Image augmentation to increase training dataset size

In case you want to increase training dataset size (i.e. create more images for training), you can conduct image augmentation. Recommendation is to have at least 1,500 annotated images per class. The provided code doubles the annotated images each run and the additional images will be randomly rotated, flipped and changed in brightness or size of the original files. The annotation is automatically adapted. Please be aware that only images with annotations will be augmented, background-only images will be skipped.

Example : If Class_0 has only 730 annotated images, run image augmentation once on Class_0 to generate additional 730 images.

  1. Launch Anaconda Prompt (Anaconda3) from Start Menu and enter your virtual environment with
conda activate yolo_env

  1. Navigate to the directory in which the file "image_augmentation.py" and the folder "data_aug" is found, by copy-pasting the pathway to the file and using the "cd" command, e.g.cd 'C:/Users/Max Mustermann/Desktop/bee-finder/toolbox/'

  1. The function to conduct image augmentation requires following prompts:
    --class_number: The name of the folder which includes the images and the .txt files (see "Data Organisation for YOLOv5 training").
    --your_pathway: The pathway to the folder "toolbox". Please copy and paste your pathway to the file.

Now you can conduct image augmentation by typing and confirming:

python image_augmentation.py --class_number="Class_0"  --your_pathway='[system_path_to_folder]/data/'

Your augmented images and annotations will be found in a folder named after your Class number and the suffix "_augmented" (i.e. "Class_0_augmented"). If you are satisfied with the results, move all ".jpg" files and ".txt" files in the Class_0 folder and delete the folder Class_0_augmented.

πŸ’‘ TIPP: If you have more than one class, add e.g. "Class_1" folder in each the "images" and "labels" folder and adapt the prompt accordingly (e.g.--class_number="Class_1") to separately augment this class, or combine all classes to one single folder if the dataset is already well-balanced. Please note that images without annotation (i.e. background only images) will not be augmented.


🠊 Image augmentation has been completed successfully.


Sorting data randomly in the train / validation / test folders

  1. Launch Anaconda Prompt (Anaconda3) from Start Menu and enter your virtual environment with

conda activate yolo_env

  1. Navigate to the directory in which the file "split_dataset.py" is found, by copy-pasting the pathway to the function, e.g.cd 'C:/Users/Max Mustermann/Desktop/bee-finder/toolbox/'

  1. Now, you need to copy all images from the folder (in our example "Class_0") to the train (70%), val (20%) and test (10%) folders. For this, the "split_dataset.py" function was created. The original files will stay in the "Class_0" folder and only copies will be created in the train, val and test folders. Following arguments can be adapted depending on your setup:
    --class_number: The name of the folder which includes the images and the ".txt" files (see "Data Organisation for YOLOv5 training"). If you have more than one class (e.g. "Class_1"), adapt the code by replacing "Class_0" with "Class_1".
    --your_pathway: The pathway to the folder containing the folders β€œimages” and β€œlabels” (according to structure shown in β€œData organisation for YOLOv5 training”). Please copy and paste this pathway.

You can apply the function by typing and confirming:
python split_dataset.py --class_number="Class_0"  --your_pathway='[system_path_to_folder]/'

🠊 Your training dataset for YOLOv5 has been successfully split in training / validation / test images.


Train yolov5 with custom data

This step is only needed if you want to train your own custom CNN. In case you only want to use the bee-finder or you already have other "weights", i.e. a ".pt" file you want to use, you can directly jump to the "Using the bee-finder" section.


To train YOLOv5, you need to adapt the file β€œtraining_config.yaml” first.

  1. Launch Anaconda Prompt (Anaconda3) from Start Menu and enter your virtual environment with

conda activate yolo_env

  1. Navigate to the directory in which the file "train.py" is found, by copy-pasting the pathway to the function, e.g. cd 'C:/Users/Max Mustermann/Desktop/bee-finder/'

  1. Open the file "training_config.yaml", which you can find in the directory bee-finder/data, with Visual Studio Code. In there, replace the class names with the classes predefined in the annotation process, e.g. "O.cornuta", "bee", "Class_0" etc.. Save the file to finalize configuration for the YOLOv5 bee-finder.

    πŸ’‘ TIPP: A quick adaptation in the Anaconda prompt is also possible with a standard software called nano (already installed in base anaconda), so you can adapt the code by navigating to the folder which contains the "training_config.yaml" file via Anaconda prompt and type nano training_config.yaml. Change the pathway, press Ctrl + X and confirm the change with Y.

grafik


  1. Before you train your own YOLO network with

python train.py --img 1280 --batch 16 --epochs 3 --data β€œ[system_path_to_folder]/training_config.yaml” --weights yolov5x6.pt

adapt following arguments according to your setup:


--img: defines the size of the annotated images. Please make sure that the images correspond to the selected network (see "--weights").
--batch: "batch" specifies how many images are processed at once and depends on the available GPUΒ΄s memory capabilities. Processing 16 images simultaneously is relatively high for a single GPU and may produce an error (β€œout of memory”). If this is the case, lower the number. If one uses many GPUs simultaneously, one can increase the number to e.g. 64 or even higher.
--epochs: "epochs" specifies the total number of iterations of all training data in one training cycle. To find the desired best performance requires a bit trial and error. The higher the number, the longer the training time is. As such, try to increase the epochs e.g. in steps of 100 and find a well-performing model with the lowest number of epochs.
--weights: Choose a pretrained network from the ultralytics website (https://github.com/ultralytics/yolov5#pretrained-checkpoints) and specify it here. Those weights will be automatically downloaded in the bee-finder. Please note the specifications to image size. Those should correspond with "--img" and the actual annotated images size.

For every epoch, the box-loss, object loss and class loss will be shown. Also, the [email protected]:0.95 will be depicted, which will increase each epoch while YOLO improves its detecting skills by adjusting weights until no improvement is possible any more. If this plateau cannot be found with the epochs you specified within the β€œtrain.py” function, repeat the training by increasing the number of epochs. All results of the run will be saved in a csv document within bee-finder/runs/train/exp.

The best weights will be saved in the weights folder.

🠊 YOLOv5 has been trained successfully.


Using the bee-finder

The bee-finder (namely the β€œyolov5_v7.0_modified.py”-file) has several parameters which can be adjusted according to the specific project. These are:
--path_to_video: type in the path to the input video
--model_weights: type in the path to the trained YOLOv5 model weights
--cut: The bee-finder only saves images in which the target organism is present and merges it to a video. If you want to copy the whole video with target organism highlighted, you can set this parameter to "False". (default = "True")
--fps: frames per second value used to convert input video to frames (default value=1)
--batch_size: number of images processed per time (default value = 128). The higher this number, the more memory is required.
--cuda_device: When working with several GPUs, you can switch the working GPU here. Otherwise, you do not need this prompt.

To use the bee-finder, please follow those steps:

  1. Launch Anaconda Prompt (Anaconda3) from Start Menu and enter your virtual environment with

conda activate yolo_env

  1. Navigate to the directory in which the file "yolov5_7.0_modified.py" is found, by copy-pasting the pathway to the prompt, e.g.cd 'C:/Users/Max Mustermann/Desktop/bee-finder/toolbox/'

  2. Use the bee-finder by typing and confirming the following commannd, while adding necessary parameters from above:


python yolov5_7.0_modified.py --path_to_video="<Path_to_videos>/" --model_weights="[system_path_to_folder]/best.pt" --fps=30

πŸ’‘ TIPP: If you need to extract a timestamp in the generated csv file, you can utilize the image names (i.e. frame number) in the log-file, as it consists of consecutive numbers by calculating (frame number)/(fps of video)=seconds in the video you can replicate a time stamp for each detection.

πŸ’‘ TIPP: If you want to use the bee-finder on videos with 60 fps, an estimate of independent bee flights is calculated by calculating the detections at image margin. By generating the median of how many bees were present on the margin per second, an estimate of independent bee flights in and out of the nesting aid is calculated. Please note that this bee counter only works with 60 fps.

Troubleshoot

We collected some mistakes that a person can run into.


1. General errors

Assertion error: File not found

AssertionError


In this case, the wrong separators were used to set the correct path. Windows can operate with both "/" and "\", but sometimes (and in Linux) it can only operate with "/", otherwise you run into the depicted error. Please also note, that Windows Linux requires a "/" before starting a path whereas Windows does not. e.g. Linux: '/C:/Max Mustermann'; Windows: 'C:/Max Mustermann'


Set-Location

Set-Location


This error shows that it cannot recognise the pathway prompted in the menu. The error is caused by the folder which contains a " "[space] and can be easily fixed by wrapping the whole pathway in ' '. The fixed command looks like this: cd 'C:/Users/Max Mustermann/Documents/Random_folder'


2. While training YOLOv5:

torch.cuda.OutOfMemoryError: CUDA is out of memory

grafik


This error shows, when your GPU cannot process as many images simultaneously as you specified (i.e. with "--batch"). In this case, decrease the number (e.g. when --batch=10and you receive this error, lower the number with --batch=5), until your GPU can handle it. Please be aware, that the amount of images a GPU can process is not necessarily always the same. E.g. if there is no sufficient cooler for the GPU, it can handle less images simultaneously as a GPU whose temperature is cooled better.


IndexError: index 2 is out of bounds for axis 0 with size 1

Index


This error shows when there is a problem with the annotated classes. In this case, there were annotations of a class "2", which was not specified in the "training_config.yaml" file. One can easily fix this by either correcting the "training_config.yaml" file when there are more than one classes (see Train yolov5 with custom data). The bee-finder only has one class, so that a correction of the file containing the class "2" to class "0" (as we only have "bee" as a class) solved the error. Before running the training again, make sure to delete the all cache files in the "labels" folder (e.g. train.cache), as otherwise YOLOv5 can show the same error again even though the problem has been solved.
  1. While operating the bee-finder

UnboundLocalError: local variable 'path_to_video_mp4' referenced before assignment grafik


This error is produced by having run the bee-finder once already and there are still files in the created directories. Please delete all previous output files of the bee-finder for the videos you want to use the bee-finder on again (i.e. the directory with the same name as the video file) run the bee-finder again.

References

Jocher, G. (2020). YOLOv5 by Ultralytics (Version 7.0) [Computer software]. https://doi.org/10.5281/zenodo.3908559


Paperspace (2020). Data Augmentation For Object Detection [Computer software]. https://github.com/Paperspace/DataAugmentationForObjectDetection?ref=blog.paperspace.com

License

The detector is released under the GNU General Public License v3.0 license.

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.6%
  • Other 1.4%