-
-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Method for face detection #47
Comments
Hey, I took a look at your code, I think it is fine, but we would need to work a bit to properly integrate it with cozmo. First, I think the image processing should be done in a separate thread. This means that we would need the tracking to have its own thread class with methods to add new frames in the loop as they come from cozmo. We should also think about whether we want to process all frames or only the ones we have time to do (depending on each device it might take some time to process a frame). We should also have a few classes to define and store the properties of each visible object, and add the functionality to store and retrieve positions and other information about visible objects (and maybe those that went out of the field of view for a more advanced version). This should of course be independent of the implementation of yolo or other image recognition methods, so we can easily update the algorithms we use or combine them (have something like yolo for faces and something else for the cubes). You could start implementing this slowly and create PRs with each part, I am interested in working with this too (I was planning to work in the cube recognition in the past), so I could start implementing some of the basic architecture if you'd like some help, then we can see how it goes as we add the code. |
I think even before talking about threads, we should first split the I like your idea of creating different classes to define and store properties associated with each detected/tracked object. Makes it easier and more flexible to use from a script. So I am all for this. Might I suggest defining an interface so that other classes can handle those objects in a more generic manner? At a more abstract level, you seem to imply that Yolo would only be used to detect faces. However, my intention was rather to have Yolo be THE algorithm/mechanism used for all sorts of object detection. At the moment it has only been trained to detect faces and hands. Using the Open Image dataset though it is possible to train the network to detect some 600 categories of objects. If I remember correctly, the roadmap for the Finally, I am not against some help to implement this whole thing (that is the reason for opening this issue in the first place), especially if it will serve to detect different categories of object. But before that we should define what the final architecture will look like, so that we are not just coding in the dark. |
I agree, the code should be split in separate classes for different functionality. I am not sure about what should go where, but I think you have a good idea of how it should be done, so you can go ahead.
I talk about threads thinking on how this should be included in the pycozmo architecture. We currently have independent thread classes managing different tasks, and computer vision (which I believe will be the most computationally expensive task in the package) should definitely not be blocking the main thread. This is because, in my opinion, any user of pycozmo should be able to import the package and start playing with the robot without needing to worry about the management of the tasks we are implementing in the package. If you'd like to implement everything without worrying much about this, that's perfectly fine, you can do it in an example at first, and we'll find a way of including it in the package afterwards.
I'm not sure what you mean by this, but yes, we should be able to access these objects in a generic manner. The way I see to do this would be to include a list/dict/manager class for these objects in the Client, so they can be easily accessed from there.
I think we can use yolo for any item we can train it to. My only concern is how easily would be to train this model to detect the cubes. I think it will be a pain to do enough pictures in enough environments to train the model. There are other use cases where yolo might not be the right solution (e.g detect lines or points to improve the precision of the localization of the robot).
As I said, you can go ahead and implement what you'd like, then create a PR and we'll take it from there. I don't know what is the best way of doing this and you have a head start so I think you should go ahead and implement it. |
Very well then, I will start implementing all that and I will let you know where I might need some guidance to integrate my code into the |
Sounds good! You don't need to hurry with this, I think the best of this library is how much we can learn from cozmo and developing robotic solutions. Hopefully, you'll have some good fun with it. Let me know if you want some help or have some questions (you can just drop me an email), I'd like to help you if I can. Also, good luck with the job hunt! |
Hello again, I finally took some time off after getting my vaccine and worked on the face detection algorithm for the pycozmo. Really sorry that it took a month to get there. Hope you'll enjoy it. |
Hello @zayfod,
Instead of simply throwing a pull-request your way and see if it sticks, I thought I'd open an issue to discuss how you plan on implementing the face detection feature?
I have already cloned your project and tried my hand at such an implementation using OpenCV and YoloV4-tiny. For the moment the neural network is trained using the OpenImage dataset and can only detect Human hands and heads.
The class handling the detection and tracking is implemented here: https://github.com/davinellulinvega/pycozmo/blob/face_detection/pycozmo/multi_tracking.py
And an example of how this might be used within the Pycozmo library can be found here:
https://github.com/davinellulinvega/pycozmo/blob/face_detection/examples/face_detection.py (note that you also need the .cfg, .names, and .weights files for building the network).
As mentioned above, for the moment yolo has only been trained to detect hands and heads, but it can "easily" be retrained to handle heads, cozmo's cubes, pets (cat, dogs, birds, ...), and cozmo's other objects (platform, cube markers, and what else?).
Hope this might be of interest for you. If it is do let me know and we can talk more about a better integration of the multi-tracker (and its accompanying files) within the Pycozmo library.
The text was updated successfully, but these errors were encountered: