This is by no means the final state of the project, rather a prototype showcasing what’s possible. This is probably one of the most crude, hacked together things you will ever see. Do not judge the code quality but rather what the prototype implies.
Unity captures the information from the web camera, sends the information to the local Python server operating on Flask, which processes the image and predicts the emotion and then sends the prediction back to Unity with Unity playing corresponding Particle Effect according to emotion received. Python does all the “AI” work in this case with Unity just supplying the webcam data. In the current setup, since both run on the local machine, Python could in fact tackle even the web camera feed, letting Unity just display graphics.
There are performance hiccups within Unity, related to the transformation of the webcam image and sending it to the server. Also, the entire process takes a bit of time since this doesn’t happen on a single thread within Unity but rather asynchronously.
- We would probably refine the Python layer: can use inter-process communication which should be much faster. Taking webcam processing to Python.
- Different direction: Need to also look into taking image processing inside Unity (both face detection and emotion recognition), which would require building native libraries for OpenCV and Keras for C# and Unity.
- Used
Python v. 3.7.7
andpip3 v. 20.1.1
Tensorflow v. 2.1.1
. Install viapip3 install tensorflow
. Needed for running pre-trained models.- Additional libraries needed to run the server (all can be installed by
pip3 install NAME_OF_LIBRARY
)tensorflow
(training models and running pre-trained models)opencv-python
(computer vision for detecting faces)numpy
(operations on arrays essential for ML)Flask
(creating server API)
Python part was constructed by following certain parts of this tutorial and their corresponding repository. The most useful step-by-step code elaboration of their implementation along with other useful things is contained here
The model the tutorial provided did not work well for me. I searched the net and found a better one, linked on this GitHub by atulapra. The direct link to download the model is here. I included it here along with Python source code.
To run the Python server (the essential part of this prototype), you need to navigate to the directory with the .py file and run python3 emotion_recognition_server.py. The server will start on http:https://127.0.0.1:5000/ and then Unity will be able to send the requests.
- Used Unity 2019.3.14f1. It is imperative that you also use the absolutely same version when running the Unity project from this repository.
In the MainScene you will find all the VFX used for triggering each and specific emotion and all the components performing logic attached on GameObjects. The most important component to explore would be the SendCameraFeed.cs. Webcam imagery is acquired by using WebcamTexture class. It is important to note that the image comes upside-down (at least in my case), and once the Python server receives it, it in fact flips it back to normal in order to process it correctly.
Run from Editor (press Play). If the server is running too, soon enough the system will read the emotions and play corresponding particle effects.
Let me know whether there is something missing from this explanation.