Lip Sync - Neural Network Rhubarb Replication

This is a Python neural network replication of Rhubarb Lip Sync, designed to enable complex lip movement for real-time chatbots.

This project utilizes a simple neural network trained on pairs of spoken texts and Rhubarb Lip Sync outputs, approximating lip movements with around 75% accuracy. This level of accuracy is sufficient for generating a sense of realism in most applications.

Please note, if your application does not require real-time performance, you might want to consider using Rhubarb Lip Sync directly.

How to Use

Inference

For now, only inference is supported, as the training code is being heavily refactored. If for some reason you need to train your own model and can't wait, let me know. For inference, use the following command:

python .\inference.py --wav_file_name .\001.wav --model_name model_full_dataset_2layers.pth

Training

If you wish to train your own model, you can do so as well. The program looks for 41khz WAV files in the "wavs" directory, and texts generated by Rhubarb (The command-line program) in the "texts" directory. WAVs and TXTs should share the same filename ("001.wav" and "001.txt"). I had better luck not using the extended mouthshapes (except for 'X') so the training program is set to not include them, if you wish to do when training please set the OUTPUT_SIZE variable to 9. If you decide to use the extended mouthshape "X", please find/replace it with "G", or "I" if using the extended mouthshapes.

To-Do List

Convert from using .pth to using SafeTensors.
Add video example to README.md.

Current Status

The code is currently undergoing refactoring and users may encounter errors, particularly when attempting to train their own models. However, the provided model (model_full_dataset_2layers.pth) should be satisfactory for most purposes. It's been trained on over 80 GB of WAV files from a variety of sources, providing a comprehensive and versatile foundation for lip-syncing tasks.

License and Use

This code is available under the MIT license and is free for anyone to use without obligation. However, I would be delighted if you'd drop me a line to let me know if and how you're using it!

Contributions and Feedback

Please feel free to contribute to this project or provide feedback by opening an issue or pull request on GitHub. Your insights are greatly appreciated!

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Helper Files		Helper Files
texts		texts
wavs		wavs
LICENSE		LICENSE
README.md		README.md
data_classes.py		data_classes.py
english-adjectives.txt		english-adjectives.txt
english-nouns.txt		english-nouns.txt
funrun.py		funrun.py
inference.py		inference.py
model_full_dataset_2layers.pth		model_full_dataset_2layers.pth
requirements.txt		requirements.txt
test_model.py		test_model.py
train_model.py		train_model.py

License

cryptowooser/lipsynch-nn

Folders and files

Latest commit

History

Repository files navigation

Lip Sync - Neural Network Rhubarb Replication

How to Use

Inference

Training

To-Do List

Current Status

License and Use

Contributions and Feedback

About

Topics

Resources

License

Stars

Watchers

Forks

Languages