Word-2-Word Level Transcription and Forced Alignment Lip Sync Ultra Pro Max 2D Avatar

Let the code introduce itself...

final_output_mp4.mp4

Welcome to the repository for the "Word-2-Word Level Transcription and Forced Alignment Lip Sync Ultra Pro Max 2D Avatar" project! This unique Python script takes audio processing to the next level, creating a mesmerizing 2D avatar that syncs with spoken words.

Overview

This script, powered by libraries such as matplotlib, pydub, numpy, cv2, glob, os, moviepy, and pathlib, performs a variety of tasks:

Decibel Calculation: The script calculates decibel levels using the root mean square (RMS) method from an input audio file.
Processing Decibel Levels: After obtaining decibel levels, the script processes them, addressing infinite or NaN values, and provides average and maximum decibel levels.
Image Assignment: Based on processed decibel levels, the script assigns dynamic images to different decibel value ranges, creating an engaging visual representation of the audio.
Video Creation: Using the assigned images, the script generates both AVI and MP4 video files, capturing the audio characteristics visually.
Final Video: The script combines the generated video with the original audio, producing a final synchronized masterpiece.

How to Use

Clone the Repository:

git clone https://github.com/Eeman1113/word-2-word-level-transcription-and-forced-alignment-lip-sync-ultra-pro-max-2d-avatar.git

Install the Required Libraries:

pip install matplotlib pydub numpy opencv-python moviepy

Run the Script:
```
python main.py
```
Replace main.py with the actual name of your Python script.

Customize the Input Audio File Path:

Modify the main function to point to your desired audio file:

if __name__ == "__main__":
    main("/path/to/your/audio/file.m4a")
    print("Finished!")

Results

The script generates both AVI and MP4 video files, along with a final synchronized video. Processed decibel levels and assigned images are printed for analysis.

Feel free to experiment with different audio files and customize the image assignment logic to suit your creative needs.

Embark on a journey into the realm of audio visualization with the "Word-2-Word Level Transcription and Forced Alignment Lip Sync Ultra Pro Max 2D Avatar" project!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
gwack		gwack
sound		sound
video		video
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word-2-Word Level Transcription and Forced Alignment Lip Sync Ultra Pro Max 2D Avatar

Overview

How to Use

Results

About

Releases

Packages

Contributors 2

Languages

Eeman1113/word-2-word-level-transcription-and-forced-alignment-lip-sync-ultra-pro-max-2d-avatar

Folders and files

Latest commit

History

Repository files navigation

Word-2-Word Level Transcription and Forced Alignment Lip Sync Ultra Pro Max 2D Avatar

Overview

How to Use

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages