ISSAI
Popular repositories Loading
-
Kazakh_TTS
Kazakh_TTS PublicAn expanded version of the previously released Kazakh text-to-speech (KazakhTTS) synthesis corpus. In KazakhTTS2, the overall size has increased from 93 hours to 271 hours, the number of speakers h…
-
SpeakingFaces
SpeakingFaces PublicA large-scale publicly-available visual-thermal-audio dataset designed to encourage research in the general areas of user authentication, facial recognition, speech recognition, and human-computer …
-
ISSAI_SAIDA_Kazakh_ASR
ISSAI_SAIDA_Kazakh_ASR Publicthe first industrial-scale open-source Kazakh speech corpus. KSC2 corpus subsumes the previously introduced two corpora: KSC and KazakhTTS2 and supplements additional data from other sources. KSC2 …
-
thermal-facial-landmarks-detection
thermal-facial-landmarks-detection PublicSF-TL54: Thermal Facial Landmark Dataset with Visual Pairs.
Repositories
- city-sustainability-indexes Public
This repo contains code and models for detecting city sustainability indexes
IS2AI/city-sustainability-indexes’s past year of commit activity - HPE-depth-fisheye Public
This project used synthetic data created using Nvidia Omniverse to train a camera-view invariant multi-pose HPE model for depth and fisheye cameras.
IS2AI/HPE-depth-fisheye’s past year of commit activity - Enhancing-Ambient-Assisted-Living-with-Multi-Modal-Vision-and-Language-Models Public
This project is aimed at detecting the abnormal behaviour or emergency cases using vision-language model (VLM), large language model (LLM), human detection model, text-to-speech (TTS) and speech-to-text models (STT). The framework can detect the subtle sings of emergency and actively interact with the user to make an accurate decision.
IS2AI/Enhancing-Ambient-Assisted-Living-with-Multi-Modal-Vision-and-Language-Models’s past year of commit activity