Texture Features Extraction

Python Scripts for extract texture features given transcripts.

The extractor BERT is from this the repository: sentence-transformer;

the pretrained weights are from: distiluse-base-multilingual-cased-v2 according to the paper Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation. Advantages of this model according to the paper: Aligned feature space. 1) Vector spaces are aligned across languages, i.e., identical sentences in different languages are close; 2) vector space properties in the original source language from the teacher model M are adopted and transferred to other languages.

Platform

python: 3.6+
Pytorch: 1.7+

Howto

download training_data_transcripts in the root directory

training_data_transcript/
├── animals_transcripts1_train
    ├── 
    ├── 
    ├── 025157
        |── 025157_animals.srt
    ...

use process_srt_files in preprocess.py: to generate the raw data raw_data.npy that contains the list of chunks. Each chunk contains text, duration, talk_type, participant_id. One can use/modify ChunksDataset to load the raw data.
just run feature_extraction.py to extract the features and obtain the embeddings.npz. Each item contains a feature embedding and a participant label. One can use/modify EmbsDataset to load the extracted features.

Aurthor information

[email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
BERT Training		BERT Training
multi_modalities		multi_modalities
README.md		README.md
check_meta.py		check_meta.py
feature_extraction.py		feature_extraction.py
merge_meta.py		merge_meta.py
mp.py		mp.py
np2csv.py		np2csv.py
preprocess.py		preprocess.py
remove_v2_scrt.py		remove_v2_scrt.py
synchronization.py		synchronization.py
test_modality.py		test_modality.py
train_modality.py		train_modality.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Texture Features Extraction

Platform

Howto

Aurthor information

About

Releases

Packages

Languages

JianJiangKCL/TextFeatures

Folders and files

Latest commit

History

Repository files navigation

Texture Features Extraction

Platform

Howto

Aurthor information

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages