Skip to content

mzmmoazam/kashmiri_dataset

Repository files navigation

Kashmiri Dataset

This repository contains data and the tool used to collect this dataset.

Data folders

  • downloaded_content/

    This folder contains word pronouncations, pdf's, docs and html files that contain kashmiri.

  • csv_files/

    This folder contains csv files that contain data from kashmiri dictionaries and text from websites.

Note: Find the zip files for these folders in compressed_data/

Some of the zip files maybe split into multiple parts, you will be get the actual zip file by running the following command :

cat zip_filename.zip.part* > zip_filename.zip


To know about working of the tool click here

Last but not least I would like to thank the websites from where I have collected this data


About

Data and tool to fetch kashmiri text

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published