data filter routines using numpy
-
Updated
Feb 14, 2021 - Python
data filter routines using numpy
Analysis of Tweets Dataset using concepts like Data Curation and Data Processing.
TokenEditor is a web application for manual annotation (or manual review of automatic annotations) of text. Albeit primarily aimed at reviewing PoS tags and lemmas, it is fully customizable, to support any annotation levels.
Web application for text-based data labeling 🏷️
Python package to make URL extraction, generalization, validation, and filtration easy.
Materials from a guest lecture entitled, "Beyond Data Standards," prepared for University of Washington's LIS 546 (Data Curation II) in Spring 2021.
Web Scraping & Text Data Collecting and Curating for Maithili Language. Also Language Modeling for collected data.
Rebalancing chemical reaction
Practices of the "Diploma in Data Sciences, Machine Learning and its applications", in which I was a mentor.
Canonicalizing data and implementing strategies for ensuring equivalence
Data Curation, Winter 2021
Codes I wrote for the paper : "Global determinants of freshwater and marine fish genetic diversity" Nature Communications, 2020
This program consists in discovering equivalence links (owl:sameAs) for a given set of URIs dynamically and online with SPARQL queries.
Code and data for "Target-oriented Proactive Dialogue Systems with Personalization: Problem Formulation and Dataset Curation" (EMNLP 2023)
R script for GenBank sequences names changing, filling-in missing molecular markers data and sequences concatenation
Repository for the collection, management, and versioning of the GCIS data management conventions.
For this human-centered data science project, I analyzed some data on the Gender characteristics of Superheroes and Villains to determine the ratio of female characters that appear in comic books compared to their male counterparts using Matplotlib.
Add a description, image, and links to the data-curation topic page so that developers can more easily learn about it.
To associate your repository with the data-curation topic, visit your repo's landing page and select "manage topics."