Skip to content
View CryptoSalamander's full-sized avatar

Organizations

@Chainerator @SSU-ALOE

Block or report CryptoSalamander

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

AI-data warehouse to enrich, transform and analyze data from cloud storages

Python 938 55 Updated Nov 1, 2024

Collection of training data management explorations for large language models

278 28 Updated Aug 2, 2024

The RedPajama-Data repository contains code for preparing large datasets for training large language models.

Python 4,559 350 Updated Oct 17, 2024

The Universe of Evaluation. All about the evaluation for LLMs.

Python 213 21 Updated Jul 9, 2024

The Universe of Data. All about data, data science, and data engineering

Python 516 52 Updated Jul 18, 2024

A natural language interface for computers

Python 54,701 4,782 Updated Oct 31, 2024

On the Hidden Mystery of OCR in Large Multimodal Models (OCRBench)

Python 463 29 Updated Oct 16, 2024

KoAlpaca: 한국어 명령어를 이해하는 오픈소스 언어모델 (KoAlpaca: An open-source language model to understand Korean instructions)

Jupyter Notebook 1,542 237 Updated Oct 25, 2024

Scene Text Recognition (STR) methods trained with fewer real labels (CVPR 2021)

Jupyter Notebook 174 27 Updated Dec 23, 2023

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting

Python 78 5 Updated Feb 11, 2023

Augmentation pipeline for rendering synthetic paper printing, faxing, scanning and copy machine processes

Python 348 46 Updated Sep 17, 2024

Scene Text Recognition with Permuted Autoregressive Sequence Models (ECCV 2022)

Python 577 127 Updated May 29, 2024

This dataset contains re-annotations of 4 popular Latin/English scene text recognition datasets.

49 9 Updated Mar 24, 2020

Scene text recognition

Python 105 14 Updated Jul 7, 2022

Implementation of CRAFT Text Detection

Python 191 47 Updated Jul 6, 2023

심초음파/심전도 AI 모델 Datathon

Python 1 Updated Dec 2, 2021

머신러닝 입문자 혹은 스터디를 준비하시는 분들에게 도움이 되고자 만든 repository입니다. (This repository is intented for helping whom are interested in machine learning study)

Jupyter Notebook 2,626 848 Updated Apr 5, 2024

👩‍💻👨‍💻 AI 엔지니어 기술 면접 스터디 (⭐️ 1k+)

1,852 450 Updated Oct 12, 2024

scikit-learn cross validators for iterative stratification of multilabel data

Python 850 75 Updated Oct 12, 2024

A python implementation of the Rapid Automatic Keyword Extraction

Python 974 594 Updated Sep 4, 2020

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 134,291 26,850 Updated Nov 1, 2024

This dataset code generates mathematical question and answer pairs, from a range of question types at roughly school-level difficulty.

Python 1,795 249 Updated Jul 24, 2024

Kex is a python library for unsupervised keyword extraction from a document, providing an easy interface and benchmarks on 15 public datasets.

Python 53 6 Updated Feb 17, 2022

Our project deals with the trend analysis on the crawled Korea Herald dataset using SRL-BERT and Sentence-BERT.

Jupyter Notebook 2 1 Updated Jun 6, 2021

비지도학습 방법으로 한국어 텍스트에서 단어/키워드를 자동으로 추출하는 라이브러리입니다

Python 352 57 Updated Apr 13, 2022

Korean BERT pre-trained cased (KoBERT)

Jupyter Notebook 1,296 368 Updated Oct 3, 2024

Minimal keyword extraction with BERT

Python 3,524 348 Updated Jul 16, 2024

Deep Keyphrase Extraction using BERT

Jupyter Notebook 255 71 Updated Feb 21, 2022

An open-source NLP research library, built on PyTorch.

Python 11,754 2,251 Updated Nov 22, 2022

텍스트 요약 분야의 주요 연구 주제, Must-read Papers, 이용 가능한 model 및 data 등을 추천 자료와 함께 정리한 저장소입니다.

332 47 Updated Feb 21, 2022
Next