Skip to content
View PascalSun's full-sized avatar
🇦🇺
Focusing
🇦🇺
Focusing

Highlights

  • Pro

Organizations

@AI4WA

Block or report PascalSun

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results

An open-source RAG-based tool for chatting with your documents.

Python 778 45 Updated Aug 26, 2024

Ingest, parse, and optimize any data format ➡️ from documents to multimedia ➡️ for enhanced compatibility with GenAI frameworks

Python 4,794 403 Updated Aug 3, 2024

A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。

Python 10,070 738 Updated Aug 26, 2024

Rapidly build AI apps in Python

Python 5,139 244 Updated Aug 27, 2024

React component for 2D, 3D, VR and AR force directed graphs

HTML 2,130 271 Updated May 19, 2024

Official Repository of ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning

Python 202 18 Updated Jun 3, 2024

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Models

Jupyter Notebook 85 3 Updated Aug 27, 2024

VIINA: Violent Incident Information from News Articles on the 2022 Russian Invasion of Ukraine

247 21 Updated Aug 26, 2024

利用AI大模型,一键解说并剪辑视频; Using AI models to automatically provide commentary and edit videos with a single click.

Python 634 69 Updated Aug 27, 2024

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Python 2,351 242 Updated Aug 27, 2024

The scripts for training Detectron2-based Layout Models on popular layout analysis datasets

Python 198 54 Updated Sep 26, 2023
Python 76 12 Updated Aug 5, 2024

MPB (Miner-PDF-Benchmark) is an end-to-end PDF document comprehension evaluation suite designed for large-scale model data scenarios.

Python 9 4 Updated Aug 2, 2024

Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization

JavaScript 2,200 181 Updated Aug 16, 2024

Fast, Accurate, Lightweight Python library to make State of the Art Embedding

Python 1,319 96 Updated Aug 23, 2024

MULTITQ is a large-scale dataset featuring ample relevant facts and multiple temporal granularities.

Python 13 3 Updated Mar 11, 2024

More Accurate Question Answering on Freebase

Python 107 36 Updated Jul 6, 2023

Tools for state of the art Knowledge Base Completion.

Python 8 2 Updated Mar 14, 2021

https://kg-beyond-triple.github.io/

JavaScript 3 Updated Oct 22, 2023

[Paper List] Papers integrating knowledge graphs (KGs) and large language models (LLMs)

1,254 98 Updated Aug 27, 2024

Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.

141 2 Updated Aug 7, 2024

​ 李白 👤 作为唐代杰出诗人,其诗歌作品在中国文学史上具有重要地位。近年来,随着数字技术和人工智能的快速发展,传统文化普及推广的形式也面临着创新与变革。国内外对于李白诗歌的研究虽已相当深入,但在数字化、智能化普及方面仍存在不足。因此,本项目旨在通过构建李白知识图谱,结合大模型训练出专业的AI智能体,以生成式对话应用的形式,推动李白文化的普及与推广。

Python 1,090 124 Updated Jul 12, 2024

Curated tutorials and resources for Large Language Models, Text2SQL, Text2DSL、Text2API、Text2Vis and more.

1,545 118 Updated Aug 19, 2024

A Comprehensive Toolkit for High-Quality PDF Content Extraction

Python 4,412 286 Updated Aug 27, 2024

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Python 6,089 308 Updated Aug 27, 2024

Integrated set of Django applications addressing authentication, registration, account management as well as 3rd party (social) account authentication.

Python 9,324 3,005 Updated Aug 23, 2024

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Wo…

Python 3,718 252 Updated Aug 27, 2024

The Abstraction and Reasoning Corpus

JavaScript 3,241 535 Updated Aug 4, 2024

CORD: A Consolidated Receipt Dataset for Post-OCR Parsing

384 38 Updated Jul 20, 2022

Multilingual Voice Understanding Model

Python 2,271 212 Updated Aug 2, 2024
Next