Block or Report
Block or report iamwonseokchoi
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
A one-stop, open-source, high-quality data extraction tool, supports PDF/webpage/e-book extraction.一站式开源高质量数据提取工具,支持PDF/网页/多格式电子书提取。
aider is AI pair programming in your terminal
Jupyter Notebooks to help you get hands-on with Pinecone vector databases
PySpark test helper methods with beautiful error messages
Your AI second brain. Get answers to your questions, whether they be online or in your own notes. Use online AI models (e.g gpt4) or private, local LLMs (e.g llama3). Self-host locally or use our c…
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Keyword Extraction and Analysis Pipeline & Application with KeyBERT and Taipy
Building smart Big Data pipelines with Dask & Taipy (DEMO)
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
Full data pipeline and data engineering project using AWS services (MSK, Kafka, Spark Stream/SQL, Elastic Stack, Iceberg, Glue, Athena, Streamlit, etc.)
Data replication and lineage management mini-project using Azure Databricks
Stock predictor app for NASDAQ stocks served on Streamlit. Data engineering side uses publicly available APIs to curate and form data, data science side offers a myriad of models.
Using Spark Vectorized UDFs and AI tools on stock price data
Integrated end-to-end data project using lambda and data lakehouse architecture to compile financial data
Data lakehouse and lambda architecture mini-project deployed onto non-managed Kubernetes using Kafka and Pyspark
Spark structured streaming mini-project for IoT devices using Databricks