Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
-
Updated
Nov 1, 2024 - HTML
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Write beautifully typeset academic texts with distraction-free Markdown and Pandoc.
Ruby gem that converts an HTML page/document into a Microsoft Word `.doc` file
This repository contains the documentation of Syncfusion file format .NET libraries which is used to create, read, edit and convert PDF, Excel, Word and PPTX documents.
Mahayana Buddhist Sutras with Pinyin in HTML, Plain Text and PDF Format
Automatic translator from word to html / excel to html
📃 A GUI based docx to html parser. Useful for ripping out inline styles of docx files.
ODT to DOCX is an online tool that converts ODT (OpenDocument Text) files into DOCX (Microsoft Word Open XML Formate Document) files.
This repository contains the documentation of Syncfusion file format .NET libraries which is used to create, read, edit and convert PDF, Excel, Word and PPTX documents.
This demo illustrates how multiple users can edit sequentially in a Word document using Syncfusion Word processor component (Document editor) in your Web application.
Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.
More than a document converter
This repository contains the documentation of Syncfusion file format libraries for Java which is used to create, read, edit, and Word documents.
Add a description, image, and links to the docx topic page so that developers can more easily learn about it.
To associate your repository with the docx topic, visit your repo's landing page and select "manage topics."