-
Carnegie Mellon University
- Pittsburgh, PA, USA
- https://www.cs.cmu.edu/~zhuyund/index.html
Highlights
- Pro
Stars
Shared repository for open-sourced projects from the Google AI Language team.
A template for making a Google Chrome Extension, using Twitter Bootstrap 3.
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.
The source codes for Fine-grained Fact Verification with Kernel Graph Attention Network.
Data and models for the SciFact verification task.
XTREME is a benchmark for the evaluation of the cross-lingual generalization ability of pre-trained multilingual models that covers 40 typologically diverse languages and includes nine tasks.
ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, p…
Summarizing and Exploring Tabular Data in Conversational Search (SIGIR '20)
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Code on A Hybrid Retrieval-Generation Neural Conversation Model (CIKM 2019)
Code for the ACL 19 paper "Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards"
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
This is the repo for the paper "Revealing the Importance of Semantic Retrieval for Machine Reading at Scale".
手写实现李航《统计学习方法》书中全部算法
Ongoing research training transformer models at scale
NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence labeling tasks (e.g. NER, POS, Segmentation). It includes character LSTM/CNN, word LSTM/CNN and softmax/CRF components.
💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants
MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
pkuseg多领域中文分词工具; The pkuseg toolkit for multi-domain Chinese word segmentation
An Efficient Lexical Analyzer for Chinese