-
MIT
- Cambridge
-
17:39
(UTC -04:00) - https://sustcsonglin.github.io/
- @SonglinYang4
Highlights
- Pro
-
flash-linear-attention Public
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
-
-
TN-PCFG Public
source code of NAACL2021 "PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols“ and ACL2021 main conference "Neural Bilexicalized PCFG Induction"
-
transformers_ssm_copy Public
Forked from sjelassi/transformers_ssm_copy -
zoology Public
Forked from HazyResearch/zoologyUnderstand and test language model architectures on synthetic tasks.
-
-
mamba.py Public
Forked from alxndrTL/mamba.pyAn efficient Mamba implementation in PyTorch and MLX.
-
Academic-project-page-template Public template
Forked from eliahuhorwitz/Academic-project-page-templateA project page template for academic papers. Demo at https://eliahuhorwitz.github.io/Academic-project-page-template/
JavaScript UpdatedJan 22, 2024 -
hyena-dna Public
Forked from HazyResearch/hyena-dnaOfficial implementation for HyenaDNA, a long-range genomic foundation model built with Hyena
Assembly Apache License 2.0 UpdatedJan 20, 2024 -
lit-gpt Public
Forked from Lightning-AI/litgptHackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-l…
-
-
TinyLlama Public
Forked from jzhang38/TinyLlamaThe TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
-
-
nanokitchen Public
Forked from proger/nanokitchenParallel Associative Scan for Language Models
-
cutlass-kernels Public
Forked from ColfaxResearch/cutlass-kernelsCuda MIT License UpdatedDec 20, 2023 -
-
-
FlagAttention Public
Forked from FlagOpen/FlagAttentionA collection of memory efficient attention operators implemented in the Triton language.
-
streaming-llm Public
Forked from mit-han-lab/streaming-llmEfficient Streaming Language Models with Attention Sinks
-
stack-attention Public
Forked from bdusell/stack-attentionCode for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
Python UpdatedOct 4, 2023 -
safari Public
Forked from HazyResearch/safariConvolutions for Sequence Modeling
-
disco-pointer Public
Official Implementation of ACL2023: Don't Parse, Choose Spans! Continuous and Discontinuous Constituency Parsing via Autoregressive Span Selection
-
flash-linear-rnn Public
Implementations of various linear RNN layers using pytorch and triton
-
-
s5-pytorch Public
Forked from i404788/s5-pytorchPytorch implementation of Simplified Structured State-Spaces for Sequence Modeling (S5)
Python Mozilla Public License 2.0 UpdatedJun 25, 2023 -
SGEMM_CUDA Public
Forked from siboehm/SGEMM_CUDAFast CUDA matrix multiplication from scratch
Cuda UpdatedJun 13, 2023 -
sustcsonglin_old.github.io Public
Forked from imfing/vuepress-homepage📄 Elegant & friendly homepage (bio, tech portfolio, resume, doc...) template with Markdown and VuePress
-
-
BeamTreeRecursiveCells Public
Forked from JRC1995/BeamTreeRecursiveCellsPython MIT License UpdatedMay 27, 2023 -
state-spaces Public
Forked from state-spaces/s4Sequence Modeling with Structured State Spaces
Jupyter Notebook Apache License 2.0 UpdatedMay 25, 2023