Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
xianshang33 committed Apr 24, 2024
1 parent 2c2e2ec commit 6e057ba
Show file tree
Hide file tree
Showing 11 changed files with 561 additions and 401 deletions.
38 changes: 21 additions & 17 deletions CATEGORIES.md

Large diffs are not rendered by default.

383 changes: 191 additions & 192 deletions README.md

Large diffs are not rendered by default.

383 changes: 191 additions & 192 deletions README_en.md

Large diffs are not rendered by default.

20 changes: 20 additions & 0 deletions summary/2024-04/2404.14464.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#### 背景
- **背景**
文章指出,多跳问答是一个知识密集型复杂问题。大型语言模型(LLMs)通过他们的思路链(CoT)能力逐步推理复杂问题,检索增强可以有效减少因为旧知识和未知知识导致的事实错误。最近的工作将检索增强引入CoT推理来解决多跳问答问题。然而,这些链式方法有如下问题:1)检索到的无关段落可能会误导推理过程;2)链式结构中的错误可能导致错误的连锁反应。

- **已有的工作**
现有工作提供了一次性检索方法,但针对多跳问题,需要更细致的方法来获取全面的知识。迭代检索涉及多轮检索,每轮都由新生成的子问题引导。这些方法虽然比一次性检索表现更好,但链式的过程易于发生错误累积。这突出了迭代检索方法固有的脆弱性,并且是实现可靠知识提取的关键挑战。

#### 核心贡献
- **提出了一个动态检索框架叫做 TREE OF REVIEWS (TOR)**
- **挑战1:如何减轻无关信息对推理过程的误导**
文章引入树形结构来分别处理每个检索到的段落,减少无关段落对推理路径的误导效应,不同推理路径的扩展多样性减少了单个推理错误对整体的影响。

- **挑战2:如何优化迭代检索的效率和质量**
文章提出两种基于树的检索优化策略:剪枝和有效扩展,这显著提高了检索质量和效率,为优化迭代检索方法提供了宝贵的见解。
- ...
#### 实现与部署
文章中提到,TOR框架在三个不同的多跳问答数据集上进行了实验,结果展示TOR在检索和反馈生成方面均达到了最先进的性能,表明其有效性。通过剪枝策略减少无用搜索的启动频率,通过有效扩展策略优化查询生成以改善检索段落,论文中宣称这些策略在提高检索质量和效率方面显示出了明显的改进。

#### 总结
论文提出了一种新的迭代检索框架TOR,它采用树形结构减少错误累积,并引入优化策略提高检索效率和质量。在实验中,TOR框架在多个数据集上达到了最先进的性能。
20 changes: 20 additions & 0 deletions summary/2024-04/2404.14469.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#### 背景
- **背景**
文章指出大型语言模型(LLMs)在处理长文本上取得了显著的进步,其中关键值(KV)缓存在提高性能方面起到了重要作用。但是,随着输入文本长度的增加,KV缓存的增长对于内存和时间效率提出了挑战。

- **已有的工作**
尽管有一些工作通过扩展LLMs来处理更长的上下文,解决了上下文维持和注意力机制扩展的难题,但在长文本输入时LLMs仍然面临显著的挑战。具体来说,在推断阶段,随着输入长度的增长,由于必须对过去的KVs计算注意力,每一步的解码速度也会线性增长。此外,当前的KV缓存压缩方法普遍忽略了输入序列KV缓存的压缩,这是内存效率中的主要瓶颈,特别是在有噪声的上下文场景中。

#### 核心贡献
- **提出了一个名为SnapKV的方法**
- **挑战1:有效缩减长文本输入时KV缓存大小**
SnapKV是一种创新的、无需微调的方法,能够有效地减小KV缓存的大小,并能在现实世界中的应用程序中提供可与基线模型媲美的性能。SnapKV通过自动压缩KV缓存并选取每个注意力头部的重要KV位置,显著减少了处理长输入序列时不断增长的计算负担和内存占用。

- **挑战2:在不损失精准生成的情况下压缩输入**
根据作者的观察,模型在生成过程中对特定的提示语(prompt)关注特征显示出一致的模式。文章利用这一洞察,提出了SnapKV算法,有效解决了在不影响模型准确性的前提下,有效压缩长序列输入的KV缓存的挑战。SnapKV展示了在多达380K的上下文令牌处理上,在单个GPU上的应用潜力,几乎不会降低准确度。

#### 实现与部署
SnapKV明显减少了长文本处理时的内存和计算开销,相比于基线模型可以实现3.6倍的生成速度提升和8.2倍的内存效率增强。SnapKV与HuggingFace实现相结合,能够在一台A100-80GB GPU上处理多达380K的上下文令牌,仅需要做很小的代码调整,并在Needle-in-a-Haystack测试中仅显示出微小的精确度降低,这并没有显著影响模型的性能。此外,SnapKV也可以与其他加速策略(如并行解码)正交组合使用,进一步拓展其性能能力。在处理各个长度、格式和内容各异的输入中,SnapKV对准确性和效率的改进与传统KV缓存相当,并可与之前的工作相比较。

#### 总结
该文章介绍了SnapKV,一种针对大型语言模型中关键值缓存问题的新方法。SnapKV通过智能压缩和选取重要的KV位置,有效地提升了长文本处理时的解码速度和内存效率,并在保持准确性的同时显著降低了计算成本。
20 changes: 20 additions & 0 deletions summary/2024-04/2404.14809.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#### 背景
- **背景**
文章介绍了图是表示社会和自然中不同实体及其复杂关系的基础数据模型,如社交网络、交通网络、金融网络和生物医药系统。近期,大型语言模型(LLMs)展现出在处理各种自然语言处理(NLP)和多模态任务方面的强大泛化能力,能够回答用户的任意问题和生成特定领域内容。

- **已有的工作**
与LLMs相比,节点学习模型有局限性,如需要针对不同的图任务进行重新训练,并且在不同图任务之间的迁移能力有限。LLMs在处理图任务的泛化方面具有优势,包括无需训练图学习模型,减少人工注释的成本。

#### 核心贡献
- **提出了一个对现有LLMs研究在图数据上的应用进行全面调查的综述**
- **挑战1:图任务的泛化难题**
LLMs能够解决这一挑战,因为它们免除了训练图学习模型的需要,并减少了手动注释的成本。

- **挑战2:如何使用LLMs解决图任务**
该综述深入调查并总结了通过先进的LLM模型解决的相关图分析任务,并指出了现存的挑战和未来的研究方向。

#### 实现与部署
在评估结果方面,该综述贯穿了LLM在图数据分析领域的应用,包括基于LLM的图查询处理、基于LLM的图推理与学习,以及基于图-LLM的应用。对比了LLMs在图形原理模型概念中的优越性能与节点学习模型的限制,并梳理了未来的研究方向。

#### 总结
本文是一个综述性研究,主要调查了在图数据上使用的LLMs研究,探讨了LLMs在图任务泛化方面的优势,并提出了在该领域进行研究的未来方向。
20 changes: 20 additions & 0 deletions summary/2024-04/2404.15238.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#### 背景
- **背景**
文章介绍了大型语言模型(LLMs)在各种应用中变得越来越重要,例如用于推荐系统和客户服务。这些模型往往反映出西方中心的观点,因为它们主要训练在反映这些价值观和行为的数据上。这种文化偏见会导致不希望的后果,例如强化刻板印象、疏远非西方用户、阻碍全球部署等。因此,研发能够意识到多元文化的语言技术变得越来越重要。

- **已有的工作**
现有的研究已经开发了文化知识数据库来表示与文化相关的知识和规范,但存在几个局限性:它们通常依赖于像维基百科和在线文章这样的正式知识来源,而这些来源错过了当地社区所体验的丰富、不断演化和长尾的文化细微差别;这些方法倾向于以一种断言的方式呈现文化知识,未能捕捉到文化习俗和价值观在同一文化群体内部个人之间的差异;此外,它们的评估方法通常依赖于分类任务和问答任务,这与LLMs在现实世界中的部署方式非常不同,因此无法反映它们在实践中的文化意识。

#### 核心贡献
- **提出了一个可扩展的流水线,用于从不同的在线社区大规模构建文化知识库**
- **挑战1:处理大规模噪声数据**
为了应对现有文化知识资源的断言性问题和文化描述的多样性问题,研究创建了一个通用框架,该框架从在线社区(如TikTok和Reddit)收集文化知识,并在大规模的自我叙述中建构了包含1.2万条来自TikTok和1.1万条来自Reddit的文化描述符的文化知识库——CultureBank。这样生成的知识库收录了针对相似文化习俗的多元视角,并计算了一致性级别,以支持包容性的文化理解。

- **挑战2:评估语言模型的文化意识**
为了更有效地评估LLMs的文化意识,研究提供了一个基于真实世界场景的文化情境,每个文化描述符都与一个相关的情境相结合(例如旅行咨询)。然后,研究评估了先进的LLMs在CultureBank上的文化意识,并发现了提升的空间。此外,研究证明,在CultureBank上训练LLMs可以提高其在下游文化相关任务上的表现。

#### 实现与部署
CultureBank 是一个开源的知识库,通过Framework收集了大量的在线文化自述,并用于训练和微调LLMs。评估结果显示,通过在CultureBank上进行微调的LLMs在两项文化任务上的零样本设置中展现了更好的性能。其中,CultureBank包含了丰富的文化描述符和场景,这些资源可以用于评估语言模型在文化意识方面的表现,并且能够指出改进的方向。

#### 总结
该论文提出了一个用于构建文化知识库的通用流水线,并使用该流水线创建了CultureBank,这是一个包含TikTok和Reddit上文化描述符的知识库。论文还通过这个知识库评估了LLMs在文化意识方面的表现,并用于训练更具文化意识的语言模型,以此促进未来语言技术的文化意识发展。
18 changes: 18 additions & 0 deletions summary_en/2024-04/2404.14464.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
#### Background
- **Background**
The paper discusses multi-hop question answering, a complex problem requiring knowledge-intensive reasoning. Large Language Models (LLMs), with their Chain of Thoughts (CoT) capability, step through complex issues, while retrieval augmentation helps to mitigate factual errors due to outdated or unknown knowledge. However, existing chain methods lead to issues like irrelevant paragraph retrieval and error propagation.
- **Existing Work**
Existing work in this domain has focused on one-time retrieval methods that fall short in addressing the nuances of multi-hop questions. Iterative retrieves multi-turn guided by new sub-questions still suffer from the risk of cumulative errors, making them vulnerable in obtaining reliable knowledge.

#### Core Contributions
- **Dynamic retrieval framework called TREE OF REVIEWS (TOR)**
- **Challenge 1: Mitigating the effect of irrelevant paragraphs on reasoning processes**
The paper introduces a tree structure to handle retrieved paragraphs independently, reducing the misleading influence of irrelevant paragraphs on reasoning paths and diminishing the impact of individual reasoning errors on the entire process.
- **Challenge 2: Enhancing the efficiency and quality of iterative retrieval**
The paper presents two tree-based search optimization strategies: pruning, which minimizes unproductive search initiations, and effective expansion, which refines query generation for better retrieval, providing valuable insights into the optimization of iterative retrieval.

#### Implementation and Deployment
Experiments conducted on three different multi-hop question answering datasets demonstrated TOR's state-of-the-art performance in both retrieval and response generation, validating its effectiveness. The proposed pruning strategy reduces unnecessary search initiations, while effective expansion optimizes query generation to improve the quality of retrieved paragraphs. These strategies are claimed to significantly enhance retrieval quality and efficiency.

#### Summary
The paper proposes a novel iterative retrieval framework (TOR) that uses a tree structure to minimize error accumulation and incorporates optimization strategies to improve retrieval efficiency and quality. Experiments show that TOR achieves state-of-the-art performance on several datasets.
22 changes: 22 additions & 0 deletions summary_en/2024-04/2404.14469.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#### Background
- **Background**
LLMs have seen significant advancements in processing long texts, with Key-Value (KV) caches playing a crucial role in enhancing their capabilities. However, the increase in KV cache size poses memory and time efficiency challenges as the input length grows.

- **Existing Work**
While some work has been successful in extending LLMs to handle longer contexts by addressing context maintenance and attention mechanism scalability, significant challenges remain in efficiently processing long input contexts. The KV caches in attention calculation become a bottleneck during inference, with decoding speeds per step linearly increasing as input length increases. Moreover, current methods primarily opti

mize the appended KV cache during generation steps, neglecting the pressing issue of compressing the KV cache for input sequences, especially challenging under noisy context scenarios.

#### Core Contributions
- **Introduced SnapKV**
- **Challenge 1: Effective reduction of KV cache size for long text inputs**
SnapKV is an innovative and fine-tuning-free approach that efficiently minimizes the KV cache size while maintaining performance comparable to baseline models in real-world applications. By automatically compressing KV caches and selecting important KV positions for each attention head, SnapKV significantly reduces the computational overhead and memory footprint when processing long input sequences.

- **Challenge 2: Compressing inputs without losing precise generation capabilities**
Drawing upon their observations, the authors devised the SnapKV algorithm, which effectively challenges the issue of compressing long sequence inputs' KV caches without compromising the model's accuracy. SnapKV demonstrates potential for applications on processing up to 380K context tokens on a single GPU with minimal accuracy loss in the Needle-in-a-Haystack test.

#### Implementation and Deployment
SnapKV substantially reduces memory and computational overheads for long text processing, achieving a 3.6x increase in generation speed and an 8.2x enhancement in memory efficiency compared to the baseline. When combined with the HuggingFace implementation, SnapKV can handle up to 380K context tokens on a single A100-80GB GPU with minor code changes, exhibiting only negligible accuracy loss in the Needle-in-a-Haystack test. Additionally, SnapKV can be used orthogonally with other acceleration strategies like parallel decoding, extending its performance capabilities further. SnapKV's improvements in accuracy and efficiency across various inputs, formats, and contents are comparable to traditional KV caching and can be benchmarked against previous work.

#### Summary
This paper introduces SnapKV, a novel approach to tackling the Key-Value cache problem in large language models. SnapKV intelligently compresses and selects important KV positions to significantly improve decoding speed and memory efficiency for long text processing, reducing computational costs while maintaining accuracy.
20 changes: 20 additions & 0 deletions summary_en/2024-04/2404.14809.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
#### Background
- **Background**
The article introduces the concept of graphs as fundamental data models representing various entities and their complex relationships in society and nature, such as social networks, transportation networks, financial networks, and biomedical systems. Recently, large language models (LLMs) have demonstrated significant generalization abilities in handling various natural language processing (NLP) and multi-modal tasks, answering arbitrary questions, and generating domain-specific content.

- **Existing Work**
Compared to graph learning models, LLMs have limitations, such as the need for retraining for different graph tasks and limited transferability between different graph tasks. LLMs have an advantage in generalizing graph tasks, including eliminating the training of graph learning models and reducing the cost of manual annotation.

#### Core Contributions
- **Conducted a comprehensive survey of existing LLM research on graph data**
- **Challenge 1: Generalization challenge of graph tasks**
LLMs can address this challenge by eliminating the need for training graph learning models and reducing the cost of manual annotation.

- **Challenge 2: How to use LLMs to solve graph tasks**
The survey delves deeply into and summarizes the relevant graph analytics tasks solved by advanced LLM models and points out the existing challenges and future research directions.

#### Implementation and Deployment
The survey thoroughly evaluates the applications of LLM in graph data analytics, including LLM-based graph query processing, LLM-based graph inference and learning, and graph-LLM-based applications. It compares the superior performance of LLMs within the concept of graph foundational models with the limitations of graph learning models and outlines future research directions.

#### Summary
This work is a survey that investigates research on LLMs applied to graph data, discusses the advantages of LLMs in providing general solutions for graph tasks, and suggests future directions for research in this field.
Loading

0 comments on commit 6e057ba

Please sign in to comment.