This repository contains papers related to all kinds of LLMs.
We strongly encourage researchers in the hope of advancing their excellent work.
Theme | Source | Link | Other |
---|---|---|---|
…… | …… | …… | …… |
Descriptions | …… |
Paper | Source | Link | Other |
---|---|---|---|
A Survey on Multimodal Large Language Models for Autonomous Driving | arXiv:2311.12320 | bilibili | …… |
Descriptions | …… | ||
Retrieval-Augmented Generation for Large Language Models: A Survey | Arxiv2023'Tongji University | …… | …… |
Descriptions | This paper provides a comprehensive overview of the integration of retrieval mechanisms with generative processes within large language models to enhance their performance and knowledge capabilities. |
Paper | Source | Link | Other |
---|---|---|---|
…… | …… | …… | …… |
Descriptions | …… |
Paper | Source | Link | Other |
---|---|---|---|
Improving Text Embeddings with Large Language Models | Arxiv2024'Microsoft | …… | …… |
ARES: An Automated Evaluation Framework for Retrieval-Augmented Generation Systems | NAACL 2024 | …… | Code: stanford-futuredata/ARES |
Descriptions | ARES, an Automated RAG Evaluation System, efficiently evaluates retrieval-augmented generation systems across multiple tasks using synthetic data and minimal human annotations, maintaining accuracy even with domain shifts. |
Paper | Source | Link | Other |
---|---|---|---|
…… | …… | …… | …… |
Descriptions | …… |
Paper | Source | Link | Other |
---|---|---|---|
Higher Layers Need More LoRA Experts | Arxiv2024'Northwestern University | …… | …… |
Descriptions | In deep learning models, higher layers require more LoRA (Low-Rank Adaptation) experts to enhance the model’s expressive power and adaptability. | ||
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression | Arxiv2023'Microsoft | …… | …… |
Descriptions | To accelerate and enhance the performance of large language models (LLMs) in handling long texts, compressing prompts can be an effective method. | ||
Can AI Assistants Know What They Don't Know? | Arxiv2024'Fudan University | …… | Code: Say-I-Dont-Know |
Descriptions | The paper explores if AI assistants can identify when they don't know something, creating a "I don't know" dataset to teach this, resulting in fewer false answers and increased accuracy. | ||
Code Llama: Open Foundation Models for Code | Arxiv2023'Meta AI | bilibili | codellama |
Descriptions | The article introduces Code Llama, a family of large programming language models developed by Meta AI, based on Llama 2, designed to offer state-of-the-art performance among open models, support large input contexts, and possess zero-shot instruction following capabilities for programming tasks. | ||
Are Emergent Abilities of Large Language Models a Mirage? | NIPS2023'Stanford University | bilibili | …… |
Descriptions | The article challenges the notion that large language models (LLMs) exhibit "emergent abilities," suggesting that these abilities may be an artifact of the metrics chosen by researchers rather than inherent properties of the models themselves. Through mathematical modeling, empirical testing, and meta-analysis, the authors demonstrate that alternative metrics or improved statistical methods can eliminate the perception of emergent abilities, casting doubt on their existence as a fundamental aspect of scaling AI models. | ||
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models | Arxiv2023'MEGVII Technology | …… | VaryBase |
Descriptions | The article introduces Vary, a method for expanding the visual vocabulary of Large Vision-Language Models (LVLMs) to enhance dense and fine-grained visual perception capabilities for specific visual tasks, such as document-level OCR or chart understanding. | ||
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks | Arxiv2019'UKP Lab | bilibili | sentence-transformers |
Descriptions | The paper introduces Sentence-BERT (SBERT), a modification of the BERT network that employs siamese and triplet network structures to produce semantically meaningful sentence embeddings that can be compared using cosine similarity, thereby significantly enhancing the efficiency of sentence similarity search and clustering tasks. | ||
Towards a Unified View of Parameter-Efficient Transfer Learning | ICLR2022'Carnegie Mellon University | …… | unify-parameter-efficient-tuning |
Descriptions | This paper presents a unified framework for understanding and improving various parameter-efficient transfer learning methods by modifying specific hidden states in pre-trained models, defining a set of design dimensions to differentiate between methods, and experimentally demonstrating the framework's ability to identify important design choices in previous methods and instantiate new parameter-efficient tuning methods that are more effective with fewer parameters. |
Paper | Source | Link | Other |
---|---|---|---|
…… | …… | …… | …… |
Descriptions | …… |
Paper | Source | Link | Other |
---|---|---|---|
…… | …… | …… | …… |
Descriptions | …… |
Paper | Source | Link | Other |
---|---|---|---|
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning | ICLR2023 | …… | …… |
Descriptions | Diffusion strategies, as a highly expressive class of policies, are used in offline reinforcement learning scenarios to improve learning efficiency and decision-making performance. |