Skip to content

lzyrapx/Papers-Books-Reading

Repository files navigation

Papers-Reading

Algorithm

Palindromic Tree

An Introduction to Quantum Computing, Without the Physics

The Berlekamp-Massey Algorithm revisited

Video Stabilization Algorithm

LLM

Date Paper Key Words
2017.6.12 Attention Is All You Need Transformer & Attention
2022.5.27 FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness Flash Attention
2022.8.15 LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale LLM.int8
2023.7.18 FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning Flash Attention 2
2024.3.19 When Do We Not Need Larger Vision Models? Scaling on Scales
2024.7.10 PaliGemma: A versatile 3B VLM for transfer Google small VLM: Paligemma
2024.7.12 FlashAttention-3 is optimized for Hopper GPUs (e.g. H100) Flash Attention 3
2024.7.28 Enhancing Taobao Display Advertising with Multimodal Representations: Challenges, Approaches and Insights Advertising with Multimodal
2024.8.22 NanoFlow: Towards Optimal Large Language Model Serving Throughput A novel serving framework: NanoFlow
2024.10.3 SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration Sage Attention
2024.11.17 SageAttention2: Efficient Attention with Thorough Outlier Smoothing and Per-thread INT4 Quantization Sage Attention 2

About

šŸ¬Some papers & books Iā€™ve read.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published