- Blog: Evolution of LLMs
- Blog: Architecture and components
- Video: Bert
- Video: GPT-1, 2, 3
- Zhihu: Why has the Decoder-only arch become the mainstream?
- Language Models are Unsupervised Multitask Learners. [Paper][Code]
- Yuan 1.0: Large-Scale Pre-trained Language Model in Zero-Shot and Few-Shot Learning. [Paper][Code]
NIPS2017
Attention Is All You Need. [Paper][Code]11 Oct 2018 arxiv
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. [Paper][Code]ICML22
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? [Paper][Code]
- Zhihu: Tokenization
- Zhihu: BPE
- Blog: understanding llm tokenization
- Video: Build GPT tokenizer
- Hugging face NLP Course Chapter6
ACL2016
Neural Machine Translation of Rare Words with Subword Units. [Paper]26 Sep 2016 arxiv
Google’s neural machine translation system: Bridging the gap between human and machine translation. [Paper]ACL2018
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates. [Paper]
-
Code generation
- Codex
- AlpaCode
(MLCAD 2023)
Chateda: A large language model powered autonomous agent for eda. [Paper](23 May 2023 arxiv)
Chipgpt: How far are we from natural language hardware design. [Paper](ICCAD23 invited)
Verilogeval: Evaluating large language models for verilog code generation. [Paper][Code](ICCAD23)
Gpt4aigchip: Towards next-generation ai accelerator design automation via large language models. [Paper](MLCAD23)
Chip-chat: Challenges and opportunities in conversational hardware design. [Paper](8 Nov 2023 arxiv)
Autochip: Automating hdl generation using llm feedback. [Paper][Code](ASP-DAC24)
Rtllm: An open-source benchmark for design rtl generation with large language model. [Paper][Code](DATE23)
Benchmarking large language models for automated verilog rtl code generation. [Paper][Code]
-
Code Verification & Analysis
(31 Oct 2023 arxiv)
Chipnemo: Domain-adapted llms for chip design. [Paper](28 Nov 2023 arxiv)
Rtlfixer: Automatically fixing rtl syntax errors with large language models. [Paper](DAC21)
Autosva: Democratizing formal verification of rtl module interactions. [Paper](24 Jun 2023 arxiv)
Llm-assisted generation of hardware assertions. [Paper](21 Aug 2023 arxiv)
Unlocking hardware security assurance: The potential of llms. [Paper](14 Aug 2023 arxiv)
Divas: An llm-based end-to-end framework for soc security analysis and policy-based protection. [Paper](2 Feb 2023 arxiv)
Fixing hardware security bugs with large language models. [Paper]
-
Specification Generation
(NIPS2011)
Algorithms for Hyper-Parameter Optimization. [Paper](ACL2023)
Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers. [Paper][Code]
- Combinatorial / Discrete Problem
(ICLR24)
Prompt
Type1
LARGE LANGUAGE MODELS AS OPTIMIZERS. [Paper][Code](ICLR24)
Bayesian Optimization
Type3
LARGE LANGUAGE MODELS TO ENHANCE BAYESIAN OPTIMIZATION. [Paper]Code](19 Jan 2024 arxiv)
Evolutionary algorithm
Type3
A match made in consistency heaven: when large language models meet evolutionary algorithms. [Paper](29 Oct arxiv)
Evolutionary algorithm
Type3
Large Language Models as Evolutionary Optimizers. [Paper](8 Oct arxiv)
Prompt
Type1
Towards Optimizing with Large Language Model. [Paper]
- Numerical / Continuous Problem
(ICLR24)
Prompt
Type1
LARGE LANGUAGE MODELS AS OPTIMIZERS. [Paper][Code](Nature)
Prompt
Type1
Mathematical discoveries from program search with large language models. [Paper](8 Jul arxiv)
Prompt
Type3
Large Language Models for Supply Chain Optimization. [Paper][Code](NIPS23)
Prompt
Type1
Using Large Language Models for Hyperparameter Optimization. [Paper](19 Jan 2024 arxiv)
Evolutionary algorithm
Type3
A match made in consistency heaven: when large language models meet evolutionary algorithms. [Paper](22 Nov 2023 arxiv)
Reinforcement learning
Type3
Large Language Model is a Good Policy Teacher for Training Reinforcement Learning Agents. [Paper][Code](25 May 2023 arxiv)
Reinforcement learning
Type1
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory. [Paper][Code](29 Oct arxiv)
Evolutionary algorithm
Type3
Large Language Models as Evolutionary Optimizers. [Paper]