[Feature Request] Classifier-Free Guidance sampling #499

wronkiew · 2023-07-03T21:32:22Z

🚀 Feature

Add an option for models to use Classifier-Free Guidance (CFG) during inference. CFG uses a negative prompt to push inference to follow the system prompt more closely.

Support for this has also been requested at huggingface/transformers#24536 and ggml-org/llama.cpp#2083. The paper describing the technique is here. Section 3.4 shows their evaluation using CFG to improve chatbot responses.

Motivation

MLC-LLM is memory-constrained on mobile devices. The response quality of LLMs using CFG averaged similarly to standard models twice the size. For example, LLaMA-7B with CFG outperformed LLaMA-13B on the Lambada text-generation benchmark. This is at the expense of inference time.

Vermeille · 2023-07-03T22:22:58Z

Author here. Let me know if help is needed.

yzh119 · 2023-07-04T08:10:13Z

Thank @wronkiew for raising this up!
The example code by @Vermeille is very intuitive, and the idea of contrasting logits w/ and w/o prompts looks interesting.

I suppose it needs some effort on the MLC-LLM side because we need to support multiple KV caches, I'm glad to help prototype it.

wronkiew added the feature request New feature or request label Jul 3, 2023

Vermeille mentioned this issue Jul 3, 2023

Classifier-Free Guidance turboderp/exllama#129

Closed

mondaychen mentioned this issue Aug 15, 2023

[Feature Request] Output Restricted by Context Free Grammar or JSON Schema #758

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Classifier-Free Guidance sampling #499

[Feature Request] Classifier-Free Guidance sampling #499

wronkiew commented Jul 3, 2023

Vermeille commented Jul 3, 2023

yzh119 commented Jul 4, 2023

[Feature Request] Classifier-Free Guidance sampling #499

[Feature Request] Classifier-Free Guidance sampling #499

Comments

wronkiew commented Jul 3, 2023

🚀 Feature

Motivation

Vermeille commented Jul 3, 2023

yzh119 commented Jul 4, 2023