Support DraftRetriever datastore read/write for large tokenizers and vocabulary sizes (i.e. llama3+) #23

scandukuri · 2024-11-13T06:39:36Z

Here we make the necessary changes to read and write suffixes to memory and file for large tokenizers; the original implementation only supported token IDs up to Rust u16::MAX (65,535).

Crucially, using Rust i32 for reading and writing individual token IDs (instead of u16 originally) allows the tool to support token IDs of up to Rust i32::MAX (2,147,483,647), and still allows negative placeholder IDs for padding in the implementation like -2.

zhenyuhe00 · 2024-11-20T01:32:55Z

Hi,
Thank you for the fix. Would you consider creating a new branch? The change from "u16" to "i32" isn't needed for models with small vocabulary size and would increase disk storage usage unnecessarily.

scandukuri · 2024-11-20T03:10:43Z

Yes! Can make a ‘llama3’ branch. I can make the existing changes (DraftRetriever) + the necessary changes to modeling_llama_kv.py.

zhenyuhe00 · 2024-11-20T04:04:24Z

That sounds great! Appreciate your effort.

scandukuri added 2 commits November 12, 2024 22:30

support large vocab sizes (i.e. llama3) for DraftRetriever datastore

05f57b0

updates comments to explain implementation changes

9b9cebd

scandukuri closed this Nov 20, 2024

scandukuri mentioned this pull request Nov 20, 2024

Support DraftRetriever datastore read/write for large vocab sizes (i.e. llama3+) and REST inference for llama3 #24

Merged

csAugust mentioned this pull request Nov 30, 2024

Llama3 Branch Still Suffers Segmentation Fault When Generating Datastore Using Qwen2.5 #26

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support DraftRetriever datastore read/write for large tokenizers and vocabulary sizes (i.e. llama3+) #23

Support DraftRetriever datastore read/write for large tokenizers and vocabulary sizes (i.e. llama3+) #23

scandukuri commented Nov 13, 2024

zhenyuhe00 commented Nov 20, 2024

scandukuri commented Nov 20, 2024 via email •

edited

Loading

zhenyuhe00 commented Nov 20, 2024

Support DraftRetriever datastore read/write for large tokenizers and vocabulary sizes (i.e. llama3+) #23

Support DraftRetriever datastore read/write for large tokenizers and vocabulary sizes (i.e. llama3+) #23

Conversation

scandukuri commented Nov 13, 2024

zhenyuhe00 commented Nov 20, 2024

scandukuri commented Nov 20, 2024 via email • edited Loading

zhenyuhe00 commented Nov 20, 2024

scandukuri commented Nov 20, 2024 via email •

edited

Loading