-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: expose flash_attn / cache_type_k / cache_type_v
- Loading branch information
Showing
7 changed files
with
80 additions
and
41 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -54,6 +54,12 @@ public LlamaContext(int id, ReactApplicationContext reactContext, ReadableMap pa | |
params.hasKey("n_threads") ? params.getInt("n_threads") : 0, | ||
// int n_gpu_layers, // TODO: Support this | ||
params.hasKey("n_gpu_layers") ? params.getInt("n_gpu_layers") : 0, | ||
// boolean flash_attn, | ||
params.hasKey("flash_attn") ? params.getBoolean("flash_attn") : false, | ||
This comment has been minimized.
Sorry, something went wrong.
This comment has been minimized.
Sorry, something went wrong.
jhen0409
Author
Member
|
||
// String cache_type_k, | ||
params.hasKey("cache_type_k") ? params.getString("cache_type_k") : "f16", | ||
// String cache_type_v, | ||
params.hasKey("cache_type_v") ? params.getString("cache_type_v") : "f16", | ||
// boolean use_mlock, | ||
params.hasKey("use_mlock") ? params.getBoolean("use_mlock") : true, | ||
// boolean use_mmap, | ||
|
@@ -382,6 +388,9 @@ protected static native long initContext( | |
int n_batch, | ||
int n_threads, | ||
int n_gpu_layers, // TODO: Support this | ||
boolean flash_attn, | ||
String cache_type_k, | ||
String cache_type_v, | ||
boolean use_mlock, | ||
boolean use_mmap, | ||
boolean vocab_only, | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Does it make sense to expose this on the Android side? I do not believe flash_attn is implemented on android aside the cpu implementation for testing.