-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio input, app layout tweaks, better background startup, use ExtendedTask #224
Changes from 25 commits
f5d3e74
c1b68f1
78c3078
f302d96
20a6456
5595d35
0fb2b73
469dff5
45dde96
fc80c92
3380ee8
9eb110e
a664030
2f408e4
ea55e20
0a88caf
f1ac74e
48ce475
1eb0cf0
f3be247
1343f53
5511e5d
fcb826b
b9344d9
df47901
341b051
159fb8c
7b36c89
1fbe973
281f346
1be0040
4e089e6
96103a8
8a9d97c
78b61c1
1d76339
2cd7972
b1f7ef5
22e34ac
0b08eac
5cb2dd1
0a028fa
2ff5473
bb34c01
c90a94f
dee7541
602c9f5
44d56ac
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -4,6 +4,11 @@ | |
- Added claude-3.5-sonnet model from Anthropic. | ||
- Set gpt-4o-mini as default model for OpenAI. #219 | ||
- Fixed bugs with Azure OpenAI service. #223 | ||
- Add audio input option for chat app. #224 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Did you enable it in the settings? At a minimum, I need to document instructions better. |
||
- Fix bug with chat app not loading on linux. #224 | ||
- Allow chat app to run in Positron (not yet as background job) #224 | ||
- API calls now run async with ExtendedTask. #224 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think async (and by extension, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This was meant for line 10 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Fair point on async. I've been using local models more, and they can be slow. The idea was to get not block actions in the current session, but I don't think that adds much. I'll remove the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Update on this, ExtendedTask needs a promise to be returned. I've moved promises to Suggests, but it's not really another dependency because shiny already depends on the package. |
||
- New styling of chat app. #224 | ||
JamesHWade marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## gptstudio 0.4.0 | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
#' Parse a Data URI | ||
#' | ||
#' This function parses a data URI and returns the MIME type and decoded data. | ||
#' | ||
#' @param data_uri A string. The data URI to parse. | ||
#' | ||
#' @return A list with two elements: 'mime_type' and 'data'. | ||
#' | ||
parse_data_uri <- function(data_uri) { | ||
if (is.null(data_uri) || !is.character(data_uri) || length(data_uri) != 1) { | ||
cli::cli_abort("Invalid input: data_uri must be a single character string") | ||
} | ||
|
||
match <- regexec("^data:(.+);base64,(.*)$", data_uri) | ||
if (match[[1]][1] == -1) { | ||
cli::cli_abort("Invalid data URI format") | ||
} | ||
groups <- regmatches(data_uri, match)[[1]] | ||
mime_type <- groups[2] | ||
b64data <- groups[3] | ||
# Add padding if necessary | ||
padding <- nchar(b64data) %% 4 | ||
if (padding > 0) { | ||
b64data <- paste0(b64data, strrep("=", 4 - padding)) | ||
} | ||
list(mime_type = mime_type, data = jsonlite::base64_dec(b64data)) | ||
} | ||
|
||
#' Transcribe Audio from Data URI Using OpenAI's Whisper Model | ||
#' | ||
#' This function takes an audio file in data URI format, converts it to WAV, and | ||
#' sends it to OpenAI's transcription API to get the transcribed text. | ||
#' | ||
#' @param audio_input A string. The audio data in data URI format. | ||
#' @param api_key A string. Your OpenAI API key. Defaults to the OPENAI_API_KEY | ||
#' environment variable. | ||
#' | ||
#' @return A string containing the transcribed text. | ||
#' | ||
#' @export | ||
#' | ||
#' @examples | ||
#' \dontrun{ | ||
#' audio_uri <- "data:audio/webm;base64,SGVsbG8gV29ybGQ=" # Example data URI | ||
#' transcription <- transcribe_audio(audio_uri) | ||
#' print(transcription) | ||
#' } | ||
#' | ||
transcribe_audio <- function(audio_input, api_key = Sys.getenv("OPENAI_API_KEY")) { | ||
parsed <- parse_data_uri(audio_input) | ||
|
||
temp_webm <- tempfile(fileext = ".webm") | ||
temp_wav <- tempfile(fileext = ".wav") | ||
writeBin(parsed$data, temp_webm) | ||
system_result <- #nolint | ||
system2("ffmpeg", | ||
args = c("-i", temp_webm, "-acodec", "pcm_s16le", "-ar", "44100", temp_wav), # nolint | ||
stdout = TRUE, | ||
stderr = TRUE | ||
) | ||
|
||
if (!file.exists(temp_wav)) { | ||
cli::cli_abort("Failed to convert audio: {system_result}") | ||
} | ||
|
||
req <- request("https://api.openai.com/v1/audio/transcriptions") %>% | ||
req_auth_bearer_token(api_key) %>% | ||
req_body_multipart( | ||
file = curl::form_file(temp_wav), | ||
model = "whisper-1", | ||
response_format = "text" | ||
) | ||
|
||
resp <- req_perform(req) | ||
|
||
if (resp_is_error(resp)) { | ||
cli::cli_abort("API request failed: {resp_status_desc(resp)}") | ||
} | ||
|
||
user_prompt <- resp_body_string(resp) | ||
file.remove(temp_webm, temp_wav) | ||
invisible(user_prompt) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should mostly avoid adding new dependencies. For icons we already have fontawesome (which we would already have because of shiny) and for promises I'll make a comment on the ExtendedTask part
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
promises
is already a dependency for shiny, but I'll see if I can get around bsicons.