tidyllm 0.3.0 represents a major milestone for tidyllm
The largest changes compared to 0.2.0 are:
- New Verb-Based Interface: Users can now use verbs like
chat()
,embed()
,send_batch()
,check_batch()
, andfetch_batch()
to interact with APIs. These functions always work with a combination of verbs and providers:- Verbs (e.g.,
chat()
,embed()
,send_batch()
) define the type of action you want to perform. - Providers (e.g.,
openai()
,claude()
,ollama()
) are an arguement of verbs and specify the API to handle the action with and take provider-specific arguments
- Verbs (e.g.,
Each verb and provider combination routes the interaction to provider-specific functions like openai_chat()
or claude_chat()
that do the work in the background. These functions can also be called directly as an alternative more verbose and provider-specific interface.
llm_message("Hello World") |>
openai(.model = "gpt-4o")
# Recommended Verb-Based Approach
llm_message("Hello World") |>
chat(openai(.model = "gpt-4o"))
# Or even configuring a provider outside
my_ollama <- ollama(.model = "llama3.2-vision:90B",
.ollama_server = "https://ollama.example-server.de",
.temperature = 0)
llm_message("Hello World") |>
chat(my_ollama)
# Alternative Approach is to use more verbose specific functions:
llm_message("Hello World") |>
openai_chat(.model = "gpt-4o")
- The old functions (
openai()
,claude()
, etc.) still work if you directly supply anLLMMessage
as arguement, but issue deprecation warnings when used directly for chat. - Users are encouraged to transition to the new interface for future-proof workflows.
- The output format of embedding APIs was changed from a matrix to a tibble with an input column and a list column containing one embedding vector and one input per row.
R6
-based savedLLMMessage
objects are no longer compatible with the new version. Saved objects from earlier versions need to be re-created
gemini()
andperplexity()
as new supported API providers.gemini()
brings interesting Video and Audio features as well as search grounding to tidyllm.perplexity()
also offers well cited search grounded assitant replies- Batch-Processing for
mistral()
- New Metadata-Extraction function
get_reply_metadata()
to get information on token usage, or on other relevant metadata (like sources used for grounding)
- Refactored Package Internals:
- Transitioned from
R6
toS7
for the mainLLMMessage
class, improving maintainability, interoperability, and future-proofing. - Consolidated all API-specific functionality into dedicated files
- Transitioned from
- Batch API functions for the Mistral API
- Search Grounding with the
.grounding_threshold
argument added of thegemini_chat()
function allowing you to use Google searches to ground model responses to a search result Gemini models. For example, asking about the maintainer of an obscure R package works with grounding but does only lead to a hallucination without:
llm_message("What is tidyllm and who maintains this package?") |>
gemini_chat(.grounding_threshold = 0.3)
- Perplexity as additional API provider available through
perplexity_chat()
. The neat feature of perplexity is the up-to-date web search it does with detailed citations. Cited sources are available in theapi_specific
-list column ofget_metadata()
.json_schema
support forollama()
available with Ollama 0.5.0
- Metadata extraction is now handled by api-specific methods.
get_metadata()
returns a list column with API-specific metadata
- Switch from
R6
toS7
for the mainLLMMessage
class - Several bug-fixes for
df_llm_message()
- API formatting methods are now in the code files for API providers
- Rate-limit header extraction for tracking and streaming callback generation are now methods for
APIProvider
classes - All api-specific code is now in the
api_openai.R
,api_gemini.R
,etc. files - Support for
as_tibble()
S3 Generic forLLMMessage
- Rate limit tracking and output for verbose mode in API-functions moved to a single function
track_rate_limit()
- Unnecessary
.onattach()
removed - Bugfix in callback method of Gemini streaming responses (still not ideal, but works)
- Embedding functions refactored to reduce repeated code
- Small test code to look at potential interoperability with elmer for using elmer-type schemata.
- API-key check moved into API-object method
- Slight refactoring for batch functions (there is still quite a bit of potential to reduce duplication)
- Old
R6
-basedLLMMessage
-objects are not compatible with the new version anymore! This also applies to saved objects, like lists of batch files.
- Google Gemini now supports working with multiple files in one message for the file upload functionality
here::here("local_wip","example.mp3") |> gemini_upload_file()
here::here("local_wip","legrille.mp4") |> gemini_upload_file()
file_tibble <- gemini_list_files()
llm_message("What are these two files about?") |>
gemini_chat(.fileid=file_tibble$name)
Better embedding functions with improved output and error handling and new documentation. New article on using embeddings with tidyllm. Support for embedding models on azure with azure_openai_embedding()
- The output format of
embed()
and the related API-specific functions was changed from a matrix to a tibble with an input column and a list column containing one embedding vector and one input per row.
One disadvantage of the first iteration of the new interface was that all arguements that needed to be passed to provider-specific functions, were going through the provider function. This feels, unintuitive, because users expect common arguments (e.g., .model, .temperature) to be set directly in main verbs like chat()
or send_batch()
.Moreover, provider functions don't expose arguments for autocomplete, making it harder for users to explore options. Therefore, the main API verbs now directly accept common arguements, and check them against the available arguements for each API.
- New error message for not setting a provider in main verbs
- Missing export of main verbs fixed
- Wrong documentation fixed
tidyllm
has introduced a verb-based interface overhaul to provide a more intuitive and flexible user experience. Previously, provider-specific functions like claude()
, openai()
, and others were directly used for chat-based workflows. Now, these functions primarily serve as provider configuration for some general verbs like chat()
.
- New Verb-Based Interface: Users can now use verbs like
chat()
,embed()
,send_batch()
,check_batch()
, andfetch_batch()
to interact with APIs. These functions always work with a combination of verbs and providers:- Verbs (e.g.,
chat()
,embed()
,send_batch()
) define the type of action you want to perform. - Providers (e.g.,
openai()
,claude()
,ollama()
) are an arguement of verbs and specify the API to handle the action with and take provider-specific arguments
- Verbs (e.g.,
Each verb and provider combination routes the interaction to provider-specific functions like openai_chat()
or claude_chat()
that do the work in the background. These functions can also be called directly as an alternative more verbose and provider-specific interface.
llm_message("Hello World") |>
openai(.model = "gpt-4o")
# Recommended Verb-Based Approach
llm_message("Hello World") |>
chat(openai(.model = "gpt-4o"))
# Or even configuring a provider outside
my_ollama <- ollama(.model = "llama3.2-vision:90B",
.ollama_server = "https://ollama.example-server.de",
.temperature = 0)
llm_message("Hello World") |>
chat(my_ollama)
# Alternative Approach is to use more verbose specific functions:
llm_message("Hello World") |>
openai_chat(.model = "gpt-4o")
- Backward Compatibility:
- The old functions (
openai()
,claude()
, etc.) still work if you directly supply anLLMMessage
as arguement, but issue deprecation warnings when used directly for chat. - Users are encouraged to transition to the new interface for future-proof workflows.
- The old functions (
- Added functions to work with the Google Gemini API, with the new
gemini()
main API-function - Support for the file upload workflows for Gemini:
#Upload a file for use with gemini
upload_info <- gemini_upload_file("example.mp3")
#Make the file available during a Gemini API call
llm_message("Summarize this speech") |>
gemini(.fileid = upload_info$name)
#Delte the file from the Google servers
gemini_delete_file(upload_info$name)
- Brings video and audio support to tidyllm
- Google Gemini is the second API to fully support
tidyllm_schema()
gemini()
-requests allow for a wide range of file types that can be used for context in messages- Supported document formats for
gemini()
file workflows:- PDF:
application/pdf
- TXT:
text/plain
- HTML:
text/html
- CSS:
text/css
- Markdown:
text/md
- CSV:
text/csv
- XML:
text/xml
- RTF:
text/rtf
- PDF:
- Supported code formats for
gemini()
file workflows:- JavaScript:
application/x-javascript
,text/javascript
- Python:
application/x-python
,text/x-python
- JavaScript:
- Supported image formats for
gemini()
file workflows:- PNG:
image/png
- JPEG:
image/jpeg
- WEBP:
image/webp
- HEIC:
image/heic
- HEIF:
image/heif
- PNG:
- Supported video formats for
gemini()
file workflows:- MP4:
video/mp4
- MPEG:
video/mpeg
- MOV:
video/mov
- AVI:
video/avi
- FLV:
video/x-flv
- MPG:
video/mpg
- WEBM:
video/webm
- WMV:
video/wmv
- 3GPP:
video/3gpp
- MP4:
- Supported audio formats for
gemini()
file workflows:- WAV:
audio/wav
- MP3:
audio/mp3
- AIFF:
audio/aiff
- AAC:
audio/aac
- OGG Vorbis:
audio/ogg
- FLAC:
audio/flac
- WAV:
- Added
get_metadata()
function to retrieve and format metadata fromLLMMessage
objects. - Enhanced the
print
method forLLMMessage
to support printing metadata, controlled via the newtidyllm_print_metadata
option or a new.meta
-arguement for the print method.
conversation <- llm_message("Write a short poem about software development") |>
claude()
#Get metdata on token usage and model as tibble
get_metadata(conversation)
#or print it with the message
print(conversation,.meta=TRUE)
#Or allways print it
options(tidyllm_print_metadata=TRUE)
- Fixed a bug in
send_openai_batch()
caused by a missing.json
-arguement not being passed for messages without schema
New CRAN release. Largest changes compared to 0.1.0:
Major Features:
- Batch Request Support: Added support for batch requests with both Anthropic and OpenAI APIs, enabling large-scale request handling.
- Schema Support: Improved structured outputs in JSON mode with advanced
.json_schema
handling inopenai()
, enhancing support for well-defined JSON responses. - Azure OpenAI Integration: Introduced
azure_openai()
function for accessing the Azure OpenAI service, with full support for rate-limiting and batch operations tailored to Azure’s API structure. - Embedding Model Support: Added embedding generation functions for the OpenAI, Ollama, and Mistral APIs, supporting message content and media embedding.
- Mistral API Integration: New
mistral()
function provides full support for Mistral models hosted in the EU, including rate-limiting and streaming capabilities. - PDF Batch Processing: Introduced the
pdf_page_batch()
function, which processes PDFs page by page, allowing users to define page-specific prompts for detailed analysis. - Support for OpenAI-compatible APIs: Introduced a
.compatible
argument (and flexible url and path) inopenai()
to allow compatibility with third-party OpenAI-compatible APIs.
Improvements:
- API Format Refactoring: Complete refactor of
to_api_format()
to reduce code duplication, simplify API format generation, and improve maintainability. - Improved Error Handling: Enhanced input validation and error messaging for all API-functions functions, making troubleshooting easier.
- Rate-Limiting Enhancements: Updated rate limiting to use
httr2::req_retry()
in addition to the rate-limit tracking functions in tidyllm, using 429 headers to wait for rate limit resets. - Expanded Testing: Added comprehensive tests for API functions using
httptest2
Breaking Changes:
- Redesigned Reply Functions:
get_reply()
was split intoget_reply()
for text outputs andget_reply_data()
for structured outputs, improving type stability compared to an earlier function that had different outputs based on a.json
-arguement. - Deprecation of
chatgpt()
: Thechatgpt()
function has been deprecated in favor ofopenai()
for feature alignment and improved consistency.
Minor Updates and Bug Fixes:
- Expanded PDF Support in
llm_message()
: Allows extraction of specific page ranges from PDFs, improving flexibility in document handling. - New
ollama_download_model()
function to download models from the Ollama API - All sequential chat API functions now support streaming
- Support for both the Anthropic and the OpenAI batch request API added
- New
.compatible
-arguement inopenai()
to allow working with compatible third party APIs
- Complete refactor of
to_api_format()
: API format generation now has much less code duplication and is more maintainable.
get_reply()
was split into two type-stable functions:get_reply()
for text andget_reply_data()
for structured outputs.
- Rate limiting updated to use
httr2::req_retry()
: Rate limiting now uses the right 429 headers where they come.
-
Enhanced Input Validation: All API functions now have improved input validation, ensuring better alignment with API documentation
-
Improved error handling More human-readable error messages for failed requests from the API
-
Advanced JSON Mode in
openai()
: Theopenai()
function now supports advanced.json_schemas
, allowing structured output in JSON mode for more precise responses. -
Reasoning Models Support: Support for O1 reasoning models has been added, with better handling of system prompts in the
openai()
function. -
Streaming callback functions refactored: Given that the streaming callback format for Open AI, Mistral and Groq is nearly identical the three now rely on the same callback function.
chatgpt()
Deprecated: Thechatgpt()
function has been deprecated in favor ofopenai()
. Users should migrate toopenai()
to take advantage of the new features and enhancements.
- Better Error Handling: The
openai()
,ollama()
, andclaude()
functions now return more informative error messages when API calls fail, helping with debugging and troubleshooting.
- Embedding Models Support: Embedding model support for three APIs:
- Embedding functions process message histories and combine text from message content and media attachments for embedding models.
ollama_embedding()
to generate embeddings using the Ollama API.openai_embedding()
to generate embeddings using the OpenAI API.mistral_embedding()
to generate embeddings using the Mistral API.
- PDF Page Support in
llm_message()
: Thellm_message()
function now supports specifying a range of pages in a PDF by passing a list withfilename
,start_page
, andend_page
. This allows users to extract and process specific pages of a PDF.
- PDF Page Batch Processing: Introduced the
pdf_page_batch()
function, which processes PDF files page by page, extracting text and converting each page into an image, allowing for a general prompt or page-specific prompts. The function generates a list ofLLMMessage
objects that can be sent to an API and work with the batch-API functions in tidyllm.
- Support for the Mistral API: New
mistral()
function to use Mistral Models on Le Platforme on servers hosted in the EU, with rate-limiting and streaming support.
- Message Retrieval Functions: Added functions to retrieve single messages from conversations:
last_user_message()
pulls the last message the user sent.get_reply()
gets the assistant reply at a given index of assistant messages.get_user_message()
gets the user message at a given index of user messages.
- Easier Troubleshooting in API-function: All API functions now support the
.dry_run
argument, allowing users to generate anhttr2
-request for easier debugging and inspection. - API Function Tests: Implemented
httptest2
-based tests with mock responses for all API functions, covering both basic functionality and rate-limiting.
- New Ollama functions:
- Model Download: Introduced the
ollama_download_model()
function to download models from the Ollama API. It supports a streaming mode that provides live progress bar updates on the download progress.
- Model Download: Introduced the
- Refactoring of
llm_message()
- The
groq()
function now supports images. - More complete streaming support across API-functions.
- Groq Models: System prompts are no longer sent for Groq models, since many models on Groq do not support them and all multimodal models on Groq disallow them.
- New unit tests for
llm_message()
. - Improvements in streaming functions.
-
JSON Mode: JSON mode is now more widely supported across all API functions, allowing for structured outputs when APIs support them. The
.json
argument is now passed only to API functions, specifying how the API should respond, and it is not needed anymore inlast_reply()
. -
Improved
last_reply()
Behavior: The behavior of thelast_reply()
function has changed. It now automatically handles JSON replies by parsing them into structured data and falling back to raw text in case of errors. You can still force raw text replies even for JSON output using the.raw
argument.
last_reply()
: The.json
argument is no longer used, and JSON replies are automatically parsed. Use.raw
to force raw text replies.