-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
.Net: Add CachedContent Property to GeminiPromptExecutionSettings (#1…
…0268) ### Motivation and Context **Why is this change required?** This change introduces a new `CachedContent` field to the `GeminiPromptExecutionSettings` class. The addition enables context caching, a feature that optimizes the handling of repeated static content in requests with high input token counts. **What problem does it solve?** Repeatedly processing substantial, static context data (e.g., lengthy documents, audio files, or video files) in requests can be resource-intensive and costly. Context caching addresses this by allowing the reuse of shared context across multiple requests, improving performance and cost-efficiency. **What scenario does it contribute to?** Context caching is particularly well-suited for scenarios where a substantial initial context is repeatedly referenced by shorter requests. Use cases include: - **Chatbots with extensive system instructions**: Allows reuse of complex system configurations across multiple user interactions. - **Repetitive analysis of lengthy video files**: Enables efficient analysis by caching video-related metadata or transcriptions. - **Recurring queries against large document sets**: Improves efficiency for workflows requiring repeated access to large document collections. - **Frequent code repository analysis or bug fixing**: Facilitates the reuse of large codebases as cached context for debugging or analysis tasks. *No open issues are linked to this change.* --- ### Description This PR adds a `CachedContent` field to the `GeminiPromptExecutionSettings` class. The field allows users to reference cached context items such as text blocks, audio files, or video files in prompt requests. **Key features:** - The minimum size of a context cache is 32,768 tokens. - Cached content is retained for a default duration of 60 minutes, with the option to configure expiration. - Cached content is billed efficiently: the initial creation call is charged at the standard rate, while subsequent references to the cache are billed at a reduced rate. By enabling context caching, users can optimize cost and resource usage for workflows requiring substantial and repeat context data. --- ### Contribution Checklist - [x] The code builds clean without any errors or warnings - [x] The PR follows the [SK Contribution Guidelines](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md) and the [pre-submission formatting script](https://github.com/microsoft/semantic-kernel/blob/main/CONTRIBUTING.md#development-scripts) raises no violations - [x] All unit tests pass, and I have added new tests where possible - [x] I didn't break anyone 😄 --- ### Demo <img width="1465" alt="Screenshot 2025-01-23 at 11 46 17 PM" src="https://github.com/user-attachments/assets/9d070c69-a482-4a40-a912-2b17ddf4c7ab" /> <img width="841" alt="Screenshot 2025-01-23 at 11 50 29 PM" src="https://github.com/user-attachments/assets/aac961ec-edf8-4458-a815-57e46f2a5789" /> --- ### Notes 1. The CachedContent is only available via `v1beta1` endpoint 2. [Overview of context caching in Gemini](https://cloud.google.com/vertex-ai/generative-ai/docs/context-cache/context-cache-overview?hl=en) 3. [cachedContent field in the body](https://cloud.google.com/vertex-ai/generative-ai/docs/reference/rest/v1beta1/projects.locations.endpoints/generateContent) for the `v1beta1` endpoint --------- Co-authored-by: Roger Barreto <[email protected]>
- Loading branch information
1 parent
471d9a8
commit 6fbbb44
Showing
11 changed files
with
294 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.