Integrate OpenAI Audio Client and Introduce ChatClientBase for Enhanced Separation of Concerns #9

Cyb3rWard0g · 2024-12-23T03:12:39Z

Summary

This PR adds OpenAI’s audio client integration into the project, enabling speech generation, transcription, and translation functionalities. Additionally, it introduces a new ChatClientBase class to handle chat-specific logic, separating it from the LLMClientBase to ensure modularity and maintainability.

Key Changes

OpenAI and Azure OpenAI Audio Client:
- Unified client to support both OpenAI and Azure OpenAI APIs.
- Configurable settings for API key, base URL, Azure-specific options (e.g., endpoint, deployment, version).
Speech Generation:
- Converts text into audio using OpenAI endpoints.
- Handles large text inputs by splitting into manageable chunks for efficient processing.
- Supports incremental file saving or in-memory audio composition.
Transcription and Translation:
- Transcription: Converts audio files into text using OpenAI transcription models.
- Translation: Translates audio content into English with OpenAI translation models.
Dynamic Request Validation:
- Utilizes Pydantic models for validating and structuring requests.
- Ensures accurate and error-free communication with OpenAI APIs.
Improved Logging and Error Handling:
- Detailed logging for tracking API interactions and debugging issues.
- Robust error handling with meaningful feedback for failed operations.
Introduction of ChatClientBase:
- New base class to encapsulate chat-specific functionality, such as Prompty integration and prompt templates.
- Keeps LLMClientBase focused on general-purpose LLM features, avoiding chat-specific dependencies.
- Supports loading and configuring Prompty sources for chat-based workflows.

Impact

Enhanced Capabilities: Enables a wide range of use cases, from text-to-speech applications to audio-to-text processing and translations.
Separation of Concerns: By introducing ChatClientBase, the project achieves better modularity and clarity between general LLM functionalities and chat-specific features.
Flexibility: Provides seamless integration with both OpenAI and Azure OpenAI, catering to diverse deployment needs.
Ease of Use: Simplifies interaction with OpenAI’s audio APIs through structured requests and automated chunk handling.
Scalability: Handles large text inputs and ensures consistent performance with efficient chunk processing and file management.

These updates not only add advanced audio processing capabilities but also improve the project’s maintainability and adaptability by refactoring chat-specific logic into a dedicated base class.

Cyb3rWard0g added 3 commits December 11, 2024 12:42

Refactor: Split LLMClientBase into LLMClientBase and ChatClientBase.

5be41c5

Updated CONTRIBUTING to clean local and remote branches

01134ad

Added OAI Audio Client support

5e5eebf

Cyb3rWard0g merged commit 3258b55 into main Dec 23, 2024

Cyb3rWard0g deleted the feature/oai-audio-client branch December 27, 2024 07:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate OpenAI Audio Client and Introduce ChatClientBase for Enhanced Separation of Concerns #9

Integrate OpenAI Audio Client and Introduce ChatClientBase for Enhanced Separation of Concerns #9

Cyb3rWard0g commented Dec 23, 2024

Integrate OpenAI Audio Client and Introduce ChatClientBase for Enhanced Separation of Concerns #9

Integrate OpenAI Audio Client and Introduce ChatClientBase for Enhanced Separation of Concerns #9

Conversation

Cyb3rWard0g commented Dec 23, 2024

Summary

Key Changes

Impact