-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OpenAI Integration #1
Comments
The interesting part about using a model to distill tokens while maintaining human readability reminds me of the concept of "dual-purpose encoding" - where the same content serves both human and machine needs effectively. Using a local model would be more cost-effective. Perhaps we could explore a format that includes: # [filename]
@semantictags: [local-model-generated-tokens]
@type: [file-type]
@path: [relative-path]
@summary: [local-model-generated-brief]
```content``` This could allow:
|
Building on these observations about SummarizeGPT's role in LLM pair programming and the potential for dual-purpose encoding, I've been considering a systematic approach through a flexible configuration system. Rather than jumping straight to implementing semantic enhancements, I believe we should first establish a robust configuration framework that can support various levels of enhancement while maintaining the tool's simplicity. Here's what I'm envisioning: # Example config structure (config.yaml)
summarize_gpt:
default:
encoding: cl100k_base
max_lines: null
semantic:
enabled: false
model: "sentence-transformers/all-MiniLM-L6-v2"
token_limit: 100
semantic:
enabled: true
model: ${SEMANTIC_MODEL_PATH} # from env
api_key: ${OPENAI_API_KEY} # from env With a clear configuration precedence:
This configuration-first approach would provide the foundation needed to support both the current functionality and future semantic enhancements while giving users fine-grained control over how they want to use the tool. An additional benefit of this YAML-based approach is that it enables effortless re-summarization of projects. By storing the configuration details in the project's directory, teams can maintain consistent summarization settings across multiple runs and between different team members. This is particularly valuable when working with large codebases or when you need to regenerate summaries after code changes while maintaining the same semantic analysis parameters and exclusion rules. The local .summarizegpt.yaml effectively serves as both a configuration cache and a project-level standard, ensuring that everyone working with the codebase gets the same context representation when using SummarizeGPT. This consistency is crucial for maintaining effective LLM pair programming practices across a team. Thoughts? |
I think I will try to get #4 looked at. I see the benefit in my flows when implementing a configuration-specific setup. We can use this as a basis to provide SummarizeGPT the ability to execute code summarization so this tool can be a more effective dual-encoded tool. Thanks again for using this tool! |
Get description of python files from openai and add to markdown file.
The text was updated successfully, but these errors were encountered: