-
-
Notifications
You must be signed in to change notification settings - Fork 6.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Langchain integration #512
Conversation
…sed for debugging
…sed for debugging
…ep, asking for clarification
need to fix typing issues |
Hey, the failing tests don't have anything to do with the changed code, so I'd rather not fix them here. Test 1:
Test 2: |
gpt_engineer/ai.py
Outdated
|
||
# initialize token usage log | ||
self.cumulative_prompt_tokens = 0 | ||
self.cumulative_completion_tokens = 0 | ||
self.cumulative_total_tokens = 0 | ||
self.token_usage_log = [] | ||
|
||
# ToDo: Adapt to arbitrary model, currently assumes model using tiktoken |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# ToDo: Adapt to arbitrary model, currently assumes model using tiktoken |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on this.
having a hard requirement on tiktoken
could reduce the usability of this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, LangChain doesn't give access to its models' tokenizers. So we would either (i) wait until LC does so, or (ii) implement each tokenizer for each model into gpt-engineer. Imo (ii) doesn't make sense because we want to let LC handle all the interactions with models, and focus on coding-specific stuff.
gpt_engineer/ai.py
Outdated
def deserialize_messages(jsondictstr: str) -> List[Message]: | ||
return AI.parse_langchain_basemessages( | ||
messages_from_dict(json.loads(jsondictstr)) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just do:
return list(messages_from_dict(json.loads(jsondictstr))
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And delete below function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately no. Doing list(...)
gives a List[BaseMessage]
, but we need a List[Message]
(which is defined as List[Union[AIMessage, HumanMessage, SystemMessage]]
).
The complexity comes from the fact that List
is invariant, meaning List[Some_Subclass]
is not accepted where List[Some_Class]
is expected.
For context:
BaseMessage
is a LangChain class and the superclass of the LC classesAIMessage
,HumanMessage
, andSystemMessage
Message
is defined in this PR and is an alias forUnion[AIMessage, HumanMessage, SystemMessage]
We have to choose between one of these options:
- Don't do typing at all
- Use same type as LC: Use
Sequence[BaseMessage]
in gpt-engineer. This works becauseSequence
unlikeList
is covariant. But Sequence is immutable. So everywhere we add to the messages, we would have to convert toList
, do the operation, then re-convert toSequence
. Imo this makes the code complex and repetitive. - Parse at the point of interaction with LC: Parse between LC's
Sequence[BaseMessage]
and ourList[Message]
only at the LC interface. Then we can keep our code simple.
I think 3 is cleanest, so I chose that.
What are your thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right the typing. My thinking is that we use typing.cast() (and assert isinstance(...)
if necessary)?
Otherwise the code is very confusing to the reader and I'd even prefer doing 1!
(which is a legit option: just adding # type: ignore
gpt_engineer/ai.py
Outdated
def last_message_content(self, messages: List[Message]) -> str: | ||
m = messages[-1].content.strip() | ||
logging.info(m) | ||
return m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this function makes the code more difficult to read for a human.
Would prefer to remove it!
And logging.info(m)
will be confusing no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great on 1.
On the other, it's just that it's:
- More characters to type
- Needs an external dependency (AI)
- which is an object? (even though it doesn't need to be initialized)
- hides away unnecessarily, how simple everything actually is: It's just a list of messages! (knowing this makes it easier to understand the code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, get your point, will change!
Good job @UmerHA! Soon there 🏃 The big concern from my side is the "model_dir" and "model_name" and that gpt4 is not supported. Actualy, it's not on our roadmap to add open source models. Also, I'd like to keep the code as simple as possible, and I'm not sure it is worth it for anyone to use local models? GPT4 is much better. I'd propose we remove that part for now and potentially bring the local model stuff in a separate PR. Thoughts? |
Should be a model for gpt4.
…On Sat, Jul 8, 2023, 7:34 AM Anton Osika ***@***.***> wrote:
Good job @UmerHA <https://github.com/UmerHA>! Soon there and merged 🏃
The big concern from my side is the "model_dir" and "model_name" and that
gpt4 is not supported.
Actualy, it's not on our roadmap to add open source models.
Also, I'd like to keep the code as simple as possible, and I'm not sure it
is worth it for anyone to use local models? GPT4 is much better.
I'd propose we remove that part for now and potentially bring the local
model stuff in a separate PR.
Thoughts?
—
Reply to this email directly, view it on GitHub
<#512 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGVMKNO6BQ7BHJ5G3QHJ5DXPFVWDANCNFSM6AAAAAA2CGACSM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
My general perspective, I'm not sure what disadvantage there would be in
including it. I can think of five contributors who would like to see it
included to facilitate their current efforts. From what I understand, this
functionality was a key motivation for @UmerHA <https://github.com/UmerHA> to
do the work he has done. By including it, you will likely accelerate GPT
Engineer development. I personally have been talking to a few companies
that would like to pilot something like GPT-Engineer and possibly subsidize
it's development, as long as they can be confident their codebase isn't
going to end up being used as the next generation of training data (i.e.
not using third-party APIs). Facilitating people testing alternative
models will also likely benefit GPT Engineer in the long run as the models
inevitably become more accessible on more common hardware. And more
companies feeling comfortable they can control where their codebase is
going, the more corporate resources will be put behind the project.
On Sat, Jul 8, 2023 at 12:33 PM Holden Robbins ***@***.***>
wrote:
…
Should be a model for gpt4.
On Sat, Jul 8, 2023, 7:34 AM Anton Osika ***@***.***> wrote:
> Good job @UmerHA <https://github.com/UmerHA>! Soon there and merged 🏃
>
> The big concern from my side is the "model_dir" and "model_name" and that
> gpt4 is not supported.
>
> Actualy, it's not on our roadmap to add open source models.
>
> Also, I'd like to keep the code as simple as possible, and I'm not sure
> it is worth it for anyone to use local models? GPT4 is much better.
>
> I'd propose we remove that part for now and potentially bring the local
> model stuff in a separate PR.
>
> Thoughts?
>
> —
> Reply to this email directly, view it on GitHub
> <#512 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAGVMKNO6BQ7BHJ5G3QHJ5DXPFVWDANCNFSM6AAAAAA2CGACSM>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
@AntonOsika Taking a step back: What's the reason for integrating with LangChain if we don't want other models? Apart from that I agree that gpt4 is currently the best model, but I would expect that non-openai models (incl. open-source) get more capable over time, so I would vote to enable them. Also, is there a roadmap I can consult? Of course don't want to build stuff not on the roadmap. :) |
Hey cool didn’t know there was a gpt4 model accessible in this way? Does it work out of the box with this approach. As long as it didn’t interfere with current way of using its great, happy to merge after some other review comments are addressed. it’s just that the way of putting messages together seems like it shouldn’t be supported by gpt4? also — thanks both for contributions on this |
Worked last time I checked, but doesn't appear to work at the moment with
any branches of gpt engineer code base. Is it just me or did OpenAI quietly
roll back access to gpt-4 via the API?
…On Sun, Jul 9, 2023 at 12:20 PM Anton Osika ***@***.***> wrote:
Hey cool didn’t know there was a gpt4 model accessible in this way? Does
it work out of the box with this approach.
As long as it didn’t interfere with current way of using its great, happy
to merge after some other review comments are addressed. it’s just that the
way of putting messages together seems like it shouldn’t be supported by
gpt4?
also — thanks both for contributions on this
—
Reply to this email directly, view it on GitHub
<#512 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGVMKJYPT2BA24BVP4W6TTXPL76PANCNFSM6AAAAAA2CGACSM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Another (recommended) alternative would be to use a data model like
ChatMessageHistory, already provided by Langchain, or roll your own.
…-H
On Tue, Jul 11, 2023, 5:05 AM UmerHA ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In gpt_engineer/ai.py
<#512 (comment)>
:
> )
return messages
- def update_token_usage_log(self, messages, answer, step_name):
+ def last_message_content(self, messages: List[Message]) -> str:
+ m = messages[-1].content.strip()
+ logging.info(m)
+ return m
+
+ @staticmethod
+ def serialize_messages(messages: List[Message]) -> str:
+ return json.dumps(messages_to_dict(messages))
+
+ @staticmethod
+ def deserialize_messages(jsondictstr: str) -> List[Message]:
+ return AI.parse_langchain_basemessages(
+ messages_from_dict(json.loads(jsondictstr))
+ )
Unfortunately no. Doing list(...) gives a List[BaseMessage], but we need
a List[Message] (which is defined as List[Union[AIMessage, HumanMessage,
SystemMessage]]).
The complexity comes from the fact that List is invariant
<https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)>,
meaning List[Some_Subclass] is not accepted where List[Some_Class] is
expected.
For context:
- BaseMessage is a LangChain class and the superclass of the LC
classes AIMessage, HumanMessage, and SystemMessage
- Message is defined in this PR and is an alias for Union[AIMessage,
HumanMessage, SystemMessage]
We have to choose between one of these options:
1. Don't do typing at all
2. Use same type as LC: Use Sequence[BaseMessage] in gpt-engineer.
This works because Sequence unlike List is covariant. But Sequence is
immutable. So everywhere we add to the messages, we would have to convert
to List, do the operation, then re-convert to Sequence. Imo this makes
the code complex and repetitive.
3. Parse at the point of interaction with LC: Parse between LC's
Sequence[BaseMessage] and our List[Message] only at the LC interface.
Then we can keep our code simple.
I think 3 is cleanest, so I chose that.
What are your thoughts?
—
Reply to this email directly, view it on GitHub
<#512 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAGVMKIMSCXHVGQYKYEBVNTXPU6QHANCNFSM6AAAAAA2CGACSM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
- deleted unnecessary functions & logger.info - used LangChain ChatLLM instead of LLM to naturally communicate with gpt-4 - deleted loading model from yaml file, as LC doesn't offer this for ChatModels
Hey @AntonOsika, I incorporated your feedback. I also removed the |
Co-authored-by: Anton Osika <[email protected]>
return "\n".join( | ||
[f"{message['role']}:\n\n{message['content']}" for message in messages] | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breakes tests. See CI.
Any reason of changing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleting format_messages
is not the issue. That function is not used anywhere. That's why I deleted it.
The test test_collect_learnings
breaks because in the test we're doing sth like
code = "this is output\n\nit contains code"
code_with_extra_stuff = "extra stuff" + json.dumps(code)
assert code in code_with_extra_stuff
but json.dumps
escapes the \n
and turns it into \\n
. That's one of the failing tests mentioned above which iiuc don't relate to the changes in this PR.
I will fix the tests by changing assert code in code_with_extra_stuff
to assert json.dumps(code) in code_with_extra_stuff
Cool! 🏃 To be honest I think AI abstraction is a bit too complex right now and would like a proposal of a refactor. But I think it's OK we merge after we address current comments. As @hrobbins points out I think ChatMessageHistory would be great. |
- Fixed failing test - Removed parsing complexity by using # type: ignore - Replace every ocurence of ai.last_message_content with its content
Let me know when you want me to take a look! 🙏 |
@AntonOsika You can take a look :) |
Happy birthday @UmerHA Same with me – very happy to get on top of PRs and see this. Let's merge 🏃 |
* Added LangChain integration * Fixed issue created by git checkin process * Added ':' to characters to remove from end of file path * Tested initial migration to LangChain, removed comments and logging used for debugging * Tested initial migration to LangChain, removed comments and logging used for debugging * Converted camelCase to snake_case * Turns out we need the exception handling * Testing Hugging Face Integrations via LangChain * Added LangChain loadable models * Renames "qa" prompt to "clarify", since it's used in the "clarify" step, asking for clarification * Fixed loading model yaml files * Fixed streaming * Added modeldir cli option * Fixed typing * Fixed interaction with token logging * Fix spelling + dependency issues + typing * Fix spelling + tests * Removed unneeded logging which caused test to fail * Cleaned up code * Incorporated feedback - deleted unnecessary functions & logger.info - used LangChain ChatLLM instead of LLM to naturally communicate with gpt-4 - deleted loading model from yaml file, as LC doesn't offer this for ChatModels * Update gpt_engineer/steps.py Co-authored-by: Anton Osika <[email protected]> * Incorporated feedback - Fixed failing test - Removed parsing complexity by using # type: ignore - Replace every ocurence of ai.last_message_content with its content * Fixed test * Update gpt_engineer/steps.py --------- Co-authored-by: H <[email protected]> Co-authored-by: Anton Osika <[email protected]>
Strongly based on #466 - shoutout to @hrobbins!
What it does:
Enables using any model that's available in LangChain as backend.
How to use it:
gpt-engineer <path_to_project> gpt-4
<model_name>.yaml
file with the model configs. Thengpt-engineer <path_to_project> <model_name> --modeldir <path_to_model_config_file>
How it works:
ai.py
has been adapted to use LangChainIn the future, more model configs can be added, so more models are available by default.