Skip to content

v2.3

Latest
Compare
Choose a tag to compare
@oobabooga oobabooga released this 15 Jan 01:05
· 6 commits to main since this release
7e80266

Changes

  • Major UI optimization: use the morphdom library to make incremental updates to the Chat tab during streaming (#6653). With this:
    • The CPU usage is drastically reduced for long contexts or high tokens/second.
    • The UI doesn't become sluggish in those scenarios anymore.
    • You can select and copy text or code from previous messages during streaming, as those elements remain static with the "morphing" operations performed by morphdom. Only what has changed gets updated.
  • Add a button to copy the raw message content below each chat message.
  • Add a button to regenerate the reply below the last chat message.
  • Activate "auto_max_new_tokens" by default, to avoid having to "continue" the chat reply for every 512 tokens.
  • Installer:
    • Update Miniconda to 24.11.1 (latest version). Note: Miniconda is only used during the initial setup.
    • Make the checksum verification for the Miniconda installer more robust on Windows, to account for systems where it was previously failing to execute at all.

Bug fixes

Backend updates

  • Transformers: bump to 4.48.
  • flash-attention: bump to 2.7.3.