-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
v0.1.0-20240802 release (#140) #146
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Bugfix: a case that files' encodings can not be detected by chardet (#61) * Bugfix: connection error for longtime upload tasks (#62) * Fix connection error for longtime job * fix testcase bugs * support num workers for embedding model * Refactor query api and add dataframe UI * Refactor query api * Remove embedding workers * Add file: file_utils.py (#63) * Fix connection error for longtime job * fix testcase bugs * support num workers for embedding model * Refactor query api and add dataframe UI * Refactor query api * Remove embedding workers * Add file_utils --------- Co-authored-by: Yue Fei <[email protected]> * Remove local storage and enable Elasticsearch hybrid query mode (#60) * Add gpu dockerfile * Fix bug * Fix gb2312 * Update embedding batch size * Set default embedding and llm model * Update docker tag * Fix hologres check * Update registry * Fix bug * Fix tests * Add queue * Update batch size * Add async interface * Fix index conflict * Add change index parameter for FAISS * Fix batch size * Update * Modify async upload to sync (#64) * Modify async upload to sync * fix failed test * Fix faiss_path not effective in retrieval (#65) * Add API to support upload local files (#67) * support upload file via API * add Readme for upload API * refactor query api * modify load_knowledge with session_config * use tempfile.mkdtemp() to store upload files * add docker image timezone for China (#68) * add image zone for China * remove unused ENV --------- Co-authored-by: shubao.sx <[email protected]> Co-authored-by: Yue Fei <[email protected]> * load data pipeline supports read config (#70) * Add gpu docker image timezone for China (#74) * Add fast bm25 (#66) * Add fast bm25 * Fix bm25 bug * Fix bug * Fix test * Update readme and configuration (#77) * fix demo.toml typo, and add comments for settings.toml for embedding * update readme, add load data * Update docker.yml * Enable multiple workers to improve perf (#75) * Add fast bm25 * Update * Fix bug * Fix bm25 bug * Fix bug * Refine code * Update multi-process * Add API to support upload local files (#67) * support upload file via API * add Readme for upload API * refactor query api * modify load_knowledge with session_config * use tempfile.mkdtemp() to store upload files * add docker image timezone for China (#68) * add image zone for China * remove unused ENV --------- Co-authored-by: shubao.sx <[email protected]> Co-authored-by: Yue Fei <[email protected]> * load data pipeline supports read config (#70) * Add gpu docker image timezone for China (#74) * Add fast bm25 (#66) * Add fast bm25 * Fix bm25 bug * Fix bug * Fix test * Update dockerfile * Fix bug * Update * Update docker file * Fix empty file bug * Fix local index error * Fix lint * Decouple gradio and backend * Add ui build * Add gunicorn * Fix gunicorn * Update nginx * add nginx image * Fix deployment issue * Fix upload --------- Co-authored-by: 筱文 <[email protected]> Co-authored-by: paradiseHIT <[email protected]> Co-authored-by: shubao.sx <[email protected]> * Add guides for env and docker (#81) * Add guides for env * add guides for docker build * Add README * Add config guide cn&en (#82) * add es setting * add es setting * add elasticsearch test * add es test * add and modify es_tokenizer test * add and modify es_tokenizer test * modify test_as_tokenizer * add skipif * fix test linter fails * fix lint problem * update test_as_analyzer * add config_guide * add navigation into readme * Add doc reference for rag query (#84) * Support evaluation for generated and open datasets (#83) * Refactor evaluation module * add UI: eval_tab * support eval UI * tmp eval * remove eval web * Support evaluation * fix pytest * Add OpenDataSet class --------- Co-authored-by: ranxia <[email protected]> * Fix oss url for miracl dataset (#86) * fix ui es upload (#85) * Fix eas LLM (#88) * Milvus support sparse search (#87) * Upload multiple files in single API call (#89) * Milvus support sparse search * aload fix * Upload multiple files in one api call * Remove notebooks * Fix tests * Fix http timeout issue * Add client default timeout limitation and support UI interactive (#90) * Add client default timeout limitation and support UI interactive * support interactivate for vectordb type * Fix ui issue (#91) * Fix deps and add gpu ci tests (#92) * Fix deps and add gpu ci tests * Don't send report in 2nd pipeline * Fix empty response for empty knowledge base (#93) * Fix empty response for empty knowledge base * Add constant for empty response message * Fix dup nodes (#94) * Add error handling (#96) * Add error handling * Add upload error msg * fix data_loader (#95) * fix data_loader * fix data_loder * fix data_loader * fix data_loader * Set proper log levels (#98) * Adjust config instruction and add es instruction (#99) * add es setting * add es setting * add elasticsearch test * add es test * add and modify es_tokenizer test * add and modify es_tokenizer test * modify test_as_tokenizer * add skipif * fix test linter fails * fix lint problem * update test_as_analyzer * add config_guide * add navigation into readme * adjust config guide and add es instruction * Log stacktrace for failed requests (#100) * Load milvus collection by default (#101) * Log stacktrace for failed requests * Load milvus collection by default * Rename & Relocate figures in md (#102) * add es setting * add es setting * add elasticsearch test * add es test * add and modify es_tokenizer test * add and modify es_tokenizer test * modify test_as_tokenizer * add skipif * fix test linter fails * fix lint problem * update test_as_analyzer * add config_guide * add navigation into readme * adjust config guide and add es instruction * modify md figures * minor modification * change md path and name * 针对windows平台修改docker启动命令 (#104) * 针对windows平台修改docker启动命令 * 针对windows平台修改docker启动命令 * 针对windows平台修改docker启动命令 * make format * make format, nothing changed * download models from oss automatically (#97) * download models from oss automatically * download models from oss automatically * download models from oss automatically * download models from oss automatically * download models from modelscope * download models from modelscope * fix readme * Fix bug in downloading models (#106) * Fix bug * Fix log * Fix download * Add markdown reader (#105) * fix pdf reader (#107) Co-authored-by: Yue Fei <[email protected]> * Personal/ranxia/pdf table summary fix (#109) * fix pdf reader * fix pdf reader table summary --------- Co-authored-by: Yue Fei <[email protected]> * FiAddage number to file_name (#110) * Support stream response for LLM (PaiEAS && DashScope) (#112) * Support stream response for LLM (PaiEAS && DashScope) * Add PaiEas LLM old file * Add image node processor (#114) * Fit image in response * Add image insert * Fix llm max-token * Fix bug (#115) * Fix bugs for chinese escaped string in API header (#117) * Fix bidi version (#119) * Add fix version * Update poetry.lock * Update streaming response to body field use server sent events (#120) * Fix streaming * Fix llm and vector query * Address comment * Remove extra print * Support simple-weighted-reranker and similarity-threshold (#116) * Support nomalized cosine_sim score for different vectorDB * Support simple-weighted-reranker and similarity-threshold * [Todo] Support ES hybrid search * Support Milvus * fix path * fix open dataset * Fix url for du-retrieval dataset * Restore setting * Fix reviews * Apply node_id for weighted_reranker * jsonl reader (#124) * jsonl reader * jsonl reader * Support function_calling with booking demo tools (#122) * Add booking system demo for function_calling * Support customized function calling tools * Add testcase for agent and llm * Fix test * Fix async test * Add readme for function calling * Add readme for function calling * Remove ref figs * Add nodes enhancement by raptor (#111) * add raptor * add raptor ui support * fix logger bug * add node_enhancement class and modify test * fix node_enhancement setting bug * lint adjustment * poetry lock * fix poetry.lock * fix poetry issues * add a param * add token calculation for Chinese and adjust context_window * update tokenization_qwen * update file_path * merge feature and update poetry.lock * exclude pytest since no vocab file in the test env * exclude qwen.tiktoken * delete assert * Add weather tool (#125) * weather okgit add .! * fix bug * space bug --------- Co-authored-by: Yue Fei <[email protected]> * Don't use parallel when data size is big (#108) * Add opensearch (#127) * Add open search. Not tested * Fix * Fix config * update docker's readme (#126) * update docker's readme * change network back * change network back * change network back * Create ci.yml (#131) * Update CI & PR pipelines (#132) * Update CI * Fix ci * Fix a few ui bugs (#133) * Support RDS postgres vector store (#134) * support rds postgers for store engine * Format * support table * Make format --------- Co-authored-by: Yue Fei <[email protected]> * Fix minor bugs (#135) * Fix bug * Fix index bug * Updaet password field * Add pre-commit * Remove upload button * Refine upload * Fix pg connection string * Fix empty response for score_threshold (#136) * Fix empty response for score_threshold * Modify empty response info * Modify empty response info --------- Co-authored-by: Yue Fei <[email protected]> * fix table_reader in pdf_reader (#128) * fix table_reader in pdf_reader * fix table_reader in pdf_reader * fix table_reader in pdf_reader * fix table_reader in pdf_reader * fix table_reader in pdf_reader * fix table_reader in pdf_reader * add "enable_ocr" and "enable_table_summary" (#138) * add "enable_ocr" and "enable_table_summary" * add "enable_ocr" and "enable_table_summary" * add "enable_ocr" and "enable_table_summary" * Add release pipeline and fix some bugs (#137) * Fix bug * Add release pipeline * Update * Update * Fix bug * Fix login * Fix empty tag * Update * Fix ui issue * Add base version tag * Fix specific version * Use pg hybrid retrieval directly * Fix image tag * Fix llm config (#139) * Fix toml merge bug (#142) * Fix configuration conflict (#143) * Fix merge bug * Fix version conflict for config file * Resolve snapshot merge conflict * Fix space outage in github runner (#144) * Fix merge bug * Fix version conflict for config file * Resolve snapshot merge conflict * Update yaml --------- Co-authored-by: Ceceliachenen <[email protected]> Co-authored-by: wwxxzz <[email protected]> Co-authored-by: paradiseHIT <[email protected]> Co-authored-by: shubao.sx <[email protected]> Co-authored-by: aero-xi <[email protected]> Co-authored-by: ranxia <[email protected]> Co-authored-by: aero-xi <[email protected]> Co-authored-by: CharlieKoo <[email protected]> Co-authored-by: zhangdingchu <[email protected]> Co-authored-by: zt2645802240 <[email protected]>
☂️ Python Coverage
Overall Coverage
New FilesNo new covered files... Modified FilesNo covered modified files...
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Bugfix: a case that files' encodings can not be detected by chardet (BugFix: a case that files' encodings can not be detected by chardet #61)
Bugfix: connection error for longtime upload tasks (Bugfix: connection error for longtime upload tasks #62)
Fix connection error for longtime job
fix testcase bugs
support num workers for embedding model
Refactor query api and add dataframe UI
Refactor query api
Remove embedding workers
Add file: file_utils.py (Add file: file_utils.py #63)
Fix connection error for longtime job
fix testcase bugs
support num workers for embedding model
Refactor query api and add dataframe UI
Refactor query api
Remove embedding workers
Add file_utils
Remove local storage and enable Elasticsearch hybrid query mode (Remove local storage and enable Elasticsearch hybrid query mode #60)
Add gpu dockerfile
Fix bug
Fix gb2312
Update embedding batch size
Set default embedding and llm model
Update docker tag
Fix hologres check
Update registry
Fix bug
Fix tests
Add queue
Update batch size
Add async interface
Fix index conflict
Add change index parameter for FAISS
Fix batch size
Update
Modify async upload to sync (Modify async upload to sync #64)
Modify async upload to sync
fix failed test
Fix faiss_path not effective in retrieval (Fix faiss_path not effective in retrieval #65)
Add API to support upload local files (Add API to support upload local files #67)
support upload file via API
add Readme for upload API
refactor query api
modify load_knowledge with session_config
use tempfile.mkdtemp() to store upload files
add docker image timezone for China (add docker image timezone for China #68)
add image zone for China
remove unused ENV
load data pipeline supports read config (load data pipeline supports read config #70)
Add gpu docker image timezone for China (add gpu docker image timezone for China #74)
Add fast bm25 (Add fast bm25 #66)
Add fast bm25
Fix bm25 bug
Fix bug
Fix test
Update readme and configuration (Update readme add configuration #77)
fix demo.toml typo, and add comments for settings.toml for embedding
update readme, add load data
Update docker.yml
Enable multiple workers to improve perf (Enable multiple workers to improve perf #75)
Add fast bm25
Update
Fix bug
Fix bm25 bug
Fix bug
Refine code
Update multi-process
Add API to support upload local files (Add API to support upload local files #67)
support upload file via API
add Readme for upload API
refactor query api
modify load_knowledge with session_config
use tempfile.mkdtemp() to store upload files
add docker image timezone for China (add docker image timezone for China #68)
add image zone for China
remove unused ENV
load data pipeline supports read config (load data pipeline supports read config #70)
Add gpu docker image timezone for China (add gpu docker image timezone for China #74)
Add fast bm25 (Add fast bm25 #66)
Add fast bm25
Fix bm25 bug
Fix bug
Fix test
Update dockerfile
Fix bug
Update
Update docker file
Fix empty file bug
Fix local index error
Fix lint
Decouple gradio and backend
Add ui build
Add gunicorn
Fix gunicorn
Update nginx
add nginx image
Fix deployment issue
Fix upload
Add guides for env and docker (Add guides for env and docker #81)
Add guides for env
add guides for docker build
Add README
Add config guide cn&en (Add config guide cn&en #82)
add es setting
add es setting
add elasticsearch test
add es test
add and modify es_tokenizer test
add and modify es_tokenizer test
modify test_as_tokenizer
add skipif
fix test linter fails
fix lint problem
update test_as_analyzer
add config_guide
add navigation into readme
Add doc reference for rag query (Add doc reference for rag query #84)
Support evaluation for generated and open datasets (Support evaluation for generated and open datasets #83)
Refactor evaluation module
add UI: eval_tab
support eval UI
tmp eval
remove eval web
Support evaluation
fix pytest
Add OpenDataSet class
Fix oss url for miracl dataset (Fix oss url for miracl dataset #86)
fix ui es upload (Fix es upload failure in ui #85)
Fix eas LLM (Fix PaiEas LLM achat func #88)
Milvus support sparse search (Enable Milvus hybrid search #87)
Upload multiple files in single API call (Upload multiple files in single API call #89)
Milvus support sparse search
aload fix
Upload multiple files in one api call
Remove notebooks
Fix tests
Fix http timeout issue
Add client default timeout limitation and support UI interactive (Add client default timeout limitation and support UI interactive #90)
Add client default timeout limitation and support UI interactive
support interactivate for vectordb type
Fix ui issue (Fix ui issue #91)
Fix deps and add gpu ci tests (Fix deps and add gpu ci tests #92)
Fix deps and add gpu ci tests
Don't send report in 2nd pipeline
Fix empty response for empty knowledge base (Fix empty response for empty knowledge base #93)
Fix empty response for empty knowledge base
Add constant for empty response message
Fix dup nodes (Fix duplicate nodes #94)
Add error handling (Add error handling #96)
Add error handling
Add upload error msg
fix data_loader (fix data_loader #95)
fix data_loader
fix data_loder
fix data_loader
fix data_loader
Set proper log levels (Refine log levels #98)
Adjust config instruction and add es instruction (Adjust config instruction and add es instruction #99)
add es setting
add es setting
add elasticsearch test
add es test
add and modify es_tokenizer test
add and modify es_tokenizer test
modify test_as_tokenizer
add skipif
fix test linter fails
fix lint problem
update test_as_analyzer
add config_guide
add navigation into readme
adjust config guide and add es instruction
Log stacktrace for failed requests (Log stacktrace for failed requests #100)
Load milvus collection by default (Load milvus collection by default #101)
Log stacktrace for failed requests
Load milvus collection by default
Rename & Relocate figures in md (Rename & Relocate figures in md #102)
add es setting
add es setting
add elasticsearch test
add es test
add and modify es_tokenizer test
add and modify es_tokenizer test
modify test_as_tokenizer
add skipif
fix test linter fails
fix lint problem
update test_as_analyzer
add config_guide
add navigation into readme
adjust config guide and add es instruction
modify md figures
minor modification
change md path and name
针对windows平台修改docker启动命令 (针对windows平台修改docker启动命令 #104)
针对windows平台修改docker启动命令
针对windows平台修改docker启动命令
针对windows平台修改docker启动命令
make format
make format, nothing changed
download models from oss automatically (download models from oss automatically #97)
download models from oss automatically
download models from oss automatically
download models from oss automatically
download models from oss automatically
download models from modelscope
download models from modelscope
fix readme
Fix bug in downloading models (Fix bug in downloading models #106)
Fix bug
Fix log
Fix download
Add markdown reader (Add markdown reader #105)
fix pdf reader (fix pdf reader #107)
Personal/ranxia/pdf table summary fix (Personal/ranxia/pdf table summary fix #109)
fix pdf reader
fix pdf reader table summary
FiAddage number to file_name (Add page number to file_name #110)
Support stream response for LLM (PaiEAS && DashScope) (Support stream response for LLM (PaiEAS && DashScope) #112)
Support stream response for LLM (PaiEAS && DashScope)
Add PaiEas LLM old file
Add image node processor (Add image node processor #114)
Fit image in response
Add image insert
Fix llm max-token
Fix bug (Fix UI bug #115)
Fix bugs for chinese escaped string in API header (Fix bugs for chinese escaped string in API header #117)
Fix bidi version (Fix bidi version #119)
Add fix version
Update poetry.lock
Update streaming response to body field use server sent events (Update streaming response to body field use server sent events #120)
Fix streaming
Fix llm and vector query
Address comment
Remove extra print
Support simple-weighted-reranker and similarity-threshold (Support simple-weighted-reranker and similarity-threshold #116)
Support nomalized cosine_sim score for different vectorDB
Support simple-weighted-reranker and similarity-threshold
[Todo] Support ES hybrid search
Support Milvus
fix path
fix open dataset
Fix url for du-retrieval dataset
Restore setting
Fix reviews
Apply node_id for weighted_reranker
jsonl reader (jsonl reader #124)
jsonl reader
jsonl reader
Support function_calling with booking demo tools (Support function_calling with booking demo tools #122)
Add booking system demo for function_calling
Support customized function calling tools
Add testcase for agent and llm
Fix test
Fix async test
Add readme for function calling
Add readme for function calling
Remove ref figs
Add nodes enhancement by raptor (Add nodes enhancement by raptor #111)
add raptor
add raptor ui support
fix logger bug
add node_enhancement class and modify test
fix node_enhancement setting bug
lint adjustment
poetry lock
fix poetry.lock
fix poetry issues
add a param
add token calculation for Chinese and adjust context_window
update tokenization_qwen
update file_path
merge feature and update poetry.lock
exclude pytest since no vocab file in the test env
exclude qwen.tiktoken
delete assert
Add weather tool (Add weather tool #125)
weather okgit add .!
fix bug
space bug
Don't use parallel when data size is big (Don't use parallel to build BM25 index when data size is big #108)
Add opensearch (Add opensearch #127)
Add open search. Not tested
Fix
Fix config
update docker's readme (update docker's readme #126)
update docker's readme
change network back
change network back
change network back
Create ci.yml (Create ci.yml #131)
Update CI & PR pipelines (Update CI & PR pipelines #132)
Update CI
Fix ci
Fix a few ui bugs (Fix vector db ui #133)
Support RDS postgres vector store (Support RDS postgres vector store #134)
support rds postgers for store engine
Format
support table
Make format
Fix minor bugs (Fix minor bugs #135)
Fix bug
Fix index bug
Updaet password field
Add pre-commit
Remove upload button
Refine upload
Fix pg connection string
Fix empty response for score_threshold (Fix empty response for score_threshold #136)
Fix empty response for score_threshold
Modify empty response info
Modify empty response info
fix table_reader in pdf_reader (fix table_reader in pdf_reader #128)
fix table_reader in pdf_reader
fix table_reader in pdf_reader
fix table_reader in pdf_reader
fix table_reader in pdf_reader
fix table_reader in pdf_reader
fix table_reader in pdf_reader
add "enable_ocr" and "enable_table_summary" (add "enable_ocr" and "enable_table_summary" #138)
add "enable_ocr" and "enable_table_summary"
add "enable_ocr" and "enable_table_summary"
add "enable_ocr" and "enable_table_summary"
Add release pipeline and fix some bugs (Add release pipeline and fix some bugs #137)
Fix bug
Add release pipeline
Update
Update
Fix bug
Fix login
Fix empty tag
Update
Fix ui issue
Add base version tag
Fix specific version
Use pg hybrid retrieval directly
Fix image tag
Fix llm config (Fix llm config #139)
Fix toml merge bug (Fix toml merge bug #142)
Fix configuration conflict (Fix configuration conflict #143)
Fix merge bug
Fix version conflict for config file
Resolve snapshot merge conflict
Fix space outage in github runner (Fix space outage in github runner #144)
Fix merge bug
Fix version conflict for config file
Resolve snapshot merge conflict
Update yaml