-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backport PR #2807 to release/v1.7 for add qbg-handler #2822
Backport PR #2807 to release/v1.7 for add qbg-handler #2822
Conversation
* add qbg-handler * apply coderabbit suggestion --------- Co-authored-by: Yusuke Kato <[email protected]>
Deploying vald with
|
Latest commit: |
4c80a24
|
Status: | ✅ Deploy successful! |
Preview URL: | https://ab69d7aa.vald.pages.dev |
Branch Preview URL: | https://backport-release-v1-7-featur-f4u5.vald.pages.dev |
📝 WalkthroughWalkthroughThis pull request introduces significant changes to the Rust agent implementation, focusing on enhancing the agent's functionality and error handling. The changes primarily involve updating the agent's core components to support a new QBG (Quantum Bit Graph) service, improving concurrency, and expanding error handling capabilities. The modifications span multiple files across the agent and algorithm libraries, introducing new methods, error types, and service implementations. Changes
Sequence DiagramsequenceDiagram
participant Client
participant Agent
participant QBGService
participant Index
Client->>Agent: Send Request
Agent->>QBGService: Process Request
QBGService->>Index: Perform Operation
Index-->>QBGService: Return Result
QBGService-->>Agent: Generate Response
Agent-->>Client: Send Response
Possibly related PRs
Suggested Labels
Suggested Reviewers
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (28)
rust/bin/agent/src/handler.rs (3)
28-28
: Switch from a Box to Arc.This facilitates concurrent access to the
ANN
object. Confirm you have considered the performance overhead of multiple readers vs. single writer usage.
36-42
: Constructor signature improvement.Accepting
impl algorithm::ANN
by value is good for flexibility. Consider adding error handling or validations if needed in the future.
44-44
: Initialization with Arc::new(RwLock::new(s)).Double-check if there's any scenario where immediate writes are needed. If reads dominate,
RwLock
is beneficial. Otherwise, a simplerMutex
might suffice.rust/libs/algorithm/src/lib.rs (3)
18-18
: Additional imports.The import of
i64
seems redundant if only standard numeric types are needed. Verify usage to prevent confusion.
39-58
: Extended Display match arms.Good job providing detailed error messages. If you need more context, consider including additional fields or data in these variants.
79-83
: Additional state/tracking methods.Methods like
len
,is_indexing
,is_saving
enhance observability. Consider bridging to metrics if needed for performance tracking.rust/bin/agent/src/main.rs (3)
27-28
: Mock service struct.Retaining
_MockService
is fine for testing or placeholder usage. Ensure it remains current with changes to theANN
trait.
31-35
: QBGService struct.Defining
path
,index
, andproperty
is a solid approach for QBG usage. Consider if any additional fields (e.g., caching, concurrency controls) might be beneficial.
37-75
: _MockService implementation uses placeholders.All methods currently call
todo!()
, meaning they are unimplemented. Verify that these stubs are never invoked in production.Would you like help generating a more complete mock implementation for functional testing?
rust/bin/agent/src/handler/index.rs (2)
30-87
: Robust error handling increate_index
This implementation uses thorough error details for different known error scenarios, demonstrating good clarity for debugging.
Consider augmenting with telemetry/logging for partial progress or time taken by index creation, to aid in performance tracing and debugging.
92-118
: Concerns about sync vs. async insave_index
Usingself.s.write().await
will block other concurrent writers. This is acceptable if index saving is typically short-lived, but might cause delays if large data sets are saved frequently.
You might consider an internal queue or job-based approach for saving to avoid potential bottlenecks.rust/bin/agent/src/handler/remove.rs (1)
48-67
: Handling empty UUID
Great approach for returning a descriptive error whenuuid.len() == 0
. However, consider trimming whitespace to avoid edge cases like" "
.rust/bin/agent/src/handler/insert.rs (1)
73-152
: Insertion error handling
Thematch
arms for various error cases add clarity and granularity to the error responses. All known error types appear covered.
Consider an integration test verifying each error arm to ensure full coverage and correct client-facing responses.rust/bin/agent/src/handler/update.rs (6)
16-24
: Favor a structured logger over println statementsUsing
println!
for logging can be insufficient and difficult to track at scale. Consider using a structured logging framework (e.g.,tracing
orlog
) to enable log levels, filtering, and richer context.
32-32
: Fix spelling typo in log output"Recieved" should be "Received" for clarity and correctness.
- println!("Recieved a request from {:?}", request.remote_addr()); + println!("Received a request from {:?}", request.remote_addr());
34-37
: Improve validation error consistencyWhen
config
is missing, the error string is "Missing configuration in request". Consider including more details (e.g., which field is missing) to align with other error messages that provide granular context.
41-66
: Refactor repeated error handling codeThe code for building
ErrorDetails
and returningStatus
is repeated. Extract these patterns into a helper function to reduce duplication and improve maintainability.
74-94
: Validate zero-length UUID as early as possibleThe check for an empty UUID is correctly handled. However, you may want to unify this check with other argument validations earlier in the request lifecycle, such as immediately after retrieving
uuid
, for clarity and consistency.
95-180
: Use a shared error-handling utility to handle all error variantsMultiple error variants share very similar logic for constructing error details and returning a
Status
. Consider creating a shared routine to mapError
variants to aStatus
with error details. This approach follows DRY principles and makes it easier to add new error variants.rust/bin/agent/src/handler/object.rs (4)
25-30
: Implement theexists
method or clarify its unavailabilityThe
exists
method is left as atodo!()
. Consider implementing it or explicitly documenting why it is unavailable to set clear expectations for callers.
35-35
: Correct minor spelling mistake"Recieved" should be "Received."; also consider switching to a structured logger as recommended in other files.
- println!("Recieved a request from {:?}", request.remote_addr()); + println!("Received a request from {:?}", request.remote_addr());
70-76
: Return consistent error details for GetObjectCurrently, a simple
Status::new(Code::NotFound, ...)
is returned whenuuid
is missing in storage. Aligning your error details approach (e.g., usingErrorDetails
,resource_info
, andrequest_info
) makes debugging easier and code more consistent with the rest of the repository.
88-93
: Track incomplete streaming & timestamp methodsThe methods
stream_get_object
,stream_list_object
, andget_timestamp
are placeholders (todo!()
). Mark them as not implemented or stub them out with partial logic so that any future maintainers or users get clearer feedback if these methods are called.Also applies to: 97-102, 104-109
rust/bin/agent/src/handler/search.rs (1)
73-159
: Consider consolidating error handling.The error handling logic contains repetitive code patterns for constructing error details. Consider extracting this into a helper function.
+fn create_error_details( + err: Error, + domain: &str, + config: &search::Config, + req: &search::Request, + resource_type: String, + resource_name: String, +) -> ErrorDetails { + let metadata = HashMap::new(); + let mut err_details = ErrorDetails::new(); + err_details.set_error_info(err.to_string(), domain, metadata); + err_details.set_request_info( + config.request_id, + String::from_utf8(req.encode_to_vec()) + .unwrap_or_else(|_| "<invalid UTF-8>".to_string()), + ); + err_details.set_resource_info(resource_type, resource_name, "", ""); + err_details +}rust/bin/agent/src/handler/upsert.rs (4)
41-73
: Extract error messages as constants.The error messages and resource types are hardcoded. Consider extracting them as constants for better maintainability and reusability:
const ERR_MSG_DIMENSION_SIZE: &str = "Upsert API Incompatible Dimension Size detected"; const RESOURCE_TYPE_SUFFIX: &str = "/qbg.Upsert";
74-94
: Use is_empty() for UUID validation.Replace
uuid.len() == 0
with the more idiomaticuuid.is_empty()
:- if uuid.len() == 0 { + if uuid.is_empty() {
95-123
: Consider extracting request creation logic.The update and insert request creation logic could be extracted into helper functions for better readability and maintainability:
impl super::Agent { fn create_update_request(&self, vector: Option<_>, config: upsert::Config) -> update::Request { update::Request { vector, config: Some(update::Config { skip_strict_exist_check: true, filters: config.filters, timestamp: config.timestamp, disable_balanced_update: config.disable_balanced_update, }), } } fn create_insert_request(&self, vector: Option<_>, config: upsert::Config) -> insert::Request { insert::Request { vector, config: Some(insert::Config { skip_strict_exist_check: true, filters: config.filters, timestamp: config.timestamp, }), } } }
124-159
: Extract common error details construction logic.The error details construction pattern is repeated multiple times. Consider extracting it into a helper method:
impl super::Agent { fn create_error_details( &self, uuid: String, req: &upsert::Request, domain: &str, rt_name: &str, error_reason: String, ) -> ErrorDetails { let mut err_details = ErrorDetails::new(); let metadata = HashMap::new(); let resource_type = format!("{}{}", self.resource_type, rt_name); let resource_name = format!("{}: {}({})", self.api_name, self.name, self.ip); err_details.set_error_info(error_reason, domain, metadata); err_details.set_request_info( uuid, String::from_utf8(req.encode_to_vec()) .unwrap_or_else(|_| "<invalid UTF-8>".to_string()), ); err_details.set_resource_info(resource_type, resource_name, "", ""); err_details } }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
rust/Cargo.lock
is excluded by!**/*.lock
📒 Files selected for processing (14)
rust/bin/agent/Cargo.toml
(1 hunks)rust/bin/agent/src/handler.rs
(1 hunks)rust/bin/agent/src/handler/index.rs
(2 hunks)rust/bin/agent/src/handler/insert.rs
(1 hunks)rust/bin/agent/src/handler/object.rs
(1 hunks)rust/bin/agent/src/handler/remove.rs
(1 hunks)rust/bin/agent/src/handler/search.rs
(2 hunks)rust/bin/agent/src/handler/update.rs
(1 hunks)rust/bin/agent/src/handler/upsert.rs
(1 hunks)rust/bin/agent/src/main.rs
(1 hunks)rust/libs/algorithm/src/lib.rs
(1 hunks)rust/libs/algorithms/qbg/src/input.cpp
(3 hunks)rust/libs/algorithms/qbg/src/input.h
(1 hunks)rust/libs/algorithms/qbg/src/lib.rs
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
- GitHub Check: Analyze (go)
- GitHub Check: Cloudflare Pages
🔇 Additional comments (30)
rust/bin/agent/src/handler.rs (2)
19-19
: No issues with the new module.The addition of the
object
module is straightforward. No immediate concerns are apparent for this change alone.
24-25
: Imports for Arc and RwLock.Introducing
Arc
andRwLock
indicates a shift towards a concurrency-friendly design. Verify that no other concurrency primitives are in conflict.✅ Verification successful
Concurrency primitives are correctly chosen and implemented.
The combination of
Arc
withtokio::sync::RwLock
is the idiomatic choice for shared state in an async Rust context, with no conflicting primitives present. The implementation follows async best practices by using Tokio's async-aware RwLock.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check that no conflicting concurrency abstractions are used in the codebase. rg --multiline 'Mutex<|std::sync::Mutex|'Length of output: 70437
Script:
#!/bin/bash # Search for various mutex patterns in Rust files rg --type rust -C 3 'std::sync::Mutex|parking_lot::Mutex|tokio::sync::Mutex|std::sync::RwLock|parking_lot::RwLock' rust/Length of output: 122
Script:
#!/bin/bash # Check file content to verify path and examine synchronization patterns cat rust/bin/agent/src/handler.rsLength of output: 1395
rust/libs/algorithm/src/lib.rs (2)
22-31
: New error variants introduced.These specific error types improve clarity. Ensure they are consistently used throughout the code to avoid generic error handling.
64-77
: Expansion of the ANN trait.Introducing these methods is cohesive with typical ANN workflows (create, save, insert, update, remove, search, get). Ensure consistent error handling.
rust/bin/agent/src/main.rs (5)
19-19
: Added import for Distance.Importing
Distance
is necessary for building the search results. Looks good.
21-22
: New imports for the QBG library.Including
qbg::index::Index
andqbg::property::Property
is aligned with the introduction of QBG functionalities.
222-228
: Agent instantiation with QBGService.This final linking step is correct. Good job ensuring consistent naming for resource_type and api_name.
76-103
:⚠️ Potential issueQBGService concurrency usage.
Implementation unwraps many results from the underlying
Index
. Ensure that these unwrap calls are safe from panics. Consider using proper error handling or mapping to the custom error variants.- .unwrap(); + .map_err(|e| /* handle or map to custom error variants */)?Likely invalid or redundant comment.
208-221
: Index and property initialization.The steps for QBG index creation are explicit. Verify concurrency constraints: if multiple threads need to access index creation, consider thread-safe patterns.
✅ Verification successful
Current implementation correctly handles concurrency constraints
The QBGService instance is properly managed through single ownership, being moved into the Agent handler. The async gRPC server handles requests sequentially for this single instance, requiring no additional thread-safety patterns.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Checking for concurrent usage of QBGService components rg --context 5 "QBGService"Length of output: 1653
Script:
#!/bin/bash # Check for concurrent/async usage patterns rg -B 5 -A 5 "async|tokio|thread|Arc|Mutex" rust/bin/agent/src/main.rs # Check Index and Property type definitions ast-grep --pattern 'struct Index' ast-grep --pattern 'struct Property'Length of output: 466
Script:
#!/bin/bash # Check for server/handler usage patterns rg -B 10 -A 10 "Server::bind|service_fn|handle|route" rust/bin/agent/src/main.rsLength of output: 1036
rust/bin/agent/src/handler/index.rs (2)
16-24
: Imports verification
These new imports accurately align with the subsequent usage in the file. No issues found here.
135-146
:index_info
correctness
Fetching multiple index statistics under a shared read lock is correct and thread-safe. Good usage of the read lock to avoid blocking other read operations during the info retrieval.rust/bin/agent/src/handler/remove.rs (3)
16-24
: Import usage
All imports line up with the error handling requirements below.
30-47
: Request checks and error returns
The conditions for missing configuration or ID are handled promptly. Good short-circuit pattern for invalid inputs.
68-147
: Detailed error-handling withremove
Excellent usage of domain-specific errors. Each branch clearly communicates the nature of the failure.
Consider adding a structured log or trace event to capture these error states for external monitoring.rust/bin/agent/src/handler/insert.rs (4)
16-25
: Import details
These imports are consistent with the new error handling approach.
30-40
: Parameter usage
Switching from_request
torequest
and actually leveraging it for logging is an improvement in clarity.
42-45
: Missing vector field
Promptly returning an error if the vector is missing helps keep the logic simple and prevents null-dereference risk.
46-72
: Dimension mismatch check
Implementing a dedicated error path for incompatible dimension sizes is a key correctness safeguard against data corruption.rust/bin/agent/src/handler/update.rs (1)
183-188
: Potential large payload in error logsIn some errors above, the entire request is logged, which can contain large vectors or sensitive data. Make sure that including the vector in error details and logs doesn’t violate any security or privacy requirements for your organization or users.
rust/libs/algorithms/qbg/src/input.h (1)
111-113
: Good use of const-correctnessAdding
const
qualifiers to thesearch
,get_object
, andget_dimension
methods ensures they don’t accidentally modify theIndex
state, promoting safer parallel usage and more comprehensible code.rust/bin/agent/src/handler/search.rs (3)
31-34
: Improved configuration validation.The new implementation properly handles missing configuration by using a match statement instead of directly unwrapping, which prevents potential panics.
66-72
: LGTM! Search parameters are correctly passed.The search implementation correctly uses all parameters from the configuration.
37-64
: Verify the resource type change from NGT to QBG.The resource type has been changed from "/ngt.Search" to "/qbg.Search". This change needs to be consistent across the codebase.
✅ Verification successful
Resource type transition from NGT to QBG is complete and consistent
The change from "/ngt.Search" to "/qbg.Search" has been verified to be consistent across the codebase, with no remaining NGT references and proper QBG resource types used in all handlers.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any remaining references to "ngt.Search" that might need updating rg "ngt\.Search" --type rustLength of output: 30
Script:
#!/bin/bash # Search for the new pattern and broader NGT/QBG references echo "=== Searching for qbg.Search ===" rg "qbg\.Search" --type rust echo -e "\n=== Searching for resource_type.*ngt" ===" rg "resource_type.*ngt" --type rust -i echo -e "\n=== Searching for resource_type.*qbg" ===" rg "resource_type.*qbg" --type rust -i echo -e "\n=== Searching for type-related NGT references ===" rg "type.*ngt" --type rust -iLength of output: 2479
rust/libs/algorithms/qbg/src/input.cpp (1)
336-336
: LGTM! Improved const-correctness.The addition of
const
qualifiers tosearch
,get_object
, andget_dimension
methods is a good improvement as these operations don't modify the index state.Also applies to: 390-390, 406-406
rust/libs/algorithms/qbg/src/lib.rs (3)
98-105
: LGTM! FFI interface aligns with C++ implementation.The method signatures now correctly reflect the const-qualified methods in the C++ code.
373-398
: LGTM! Improved error handling in search implementation.The implementation properly handles errors and unwraps the inner index safely.
109-112
: Verify thread safety implementation.The addition of
Send
andSync
implementations forProperty
andIndex
enables concurrent access. Please ensure all methods are actually thread-safe.✅ Verification successful
Thread safety implementation appears correct from Rust perspective
The implementation uses proper ownership semantics with
UniquePtr
andPin<&mut>
for exclusive access. Thread safety ultimately depends on the C++ implementation of these types, but the Rust-side wrapping is correctly designed.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Search for any internal mutability (Cell, RefCell, Mutex, etc.) that might affect thread safety rg "Cell|RefCell|Mutex|RwLock" --type rustLength of output: 240
Script:
#!/bin/bash # Find Property and Index type definitions and their methods ast-grep --pattern 'struct Property {$$$}' ast-grep --pattern 'struct Index {$$$}' ast-grep --pattern 'impl $_ for Property {$$$}' ast-grep --pattern 'impl $_ for Index {$$$}' # Look for unsafe blocks and raw pointer usage rg "unsafe|\\*mut|\\*const" --type rust -A 5 rust/libs/algorithms/qbg/src/Length of output: 2870
rust/bin/agent/Cargo.toml (1)
25-25
: Verify QBG version compatibility.The addition of the QBG dependency is correct, but please ensure version 0.1.0 is compatible with the rest of the project dependencies.
rust/bin/agent/src/handler/upsert.rs (2)
16-24
: Well-organized imports!The imports are logically grouped and include all necessary dependencies for error handling and protobuf types.
Line range hint
160-183
: TODO implementations align with PR objectives.The
stream_upsert
andmulti_upsert
methods are marked astodo!()
, which aligns with the PR objectives stating that Multi/Stream API implementation is not yet completed.
let hostname = cargo::util::hostname()?; | ||
let domain = hostname.to_str().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace unwrap with proper error handling for hostname.
Using unwrap()
on hostname.to_str()
could panic if the hostname contains invalid UTF-8 characters. Consider proper error handling:
- let domain = hostname.to_str().unwrap();
+ let domain = hostname.to_str().unwrap_or("<invalid-hostname>");
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
let hostname = cargo::util::hostname()?; | |
let domain = hostname.to_str().unwrap(); | |
let hostname = cargo::util::hostname()?; | |
let domain = hostname.to_str().unwrap_or("<invalid-hostname>"); |
println!("Recieved a request from {:?}", request.remote_addr()); | ||
let req = request.get_ref(); | ||
let config = match req.config.clone() { | ||
Some(cfg) => cfg, | ||
None => return Err(Status::invalid_argument("Missing configuration in request")), | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Replace println with proper logging framework.
Using println!
for request logging is not recommended in production code. Consider using a proper logging framework like tracing
or log
for better observability and control over log levels.
- println!("Recieved a request from {:?}", request.remote_addr());
+ tracing::debug!("Received a request from {:?}", request.remote_addr());
Also, there's a typo in "Recieved" -> "Received".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
println!("Recieved a request from {:?}", request.remote_addr()); | |
let req = request.get_ref(); | |
let config = match req.config.clone() { | |
Some(cfg) => cfg, | |
None => return Err(Status::invalid_argument("Missing configuration in request")), | |
}; | |
tracing::debug!("Received a request from {:?}", request.remote_addr()); | |
let req = request.get_ref(); | |
let config = match req.config.clone() { | |
Some(cfg) => cfg, | |
None => return Err(Status::invalid_argument("Missing configuration in request")), | |
}; |
[CHATOPS:HELP] ChatOps commands.
|
Description
Related Issue
Versions
Checklist
Special notes for your reviewer
Summary by CodeRabbit
Release Notes
New Features
Improvements
Performance
Bug Fixes
These changes represent a significant upgrade to the agent's functionality and reliability.