-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added logging for DatasetNotFound
error
#7778
Conversation
cvat/apps/engine/log.py
Outdated
base_info = f"[{entity}.id={entity_id} format.name={format_name} exc={base_error}]" | ||
dir_tree = directory_tree( | ||
path=dir_path, | ||
max_depth=5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a class constant or constructor parameter with default value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@azhavoro
Do you mean a param in DatasetLogManager for max depth?
cvat/apps/engine/utils.py
Outdated
for item in os.listdir(path): | ||
full_path = os.path.join(path, item) | ||
if os.path.isdir(full_path): | ||
output += directory_tree(full_path, max_depth, current_depth + 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't using os.walk be simpler and more efficient way here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, Ive tried to use walk from datumaro which is based on os.walk and also supports depth
@coderabbitai help |
CodeRabbit Commands
|
@coderabbitai review |
WalkthroughThe updates across several files in the CVAT application enhance error logging for dataset imports. New functionalities include logging detailed errors when datasets are not found and adjusting settings to enable or disable this feature. Additionally, these changes integrate a new logging class and utilize directory tree structures for better error insight, all controllable via environment settings in Docker. Changes
Recent Review DetailsConfiguration used: CodeRabbit UI Files selected for processing (1)
Files skipped from review as they are similar to previous changes (1)
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
cvat/apps/engine/log.py
Outdated
class DatasetLogManager: | ||
def __init__(self) -> None: | ||
self.glob = logging.getLogger("dataset_logger") | ||
|
||
def log_import_error(self, entity, entity_id, format_name, base_error, dir_path) -> None: | ||
base_info = f"[{entity}.id={entity_id} format.name={format_name} exc={base_error}]" | ||
dir_tree = directory_tree( | ||
path=dir_path, | ||
max_depth=5, | ||
) | ||
log_error = f"{base_info} \nDirectory tree:\n{dir_tree}" | ||
self.glob.error(log_error) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding a class constant or constructor parameter for the logger name.
This would increase flexibility by allowing different logger configurations without modifying the class code directly. Here's a suggested change:
- self.glob = logging.getLogger("dataset_logger")
+ DEFAULT_LOGGER_NAME = "dataset_logger"
+ def __init__(self, logger_name=DEFAULT_LOGGER_NAME) -> None:
+ self.glob = logging.getLogger(logger_name)
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
class DatasetLogManager: | |
def __init__(self) -> None: | |
self.glob = logging.getLogger("dataset_logger") | |
def log_import_error(self, entity, entity_id, format_name, base_error, dir_path) -> None: | |
base_info = f"[{entity}.id={entity_id} format.name={format_name} exc={base_error}]" | |
dir_tree = directory_tree( | |
path=dir_path, | |
max_depth=5, | |
) | |
log_error = f"{base_info} \nDirectory tree:\n{dir_tree}" | |
self.glob.error(log_error) | |
class DatasetLogManager: | |
DEFAULT_LOGGER_NAME = "dataset_logger" | |
def __init__(self, logger_name=DEFAULT_LOGGER_NAME) -> None: | |
self.glob = logging.getLogger(logger_name) | |
def log_import_error(self, entity, entity_id, format_name, base_error, dir_path) -> None: | |
base_info = f"[{entity}.id={entity_id} format.name={format_name} exc={base_error}]" | |
dir_tree = directory_tree( | |
path=dir_path, | |
max_depth=5, | |
) | |
log_error = f"{base_info} \nDirectory tree:\n{dir_tree}" | |
self.glob.error(log_error) | |
cvat/apps/engine/utils.py
Outdated
def directory_tree(path, max_depth=None, current_depth=0) -> str: | ||
if not os.path.exists(path): | ||
return f"No such file or directory: {path}" | ||
|
||
output = "" | ||
if current_depth > 0: | ||
output += "| " * (current_depth - 1) + "|- " | ||
|
||
output += os.path.basename(path) + "/\n" | ||
|
||
if max_depth is None or current_depth < max_depth: | ||
for item in os.listdir(path): | ||
full_path = os.path.join(path, item) | ||
if os.path.isdir(full_path): | ||
output += directory_tree(full_path, max_depth, current_depth + 1) | ||
else: | ||
output += "| " * current_depth + "|- " + item + "\n" | ||
return output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimize the directory_tree
function using os.walk
.
Using os.walk
could simplify and potentially optimize the directory tree generation. Here's a suggested implementation:
- for item in os.listdir(path):
- full_path = os.path.join(path, item)
- if os.path.isdir(full_path):
- output += directory_tree(full_path, max_depth, current_depth + 1)
- else:
- output += "| " * current_depth + "|- " + item + "\n"
+ for root, dirs, files in os.walk(path, topdown=True):
+ for name in dirs:
+ output += "| " * current_depth + "|- " + name + "/\n"
+ for name in files:
+ output += "| " * current_depth + "|- " + name + "\n"
+ break # This ensures only one level is processed at a time, similar to the original functionality
Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
def directory_tree(path, max_depth=None, current_depth=0) -> str: | |
if not os.path.exists(path): | |
return f"No such file or directory: {path}" | |
output = "" | |
if current_depth > 0: | |
output += "| " * (current_depth - 1) + "|- " | |
output += os.path.basename(path) + "/\n" | |
if max_depth is None or current_depth < max_depth: | |
for item in os.listdir(path): | |
full_path = os.path.join(path, item) | |
if os.path.isdir(full_path): | |
output += directory_tree(full_path, max_depth, current_depth + 1) | |
else: | |
output += "| " * current_depth + "|- " + item + "\n" | |
return output | |
def directory_tree(path, max_depth=None, current_depth=0) -> str: | |
if not os.path.exists(path): | |
return f"No such file or directory: {path}" | |
output = "" | |
if current_depth > 0: | |
output += "| " * (current_depth - 1) + "|- " | |
output += os.path.basename(path) + "/\n" | |
if max_depth is None or current_depth < max_depth: | |
for root, dirs, files in os.walk(path, topdown=True): | |
for name in dirs: | |
output += "| " * current_depth + "|- " + name + "/\n" | |
for name in files: | |
output += "| " * current_depth + "|- " + name + "\n" | |
break # This ensures only one level is processed at a time, similar to the original functionality | |
return output |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Out of diff range and nitpick comments (1)
cvat/settings/base.py (1)
471-478
: The new logging configuration for dataset handling looks good. Consider prefixing the handler name withCVAT_
to maintain consistency with other CVAT settings.
Motivation and context
This PR adds logs for further invistigation of
DatasetNotFound
error which frequently appears in case of importing datasets in various formatsHow has this been tested?
Checklist
develop
branch[ ] I have created a changelog fragment[ ] I have updated the documentation accordingly[ ] I have added tests to cover my changes[ ] I have linked related issues (see GitHub docs)[ ] I have increased versions of npm packages if it is necessary(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.
Summary by CodeRabbit
New Features
Enhancements
Configuration Changes
CVAT_LOG_IMPORT_ERRORS
to control the logging of dataset import errors, set to'true'
by default in the Docker configuration.