Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release v2.12.1 #7811

Merged
merged 27 commits into from
Apr 29, 2024
Merged

Release v2.12.1 #7811

merged 27 commits into from
Apr 29, 2024

Conversation

cvat-bot[bot]
Copy link
Contributor

@cvat-bot cvat-bot bot commented Apr 26, 2024

Fixed

cvat-bot bot and others added 27 commits April 15, 2024 20:20
…ts (#7669)

<!-- Raise an issue to propose your change
(https://github.com/opencv/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution
guide](https://opencv.github.io/cvat/docs/contributing/). -->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
<!-- Why is this change required? What problem does it solve? If it
fixes an open
issue, please link to the issue here. Describe your changes in detail,
add
screenshots. -->

Main Issue: #7571 
Related Issue: #2339 

I've reproduced the issue mentioned in #7571 when exporting and
importing annotations using both the Datumaro and Coco 1.0 formats.
Specifically, the "Switch outside" attribute isn't being applied as
expected. After some investigation, I pinpointed the root cause to be
the absence of the "outside" attribute in the exported annotations.

To address this, I've made adjustments to the binding.py file to bypass
the track_id during annotation import. This modification appears to
solve the issue regarding the "Switch outside" attribute. However, it
introduces a new concern: the potential loss of information, including
keyframes and track_id.

While this workaround offers a temporary fix, I'm contemplating a more
holistic approach that entails properly handling the "outside" attribute
during both the export and import processes of annotations. This method
could preserve the integrity of the annotations while ensuring the
functionality of the "Switch outside" attribute.

I'm reaching out for feedback or suggestions on my proposed solution. Is
there a preference for one of these approaches, or might there be
another avenue I haven't considered?

Looking forward to your insights.

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- [x] I have created a changelog fragment <!-- see top comment in
CHANGELOG.md -->
- [ ] I have updated the documentation accordingly
- [x] I have added tests to cover my changes
- [x] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))
- [ ] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/opencv/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/opencv/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/opencv/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/opencv/cvat/tree/develop/cvat-ui#versioning))

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/opencv/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.

---------

Co-authored-by: Maxim Zhiltsov <[email protected]>
<!-- Raise an issue to propose your change
(https://github.com/cvat-ai/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution guide](https://docs.cvat.ai/docs/contributing/).
-->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
<!-- Why is this change required? What problem does it solve? If it
fixes an open
issue, please link to the issue here. Describe your changes in detail,
add
screenshots. -->
After analyzing the current implementation it turned out that we
evaluate queryset and iterate over them only one time when merging table
rows and initializing custom structure for storing objects. It means
that generally, we can disable internal Django's default logic for
caching querysets. This approach allows us to reduce the amount of
memory used when there are a large number of objects.

### How has this been tested?
<!-- Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc. -->
I've checked the amount of memory used, the number of queries to the
database, and the required time.

**Without iterator:**
Memory usage: 18GB
<details>
  <summary>Profile details</summary>
    
```
filename: /home/maya/Documents/cvat/cvat/apps/dataset_manager/task.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
    116    290.7 MiB    290.7 MiB           1   @Profile
    117                                         def _merge_table_rows(rows, keys_for_merge, field_id):
    118                                             # It is necessary to keep a stable order of original rows
    119                                             # (e.g. for tracked boxes). Otherwise prev_box.frame can be bigger
    120                                             # than next_box.frame.
    121    290.7 MiB      0.0 MiB           1       merged_rows = OrderedDict()
    122                                         
    123                                             # Group all rows by field_id. In grouped rows replace fields in
    124                                             # accordance with keys_for_merge structure.
    125  19053.0 MiB  17140.9 MiB     1721804       for row in rows: #.iterator():
    126  19053.0 MiB   -567.5 MiB     1721803           row_id = row[field_id]
    127  19053.0 MiB   -567.6 MiB     1721803           if not row_id in merged_rows:
    128  19053.0 MiB    127.0 MiB      373063               merged_rows[row_id] = dotdict(row)
    129  19053.0 MiB   -162.9 MiB      746126               for key in keys_for_merge:
    130  19053.0 MiB   -111.0 MiB      373063                   merged_rows[row_id][key] = []
    131                                         
    132  19053.0 MiB  -1121.3 MiB     3443606           for key in keys_for_merge:
    133  19053.0 MiB  -2753.5 MiB    10330818               item = dotdict({v.split('__', 1)[-1]:row[v] for v in keys_for_merge[key]})
    134  19053.0 MiB   -485.9 MiB     1721803               if item.id is not None:
    135  19053.0 MiB   -525.5 MiB     1573530                   merged_rows[row_id][key].append(item)
    136                                         
    137                                             # Remove redundant keys from final objects
    138  19053.0 MiB      0.0 MiB           7       redundant_keys = [item for values in keys_for_merge.values() for item in values]
    139  19053.0 MiB  -5728.4 MiB      373064       for i in merged_rows:
    140  19053.0 MiB -22913.7 MiB     1492252           for j in redundant_keys:
    141  19053.0 MiB -17185.2 MiB     1119189               del merged_rows[i][j]
    142                                         
    143  19052.9 MiB     -0.0 MiB           1       return list(merged_rows.values())
```

```
filename: /home/maya/Documents/cvat/cvat/apps/dataset_manager/task.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
   749    290.7 MiB    290.7 MiB           1       @Profile
   750                                             def init_from_db(self):
   751    290.7 MiB      0.0 MiB           1           self._init_tags_from_db()
   752  19052.3 MiB  18761.7 MiB           1           self._init_shapes_from_db()
   753  19052.3 MiB      0.0 MiB           1           self._init_tracks_from_db()
   754  19052.3 MiB      0.0 MiB           1           self._init_version_from_db()
```
</details>

![Screenshot from 2024-04-10
13-57-33](https://github.com/cvat-ai/cvat/assets/49038720/9bc60a21-6676-422c-a938-39b17ff484b2)
**With iterator:**
Memory usage: 5.5GB
<details>
  <summary>Profile details</summary>
    
```
filename: /home/maya/Documents/cvat/cvat/apps/dataset_manager/task.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
   116    290.9 MiB    290.9 MiB           1   @Profile
   117                                         def _merge_table_rows(rows, keys_for_merge, field_id):
   118                                             # It is necessary to keep a stable order of original rows
   119                                             # (e.g. for tracked boxes). Otherwise prev_box.frame can be bigger
   120                                             # than next_box.frame.
   121    290.9 MiB      0.0 MiB           1       merged_rows = OrderedDict()
   122                                         
   123                                             # Group all rows by field_id. In grouped rows replace fields in
   124                                             # accordance with keys_for_merge structure.
   125   4345.7 MiB   3783.7 MiB     1721804       for row in rows.iterator():
   126   4345.7 MiB     24.2 MiB     1721803           row_id = row[field_id]
   127   4345.7 MiB      0.0 MiB     1721803           if not row_id in merged_rows:
   128   4345.7 MiB     78.9 MiB      373063               merged_rows[row_id] = dotdict(row)
   129   4345.7 MiB      5.9 MiB      746126               for key in keys_for_merge:
   130   4345.7 MiB      0.0 MiB      373063                   merged_rows[row_id][key] = []
   131                                         
   132   4345.7 MiB      0.3 MiB     3443606           for key in keys_for_merge:
   133   4345.7 MiB    152.9 MiB    10330818               item = dotdict({v.split('__', 1)[-1]:row[v] for v in keys_for_merge[key]})
   134   4345.7 MiB      9.0 MiB     1721803               if item.id is not None:
   135   4345.7 MiB      0.0 MiB     1573530                   merged_rows[row_id][key].append(item)
   136                                         
   137                                             # Remove redundant keys from final objects
   138   4345.7 MiB      0.0 MiB           7       redundant_keys = [item for values in keys_for_merge.values() for item in values]
   139   4345.7 MiB      0.0 MiB      373064       for i in merged_rows:
   140   4345.7 MiB      0.0 MiB     1492252           for j in redundant_keys:
   141   4345.7 MiB      0.0 MiB     1119189               del merged_rows[i][j]
   142                                         
   143   4348.5 MiB      2.8 MiB           1       return list(merged_rows.values())
```
```
Filename: /home/maya/Documents/cvat/cvat/apps/dataset_manager/task.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
   590    290.9 MiB    290.9 MiB           1       @Profile
   591                                             def _init_shapes_from_db(self):
   592    290.9 MiB      0.0 MiB           3           db_shapes = self.db_job.labeledshape_set.prefetch_related(
   593    290.9 MiB      0.0 MiB           1               "label",
   594    290.9 MiB      0.0 MiB           1               "labeledshapeattributeval_set"
   595    290.9 MiB      0.0 MiB           2           ).values(
   596    290.9 MiB      0.0 MiB           1               'id',
   597    290.9 MiB      0.0 MiB           1               'label_id',
   598    290.9 MiB      0.0 MiB           1               'type',
   599    290.9 MiB      0.0 MiB           1               'frame',
   600    290.9 MiB      0.0 MiB           1               'group',
   601    290.9 MiB      0.0 MiB           1               'source',
   602    290.9 MiB      0.0 MiB           1               'occluded',
   603    290.9 MiB      0.0 MiB           1               'outside',   
   604    290.9 MiB      0.0 MiB           1               'z_order',
   605    290.9 MiB      0.0 MiB           1               'rotation',
   606    290.9 MiB      0.0 MiB           1               'points',
   607    290.9 MiB      0.0 MiB           1               'parent',
   608    290.9 MiB      0.0 MiB           1               'labeledshapeattributeval__spec_id',
   609    290.9 MiB      0.0 MiB           1               'labeledshapeattributeval__value',
   610    290.9 MiB      0.0 MiB           1               'labeledshapeattributeval__id',
   611    290.9 MiB      0.0 MiB           1               ).order_by('frame')                         
   618   4328.6 MiB   4037.7 MiB       2           db_shapes = _merge_table_rows(
   619    290.9 MiB      0.0 MiB           1               rows=db_shapes,
   620    290.9 MiB      0.0 MiB           1               keys_for_merge={
   621    290.9 MiB      0.0 MiB           1                   'labeledshapeattributeval_set': [
   622                                                                       'labeledshapeattributeval__spec_id',
   623                                                                       'labeledshapeattributeval__value',
   624                                                                       'labeledshapeattributeval__id',
   625                                                                  ],
   626                                                                  },
   627    290.9 MiB      0.0 MiB           1               field_id='id',
   628                                                 )
   649   4385.6 MiB      0.0 MiB           1           serializer = serializers.LabeledShapeSerializerFromDB(list(shapes.values()), many=True)
   650   5990.2 MiB   1604.6 MiB           1           self.ir_data.shapes = serializer.data
```
```
Filename: /home/maya/Documents/cvat/cvat/apps/dataset_manager/task.py

Line #    Mem usage    Increment  Occurrences   Line Contents
=============================================================
   749    290.6 MiB    290.6 MiB           1       @Profile
   750                                             def init_from_db(self):
   751    290.9 MiB      0.3 MiB           1           self._init_tags_from_db()
   752   5990.2 MiB   5699.3 MiB           1           self._init_shapes_from_db()
   753   5990.2 MiB      0.0 MiB           1           self._init_tracks_from_db()
   754   5990.2 MiB      0.0 MiB           1           self._init_version_from_db()
```
</details>

![Screenshot from 2024-04-10
12-36-09](https://github.com/cvat-ai/cvat/assets/49038720/430ac2d0-e92b-43fb-ab17-481e287b6447)

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- [x] I have created a changelog fragment <!-- see top comment in
CHANGELOG.md -->
~~- [ ] I have updated the documentation accordingly~~
~~- [ ] I have added tests to cover my changes~~
~~- [ ] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))~~
~~- [ ] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/cvat-ai/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/cvat-ai/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/cvat-ai/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/cvat-ai/cvat/tree/develop/cvat-ui#versioning))~~

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/cvat-ai/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.
There are three changes here:

* Don't save the bundle to a file. There seems to be no reason to do
that; the bundle is very small (<8000 bytes), and we already load it
fully into memory to generate the ETag, so just generate it in memory
and keep it there.

This also fixes a possible race condition where a server process starts
to read the bundle file right as another server process starts up and
overwrites it.

* Generate the bundle on-demand rather than at startup. This obviates
the need for the `IAM_OPA_BUNDLE` environment variable.

* Generate the ETag once, instead of on every request. There's no reason
to hash the same bundle over and over.
…7734)

This improves the dependency graph of apps with regards to IAM and other
apps. IAM is a generic app, so it should not depend on specific apps,
like `webhooks`.

It's also easier to work with a bunch of smaller files rather than one
giant one.

Having permissions classes in separate modules introduces the risk that
they will never be loaded directly. To mitigate that, add a function to
ensure permissions are loaded and call it from every app.

In a future PR, I plan to move the `.rego` files into their respective
apps, as well.
In many cases what is actually meant is
"authenticated"/"authentication". This patch fixes such mistakes across
the codebase.
<!-- Raise an issue to propose your change
(https://github.com/cvat-ai/cvat/issues).
It helps to avoid duplication of efforts from multiple independent
contributors.
Discuss your ideas with maintainers to be sure that changes will be
approved and merged.
Read the [Contribution guide](https://docs.cvat.ai/docs/contributing/).
-->

<!-- Provide a general summary of your changes in the Title above -->

### Motivation and context
<!-- Why is this change required? What problem does it solve? If it
fixes an open
issue, please link to the issue here. Describe your changes in detail,
add
screenshots. -->
This PR adds logs for further invistigation of `DatasetNotFound` error
which frequently appears in case of importing datasets in various
formats

### How has this been tested?
<!-- Please describe in detail how you tested your changes.
Include details of your testing environment, and the tests you ran to
see how your change affects other areas of the code, etc. -->

### Checklist
<!-- Go over all the following points, and put an `x` in all the boxes
that apply.
If an item isn't applicable for some reason, then ~~explicitly
strikethrough~~ the whole
line. If you don't do that, GitHub will show incorrect progress for the
pull request.
If you're unsure about any of these, don't hesitate to ask. We're here
to help! -->
- [x] I submit my changes into the `develop` branch
- ~~[ ] I have created a changelog fragment <!-- see top comment in
CHANGELOG.md -->~~
- ~~[ ] I have updated the documentation accordingly~~
- ~~[ ] I have added tests to cover my changes~~
- ~~[ ] I have linked related issues (see [GitHub docs](

https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword))~~
- ~~[ ] I have increased versions of npm packages if it is necessary

([cvat-canvas](https://github.com/cvat-ai/cvat/tree/develop/cvat-canvas#versioning),

[cvat-core](https://github.com/cvat-ai/cvat/tree/develop/cvat-core#versioning),

[cvat-data](https://github.com/cvat-ai/cvat/tree/develop/cvat-data#versioning)
and

[cvat-ui](https://github.com/cvat-ai/cvat/tree/develop/cvat-ui#versioning))~~

### License

- [x] I submit _my code changes_ under the same [MIT License](
https://github.com/cvat-ai/cvat/blob/develop/LICENSE) that covers the
project.
  Feel free to contact the maintainers if that's a concern.


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
- Added detailed error logging for dataset import errors to enhance
troubleshooting and user feedback.

- **Enhancements**
- Introduced a new logging configuration for handling dataset import
errors, including file location and formatting details.

- **Configuration Changes**
- Added an environment variable `CVAT_LOG_IMPORT_ERRORS` to control the
logging of dataset import errors, set to `'true'` by default in the
Docker configuration.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Copy link
Contributor

coderabbitai bot commented Apr 26, 2024

Important

Auto Review Skipped

Bot user detected.

To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@cvat-bot cvat-bot bot merged commit 6990168 into master Apr 29, 2024
29 checks passed
@cvat-bot cvat-bot bot deleted the release-2.12.1 branch April 29, 2024 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants