Add attribute support for auto-annotation functions #9090

SpecLad · 2025-02-11T12:57:55Z

Motivation and context

Remove one of the long-standing limitations on auto-annotation functions by adding the necessary validation and remapping logic to support attribute specifications and values. Add a utility module for attributes with functionality I needed, but felt didn't belong in the auto-annotation layer.

Adds the necessary code to support using functions with attributes via agents, as well. I will submit the necesssary server-side code will be submitted to the private repository later; until that is merged, attempts to create native functions with attributes will be rejected.

How has this been tested?

Unit tests and manual testing.

Checklist

I submit my changes into the develop branch
I have created a changelog fragment
I have updated the documentation accordingly
I have added tests to cover my changes
~~[ ] I have linked related issues (see GitHub docs)~~

License

I submit my code changes under the same MIT License that covers the project.
Feel free to contact the maintainers if that's a concern.

codecov-commenter · 2025-02-11T15:37:30Z

Codecov Report

Attention: Patch coverage is 31.72043% with 127 lines in your changes missing coverage. Please review.

Project coverage is 73.81%. Comparing base (397a915) to head (1c6212c).
Report is 2 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9090      +/-   ##
===========================================
- Coverage    73.97%   73.81%   -0.16%     
===========================================
  Files          430      431       +1     
  Lines        44631    44794     +163     
  Branches      3892     3892              
===========================================
+ Hits         33017    33066      +49     
- Misses       11614    11728     +114

Components	Coverage Δ
cvat-ui	`77.48% <ø> (-0.02%)`	⬇️
cvat-server	`70.80% <31.72%> (-0.26%)`	⬇️

cvat-sdk/cvat_sdk/auto_annotation/driver.py

zhiltsov-max · 2025-02-13T17:05:36Z

cvat-sdk/cvat_sdk/auto_annotation/driver.py

 from .exceptions import BadFunctionError
 from .interface import DetectionFunction, DetectionFunctionContext, DetectionFunctionSpec


+@attrs.frozen
+class _AttributeNameMapping:


Shouldn't these classes called *Mapping be collections.abc.Mapping-compatible?

Well... it's a more abstract notion of mapping. The idea is that each of the XNameMapping classes defines how names within X should be mapped, so it parallels the structure of X. So, for example, if you want to map attribute.name, you look up attribute_nm.name (where attribute_nm is an _AttributeNameMapping object).

Maybe they shouldn't be called *Mapping, but I couldn't think of anything better when I created them.

cvat-sdk/cvat_sdk/auto_annotation/driver.py

tests/python/sdk/test_attributes.py

zhiltsov-max · 2025-02-14T10:47:14Z

tests/python/sdk/test_auto_annotation.py

+                cvataa.skeleton(
+                    123,  # cat
+                    [
+                        # head
+                        cvataa.keypoint(
+                            10,
+                            [10, 10],
+                            attributes=[
+                                cvataa.attribute_val(1, "5"),  # size
+                                cvataa.attribute_val(2, "forward"),  # orientation
+                            ],
+                        ),
+                        # tail
+                        cvataa.keypoint(30, [30, 30]),
+                    ],
+                    attributes=[
+                        cvataa.attribute_val(1, "calico"),  # color
+                        cvataa.attribute_val(2, "McFluffy"),  # name
+                    ],


I think it would be convenient to pass attributes as a dict with attr name as a key, what do you think? Probably, they are unique within a label.

The same idea is valid for labels as well, but as we can have nested ones with same names, we should check for name overlaps and also support a list of strings (e.g. ["parent name". "sublabel name"] or the inverse) to explicitly resolve such conflicts.

I think it would be convenient to pass attributes as a dict with attr name as a key, what do you think?

I think it's irrelevant. detect outputs LabeledShapeRequests, and attribute values inside a LabeledShapeRequest are identified by IDs.

Furthermore, I'm not sure this would be a meaningful improvement. The attribute identifier (be it name or number) has to occur at least twice in a function (in the spec and in detect), so in a real-life function I would expect it to be factored out as a named constant. And once you have a constant, it doesn't matter if its value is a name or a number.

The same idea is valid for labels as well

NNs usually output labels as numbers, so I think requiring names in detect would be a downgrade in convenience.

attribute values inside a LabeledShapeRequest are identified by IDs.

So, basically, this is the problem I'm talking about. If we make attribute ids optional in the attr spec, we can leave only names in the user code.

expect it to be factored out as a named constant. And once you have a constant, it doesn't matter if its value is a name or a number.

Ok, but if you have a number, it means you have to define and maintain it somewhere. If you have only a name, you don't need to map it to a number. To me, returning

attributes={ "color": "calico", "name": "McFluffy", },

or, more real:

attributes={ "color": output_tensor[dim_id], "name": output_tensor[another_dim_id], },

Looks significantly better than

attributes=[ cvataa.attribute_val(self.attr_name_map["color"], output_tensor[dim_id]), cvataa.attribute_val(self.attr_name_map["name"], output_tensor[another_dim_id]), ],

If you note, even in the tests there are currently comments with the class and attribute names in every place.

So, basically, this is the problem I'm talking about. If we make attribute ids optional in the attr spec, we can leave only names in the user code.

How are you proposing to do that? AttributeValRequest has a mandatory spec_id field (and no spec_name field).

Looks significantly better than

I agree, but I don't think it's the whole picture (also, I don't think you'd need anything like self.attr_name_map). Here's what a full function with attributes would look like, in my view:

ATTR_ID_COLOR = 1 spec = DetectionFunctionSpec(... attributes=[cvaaa.radio_attribute_spec("color", ATTR_ID_COLOR, ...)] ...) def detect(...): ... attributes=[cvataa.attribute_val(ATTR_ID_COLOR, ...)] ...

And here's what it would look like with attribute names only:

ATTR_NAME_COLOR = "color" spec = DetectionFunctionSpec(... attributes=[cvaaa.radio_attribute_spec(ATTR_NAME_COLOR, ...)] ...) def detect(...): ... attributes={ATTR_NAME_COLOR: ...} ...

which is nicer, sure, but I don't think it's a substantial enough improvement.

In this example the map and id declaration is converted into a manual declaration of a list of constants. You still need to define and maintain these ids regardless of the declaration style. Technically, in both cases you could write without a constant:

spec = DetectionFunctionSpec(... attributes=[cvaaa.radio_attribute_spec("color", 42, ...)] ...) def detect(...): ... attributes={42: ...} ...

spec = DetectionFunctionSpec(... attributes=[cvaaa.radio_attribute_spec("color", ...)] ...) def detect(...): ... attributes={"color": ...} ...

However, with ids you're basically forced to introduce a constant or a mapping or a comment to make the code clear. With attribute names, you can just write the code naturally, even though it's not a good programming style.

In both cases supporting dict of attributes reduces boilerplate code, so it looks like a good idea.

Okay, I tried to find a solution by adding a more complicated helper function. It takes a dictionary and converts it to a list of AttributeValRequests. This function also does type conversions, so it should also cover your concerns from another thread.

However, the attributes are still identified by number. I don't think I can support string identifiers without making everything more complicated than it's worth.

Ok, I think we can stop at this point. It still can be improved, but it's better to do this in a separate PR.

zhiltsov-max · 2025-02-14T10:55:19Z

tests/python/sdk/test_auto_annotation.py

+                    attributes=[
+                        cvataa.select_attribute_spec("color", 1, ["gray", "calico"]),
+                        cvataa.text_attribute_spec("name (should be ignored)", 2),
+                    ],


*_attribute_spec - consider making id optional. Probably, names are unique within a label.

…ValueValidator

sonarqubecloud · 2025-02-19T11:07:43Z

Quality Gate passed

Issues
5 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

SpecLad force-pushed the aa-attributes branch 3 times, most recently from 2cb88b3 to 6681b9b Compare February 11, 2025 13:32

SpecLad marked this pull request as ready for review February 11, 2025 13:37

SpecLad requested review from zhiltsov-max, bsekachev and nmanovic as code owners February 11, 2025 13:37

zhiltsov-max reviewed Feb 13, 2025

View reviewed changes

cvat-sdk/cvat_sdk/auto_annotation/driver.py Show resolved Hide resolved

zhiltsov-max reviewed Feb 13, 2025

View reviewed changes

cvat-sdk/cvat_sdk/auto_annotation/driver.py Outdated Show resolved Hide resolved