Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle skeleton encoding internally #1970

Merged
merged 40 commits into from
Sep 25, 2024

Conversation

eberrigan
Copy link
Contributor

@eberrigan eberrigan commented Sep 20, 2024

Description

Looking at our fly_skeleton_legs.json test data:
Our final json_str has keys directed, graph, links, multigraph, nodes.

  • directed is a boolean. (UNCHANGED from input graph)
  • graph is a dict with keys name and num_edges_inserted. (UNCHANGED from input graph)
  • links is a list of dicts with keys edge_insert_idx, key, source, target, type, for each edge.
  • multigraph is a boolean = True. (UNCHANGED from input graph)
  • nodes is a list of dicts with keys id for each node.
input object to JSON encoder python object location in graph dict encoded object as JSON string
Node(name='neck', weight=1.0) "links":[{... source: { Node(name='neck', weight=1.0) } {"py/object": "sleap.skeleton.Node", "py/state": {"py/tuple": ["neck", 1.0]}}
Node(name='head', weight=1.0) "links":[{... target: { Node(name='head', weight=1.0) } {"py/object": "sleap.skeleton.Node", "py/state": {"py/tuple": ["head", 1.0]}}
<EdgeType.BODY: 1> "links":[{... type: { <EdgeType.BODY: 1> } {"py/reduce": [{"py/type": "sleap.skeleton.EdgeType"}, {"py/tuple": [1]}, null, null, null]}
Node(name='head', weight=1.0) "nodes": [{'id': Node(name='head', weight=1.0)}, ..., ] {"id": {"py/id": 2}}
  • This is because Skeleton._graph is passed through json_graph.node_link_data, which returns a dictionary with node-link formatted data (see documentation here.
def node_link_data():
     multigraph = G.is_multigraph()

    # Allow 'key' to be omitted from attrs if the graph is not a multigraph.
    key = None if not multigraph else key
    if len({source, target, key}) < 3:
        raise nx.NetworkXError("Attribute names are not unique.")
    data = {
        "directed": G.is_directed(),
        "multigraph": multigraph,
        "graph": G.graph,
        "nodes": [{**G.nodes[n], name: n} for n in G],
    }
    if multigraph:
        data[link] = [
            {**d, source: u, target: v, key: k}
            for u, v, k, d in G.edges(keys=True, data=True)
        ]
    else:
        data[link] = [{**d, source: u, target: v} for u, v, d in G.edges(data=True)]
    return data
  • Note that our node-linked data has key links and not link.
    • We do not have networkx pinned. I am not sure why that hasn't been a problem.
  • Then the node-link formatted data is passed to jsonpickle.encode(data).
    • Serializes the SLEAP class Node. Each Node object, which has attributes name and weight, is encoded as a Python object sleap.skeleton.Node. The Node's internal state is represented as a Python tuple containing the name (as a string) and weight (as a numerical value): ["name", weight].
      • In the case where the node is an int, such as when the skeleton has a node_to_idx mapping (

        sleap/sleap/skeleton.py

        Lines 1002 to 1006 in 3c7f5af

        jsonpickle.set_encoder_options("simplejson", sort_keys=True, indent=4)
        if node_to_idx is not None:
        indexed_node_graph = nx.relabel_nodes(
        G=self._graph, mapping=node_to_idx
        ) # map nodes to int
        ), i.e. when the skeleton is from Labels, the node is returned as an integer without the name and weight attributes.
    • Serializes the SLEAP class EdgeType. EdgeType inherits from Python's Enum class. The possible edge types are BODY = 1 and SYMMETRY = 2. When serialized, EdgeType is represented using Python's reduce function, storing the type as sleap.skeleton.EdgeType and the value as a Python tuple containing the Enum value.
  • If the object has been "seen" before, it will not be encoded as the full JSON string but referenced by its py/id, which starts at 1 and indexes the objects in the order they are seen so that the second time the first object is used, it will be referenced as {"py/id": 1}.
    • This is tricky because looking at fly_skeleton_legs.json we can see that the first time a node is used in a source or target in the links list, it is encoded as a py/object with a py/state. Then it is referenced later on using the py/id. nodes then uses only the py/id to reference the Node.
    • However this is not what you would expect from looking at the order in of the graph passed from jsongraph.node_link_data() to jsonpickle.encode() (shown below) which has nodes first and links after.
    • The same is true for the order of source, target and type: in the graph below type comes before source and target so unless explicitly ordered after source and target the py/id will not be consistent with the legacy data.
    • For this reason, a method is added to the SkeletonEncoder called _encode_links that is used within the classmethod encode to encode the links first, and within the links list, encode the source, target and then the type.
    • If the exact order wasn't followed, and the py/ids were not the same as the legacy data, the EdgeType would be decoded as a Node and the edges would not be formed correctly in the decoded Skeleton using Skeleton.from_json.
{'directed': True, 'multigraph': True, 'graph': {'name': 'skeleton_legs.mat', 'num_edges_inserted': 23}, 'nodes': [{'id': Node(name='head', weight=1.0)}, {'id': Node(name='neck', weight=1.0)}, {'id': Node(name='thorax', weight=1.0)}, {'id': Node(name='abdomen', weight=1.0)}, {'id': Node(name='wingL', weight=1.0)}, {'id': Node(name='wingR', weight=1.0)}, {'id': Node(name='forelegL1', weight=1.0)}, {'id': Node(name='forelegL2', weight=1.0)}, {'id': Node(name='forelegL3', weight=1.0)}, {'id': Node(name='forelegR1', weight=1.0)}, {'id': Node(name='forelegR2', weight=1.0)}, {'id': Node(name='forelegR3', weight=1.0)}, {'id': Node(name='midlegL1', weight=1.0)}, {'id': Node(name='midlegL2', weight=1.0)}, {'id': Node(name='midlegL3', weight=1.0)}, {'id': Node(name='midlegR1', weight=1.0)}, {'id': Node(name='midlegR2', weight=1.0)}, {'id': Node(name='midlegR3', weight=1.0)}, {'id': Node(name='hindlegL1', weight=1.0)}, {'id': Node(name='hindlegL2', weight=1.0)}, {'id': Node(name='hindlegL3', weight=1.0)}, {'id': Node(name='hindlegR1', weight=1.0)}, {'id': Node(name='hindlegR2', weight=1.0)}, {'id': Node(name='hindlegR3', weight=1.0)}], 'links': [{'edge_insert_idx': 1, 'type': <EdgeType.BODY: 1>, 'source': Node(name='neck', weight=1.0), 'target': Node(name='head', weight=1.0), 'key': 0}, {'edge_insert_idx': 0, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='neck', weight=1.0), 'key': 0}, {'edge_insert_idx': 2, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='abdomen', weight=1.0), 'key': 0}, {'edge_insert_idx': 3, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='wingL', weight=1.0), 'key': 0}, {'edge_insert_idx': 4, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='wingR', weight=1.0), 'key': 0}, {'edge_insert_idx': 5, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='forelegL1', weight=1.0), 'key': 0}, {'edge_insert_idx': 8, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='forelegR1', weight=1.0), 'key': 0}, {'edge_insert_idx': 11, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='midlegL1', weight=1.0), 'key': 0}, {'edge_insert_idx': 14, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='midlegR1', weight=1.0), 'key': 0}, {'edge_insert_idx': 17, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='hindlegL1', weight=1.0), 'key': 0}, {'edge_insert_idx': 20, 'type': <EdgeType.BODY: 1>, 'source': Node(name='thorax', weight=1.0), 'target': Node(name='hindlegR1', weight=1.0), 'key': 0}, {'edge_insert_idx': 6, 'type': <EdgeType.BODY: 1>, 'source': Node(name='forelegL1', weight=1.0), 'target': Node(name='forelegL2', weight=1.0), 'key': 0}, {'edge_insert_idx': 7, 'type': <EdgeType.BODY: 1>, 'source': Node(name='forelegL2', weight=1.0), 'target': Node(name='forelegL3', weight=1.0), 'key': 0}, {'edge_insert_idx': 9, 'type': <EdgeType.BODY: 1>, 'source': Node(name='forelegR1', weight=1.0), 'target': Node(name='forelegR2', weight=1.0), 'key': 0}, {'edge_insert_idx': 10, 'type': <EdgeType.BODY: 1>, 'source': Node(name='forelegR2', weight=1.0), 'target': Node(name='forelegR3', weight=1.0), 'key': 0}, {'edge_insert_idx': 12, 'type': <EdgeType.BODY: 1>, 'source': Node(name='midlegL1', weight=1.0), 'target': Node(name='midlegL2', weight=1.0), 'key': 0}, {'edge_insert_idx': 13, 'type': <EdgeType.BODY: 1>, 'source': Node(name='midlegL2', weight=1.0), 'target': Node(name='midlegL3', weight=1.0), 'key': 0}, {'edge_insert_idx': 15, 'type': <EdgeType.BODY: 1>, 'source': Node(name='midlegR1', weight=1.0), 'target': Node(name='midlegR2', weight=1.0), 'key': 0}, {'edge_insert_idx': 16, 'type': <EdgeType.BODY: 1>, 'source': Node(name='midlegR2', weight=1.0), 'target': Node(name='midlegR3', weight=1.0), 'key': 0}, {'edge_insert_idx': 18, 'type': <EdgeType.BODY: 1>, 'source': Node(name='hindlegL1', weight=1.0), 'target': Node(name='hindlegL2', weight=1.0), 'key': 0}, {'edge_insert_idx': 19, 'type': <EdgeType.BODY: 1>, 'source': Node(name='hindlegL2', weight=1.0), 'target': Node(name='hindlegL3', weight=1.0), 'key': 0}, {'edge_insert_idx': 21, 'type': <EdgeType.BODY: 1>, 'source': Node(name='hindlegR1', weight=1.0), 'target': Node(name='hindlegR2', weight=1.0), 'key': 0}, {'edge_insert_idx': 22, 'type': <EdgeType.BODY: 1>, 'source': Node(name='hindlegR2', weight=1.0), 'target': Node(name='hindlegR3', weight=1.0), 'key': 0}]}
  • A test has been added which loads a skeleton from a JSON file, get the graph with json_graph.node_link_data from networkx, encodes the graph with the new SkeletonEncoder.encode() method, then uses Skeleton.from_json to deserialize the JSON string (with jsonpickle.decode, then json_graph.node_link_graph). The Skeleton.matches() method is used to test the equivalence of the deserialized Skeletons.
    [X] This test could be parametrized to do with various skeleton fixtures.
    [X] We should make sure to do this with a template fixture at least.
    [X] An additional test could be added to see if the serialized JSON strings are equivalent.

Types of changes

  • Bugfix
  • New feature
  • Refactor / Code style update (no logical changes)
  • Build / CI changes
  • Documentation Update
  • Other (explain)

Does this address any currently open issues?

#1470 #1918

This pull request accompanies #1961 in handling JSON encoding and decoding internally instead of relying on jsonpickle.

Outside contributors checklist

  • Review the guidelines for contributing to this repository
  • Read and sign the CLA and add yourself to the authors list
  • Make sure you are making a pull request against the develop branch (not main). Also you should start your branch off develop
  • Add tests that prove your fix is effective or that your feature works
  • Add necessary documentation (if appropriate)

Thank you for contributing to SLEAP!

❤️

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced a new SkeletonEncoder class for enhanced encoding of skeleton representations.
    • Added methods for encoding Node and EdgeType objects into JSON formats.
  • Chores

    • Updated version constraints for the attrs package in environment configuration files.
  • Tests

    • Added new tests for validating the encoding and decoding of Skeleton objects.

Copy link

coderabbitai bot commented Sep 20, 2024

Walkthrough

A new class named SkeletonEncoder has been introduced in the sleap/skeleton.py file, replacing the existing jsonpickle.encode functionality with a custom encoder for converting Python objects into JSON strings. The SkeletonEncoder includes methods for encoding Node and EdgeType objects, while the to_json method in the Skeleton class has been updated to utilize this new encoder. Additionally, tests for encoding and decoding Skeleton objects have been added, and the attrs package version constraints have been specified more clearly in the environment files.

Changes

File Change Summary
sleap/skeleton.py Added new class SkeletonEncoder for custom encoding; modified to_json method to use SkeletonEncoder.
environment.yml Updated attrs package version constraint to >=21.2.0.
environment_no_cuda.yml Updated attrs package version constraint to >=21.2.0.
tests/test_skeleton.py Added tests for encoding and decoding Skeleton objects using SkeletonEncoder.

Possibly related PRs

  • Handle skeleton decoding internally #1961: The introduction of the SkeletonDecoder class in this PR is directly related to the SkeletonEncoder class in the main PR, as both classes are involved in the (de)serialization of Node and EdgeType objects within the sleap/skeleton.py file.

Suggested reviewers

  • talmo

Poem

In the code where rabbits play,
A new encoder hops today.
Skeletons now dance with glee,
Encoding joyfully, you see!
With every node, a match so bright,
In the world of bytes, all feels right! 🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 743e406 and 83a2704.

📒 Files selected for processing (1)
  • sleap/skeleton.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • sleap/skeleton.py

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 3c7f5af and 034cffb.

Files selected for processing (1)
  • sleap/skeleton.py (1 hunks)

Comment on lines 88 to 89
class SkeletonEncoder:
"""Replace json.pickle.encode with our own encoder."""
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the missing encoder implementation.

The SkeletonEncoder class is currently empty. To fulfill its purpose of replacing json.pickle.encode with a custom encoder, please add the necessary encoding logic within the class definition.

Copy link

codecov bot commented Sep 20, 2024

Codecov Report

Attention: Patch coverage is 98.41270% with 1 line in your changes missing coverage. Please review.

Project coverage is 75.50%. Comparing base (7ed1229) to head (83a2704).
Report is 52 commits behind head on develop.

Files with missing lines Patch % Lines
sleap/skeleton.py 98.41% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #1970      +/-   ##
===========================================
+ Coverage    73.30%   75.50%   +2.19%     
===========================================
  Files          134      133       -1     
  Lines        24087    24625     +538     
===========================================
+ Hits         17658    18593     +935     
+ Misses        6429     6032     -397     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range and nitpick comments (3)
sleap/skeleton.py (3)

88-117: LGTM! Remember to implement the _encode method.

The SkeletonEncoder class is well-documented and provides a clear explanation of its purpose and the encoding format. The encode class method, _encode_node, and _encode_edge_type methods are implemented correctly.

However, the _encode method, which is intended for the actual encoding logic, is currently unimplemented. Please ensure that this method is implemented to complete the functionality of the SkeletonEncoder class.

Do you want me to generate the implementation for the _encode method or open a GitHub issue to track this task?


119-121: LGTM! Consider adding a comment to clarify the purpose of _encoded_strings.

The __init__ method is correctly defined and initializes the _encoded_strings instance variable. However, the purpose of this variable is not clear from the provided context.

Consider adding a comment to explain how _encoded_strings is intended to be used within the SkeletonEncoder class.


Line range hint 1090-1115: LGTM! Remember to remove or comment out debug print statements before production.

The added debug print statements in the to_json method do not alter the existing functionality and can be useful for logging and debugging purposes during development.

However, these print statements may clutter the output and should be removed or commented out before deploying the code to production to avoid unnecessary logging.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 034cffb and dd4865d.

Files selected for processing (1)
  • sleap/skeleton.py (3 hunks)
Additional context used
Ruff
sleap/skeleton.py

138-138: SyntaxError: Expected an indented block after function definition

Additional comments not posted (4)
sleap/skeleton.py (4)

122-133: LGTM! Remember to implement the _encode method.

The encode class method is correctly defined and documented. It creates an instance of the SkeletonEncoder class and calls the _encode method on the instance, which is the expected behavior.

However, since the _encode method is currently unimplemented, calling the encode method will raise an error. Please ensure that the _encode method is implemented to enable the functionality of the encode method.


135-137: Reminder: Implement the _encode method.

The _encode method is correctly defined with the appropriate parameters and return type. However, the method is currently unimplemented, as indicated by the empty body.

Please ensure that the _encode method is implemented to enable the functionality of the SkeletonEncoder class, as it is responsible for the actual encoding logic.


138-151: LGTM!

The _encode_node method is correctly defined and documented. It takes a Node object as input and returns a dictionary representation of the node, following the expected format.

The dictionary representation includes the py/object key set to sleap.Skeleton.Node and the py/state key containing a tuple of the node's name and weight, which aligns with the provided encoding format.

Tools
Ruff

138-138: SyntaxError: Expected an indented block after function definition


153-168: LGTM!

The _encode_edge_type method is correctly defined and documented. It takes an EdgeType object as input and returns a dictionary representation of the edge type, following the expected format.

The dictionary representation includes the py/reduce key, which contains a list of dictionaries representing the type and value of the edge type. This aligns with the provided encoding format for EdgeType objects.

@eberrigan eberrigan marked this pull request as draft September 21, 2024 04:39
@eberrigan eberrigan requested a review from roomrys September 24, 2024 04:50

json_str = jsonpickle.encode(data)
# json_str = jsonpickle.encode(data)
encoder = SkeletonEncoder()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SkeletonEncoder.encode(data) instead

@eberrigan eberrigan marked this pull request as ready for review September 24, 2024 23:04
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Outside diff range and nitpick comments (5)
environment_no_cuda.yml (1)

Line range hint 1-47: Consider reviewing version constraints for consistency

While the change to the attrs package version constraint is good, it might be beneficial to review the version constraints of other dependencies in this file for consistency. Some packages have specific version constraints, while others have more relaxed ones. Consider standardizing the approach to version constraints across all dependencies to ensure optimal compatibility and reproducibility.

tests/test_skeleton.py (2)

12-36: LGTM: Comprehensive test for Skeleton encoding and decoding.

This test function thoroughly checks the encoding and decoding process for a Skeleton object loaded from a JSON file. It verifies both object equality and JSON string equality, which is excellent.

A minor suggestion for improvement:

Consider adding an assertion to check that the encoded_json_str is not empty before parsing it. This could help catch potential encoding failures more explicitly:

assert encoded_json_str, "Encoded JSON string should not be empty"

This assertion could be added just after line 23.


39-67: LGTM: Well-structured parameterized test for multiple Skeleton fixtures.

This parameterized test function provides excellent coverage by testing the encoding and decoding process for multiple Skeleton fixtures. It verifies both object equality and JSON representation equality, which is thorough.

A suggestion for improvement:

Consider adding a check for the encoded_json_str to ensure it's not empty, similar to the suggestion for the previous test:

assert encoded_json_str, f"Encoded JSON string should not be empty for {skeleton_fixture_name}"

This assertion could be added just after line 54.

Additionally, it might be beneficial to add a comment explaining the purpose of using json.loads when comparing the JSON strings, as it normalizes the string representations for comparison.

sleap/skeleton.py (1)

Line range hint 1211-1237: Remove debug print statements.

The debug print statements added to the to_json method are helpful for development and debugging but should be removed or commented out in production code. Consider using a logging framework for debugging in the future, which allows for easier management of debug output.

Apply this diff to remove the debug print statements:

-            print(f"indexed_node_graph: {indexed_node_graph}")
-            print(f"indexed_node_graph: {indexed_node_graph}")
-        print(f"graph: {graph}")
-            print(f"data: {data}")
-            print(f"data: {data}")
-        print(f"json_str: {json_str}")
skeletons.ipynb (1)

44-379: Consider reducing debug output for clarity

The extensive debug output from the encoding process can clutter the notebook and make it less readable. Consider adjusting the logging level or redirecting debug information to a log file to improve clarity.

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between dd4865d and 4c8bdd6.

Files selected for processing (5)
  • environment.yml (1 hunks)
  • environment_no_cuda.yml (1 hunks)
  • skeletons.ipynb (1 hunks)
  • sleap/skeleton.py (3 hunks)
  • tests/test_skeleton.py (1 hunks)
Additional comments not posted (6)
environment_no_cuda.yml (1)

14-14: Approved: Explicit version constraint for attrs package

The change from - conda-forge::attrs >=21.2.0 #,<=21.4.0 to - conda-forge::attrs >=21.2.0,<=21.4.0 is a good improvement. It explicitly sets both lower and upper bounds for the attrs package version, which helps ensure compatibility and reproducibility of the environment.

Benefits of this change:

  1. Prevents potential issues with newer, untested versions of attrs.
  2. Ensures consistency across different installations of the project.
  3. Follows best practices for specifying dependency versions in conda environment files.
environment.yml (1)

13-13: Approve version constraint update for attrs package.

The change to specify both lower and upper bounds for the attrs package version is a good practice in dependency management. It helps ensure compatibility and stability across different environments.

To ensure this change doesn't conflict with other dependencies, please run the following script:

This script will help identify any potential conflicts or usage patterns that might be affected by the version constraint update.

Verification successful

Approve version constraint update for attrs package.

The updated version constraint attrs >=21.2.0,<=21.4.0 is compatible with the existing dependencies, including cattrs ==1.1.1. This ensures stability and compatibility across the codebase without introducing conflicts.

  • No conflicts detected with other dependencies.
  • attrs is consistently used across multiple modules, and the version range is appropriate.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify compatibility of attrs version with other dependencies

# Test: Check if any other packages directly depend on attrs
rg --type yaml -i 'attrs' environment.yml

# Test: Look for any potential conflicts in other files
rg --type python -i 'import attr' -A 5

Length of output: 190


Script:

#!/bin/bash
# Description: Verify compatibility of attrs version with other dependencies

# Test: Check if any other packages directly depend on attrs
rg --type yaml -i 'attrs' environment.yml

# Test: Look for any potential conflicts in other files
rg --type py -i 'import attr' -A 5

Length of output: 19024

tests/test_skeleton.py (2)

5-5: LGTM: New imports are appropriate for the added tests.

The new imports of json and SkeletonEncoder are necessary for the newly added test functions. They are correctly placed and follow the existing import style.

Also applies to: 9-9


1-67: Overall: Excellent addition of tests for SkeletonEncoder functionality.

The new test functions test_decoded_encoded_Skeleton_from_load_json and test_decoded_encoded_Skeleton provide thorough coverage for the new SkeletonEncoder functionality. They test both single file loading and multiple fixtures, ensuring robustness of the encoding and decoding process.

The tests are well-structured, use appropriate assertions, and align well with the PR objectives. The parameterized test allows for easy expansion of test cases in the future.

Minor suggestions for improvement have been made to enhance error catching and clarity. These changes, while not critical, would further strengthen the test suite.

Great job on implementing these comprehensive tests!

sleap/skeleton.py (2)

88-289: LGTM: SkeletonEncoder implementation is complete and well-structured.

The SkeletonEncoder class has been fully implemented, addressing the previous comment about adding the missing encoder implementation. The class provides a comprehensive set of methods for encoding various data types and objects, including a reference system for previously seen objects. The implementation follows good practices for custom JSON encoding.


1235-1236: LGTM: SkeletonEncoder integration in to_json method.

The to_json method has been successfully updated to use the new SkeletonEncoder instead of jsonpickle.encode. This change aligns with the goal of replacing the existing functionality with a custom encoder.

skeletons.ipynb Outdated
Comment on lines 400 to 402
"# Save the encoded json string to a file\n",
"with open(\"encoded_json_str.json\", \"w\") as f:\n",
" f.write(encoded_json_str)"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add exception handling when writing to a file

Incorporating exception handling when writing the encoded JSON string to a file can prevent the program from crashing due to unexpected I/O errors.

Apply this diff to add exception handling:

 with open("encoded_json_str.json", "w") as f:
     f.write(encoded_json_str)
+except IOError as e:
+    print(f"An error occurred while writing to 'encoded_json_str.json': {e}")

Committable suggestion was skipped due to low confidence.

skeletons.ipynb Outdated
Comment on lines 422 to 425
"# Get the skeleton from the encoded json string\n",
"decoded_skeleton = Skeleton.from_json(encoded_json_str)\n",
"decoded_skeleton"
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add assertion to verify skeleton integrity after decoding

To ensure that the encoding and decoding processes maintain the skeleton's integrity, add an assertion to confirm that the decoded skeleton matches the original.

Apply this diff to add the assertion:

 skeleton = Skeleton.load_json(fly_skeleton_legs_json)
 decoded_skeleton = Skeleton.from_json(encoded_json_str)
+assert skeleton.matches(decoded_skeleton), "Decoded skeleton does not match the original."
 decoded_skeleton

Committable suggestion was skipped due to low confidence.

skeletons.ipynb Outdated
Comment on lines 433 to 451
"# def test_SkeletonEncoder(fly_legs_skeleton_json):\n",
"# \"\"\"\n",
"# Test SkeletonEncoder.encode method.\n",
"# \"\"\"\n",
"# # Get the skeleton from the fixture\n",
"# skeleton = Skeleton.load_json(fly_legs_skeleton_json)\n",
"# # Get the graph from the skeleton\n",
"# indexed_node_graph = skeleton._graph\n",
"# graph = json_graph.node_link_data(indexed_node_graph)\n",
"\n",
"# # Encode the graph as a json string to test .encode method\n",
"# encoder = SkeletonEncoder()\n",
"# encoded_json_str = encoder.encode(graph)\n",
"\n",
"# # Get the skeleton from the encoded json string\n",
"# decoded_skeleton = Skeleton.from_json(encoded_json_str)\n",
"\n",
"# # Check that the decoded skeleton is the same as the original skeleton\n",
"# assert skeleton.matches(decoded_skeleton)"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Uncomment and integrate the test for SkeletonEncoder

The test function for SkeletonEncoder.encode is currently commented out. Enabling this test would automate verification of the encoding and decoding processes, ensuring data integrity.

Apply this diff to uncomment and update the test function:

-# def test_SkeletonEncoder(fly_legs_skeleton_json):
-#     """
-#     Test SkeletonEncoder.encode method.
-#     """
-#     # Get the skeleton from the fixture
-#     skeleton = Skeleton.load_json(fly_legs_skeleton_json)
-#     # Get the graph from the skeleton
-#     indexed_node_graph = skeleton._graph
-#     graph = json_graph.node_link_data(indexed_node_graph)
-
-#     # Encode the graph as a json string to test .encode method
-#     encoder = SkeletonEncoder()
-#     encoded_json_str = encoder.encode(graph)
-
-#     # Get the skeleton from the encoded json string
-#     decoded_skeleton = Skeleton.from_json(encoded_json_str)
-
-#     # Check that the decoded skeleton is the same as the original skeleton
-#     assert skeleton.matches(decoded_skeleton)
+def test_SkeletonEncoder():
+    """
+    Test SkeletonEncoder.encode method.
+    """
+    # Get the skeleton from the fixture
+    fly_legs_skeleton_json = "tests/data/skeleton/fly_skeleton_legs.json"
+    skeleton = Skeleton.load_json(fly_legs_skeleton_json)
+    # Get the graph from the skeleton
+    indexed_node_graph = skeleton._graph
+    graph = json_graph.node_link_data(indexed_node_graph)
+
+    # Encode the graph as a json string to test .encode method
+    encoder = SkeletonEncoder()
+    encoded_json_str = encoder.encode(graph)
+
+    # Get the skeleton from the encoded json string
+    decoded_skeleton = Skeleton.from_json(encoded_json_str)
+
+    # Check that the decoded skeleton is the same as the original skeleton
+    assert skeleton.matches(decoded_skeleton)
Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"# def test_SkeletonEncoder(fly_legs_skeleton_json):\n",
"# \"\"\"\n",
"# Test SkeletonEncoder.encode method.\n",
"# \"\"\"\n",
"# # Get the skeleton from the fixture\n",
"# skeleton = Skeleton.load_json(fly_legs_skeleton_json)\n",
"# # Get the graph from the skeleton\n",
"# indexed_node_graph = skeleton._graph\n",
"# graph = json_graph.node_link_data(indexed_node_graph)\n",
"\n",
"# # Encode the graph as a json string to test .encode method\n",
"# encoder = SkeletonEncoder()\n",
"# encoded_json_str = encoder.encode(graph)\n",
"\n",
"# # Get the skeleton from the encoded json string\n",
"# decoded_skeleton = Skeleton.from_json(encoded_json_str)\n",
"\n",
"# # Check that the decoded skeleton is the same as the original skeleton\n",
"# assert skeleton.matches(decoded_skeleton)"
def test_SkeletonEncoder():
"""
Test SkeletonEncoder.encode method.
"""
# Get the skeleton from the fixture
fly_legs_skeleton_json = "tests/data/skeleton/fly_skeleton_legs.json"
skeleton = Skeleton.load_json(fly_legs_skeleton_json)
# Get the graph from the skeleton
indexed_node_graph = skeleton._graph
graph = json_graph.node_link_data(indexed_node_graph)
# Encode the graph as a json string to test .encode method
encoder = SkeletonEncoder()
encoded_json_str = encoder.encode(graph)
# Get the skeleton from the encoded json string
decoded_skeleton = Skeleton.from_json(encoded_json_str)
# Check that the decoded skeleton is the same as the original skeleton
assert skeleton.matches(decoded_skeleton)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (1)
tests/test_skeleton.py (1)

1-10: Organize imports according to PEP 8 guidelines

The new imports have been added correctly, but we can improve the overall organization of the import statements to follow PEP 8 guidelines more closely. This will enhance readability and maintainability.

Consider reorganizing the imports as follows:

import copy
import json
import os

import jsonpickle
import pytest
from networkx.readwrite import json_graph

from sleap.skeleton import Skeleton, SkeletonEncoder

This organization:

  1. Groups standard library imports first.
  2. Separates third-party library imports with a blank line.
  3. Places local imports (from the sleap package) last.
  4. Combines imports from the same module on a single line.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 4c8bdd6 and 6444378.

📒 Files selected for processing (2)
  • sleap/skeleton.py (4 hunks)
  • tests/test_skeleton.py (1 hunks)
🔇 Additional comments not posted (11)
tests/test_skeleton.py (1)

1-60: Overall assessment: Well-implemented tests for SkeletonEncoder

The new test functions test_decoded_encoded_Skeleton_from_load_json and test_decoded_encoded_Skeleton are valuable additions to the test suite. They effectively verify the functionality of the new SkeletonEncoder and ensure that Skeleton objects can be correctly encoded and decoded.

Key strengths:

  1. Comprehensive testing of encoding and decoding for different Skeleton objects.
  2. Use of parameterized testing to cover multiple scenarios.
  3. Verification of both object equality and JSON representation consistency.

The suggested improvements in the previous comments will further enhance the robustness and clarity of these tests. Great job on expanding the test coverage!

sleap/skeleton.py (10)

88-121: LGTM: SkeletonEncoder class initialization and docstring.

The class docstring provides a clear explanation of the purpose and functionality of the SkeletonEncoder. The __init__ method initializes the _encoded_objects dictionary to manage object references during encoding.


123-136: LGTM: encode class method implementation.

The encode class method provides a clean interface for encoding data. It creates an instance of the encoder, calls the private _encode method, and then uses json.dumps to create the final JSON string.


138-167: LGTM: _encode method implementation.

The _encode method handles different types of objects (dict, list, EdgeType, Node) appropriately. It includes special handling for dictionaries containing 'nodes' and 'links' keys, which is likely specific to the skeleton structure.


169-206: LGTM: _encode_links method implementation.

The _encode_links method ensures that the links (edges) are encoded in a specific order (source, target, type, other attributes). This maintains consistency in the JSON output.


208-235: LGTM: _encode_node method implementation.

The _encode_node method handles both Node objects and integer indices. It checks for previous encoding to avoid redundancy and uses a specific format for encoding Node objects.


237-260: LGTM: _encode_edge_type method implementation.

The _encode_edge_type method properly encodes EdgeType objects, handling both first-time encoding and references to previously encoded objects.


262-278: LGTM: _get_or_assign_id method implementation.

This method manages the assignment of py/id for objects, ensuring unique identification and handling both new and previously encoded objects.


280-291: LGTM: _is_first_encoding method implementation.

This simple method checks if an object is being encoded for the first time, which is crucial for the reference system used in the encoder.


1146-1146: LGTM: Updated to_json method using SkeletonEncoder.

The to_json method has been updated to use the new SkeletonEncoder class. The changes include:

  1. Using SkeletonEncoder.encode(data) instead of jsonpickle.encode(data).
  2. Proper handling of node_to_idx mapping when provided.
  3. Maintaining backwards compatibility by only including description and preview_image fields for template skeletons.

These changes improve the JSON encoding process while maintaining compatibility with existing data formats.

Also applies to: 1210-1213, 1232-1232


Line range hint 88-1232: Overall implementation of SkeletonEncoder and integration with Skeleton class is well-done.

The new SkeletonEncoder class provides a custom JSON encoding solution for the Skeleton class, replacing the previous jsonpickle.encode functionality. The implementation is thorough, handling various object types and maintaining a reference system for efficient encoding. The integration with the Skeleton class, particularly in the to_json method, is done smoothly while maintaining backwards compatibility.

Key improvements:

  1. Custom encoding for Node and EdgeType objects.
  2. Efficient handling of repeated objects using a reference system.
  3. Maintaining order of attributes in encoded links.
  4. Backwards compatibility for non-template skeletons.

The changes should result in more efficient and controlled JSON encoding for Skeleton objects without breaking existing functionality.

Comment on lines +12 to +29
def test_decoded_encoded_Skeleton_from_load_json(fly_legs_skeleton_json):
"""
Test Skeleton decoded from SkeletonEncoder.encode matches the original Skeleton.
"""
# Get the skeleton from the fixture
skeleton = Skeleton.load_json(fly_legs_skeleton_json)
# Get the graph from the skeleton
indexed_node_graph = skeleton._graph
graph = json_graph.node_link_data(indexed_node_graph)

# Encode the graph as a json string to test .encode method
encoded_json_str = SkeletonEncoder.encode(graph)

# Get the skeleton from the encoded json string
decoded_skeleton = Skeleton.from_json(encoded_json_str)

# Check that the decoded skeleton is the same as the original skeleton
assert skeleton.matches(decoded_skeleton)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance test coverage with additional assertions

The test function effectively verifies that the encoded and decoded Skeleton matches the original. However, we can improve it by adding more specific assertions:

  1. Assert that the encoded JSON string is not empty.
  2. Compare the number of nodes and edges in the original and decoded skeletons.
  3. Verify that the node names and edge connections are preserved.

Consider adding these assertions to strengthen the test:

def test_decoded_encoded_Skeleton_from_load_json(fly_legs_skeleton_json):
    skeleton = Skeleton.load_json(fly_legs_skeleton_json)
    indexed_node_graph = skeleton._graph
    graph = json_graph.node_link_data(indexed_node_graph)

    encoded_json_str = SkeletonEncoder.encode(graph)
    assert encoded_json_str, "Encoded JSON string should not be empty"

    decoded_skeleton = Skeleton.from_json(encoded_json_str)

    assert skeleton.matches(decoded_skeleton)
    assert len(skeleton.nodes) == len(decoded_skeleton.nodes), "Number of nodes should match"
    assert len(skeleton.edges) == len(decoded_skeleton.edges), "Number of edges should match"
    assert set(n.name for n in skeleton.nodes) == set(n.name for n in decoded_skeleton.nodes), "Node names should match"
    assert set(skeleton.edge_names) == set(decoded_skeleton.edge_names), "Edge connections should match"

Comment on lines +32 to +60
@pytest.mark.parametrize(
"skeleton_fixture_name", ["flies13_skeleton", "skeleton", "stickman"]
)
def test_decoded_encoded_Skeleton(skeleton_fixture_name, request):
"""
Test Skeleton decoded from SkeletonEncoder.encode matches the original Skeleton.
"""
# Use request.getfixturevalue to get the actual fixture value by name
skeleton = request.getfixturevalue(skeleton_fixture_name)

# Get the graph from the skeleton
indexed_node_graph = skeleton._graph
graph = json_graph.node_link_data(indexed_node_graph)

# Encode the graph as a json string to test .encode method
encoded_json_str = SkeletonEncoder.encode(graph)

# Get the skeleton from the encoded json string
decoded_skeleton = Skeleton.from_json(encoded_json_str)

# Check that the decoded skeleton is the same as the original skeleton
assert skeleton.matches(decoded_skeleton)

# Now make everything into a JSON string
skeleton_json_str = skeleton.to_json()
decoded_skeleton_json_str = decoded_skeleton.to_json()

# Check that the JSON strings are the same
assert json.loads(skeleton_json_str) == json.loads(decoded_skeleton_json_str)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance parameterized test with additional assertions and error messages

The test function effectively verifies that the encoded and decoded Skeleton matches the original across multiple fixtures. To further improve its robustness:

  1. Add assertions for the number of nodes and edges.
  2. Verify that node names and edge connections are preserved.
  3. Include more descriptive error messages in assertions.

Consider enhancing the test function as follows:

@pytest.mark.parametrize(
    "skeleton_fixture_name", ["flies13_skeleton", "skeleton", "stickman"]
)
def test_decoded_encoded_Skeleton(skeleton_fixture_name, request):
    skeleton = request.getfixturevalue(skeleton_fixture_name)
    indexed_node_graph = skeleton._graph
    graph = json_graph.node_link_data(indexed_node_graph)

    encoded_json_str = SkeletonEncoder.encode(graph)
    assert encoded_json_str, f"Encoded JSON string for {skeleton_fixture_name} should not be empty"

    decoded_skeleton = Skeleton.from_json(encoded_json_str)

    assert skeleton.matches(decoded_skeleton), f"Decoded {skeleton_fixture_name} should match the original"
    assert len(skeleton.nodes) == len(decoded_skeleton.nodes), f"Number of nodes in {skeleton_fixture_name} should match"
    assert len(skeleton.edges) == len(decoded_skeleton.edges), f"Number of edges in {skeleton_fixture_name} should match"
    assert set(n.name for n in skeleton.nodes) == set(n.name for n in decoded_skeleton.nodes), f"Node names in {skeleton_fixture_name} should match"
    assert set(skeleton.edge_names) == set(decoded_skeleton.edge_names), f"Edge connections in {skeleton_fixture_name} should match"

    skeleton_json_str = skeleton.to_json()
    decoded_skeleton_json_str = decoded_skeleton.to_json()

    assert json.loads(skeleton_json_str) == json.loads(decoded_skeleton_json_str), f"JSON representations of {skeleton_fixture_name} should match"

These changes will provide more detailed information if a test fails, making it easier to identify and fix issues.

sleap/skeleton.py Outdated Show resolved Hide resolved
@roomrys roomrys merged commit ef803f6 into develop Sep 25, 2024
9 checks passed
@roomrys roomrys deleted the elizabeth/handle-skeleton-encoding-internally branch September 25, 2024 22:48
@roomrys
Copy link
Collaborator

roomrys commented Sep 25, 2024

roomrys added a commit that referenced this pull request Dec 19, 2024
* Remove no-op code from #1498

* Add options to set background color when exporting video (#1328)

* implement #921

* simplified form / refractor

* Add test function and update cli docs

* Improve test function to check background color

* Improve comments

* Change background options to lowercase

* Use coderabbitai suggested `fill`

---------

Co-authored-by: Shrivaths Shyam <[email protected]>
Co-authored-by: Liezl Maree <[email protected]>

* Increase range on batch size (#1513)

* Increase range on batch size

* Set maximum to a factor of 2

* Set default callable for `match_lists_function` (#1520)

* Set default for `match_lists_function`

* Move test code to official tests

* Check using expected values

* Allow passing in `Labels` to `app.main` (#1524)

* Allow passing in `Labels` to `app.main`

* Load the labels object through command

* Add warning when unable to switch back to CPU mode

* Replace (broken) `--unrag` with `--ragged` (#1539)

* Fix unrag always set to true in sleap-export

* Replace unrag with ragged

* Fix typos

* Add function to create app (#1546)

* Refactor `AddInstance` command (#1561)

* Refactor AddInstance command

* Add staticmethod wrappers

* Return early from set_visible_nodes

* Import DLC with uniquebodyparts, add Tracks (#1562)

* Import DLC with uniquebodyparts, add Tracks

* add tests

* correct tests

* Make the hdf5 videos store as int8 format (#1559)

* make the hdf5 video dataset type as proper int8 by padding with zeros

* add gzip compression

* Scale new instances to new frame size (#1568)

* Fix typehinting in `AddInstance`

* brought over changes from my own branch

* added suggestions

* Ensured google style comments

---------

Co-authored-by: roomrys <[email protected]>
Co-authored-by: sidharth srinath <[email protected]>

* Fix package export (#1619)

Add check for empty videos

* Add resize/scroll to training GUI (#1565)

* Make resizable training GUI and add adaptive scroll bar

* Set a maximum window size

---------

Co-authored-by: Liezl Maree <[email protected]>

* support loading slp files with non-compound types and str in metadata (#1566)

Co-authored-by: Liezl Maree <[email protected]>

* change inference pipeline option to tracking-only (#1666)

change inference pipeline none option to tracking-only

* Add ABL:AOC 2023 Workshop link (#1673)

* Add ABL:AOC 2023 Workshop link

* Trigger website build

* Graceful failing with seeking errors (#1712)

* Don't try to seek to faulty last frame on provider initialization

* Catch seeking errors and pass

* Lint

* Fix IndexError for hdf5 file import for single instance analysis files (#1695)

* Fix hdf5 read for single instance analysis files

* Add test

* Small test files

* removing unneccessary fixtures

* Replace imgaug with albumentations (#1623)

What's the worst that could happen?

* Initial commit

* Fix augmentation

* Update more deps requirements

* Use pip for installing albumentations and avoid reinstalling OpenCV

* Update other conda envs

* Fix out of bounds albumentations issues and update dependencies (#1724)

* Install albumentations using conda-forge in environment file

* Conda install albumentations

* Add ndx-pose to pypi requirements

* Keep out of bounds points

* Black

* Add ndx-pose to conda install in environment file

* Match environment file without cuda

* Ordered dependencies

* Add test

* Delete comments

* Add conda packages to mac environment file

* Order dependencies in pypi requirements

* Add tests with zeroes and NaNs for augmentation

* Back

* Black

* Make comment one line

* Add todo for later

* Black

* Update to new TensorFlow conda package (#1726)

* Build conda package locally

* Try 2.8.4

* Merge develop into branch to fix dependencies

* Change tensorflow version to 2.7.4 in where conda packages are used

* Make tensorflow requirements in pypi looser

* Conda package has TensorFlow 2.7.0 and h5py and numpy installed via conda

* Change tensorflow version in `environment_no_cuda.yml` to test using CI

* Test new sleap/tensorflow package

* Reset build number

* Bump version

* Update mac deps

* Update to Arm64 Mac runners

* pin `importlib-metadata`

* Pin more stuff on mac

* constrain `opencv` version due to new qt dependencies

* Update more mac stuff

* Patches to get to green

* More mac skipping

---------

Co-authored-by: Talmo Pereira <[email protected]>
Co-authored-by: Talmo Pereira <[email protected]>

* Fix CI on macosx-arm64 (#1734)

* Build conda package locally

* Try 2.8.4

* Merge develop into branch to fix dependencies

* Change tensorflow version to 2.7.4 in where conda packages are used

* Make tensorflow requirements in pypi looser

* Conda package has TensorFlow 2.7.0 and h5py and numpy installed via conda

* Change tensorflow version in `environment_no_cuda.yml` to test using CI

* Test new sleap/tensorflow package

* Reset build number

* Bump version

* Update mac deps

* Update to Arm64 Mac runners

* pin `importlib-metadata`

* Pin more stuff on mac

* constrain `opencv` version due to new qt dependencies

* Update more mac stuff

* Patches to get to green

* More mac skipping

* Re-enable mac tests

* Handle GPU re-init

* Fix mac build CI

* Widen tolerance for movenet correctness test

* Fix build ci

* Try for manual build without upload

* Try to reduce training CI time

* Rework actions

* Fix miniforge usage

* Tweaks

* Fix build ci

* Disable manual build

* Try merging CI coverage

* GPU/CPU usage in tests

* Lint

* Clean up

* Fix test skip condition

* Remove scratch test

---------

Co-authored-by: eberrigan <[email protected]>

* Add option to export to CSV via sleap-convert and API (#1730)

* Add csv as a format option

* Add analysis to format

* Add csv suffix to output path

* Add condition for csv analysis file

* Add export function to Labels class

* delete print statement

* lint

* Add `analysis.csv` as parametrize input for `sleap-convert` tests

* test `export_csv` method added to `Labels` class

* black formatting

* use `Path` to construct filename

* add `analysis.csv` to cli guide for `sleap-convert`

---------

Co-authored-by: Talmo Pereira <[email protected]>

* Only propagate Transpose Tracks when propagate is checked (#1748)

Fix always-propagate transpose tracks issue

* View Hyperparameter nonetype fix (#1766)

Pass config getter argument to fetch hyperparameters

* Adding ragged metadata to `info.json` (#1765)

Add ragged metadata to info.json file

* Add batch size to GUI for inference (#1771)

* Fix conda builds (#1776)

* test conda packages in a test environment as part of CI

* do not test sleap import using conda build

* use github environment variables to define build path for each OS in the matrix and add print statements for testing

* figure out paths one OS at a time

* github environment variables work in subsequent steps not current step

* use local builds first

* print env info

* try simple environment creation

* try conda instead of mamba

* fix windows build path

* fix windows build path

* add comment to reference pull request

* remove test stage from conda build for macs and test instead by creating the environment in a workflow

* test workflow by pushing to current branch

* test conda package on macos runner

* Mac build does not need nvidia channel

* qudida and albumentations are conda installed now

* add comment with original issue

* use python 3.9

* use conda match specifications syntax

* make print statements more readable for troubleshooting python versioning

* clean up build file

* update version for pre-release

* add TODO

* add tests for conda packages before uploading

* update ci comments and branches

* remove macos test of pip wheel since python 3.9 is not supported by setup-python action

* Upgrade build actions for release (#1779)

* update `build.yml` so it matches updates from `build_manual.yml`

* test `build.yml` without uploading

* test again using build_manual.yml

* build pip wheel with Ubuntu and turn off caching so build.yml exactly matches build_manual.yml

* `build.yml` on release only and upload

* testing caching

* `use-only-tar-bz2: true` makes environment unsolvable, change it back

* Update .github/workflows/build_manual.yml

Co-authored-by: Liezl Maree <[email protected]>

* Update .github/workflows/build.yml

Co-authored-by: Liezl Maree <[email protected]>

* bump pre-release version

* fix version for pre-release

* run build and upload on release!

* try setting `CACHE_NUMBER` to 1 with `use-only-tar-bz2` set to true

* increasing the cache number to reset the cache does work when `use-only-tar-bz2` is set to true

* publish and upload on release only

---------

Co-authored-by: Liezl Maree <[email protected]>

* Add ZMQ support via GUI and CLI (#1780)

* Add ZMQ support via GUI and CLI, automatic port handler, separate utils module for the functions

* Change menu name to match deleting predictions beyond max instance (#1790)

Change menu and function names

* Fix website build and remove build cache across workflows (#1786)

* test with build_manual on push

* comment out caching in build manual

* remove cache step from builad manual since environment resolves when this is commented out

* comment out cache in build ci

* remove cache from build on release

* remove cache from website build

* test website build on push

* add name to checkout step

* update checkout to v4

* update checkout to v4 in build ci

* remove cache since build ci works without it

* update upload-artifact to v4 in build ci

* update second chechout to v4 in build ci

* update setup-python to v5 in build ci

* update download-artifact to v4 in build ci

* update checkout to v4 in build ci

* update checkout to v4 in website build

* update setup-miniconda to v3.0.3 in website build

* update actions-gh-pages to v4 in website build

* update actions checkout and setup-python in ci

* update checkout action in ci to v4

* pip install lxml[html_clean] because of error message during action

* add error message to website to explain why pip install lxml[html_clean]

* remove my branch for pull request

* Bump to 1.4.1a1 (#1791)

* bump versions to 1.4.1a1

* we can change the version on the installation page since this will be merged into the develop branch and not main

* Fix windows conda package upload and build ci (#1792)

* windows OS is 2022 not 2019 on runner

* upload windows conda build manually but not pypi build

* remove comment and run build ci

* change build manual back so that it doesn't upload

* remove branch from build manual

* update installation docs for 1.4.1a1

* Fix zmq inference (#1800)

* Ensure that we always pass in the zmq_port dict to LossViewer

* Ensure zmq_ports has correct keys inside LossViewer

* Use specified controller and publish ports for first attempted addresses

* Add test for ports being set in LossViewer

* Add max attempts to find unused port

* Fix find free port loop and add for controller port also

* Improve code readablility and reuse

* Improve error message when unable to find free port

* Set selected instance to None after removal (#1808)

* Add test that selected instance set to None after removal

* Set selected instance to None after removal

* Add `InstancesList` class to handle backref to `LabeledFrame` (#1807)

* Add InstancesList class to handle backref to LabeledFrame

* Register structure/unstructure hooks for InstancesList

* Add tests for the InstanceList class

* Handle case where instance are passed in but labeled_frame is None

* Add tests relevant methods in LabeledFrame

* Delegate setting frame to InstancesList

* Add test for PredictedInstance.frame after complex merge

* Add todo comment to not use Instance.frame

* Add rtest for InstasnceList.remove

* Use normal list for informative `merged_instances`

* Add test for copy and clear

* Add copy and clear methods, use normal lists in merge method

* Bump to v1.4.1a2 (#1835)

bump to 1.4.1a2

* Updated trail length viewing options (#1822)

* updated trail length optptions

* Updated trail length options in the view menu

* Updated `prefs` to include length info from `preferences.yaml`

* Added trail length as method of `MainWindow`

* Updated trail length documentation

* black formatting

---------

Co-authored-by: Keya Loding <[email protected]>

* Handle case when no frame selection for trail overlay (#1832)

* Menu option to open preferences directory and update to util functions to pathlib (#1843)

* Add menu to view preferences directory and update to pathlib

* text formatting

* Add `Keep visualizations` checkbox to training GUI (#1824)

* Renamed save_visualizations to view_visualizations for clarity

* Added Delete Visualizations button to the training pipeline gui, exposed del_viz_predictions config option to the user

* Reverted view_ back to save_ and changed new training checkbox to Keep visualization images after training.

* Fixed keep_viz config option state override bug and updated keep_viz doc description

* Added test case for reading training CLI argument correctly

* Removed unnecessary testing code

* Creating test case to check for viz folder

* Finished tests to check CLI argument reading and viz directory existence

* Use empty string instead of None in cli args test

* Use keep_viz_images false in most all test configs (except test to override config)

---------

Co-authored-by: roomrys <[email protected]>

* Allowing inference on multiple videos via `sleap-track` (#1784)

* implementing proposed code changes from issue #1777

* comments

* configuring output_path to support multiple video inputs

* fixing errors from preexisting test cases

* Test case / code fixes

* extending test cases for mp4 folders

* test case for output directory

* black and code rabbit fixes

* code rabbit fixes

* as_posix errors resolved

* syntax error

* adding test data

* black

* output error resolved

* edited for push to dev branch

* black

* errors fixed, test cases implemented

* invalid output test and invalid input test

* deleting debugging statements

* deleting print statements

* black

* deleting unnecessary test case

* implemented tmpdir

* deleting extraneous file

* fixing broken test case

* fixing test_sleap_track_invalid_output

* removing support for multiple slp files

* implementing talmo's comments

* adding comments

* Add object keypoint similarity method (#1003)

* Add object keypoint similarity method

* fix max_tracking

* correct off-by-one error

* correct off-by-one error

* Generate suggestions using max point displacement threshold (#1862)

* create function max_point_displacement, _max_point_displacement_video. Add to yaml file. Create test for new function . . . will need to edit

* remove unnecessary for loop, calculate proper displacement, adjusted tests accordingly

* Increase range for displacement threshold

* Fix frames not found bug

* Return the latter frame index

* Lint

---------

Co-authored-by: roomrys <[email protected]>

* Added Three Different Cases for Adding a New Instance (#1859)

* implemented paste with offset

* right click and then default will paste the new instance at the location of the cursor

* modified the logics for creating new instance

* refined the logic

* fixed the logic for right click

* refined logics for adding new instance at a specific location

* Remove print statements

* Comment code

* Ensure that we choose a non nan reference node

* Move OOB nodes to closest in-bounds position

---------

Co-authored-by: roomrys <[email protected]>

* Allow csv and text file support on sleap track (#1875)

* initial changes

* csv support and test case

* increased code coverage

* Error fixing, black, deletion of (self-written) unused code

* final edits

* black

* documentation changes

* documentation changes

* Fix GUI crash on scroll (#1883)

* Only pass wheelEvent to children that can handle it

* Add test for wheelEvent

* Fix typo to allow rendering videos with mp4 (Mac) (#1892)

Fix typo to allow rendering videos with mp4

* Do not apply offset when double clicking a `PredictedInstance` (#1888)

* Add offset argument to newInstance and AddInstance

* Apply offset of 10 for Add Instance menu button (Ctrl + I)

* Add offset for docks Add Instance button

* Make the QtVideoPlayer context menu unit-testable

* Add test for creating a new instance

* Add test for "New Instance" button in `InstancesDock`

* Fix typo in docstring

* Add docstrings and typehinting

* Remove unused imports and sort imports

* Refactor video writer to use imageio instead of skvideo (#1900)

* modify `VideoWriter` to use imageio with ffmpeg backend

* check to see if ffmpeg is present

* use the new check for ffmpeg

* import imageio.v2

* add imageio-ffmpeg to environments to test

* using avi format for now

* remove SKvideo videowriter

* test `VideoWriterImageio` minimally

* add more documentation for ffmpeg

* default mp4 for ffmpeg should be mp4

* print using `IMAGEIO` when using ffmpeg

* mp4 for ffmpeg

* use mp4 ending in test

* test `VideoWriterImageio` with avi file extension

* test video with odd size

* remove redundant filter since imageio-ffmpeg resizes automatically

* black

* remove unused import

* use logging instead of print statement

* import cv2 is needed for resize

* remove logging

* Use `Video.from_filename` when structuring videos (#1905)

* Use Video.from_filename when structuring videos

* Modify removal_test_labels to have extension in filename

* Use | instead of + in key commands (#1907)

* Use | instead of + in key commands

* Lint

* Replace QtDesktop widget in preparation for PySide6 (#1908)

* Replace to-be-depreciated QDesktopWidget

* Remove unused imports and sort remaining imports

* Remove unsupported |= operand to prepare for PySide6 (#1910)

Fixes TypeError: unsupported operand type(s) for |=: 'int' and 'Option'

* Use positional argument for exception type (#1912)

traceback.format_exception has changed it's first positional argument's name from etype to exc in python 3.7 to 3.10

* Replace all Video structuring with Video.cattr() (#1911)

* Remove unused AsyncVideo class (#1917)

Remove unused AsyncVideo

* Refactor `LossViewer` to use matplotlib (#1899)

* use updated syntax for QtAgg backend of matplotlib

* start add features to `MplCanvas` to replace QtCharts features in `LossViewer` (untested)

* remove QtCharts imports and replace with MplCanvas

* remove QtCharts imports and replace with MplCanvas

* start using MplCanvas in LossViwer instead of QtCharts (untested)

* use updated syntax

* Uncomment all commented out QtChart

* Add debug code

* Refactor monitor to use LossViewer._init_series method

* Add monitor only debug code

* Add methods for setting up axes and legend

* Add the matplotlib canvas to the widget

* Resize axis with data (no log support yet)

* Try using PathCollection for "batch"

* Get "batch" plotting with ax.scatter (no log support yet)

* Add log support

* Add a _resize_axis method

* Modify init_series to work for ax.plot as well

* Use matplotlib to plot epoch_loss line

* Add method _add_data_to_scatter

* Add _add_data_to_plot method

* Add docstring to _resize_axes

* Add matplotlib plot for val_loss

* Add matplotlib scatter for val_loss_best

* Avoid errors with setting log scale before any positive values

* Add x and y axes labels

* Set title (removing html tags)

* Add legend

* Adjust positioning of plot

* Lint

* Leave MplCanvas unchanged

* Removed unused training_monitor.LossViewer

* Resize fonts

* Move legend outside of plot

* Add debug code for montitor aesthetics

* Use latex formatting to bold parts of title

* Make axes aesthetic

* Add midpoint grid lines

* Set initial limits on x and y axes to be 0+

* Ensure x axis minimum is always resized to 0+

* Adjust plot to account for plateau patience title

* Add debug code for plateau patience title line

* Lint

* Set thicker line width

* Remove unused import

* Set log axis on initialization

* Make tick labels smaller

* Move plot down a smidge

* Move ylabel left a bit

* Lint

* Add class LossPlot

* Refactor LossViewer to use LossPlot

* Remove QtCharts code

* Remove debug codes

* Allocate space for figure items based on item's size

* Refactor LossPlot to use underscores for internal methods

* Ensure y_min, y_max not equal
Otherwise we get an unnecessary teminal message:
UserWarning: Attempting to set identical bottom == top == 3.0 results in singular transformations; automatically expanding.
  self.axes.set_ylim(y_min, y_max)

---------

Co-authored-by: roomrys <[email protected]>
Co-authored-by: roomrys <[email protected]>

* Refactor `LossViewer` to use underscores for internal method names (#1919)

Refactor LossViewer to use underscores for internal method names

* Manually handle `Instance.from_predicted` structuring when not `None` (#1930)

* Use `tf.math.mod` instead of `%` (#1931)

* Option for Max Stride to be 128 (#1941)

Co-authored-by: Max  Weinberg <[email protected]>

* Add discussion comment workflow (#1945)

* Add a bot to autocomment on workflow

* Use github markdown warning syntax

* Add a multiline warning

* Change happy coding to happy SLEAPing

Co-authored-by: Talmo Pereira <[email protected]>

---------

Co-authored-by: roomrys <[email protected]>
Co-authored-by: Talmo Pereira <[email protected]>

* Add comment on issue workflow (#1946)

* Add workflow to test conda packages (#1935)

* Add missing imageio-ffmpeg to meta.ymls (#1943)

* Update installation docs 1.4.1 (#1810)

* [wip] Updated installation docs

* Add tabs for different OS installations

* Move installation methods to tabs

* Use tabs.css

* FIx styling error (line under last tab in terminal hint)

* Add installation instructions before TOC

* Replace mamba with conda

* Lint

* Find good light colors
not switching when change dark/light themes

* Get color scheme switching
with dark/light toggle button

* Upgrade website build dependencies

* Remove seemingly unneeded dependencies from workflow

* Add myst-nb>=0.16.0 lower bound

* Trigger dev website build

* Fix minor typo in css

* Add miniforge and one-liner installs for package managers

---------

Co-authored-by: roomrys <[email protected]>
Co-authored-by: Talmo Pereira <[email protected]>

* Add imageio dependencies for pypi wheel (#1950)

Add imagio dependencies for pypi wheel

Co-authored-by: roomrys <[email protected]>

* Do not always color skeletons table black (#1952)

Co-authored-by: roomrys <[email protected]>

* Remove no module named work error (#1956)

* Do not always color skeletons table black

* Remove offending (possibly unneeded) line
that causes the no module named work error to print in terminal

* Remove offending (possibly unneeded) line
that causes the no module named work error to print in terminal

* Remove accidentally added changes

* Add (failing) test to ensure menu-item updates with state change

* Reconnect callback for menu-item (using lambda)

* Add (failing) test to ensure menu-item updates with state change

Do not assume inital state

* Reconnect callback for menu-item (using lambda)

---------

Co-authored-by: roomrys <[email protected]>

* Add `normalized_instance_similarity` method  (#1939)

* Add normalize function

* Expose normalization function

* Fix tests

* Expose object keypoint sim function

* Fix tests

* Handle skeleton decoding internally (#1961)

* Reorganize (and add) imports

* Add (and reorganize) imports

* Modify decode_preview_image to return bytes if specified

* Implement (minimally tested) replace_jsonpickle_decode

* Add support for using idx_to_node map
i.e. loading from Labels (slp file)

* Ignore None items in reduce_list

* Convert large function to SkeletonDecoder class

* Update SkeletonDecoder.decode docstring

* Move decode_preview_image to SkeletonDecoder

* Use SkeletonDecoder instead of jsonpickle in tests

* Remove unused imports

* Add test for decoding dict vs tuple pystates

* Handle skeleton encoding internally (#1970)

* start class `SkeletonEncoder`

* _encoded_objects need to be a dict to add to

* add notebook for testing

* format

* fix type in docstring

* finish classmethod for encoding Skeleton as a json string

* test encoded Skeleton as json string by decoding it

* add test for decoded encoded skeleton

* update jupyter notebook for easy testing

* constraining attrs in dev environment to make sure decode format is always the same locally

* encode links first then encode source then target then type

* save first enconding statically as an input to _get_or_assign_id so that we do not always get py/id

* save first encoding statically

* first encoding is passed to _get_or_assign_id

* use first_encoding variable to determine if we should assign a py/id

* add print statements for debugging

* update notebook for easy testing

* black

* remove comment

* adding attrs constraint to show this passes for certain attrs version only

* add import

* switch out jsonpickle.encode

* oops remove import

* can attrs be unconstrained?

* forgot comma

* pin attrs for testing

* test Skeleton from json, template, with symmetries, and template

* use SkeletonEncoder.encode

* black

* try removing None values in EdgeType reduced

* Handle case when nodes are replaced by integer indices from caller

* Remove prototyping notebook

* Remove attrs pins

* Remove sort keys (which flips the neccessary ordering of our py/ids)

* Do not add extra indents to encoded file

* Only append links after fully encoded (fat-finger)

* Remove outdated comment

* Lint

---------

Co-authored-by: Talmo Pereira <[email protected]>
Co-authored-by: roomrys <[email protected]>

* Pin ndx-pose<0.2.0 (#1978)

* Pin ndx-pose<0.2.0

* Typo

* Sort encoded `Skeleton` dictionary for backwards compatibility  (#1975)

* Add failing test to check that encoded Skeleton is sorted

* Sort Skeleton dictionary before encoding

* Remove unused import

* Disable comment bot for now

* Fix COCO Dataset Loading for Invisible Keypoints (#2035)

Update coco.py

# Fix COCO Dataset Loading for Invisible Keypoints

## Issue
When loading COCO datasets, keypoints marked as invisible (flag=0) are currently skipped and later placed randomly within the instance's bounding box. However, in COCO format, these keypoints may still have valid coordinate information that should be preserved (see toy_dataset for expected vs. current behavior).

## Changes
Modified the COCO dataset loading logic to:
- Check if invisible keypoints (flag=0) have non-zero coordinates
- If coordinates are (0,0), skip the point (existing behavior)
- If coordinates are not (0,0), create the point at those coordinates but mark it as not visible
- Maintain existing behavior for visible (flag=2) and labeled

* Lint

* Add tracking score as seekbar header options (#2047)

* Add `tracking_score` as a constructor arg for `PredictedInstance`

* Add `tracking_score` to ID models

* Add fixture with tracking scores

* Add tracking score to seekbar header

* Add bonsai guide for sleap docs (#2050)

* [WIP] Add bonsai guide page

* Add more information to the guide with images

* add branch for website build

* Typos

* fix links

* Include suggestions

* Add more screenshots and refine the doc

* Remove branch from website workflow

* Completed documentation edits from PR made by reviewer + review bot.

---------

Co-authored-by: Shrivaths Shyam <[email protected]>
Co-authored-by: Liezl Maree <[email protected]>

* Don't mark complete on instance scaling (#2049)

* Add check for instances with track assigned before training ID models (#2053)

* Add menu item for deleting instances beyond frame limit (#1797)

* Add menu item for deleting instances beyond frame limit

* Add test function to test the instances returned

* typos

* Update docstring

* Add frame range form

* Extend command to use frame range

---------

Co-authored-by: Talmo Pereira <[email protected]>

* Highlight instance box on hover (#2055)

* Make node marker and label sizes configurable via preferences (#2057)

* Make node marker and label sizes configurable via preferences

* Fix test

* Enable touchpad pinch to zoom (#2058)

* Fix import PySide2 -> qtpy (#2065)

* Fix import PySide2 -> qtpy

* Remove unnecessary print statements.

* Add channels for pip conda env (#2067)

* Add channels for pypi conda env

* Trigger dev website build

* Separate the video name and its filepath columns in `VideoTablesModel` (#2052)

* add option to show video names with filepath

* add doc

* new feature added successfully

* delete unnecessary code

* remove attributes from video object

* Update dataviews.py

* remove all properties

* delete toggle option

* remove video show

* fix the order of the columns

* remove options

* Update sleap/gui/dataviews.py

Co-authored-by: Liezl Maree <[email protected]>

* Update sleap/gui/dataviews.py

Co-authored-by: Liezl Maree <[email protected]>

* use pathlib instead of substrings

* Update dataviews.py

Co-authored-by: Liezl Maree <[email protected]>

* Use Path instead of pathlib.Path
and sort imports and remove unused imports

* Use item.filename instead of getattr

---------

Co-authored-by: Liezl Maree <[email protected]>

* Make status bar dependent on UI mode (#2063)

* remove bug for dark mode

* fix toggle case

---------

Co-authored-by: Liezl Maree <[email protected]>

* Bump version to 1.4.1 (#2062)

* Bump version to 1.4.1

* Trigger conda/pypi builds (no upload)

* Trigger website build

* Add dev channel to installation instructions

---------

Co-authored-by: Talmo Pereira <[email protected]>

* Add -c sleap/label/dev channel for win/linux
- also trigger website build

---------

Co-authored-by: Scott Yang <[email protected]>
Co-authored-by: Shrivaths Shyam <[email protected]>
Co-authored-by: getzze <[email protected]>
Co-authored-by: Lili Karashchuk <[email protected]>
Co-authored-by: Sidharth Srinath <[email protected]>
Co-authored-by: sidharth srinath <[email protected]>
Co-authored-by: Talmo Pereira <[email protected]>
Co-authored-by: KevinZ0217 <[email protected]>
Co-authored-by: Elizabeth <[email protected]>
Co-authored-by: Talmo Pereira <[email protected]>
Co-authored-by: eberrigan <[email protected]>
Co-authored-by: vaibhavtrip29 <[email protected]>
Co-authored-by: Keya Loding <[email protected]>
Co-authored-by: Keya Loding <[email protected]>
Co-authored-by: Hajin Park <[email protected]>
Co-authored-by: Elise Davis <[email protected]>
Co-authored-by: gqcpm <[email protected]>
Co-authored-by: Andrew Park <[email protected]>
Co-authored-by: roomrys <[email protected]>
Co-authored-by: MweinbergUmass <[email protected]>
Co-authored-by: Max  Weinberg <[email protected]>
Co-authored-by: DivyaSesh <[email protected]>
Co-authored-by: Felipe Parodi <[email protected]>
Co-authored-by: croblesMed <[email protected]>
@roomrys roomrys mentioned this pull request Dec 19, 2024
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants