Miscellaneous QOL #467

talmo · 2021-02-04T06:55:35Z

Laundry list of small features and tweaks:

High level APIs:

sleap.load_video(): create a sleap.Video from filename
sleap.load_config(): load training job config json from model folders
sleap.load_file() now takes a match_to kwarg to align data structure instances with other labels for easy comparison and manipulation.
sleap.versions(): print package/OS versions
Labels, LabeledFrame, Instance, PredictedInstance, Video, Skeleton now all have readable __repr__ and __str__, and __len__ for easy inspection of contents (nodes, visible parts).
Labels.extract() to pull out a subset of frames into new labels while retaining metadata from the base labels.
Labels.export() to save to analysis HDF5
Labels.numpy(), LabeledFrame.numpy() to convert main data structures to numpy
Labels.describe() is more comprehensive overview of a dataset.
Labels.load_deeplabcut_folder() to interactively import a maDLC dataset.

Core workflow

Add ability to add any frame as suggestion and modify suggestion list. This makes it possible to use the suggestions as a "labeling queue". This can be used to keep track of labeled frames and frames intended to be labeled (either detected algorithmically or manually selected frames of interest by user). This list also serves as a convenient target for inference for initialization/reprediction after training:
Add Labels.clear_suggestions(), Labels.unlabeled_suggestions, Labels.get_unlabeled_suggestion_inds(), Labels.remove_empty_frames(), LabelsReader.from_user_labeled_frames(), LabelsReader.from_unlabeled_suggestions() to support suggestion-driven workflow
Remote workflow: Support for packaging output model folders as a zip file after training for easy download. Support for zip files as input to sleap.load_model().
Remote workflow: Support for packaging training + inference data, training job configurations, and CLI launch scripts into a single zip file for easy upload to Colab or other remote node:
File -> Save as... now uses labels.v000.slp as the default filename and scans the containing directory for labels following this pattern to increment the version number automatically. Saving the labels to a new version now takes just two keyboard strokes (Ctrl + Shift + S -> Enter by default). These two changes should direct users towards label versioning best practices.

GUI/UX

Window state (panels/tabs locations and visibility) are now persisted across sessions and restored when GUI is opened.
Add option to adjust node marker size to make them easier to see on high resolution displays. This and other visualization settings are now appropriately persisted across sessions.
Adjusted keyboard shortcuts to more sensible defaults and added ability to reset to "factory defaults".
Added progress indicators for long running operations such as exporting training packages and running inference. Inference progress is also displayed in terminal and notebooks:
Training and inference now have a Cancel button which immediately kills the subprocess and cleans up temp files or incomplete outputs.
Training package exporting options are moved to a submenu with clearer descriptions of what they will save.
Moved random flip augmentation options to a simpler dropdown menu (none, horizontal, vertical) rather than two checkboxes (enable/disable + horizontal/vertical)
Add name and description fields to TrainingJobConfig to make it easier to annotate preset profiles with information relevant for users to select between them. This will be displayed in the GUI in a future update to drive a simplified training dialog workflow.
Move track/identity-related options to Tracks menu and add a toggle for labels propagation. When disabled, setting track identity no longer affects subsequent frames.

Training/inference

Training CLI now doesn't override true parameters in the config with flag false defaults (e.g., config.outputs.save_visualizations works when --save_viz is not passed).
Training CLI no longer requires the positional labels_path arg since this can be specified in the config like the validation and test splits.
Training will now delete the visualizations subfolders if config.outputs.delete_viz_images is true (the default). This was often larger than the models themselves and usually only used during GUI-based training for live visualization.
Training will now zip the model folder if self.config.outputs.zip_outputs is true to support core remote workflow.
sleap.load_model() now supports loading models from zip files to support core remote workflow.
Massive refactor of the sleap.nn.inference submodule: deleted unused classes and methods and move a lot of functionality to the Predictor base class.
Predictors now use tf.keras.Model.predict_on_batch() in inference loops to drastically improve speed by cached traced/autographed model call every batch.
Inference now supports progress reporting using rich or by outputting JSON-encoded updates to stdout which can be captured by any caller. FPS and ETA are now calculated using a recent rolling buffer for more accurate estimates without needing to wait for startup/autograph amortization.
Reworked inference CLI (sleap-track) to make it much, much clearer and follow a more linear workflow. Deprecated many redundant args (still supported but hidden from sleap-track -h).
Inference CLI now takes a video or labels file as the positional data_path argument (e.g., sleap-track -m path/to/model "labels.slp") without using the --labels flag.
Inference CLI now uses a single LabelsReader/VideoReader provider to iterate over all inference targets. Previously, this was restarted per video, slowing down the whole process.
Inference on suggestions now only predicts on suggestions associated with unlabeled frames.
Inference CLI now stores much more metadata in Labels.provenance before saving, including system info, paths, sleap version, timestamps, etc.
TrainingJobConfig now caches the sleap version and filename it was loaded from and saved to during I/O ops

Fixes

Instance counting no longer double counts Instances/PredictedInstances. This affects the model folder auto-naming, labeled instance count in suggestions table and more.
sleap-label will now run in CPU-only mode to prevent pre-allocating all the GPU memory if any tensorflow ops are called.
sleap.load_model() will now default to disabling pre-allocation of all the GPU memory even if called interactively to prevent the same issue.
Fix dataset splitting to ensure there is a minimum of one sample per train/val split. Supports training with a single(!) label now.
Suggestions row is now selected appropriately when navigating suggestions.
Flip augmentation now applied correctly across all model types.

- Adjust version detection in setup.py

- Add selected frame count - Add total videos

- Instance, PredictedInstance, Skeleton - Tweak Video str - Expand labels.describe() - Add symmetry_names property to Skeleton - Add scores property to PredictedInstance - Typos, docstrings, formatting

- Default filename is now "labels.v000.slp" to encourage versioning best practices

- Add button to reset shortcuts to defaults - Add dialog to notify user about needing to restart app

- Add --reset flag in case things get messed up

- Rewrote CLI. - Now uses more standardized methods for data loading, model building, and inference. - Remove most of the dynamically generated args in favor of a flat list of args. - Deprecate a bunch of redundant args. These still work, they're now just hidden from the help. - Enable single provider inference for labels rather than predicting video-by-video. - More informative logging. - Add option for removing empty frames. By default it keeps empty frames (#396) - Add a lot more provenance information. - Unified inference progress bar for GUI, console AND notebooks (#453)! - JSON progress output for custom handling via stdout capture. - Add Predictor.from_model_paths() constructor for single entrypoint instantiation of subclasses from paths. - Remove unused imports and MockPredictor class. - Add peak_threshold to load_model() high level API. - Docstrings and typing

- export_dataset_gui for saving labels with embedded images with a progress bar - open_website to launch browser from URL - open_file to launch native system file browser - copy_to_clipboard to copy string to system clipboard

- Generates training labels package with embedded images, training job configs, training scripts, and inference scripts - Packages everything into a zip file - Displays dialog with options to open containing folder or open browser to pre-specified Colab notebook for remote training + copies path to clipboard (user just needs to upload the package and run)

- Add more logging to training data builders - Persist indices if split by fraction

- Add propagate track labels option to allow for direct set of track without affecting other frames - Fix state persistence in prefs for some options

- Docstring formatting

- Split AFTER filtering for user labels - Count dataset sizes by the reader objects instead of labels to account for possible filtering

* Add track indices to instance cropper * Add class vector generator * Split class vectors correctly in instance cropper * Move head output layer construction to heads module - Heads now subclass a base Head class - Naming doesn't include _0 anymore since we don't have any multi-output models for now. - Better input validation in Model.from_config constructor - Add loss weight to all heads in config - Test coverage for heads and (minimally for) model * Add topdown config, head and model - Rename multiclass to multiclass_bottomup * Add trainer * Data pipeline * Apply black to 'sleap' and 'tests' (#465) Co-authored-by: Arie Matsliah <[email protected]> * Fix model creation and add pooling param to head * Symmetry-aware flip augmentation (#455) * Implement symmetry-aware instance reflection * Fix symmetries sometimes not being returned uniquely * Add fancier indexing to instances * Add random flipping transformer * Fix failing linux test - Make sure indices are all cast to int32 * Add vertical flip * Add flip augmentation to config, GUI and pipeline builders * Update profiles with default fields Co-authored-by: ariematsliah-princeton <[email protected]> * Multi-size videos in data pipelines (#440) Add support for variable size videos within the same dataset by matching their size with padding or resizing Co-authored-by: Arie Matsliah <[email protected]> * Type check + Lint in CI (#470) * Try lint and typecheck in CI workflow * update * nit * continue on MyPy errors * test * correct * correct * correct Co-authored-by: Arie Matsliah <[email protected]> * Rename predictors for consistency with inference layers - TopdownPredictor -> TopDownPredictor - BottomupPredictor -> BottomUpPredictor * Create PULL_REQUEST_TEMPLATE.md * Update authors list (#471) Co-authored-by: Arie Matsliah <[email protected]> * Add CLA (#473) * Add CLA * update links Co-authored-by: Arie Matsliah <[email protected]> * Update PULL_REQUEST_TEMPLATE.md * Miscellaneous QOL (#467) Pre-1.1.0 update features (changelist in #467) * Bump pre-release version * Add back load_model that got lost in the merge - Add detection of bottomup and topdown multi-class model loading * Fix more missing things post-merge * Fix lint * Fix training from config * Add inference * Tweak describe tensor to accept nested tuples/dicts * Lint * Fix test * Lint * Fix load video dataset arg * Fix inference * Fix evals * Add BU MC to evals * Remove batch norm from TD MC head * Add option to disable batch norm in pretrained models * Add track matching when merging labels * Don't error when training finishes with no inference target Co-authored-by: ariematsliah-princeton <[email protected]> Co-authored-by: Arie Matsliah <[email protected]>

talmo added 30 commits February 3, 2021 12:42

Add high level sleap.load_video method

cad7a83

Version printer

7690ecc

- Adjust version detection in setup.py

Merge remote-tracking branch 'origin/develop' into talmo/misc_qol

2156664

Only use CPU in GUI process to prevent memory preallocation hogging gpu

77c6b9f

Fix some instance counting and add more info to labels.describe()

6646ba6

Add high level config loader

227db94

Filter suggestions by user labeled state in inference

5910d7f

Save viz CLI flag in training will not overwrite config

6475007

Add missing load_config import

824e104

Add tracks to labels.describe()

29cc9da

Add safe getter for labels

6ea65b8

Suggestions table will only count user instances

684a745

Fix safe getter

21a1398

Add percent labeled to suggestions table

123405b

Tweak status bar message

349d180

- Add selected frame count - Add total videos

Random flip is now a dropdown to be less confusing

8334410

Display correct head config when viewing trained model hyperparams

6ec78aa

Add match_to arg to high level load_file()

e90155c

Add missing reprs/str for main data structures

81dae9a

- Instance, PredictedInstance, Skeleton - Tweak Video str - Expand labels.describe() - Add symmetry_names property to Skeleton - Add scores property to PredictedInstance - Typos, docstrings, formatting

Remove repr test, tweak setup.py

1013178

Move export package commands to submenu with clearer labels

2809ac2

Add auto-increment version in save as... dialog

3ea3d27

- Default filename is now "labels.v000.slp" to encourage versioning best practices

Add version printing and cute welcome to sleap-label launch

a71fb0f

Menu label tweak

f642a48

Fix row selection when navigating suggestions

0e40b80

Add console message when configs are loaded/saved

13c10cd

Set keyboard shortcuts to more sensible defaults

e70ac93

- Add button to reset shortcuts to defaults - Add dialog to notify user about needing to restart app

Add option to reset application preferences

1d2d479

Add GUI state saving and restore

37fe53d

- Add --reset flag in case things get messed up

Add buttons to add/remove/clear all suggestions

16e7f37

talmo added 6 commits February 5, 2021 02:25

Black formatting

53155ed

Fix canceling training

61a870a

Move a few commands to standalone functions

1d11cd7

- export_dataset_gui for saving labels with embedded images with a progress bar - open_website to launch browser from URL - open_file to launch native system file browser - copy_to_clipboard to copy string to system clipboard

Black formatting

f904f03

talmo marked this pull request as ready for review February 5, 2021 10:12

talmo requested review from arie-matsliah and removed request for arie-matsliah February 5, 2021 10:12

talmo marked this pull request as draft February 5, 2021 17:16

talmo added 16 commits February 5, 2021 13:51

Fix random flip augmentation in pipeline building

bd3a0c2

Add split indices to config and support reading from indices

8fe4f3b

- Add more logging to training data builders - Persist indices if split by fraction

Add Labels.extract utility

9902c8d

Skeleton docstring cleanup and add __len__

bd4f575

Fix adding instances when skeleton is empty

5725baa

Move track options to new Tracks menu

f4cc909

- Add propagate track labels option to allow for direct set of track without affecting other frames - Fix state persistence in prefs for some options

Add track deletion menu

1aaa3ff

Fix instance creation with empty skeleton.

f2a4a64

- Docstring formatting

Add __len__ to Instance and fix blank instance generation

ababbf0

Black formatting

649213f

Fix test and special case in data split function

5051bd6

Add Labels.with_user_labels_only()

f1f29e4

Fix instance filtering and clean up docstrings in LabeledFrame

0861e62

Fix training data readers

acde4ba

- Split AFTER filtering for user labels - Count dataset sizes by the reader objects instead of labels to account for possible filtering

Fix inference on labels when skeleton is not yet set

f4277d5

Lint!!

d1f6187

talmo marked this pull request as ready for review February 8, 2021 16:27

talmo merged commit e1b8c62 into develop Feb 8, 2021

talmo deleted the talmo/misc_qol branch February 8, 2021 20:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Miscellaneous QOL #467

Miscellaneous QOL #467

talmo commented Feb 4, 2021 •

edited

Loading

Miscellaneous QOL #467

Miscellaneous QOL #467

Conversation

talmo commented Feb 4, 2021 • edited Loading

High level APIs:

Core workflow

GUI/UX

Training/inference

Fixes

talmo commented Feb 4, 2021 •

edited

Loading