Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to train AI using Tasks and their bounding boxes #8310

Merged
merged 16 commits into from
Jan 16, 2025
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.unreleased.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ For upgrade instructions, please check the [migration guide](MIGRATIONS.released
- Renamed "resolution" to "magnification" in more places within the codebase, including local variables. [#8168](https://github.com/scalableminds/webknossos/pull/8168)
- Layer names are now allowed to contain `$` as special characters. [#8241](https://github.com/scalableminds/webknossos/pull/8241)
- Datasets can now be renamed and can have duplicate names. [#8075](https://github.com/scalableminds/webknossos/pull/8075)
- Starting an ai training job using multiple annotations now also support inputting task ids and also considers their task bounding box. [#8310](https://github.com/scalableminds/webknossos/pull/8310)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve changelog entry clarity and correctness.

The changelog entry contains grammatical issues and redundant wording. Consider this revision:

-- Starting an ai training job using multiple annotations now also support inputting task ids and also considers their task bounding box. [#8310](https://github.com/scalableminds/webknossos/pull/8310)
++ Starting an AI training job using multiple annotations now supports inputting task-IDs and considers their task bounding boxes. [#8310](https://github.com/scalableminds/webknossos/pull/8310)

Changes made:

  • Fixed subject-verb agreement ("supports")
  • Capitalized "AI"
  • Hyphenated "task-IDs"
  • Removed redundant "also"
  • Made "bounding box" plural to match multiple tasks
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- Starting an ai training job using multiple annotations now also support inputting task ids and also considers their task bounding box. [#8310](https://github.com/scalableminds/webknossos/pull/8310)
- Starting an AI training job using multiple annotations now supports inputting task-IDs and considers their task bounding boxes. [#8310](https://github.com/scalableminds/webknossos/pull/8310)
🧰 Tools
🪛 LanguageTool

[style] ~25-~25: You’ve already used the word ‘also’ once in your sentence, so using it again may be redundant.
Context: ...now also support inputting task ids and also considers their task bounding box. [#8310](https:...

(REDUNDANT_FILLER)

- Improved the default colors for skeleton trees. [#8228](https://github.com/scalableminds/webknossos/pull/8228)
- Allowed to train an AI model using differently sized bounding boxes. We recommend all bounding boxes to have equal dimensions or to have dimensions which are multiples of the smallest bounding box. [#8222](https://github.com/scalableminds/webknossos/pull/8222)
- Within the bounding box tool, the cursor updates immediately after pressing `ctrl`, indicating that a bounding box can be moved instead of resized. [#8253](https://github.com/scalableminds/webknossos/pull/8253)
Expand Down
2 changes: 1 addition & 1 deletion app/controllers/AiModelController.scala
Original file line number Diff line number Diff line change
Expand Up @@ -153,7 +153,7 @@ class AiModelController @Inject()(
)
existingAiModelsCount <- aiModelDAO.countByNameAndOrganization(request.body.name,
request.identity._organization)
_ <- bool2Fox(existingAiModelsCount == 0) ?~> "model.nameInUse"
_ <- bool2Fox(existingAiModelsCount == 0) ?~> "aiModel.nameInUse"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this!

newTrainingJob <- jobService
.submitJob(jobCommand, commandArgs, request.identity, dataStore.name) ?~> "job.couldNotRunTrainModel"
newAiModel = AiModel(
Expand Down
1 change: 1 addition & 0 deletions conf/messages
Original file line number Diff line number Diff line change
Expand Up @@ -373,5 +373,6 @@ shortLink.notFound=No shortlink with this key could be found
aiModel.delete.referencedByInferences=Cannot delete AI models that are referenced by existing inferences.
aiModel.notFound=Could not find requested AI model.
aiModel.training.zeroAnnotations=Need at least one training annotation for model training.
aiModel.nameInUse=The AI model name is already in use. Please choose a different name.

aiInference.notFound=Could not find requested AI inference.
Original file line number Diff line number Diff line change
Expand Up @@ -396,7 +396,7 @@ function TaskTypeCreateView({ taskTypeId, history }: Props) {
valuePropName="checked"
>
<Checkbox>
Allow Volume Interpolation
Allow Volume Interpolation{" "}
<Tooltip
title="When enabled, it suffices to only label every 2nd slice. The skipped slices will be filled automatically by interpolating between the labeled slices."
placement="right"
Expand Down
7 changes: 5 additions & 2 deletions frontend/javascripts/admin/voxelytics/ai_model_list_view.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,10 @@ import { JobState } from "admin/job/job_list_view";
import { Link } from "react-router-dom";
import { useGuardedFetch } from "libs/react_helpers";
import { PageNotAvailableToNormalUser } from "components/permission_enforcer";
import { type AnnotationInfoForAIJob, TrainAiModelTab } from "oxalis/view/jobs/train_ai_model";
import {
type AnnotationInfoForAITrainingJob,
TrainAiModelTab,
} from "oxalis/view/jobs/train_ai_model";
import { getMagInfo, getSegmentationLayerByName } from "oxalis/model/accessors/dataset_accessor";
import type { Vector3 } from "oxalis/constants";
import type { Key } from "react";
Expand Down Expand Up @@ -106,7 +109,7 @@ export default function AiModelListView() {

function TrainNewAiJobModal({ onClose }: { onClose: () => void }) {
const [annotationInfosForAiJob, setAnnotationInfosForAiJob] = useState<
AnnotationInfoForAIJob<APIAnnotation>[]
AnnotationInfoForAITrainingJob<APIAnnotation>[]
>([]);

const getMagsForSegmentationLayer = (annotationId: string, layerName: string) => {
Expand Down
68 changes: 48 additions & 20 deletions frontend/javascripts/oxalis/view/jobs/train_ai_model.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ import { computeArrayFromBoundingBox } from "libs/utils";
import { MagSelectionFormItem } from "components/mag_selection";
import { MagInfo } from "oxalis/model/helpers/mag_info";
import { V3 } from "libs/mjs";
import { getAnnotationsForTask } from "admin/api/tasks";

const { TextArea } = Input;
const FormItem = Form.Item;
Expand All @@ -55,10 +56,12 @@ const FormItem = Form.Item;
// only the APIAnnotations of the given annotations to train on are loaded from the backend.
// Thus, the code needs to handle both HybridTracing | APIAnnotation where APIAnnotation is missing some information.
// Therefore, volumeTracings with the matching volumeTracingMags are needed to get more details on each volume annotation layer and its magnifications.
// The userBoundingBoxes are needed for checking for equal bounding box sizes. As training on fallback data is supported and an annotation is not required to have VolumeTracings,
// As the userBoundingBoxes should have multiple sizes of the smallest one, a check with a warning should be included.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// As the userBoundingBoxes should have multiple sizes of the smallest one, a check with a warning should be included.
// As the userBoundingBoxes should have extents that are multiples of the smallest extent, a check with a warning should be included.

// As training on fallback data is supported and an annotation is not required to have VolumeTracings,
// it is necessary to save userBoundingBoxes separately and not load them from volumeTracings entries to support skeleton only annotations.
// Moreover, in case an annotations is a task, its task bounding box should also be used for training.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Moreover, in case an annotations is a task, its task bounding box should also be used for training.
// Moreover, in case an annotation is a task, its task bounding box should also be used for training.

// Note that a copy of the userBoundingBoxes is included in each volume and skeleton tracing of an annotation. Thus, it doesn't matter from which the userBoundingBoxes are taken.
export type AnnotationInfoForAIJob<GenericAnnotation> = {
export type AnnotationInfoForAITrainingJob<GenericAnnotation> = {
annotation: GenericAnnotation;
dataset: APIDataset;
volumeTracings: VolumeTracing[];
Expand Down Expand Up @@ -175,8 +178,8 @@ export function TrainAiModelTab<GenericAnnotation extends APIAnnotation | Hybrid
getMagsForSegmentationLayer: (annotationId: string, layerName: string) => MagInfo;
onClose: () => void;
ensureSavedState?: (() => Promise<void>) | null;
annotationInfos: Array<AnnotationInfoForAIJob<GenericAnnotation>>;
onAddAnnotationsInfos?: (newItems: Array<AnnotationInfoForAIJob<APIAnnotation>>) => void;
annotationInfos: Array<AnnotationInfoForAITrainingJob<GenericAnnotation>>;
onAddAnnotationsInfos?: (newItems: Array<AnnotationInfoForAITrainingJob<APIAnnotation>>) => void;
}) {
const [form] = Form.useForm();

Expand Down Expand Up @@ -495,7 +498,7 @@ export function CollapsibleWorkflowYamlEditor({
}

function checkAnnotationsForErrorsAndWarnings<T extends HybridTracing | APIAnnotation>(
annotationsWithDatasets: Array<AnnotationInfoForAIJob<T>>,
annotationsWithDatasets: Array<AnnotationInfoForAITrainingJob<T>>,
): {
hasAnnotationErrors: boolean;
errors: string[];
Expand Down Expand Up @@ -596,31 +599,42 @@ function checkBoundingBoxesForErrorsAndWarnings(
function AnnotationsCsvInput({
onAdd,
}: {
onAdd: (newItems: Array<AnnotationInfoForAIJob<APIAnnotation>>) => void;
onAdd: (newItems: Array<AnnotationInfoForAITrainingJob<APIAnnotation>>) => void;
}) {
const [value, setValue] = useState("");
const onClickAdd = async () => {
const newItems = [];
const annotationIdsForTraining = [];
const unfinishedTasks = [];

const lines = value
.split("\n")
.map((line) => line.trim())
.filter((line) => line !== "");
for (const annotationUrlOrId of lines) {
if (annotationUrlOrId.includes("/")) {
newItems.push({
annotationId: annotationUrlOrId.split("/").at(-1) as string,
});
for (const taskOrAnnotationIdOrUrl of lines) {
if (taskOrAnnotationIdOrUrl.includes("/")) {
annotationIdsForTraining.push(taskOrAnnotationIdOrUrl.split("/").at(-1) as string);
} else {
newItems.push({
annotationId: annotationUrlOrId,
});
let isTask = true;
try {
const annotations = await getAnnotationsForTask(taskOrAnnotationIdOrUrl);
const finishedAnnotations = annotations.filter(({ state }) => state === "Finished");
if (annotations.length > 0) {
annotationIdsForTraining.push(...finishedAnnotations.map(({ id }) => id));
} else {
unfinishedTasks.push(taskOrAnnotationIdOrUrl);
}
} catch (_e) {
isTask = false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve error handling in task validation.

The catch block silently swallows all errors to determine if the input is a task. This could hide real API errors or other issues.

Consider this improvement:

-        try {
-          const annotations = await getAnnotationsForTask(taskOrAnnotationIdOrUrl);
-          const finishedAnnotations = annotations.filter(({ state }) => state === "Finished");
-          if (annotations.length > 0) {
-            annotationIdsForTraining.push(...finishedAnnotations.map(({ id }) => id));
-          } else {
-            unfinishedTasks.push(taskOrAnnotationIdOrUrl);
-          }
-        } catch (_e) {
-          isTask = false;
-        }
+        try {
+          const annotations = await getAnnotationsForTask(taskOrAnnotationIdOrUrl);
+          const finishedAnnotations = annotations.filter(({ state }) => state === "Finished");
+          if (annotations.length > 0) {
+            annotationIdsForTraining.push(...finishedAnnotations.map(({ id }) => id));
+          } else {
+            unfinishedTasks.push(taskOrAnnotationIdOrUrl);
+          }
+        } catch (error) {
+          if (error.response?.status === 404) {
+            isTask = false;
+          } else {
+            throw new Error(`Failed to fetch task ${taskOrAnnotationIdOrUrl}: ${error.message}`);
+          }
+        }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let isTask = true;
try {
const annotations = await getAnnotationsForTask(taskOrAnnotationIdOrUrl);
const finishedAnnotations = annotations.filter(({ state }) => state === "Finished");
if (annotations.length > 0) {
annotationIdsForTraining.push(...finishedAnnotations.map(({ id }) => id));
} else {
unfinishedTasks.push(taskOrAnnotationIdOrUrl);
}
} catch (_e) {
isTask = false;
}
let isTask = true;
try {
const annotations = await getAnnotationsForTask(taskOrAnnotationIdOrUrl);
const finishedAnnotations = annotations.filter(({ state }) => state === "Finished");
if (annotations.length > 0) {
annotationIdsForTraining.push(...finishedAnnotations.map(({ id }) => id));
} else {
unfinishedTasks.push(taskOrAnnotationIdOrUrl);
}
} catch (error) {
if (error.response?.status === 404) {
isTask = false;
} else {
throw new Error(`Failed to fetch task ${taskOrAnnotationIdOrUrl}: ${error.message}`);
}
}

if (!isTask) {
annotationIdsForTraining.push(taskOrAnnotationIdOrUrl);
}
}
}

const newAnnotationsWithDatasets = await Promise.all(
newItems.map(async (item) => {
const annotation = await getAnnotationInformation(item.annotationId);
annotationIdsForTraining.map(async (annotationId) => {
const annotation = await getAnnotationInformation(annotationId);
const dataset = await getDataset(annotation.datasetId);

const volumeServerTracings: ServerVolumeTracing[] = await Promise.all(
Expand Down Expand Up @@ -651,6 +665,16 @@ function AnnotationsCsvInput({
);
}
}
if (annotation.task?.boundingBox) {
const largestId = Math.max(...userBoundingBoxes.map(({ id }) => id));
userBoundingBoxes.push({
name: "Task Bounding Box",
boundingBox: Utils.computeBoundingBoxFromBoundingBoxObject(annotation.task.boundingBox),
color: [0, 0, 0],
isVisible: true,
id: largestId + 1,
});
}

return {
annotation,
Expand All @@ -663,14 +687,18 @@ function AnnotationsCsvInput({
};
}),
);

if (unfinishedTasks.length > 0) {
Toast.warning(
`The following tasks have no finished annotations: ${unfinishedTasks.join(", ")}`,
);
}
onAdd(newAnnotationsWithDatasets);
};
return (
<div>
<FormItem
name="annotationCsv"
label="Annotations CSV"
label="Annotations or Tasks CSV"
hasFeedback
initialValue={value}
rules={[
Expand All @@ -693,7 +721,7 @@ function AnnotationsCsvInput({
>
<TextArea
className="input-monospace"
placeholder="annotationUrlOrId"
placeholder="taskOrAnnotationIdOrUrl"
autoSize={{
minRows: 6,
}}
Expand Down