[Bug]: WinError183 on OpenVino Export mode (Windows) #1385

ggiret-thinkdeep · 2023-10-05T09:19:57Z

Describe the bug

When i execute the first notebook Getting Started, at the end of the training, i get a WinError 183, due to the fact that my OS is Windows

Dataset

N/A

Model

N/A

Steps to reproduce the behavior

Use windows
Execute the first notebook 001_get_started
Configure the export_mode to "openvino"
At the end of the training, under Windows only, a WinError 183 error is noted

The code performs a rename, with a new file name matching an already existing file. Under Linux, if the file already exists, it is overwritten without warning (provided the user has the required rights). On Windows, this returns a WinError 183 error.

I will propose a fix for windows. It will be necessary to ensure that in the case of Linux + the user does not have the rights, the expected behavior is respected

OS information

OS information:

OS: [Windows 11]
Python version: [3.10.13]
Anomalib version: [0.7.0]
PyTorch version: [2.0.0]
CUDA/cuDNN version: [11.7]
GPU models and configuration: [GeForce RTX 2070]

Expected behavior

No error

Screenshots

No response

Pip/GitHub

pip

What version/branch did you use?

No response

Configuration YAML

dataset:
  name: mvtec
  format: mvtec
  path: ./datasets/MVTec
  task: segmentation
  category: bottle
  train_batch_size: 32
  eval_batch_size: 32
  num_workers: 8
  image_size: 256 # dimensions to which images are resized (mandatory)
  center_crop: 224 # dimensions to which images are center-cropped after resizing (optional)
  normalization: imagenet # data distribution to which the images will be normalized: [none, imagenet]
  transform_config:
    train: null
    eval: null
  test_split_mode: from_dir # options: [from_dir, synthetic]
  test_split_ratio: 0.2 # fraction of train images held out testing (usage depends on test_split_mode)
  val_split_mode: same_as_test # options: [same_as_test, from_test, synthetic]
  val_split_ratio: 0.5 # fraction of train/test images held out for validation (usage depends on val_split_mode)
  tiling:
    apply: false
    tile_size: null
    stride: null
    remove_border_count: 0
    use_random_tiling: False
    random_tile_count: 16

model:
  name: patchcore
  backbone: wide_resnet50_2
  pre_trained: true
  layers:
    - layer2
    - layer3
  coreset_sampling_ratio: 0.1
  num_neighbors: 9
  normalization_method: min_max # options: [null, min_max, cdf]

metrics:
  image:
    - F1Score
    - AUROC
  pixel:
    - F1Score
    - AUROC
  threshold:
    method: adaptive #options: [adaptive, manual]
    manual_image: null
    manual_pixel: null

visualization:
  show_images: False # show images on the screen
  save_images: True # save images to the file system
  log_images: True # log images to the available loggers (if any)
  image_save_path: null # path to which images will be saved
  mode: full # options: ["full", "simple"]

project:
  seed: 0
  path: ./results

logging:
  logger: [] # options: [comet, tensorboard, wandb, csv] or combinations.
  log_graph: false # Logs the model graph to respective logger.

optimization:
  export_mode: null # options: onnx, openvino

# PL Trainer Args. Don't add extra parameter here.
trainer:
  enable_checkpointing: true
  default_root_dir: null
  gradient_clip_val: 0
  gradient_clip_algorithm: norm
  num_nodes: 1
  devices: 1
  enable_progress_bar: true
  overfit_batches: 0.0
  track_grad_norm: -1
  check_val_every_n_epoch: 1 # Don't validate before extracting features.
  fast_dev_run: false
  accumulate_grad_batches: 1
  max_epochs: 1
  min_epochs: null
  max_steps: -1
  min_steps: null
  max_time: null
  limit_train_batches: 1.0
  limit_val_batches: 1.0
  limit_test_batches: 1.0
  limit_predict_batches: 1.0
  val_check_interval: 1.0 # Don't validate before extracting features.
  log_every_n_steps: 50
  accelerator: auto # <"cpu", "gpu", "tpu", "ipu", "hpu", "auto">
  strategy: null
  sync_batchnorm: false
  precision: 32
  enable_model_summary: true
  num_sanity_val_steps: 0
  profiler: null
  benchmark: false
  deterministic: false
  reload_dataloaders_every_n_epochs: 0
  auto_lr_find: false
  replace_sampler_ddp: true
  detect_anomaly: false
  auto_scale_batch_size: false
  plugins: null
  move_metrics_to_cpu: false
  multiple_trainloader_mode: max_size_cycle

Logs

C:\Users\ggire\anaconda3\envs\MeshStream\lib\site-packages\anomalib\deploy\export.py:218: UserWarning: Transform CenterCrop is not supported currently
  warn(f"Transform {transform} is not supported currently")
C:\Users\ggire\anaconda3\envs\MeshStream\lib\site-packages\anomalib\deploy\export.py:218: UserWarning: Transform ToTensorV2 is not supported currently
  warn(f"Transform {transform} is not supported currently")
---------------------------------------------------------------------------
FileExistsError                           Traceback (most recent call last)
Cell In[9], line 3
      1 # start training
      2 trainer = Trainer(**config.trainer, callbacks=callbacks)
----> 3 trainer.fit(model=model, datamodule=datamodule)

File ~\anaconda3\envs\MeshStream\lib\site-packages\pytorch_lightning\trainer\trainer.py:608, in Trainer.fit(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    606 model = self._maybe_unwrap_optimized(model)
    607 self.strategy._lightning_module = model
--> 608 call._call_and_handle_interrupt(
    609     self, self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
    610 )

File ~\anaconda3\envs\MeshStream\lib\site-packages\pytorch_lightning\trainer\call.py:38, in _call_and_handle_interrupt(trainer, trainer_fn, *args, **kwargs)
     36         return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
     37     else:
---> 38         return trainer_fn(*args, **kwargs)
     40 except _TunerExitException:
     41     trainer._call_teardown_hook()

File ~\anaconda3\envs\MeshStream\lib\site-packages\pytorch_lightning\trainer\trainer.py:650, in Trainer._fit_impl(self, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path)
    643 ckpt_path = ckpt_path or self.resume_from_checkpoint
    644 self._ckpt_path = self._checkpoint_connector._set_ckpt_path(
    645     self.state.fn,
    646     ckpt_path,  # type: ignore[arg-type]
    647     model_provided=True,
    648     model_connected=self.lightning_module is not None,
    649 )
--> 650 self._run(model, ckpt_path=self.ckpt_path)
    652 assert self.state.stopped
    653 self.training = False

File ~\anaconda3\envs\MeshStream\lib\site-packages\pytorch_lightning\trainer\trainer.py:1112, in Trainer._run(self, model, ckpt_path)
   1108 self._checkpoint_connector.restore_training_state()
   1110 self._checkpoint_connector.resume_end()
-> 1112 results = self._run_stage()
   1114 log.detail(f"{self.__class__.__name__}: trainer tearing down")
   1115 self._teardown()

File ~\anaconda3\envs\MeshStream\lib\site-packages\pytorch_lightning\trainer\trainer.py:1191, in Trainer._run_stage(self)
   1189 if self.predicting:
   1190     return self._run_predict()
-> 1191 self._run_train()

File ~\anaconda3\envs\MeshStream\lib\site-packages\pytorch_lightning\trainer\trainer.py:1214, in Trainer._run_train(self)
   1211 self.fit_loop.trainer = self
   1213 with torch.autograd.set_detect_anomaly(self._detect_anomaly):
-> 1214     self.fit_loop.run()

File ~\anaconda3\envs\MeshStream\lib\site-packages\pytorch_lightning\loops\loop.py:206, in Loop.run(self, *args, **kwargs)
    203         break
    204 self._restarting = False
--> 206 output = self.on_run_end()
    207 return output

File ~\anaconda3\envs\MeshStream\lib\site-packages\pytorch_lightning\loops\fit_loop.py:323, in FitLoop.on_run_end(self)
    320 log.detail(f"{self.__class__.__name__}: train run ended")
    322 # hook
--> 323 self.trainer._call_callback_hooks("on_train_end")
    324 self.trainer._call_lightning_module_hook("on_train_end")
    325 self.trainer._call_strategy_hook("on_train_end")

File ~\anaconda3\envs\MeshStream\lib\site-packages\pytorch_lightning\trainer\trainer.py:1394, in Trainer._call_callback_hooks(self, hook_name, *args, **kwargs)
   1392     if callable(fn):
   1393         with self.profiler.profile(f"[Callback]{callback.state_key}.{hook_name}"):
-> 1394             fn(self, self.lightning_module, *args, **kwargs)
   1396 if pl_module:
   1397     # restore current_fx when nested context
   1398     pl_module._current_fx_name = prev_fx_name

File ~\anaconda3\envs\MeshStream\lib\site-packages\anomalib\utils\callbacks\export.py:46, in ExportCallback.on_train_end(self, trainer, pl_module)
     43 logger.info("Exporting the model")
     44 Path(self.dirpath).mkdir(parents=True, exist_ok=True)
---> 46 export(
     47     task=trainer.datamodule.test_data.task,
     48     input_size=self.input_size,
     49     transform=trainer.datamodule.test_data.transform.to_dict(),
     50     model=pl_module,
     51     export_root=self.dirpath,
     52     export_mode=self.export_mode,
     53 )

File ~\anaconda3\envs\MeshStream\lib\site-packages\anomalib\deploy\export.py:126, in export(task, transform, input_size, model, export_mode, export_root)
    124     onnx_path = export_to_onnx(model, input_size, export_path)
    125     if export_mode == ExportMode.OPENVINO:
--> 126         export_to_openvino(export_path, onnx_path, metadata, input_size)
    128 else:
    129     raise ValueError(f"Unknown export mode {export_mode}")

File ~\anaconda3\envs\MeshStream\lib\site-packages\anomalib\deploy\export.py:182, in export_to_openvino(export_path, onnx_path, metadata, input_size)
    180 optimize_command = ["mo", "--input_model", str(onnx_path), "--output_dir", str(export_path)]
    181 subprocess.run(optimize_command, check=True)  # nosec
--> 182 _add_metadata_to_ir(str(export_path) + f"/{onnx_path.with_suffix('.xml').name}", metadata, input_size)

File ~\anaconda3\envs\MeshStream\lib\site-packages\anomalib\deploy\export.py:235, in _add_metadata_to_ir(xml_file, metadata, input_size)
    233 serialize(model, str(tmp_xml_path))
    234 #Path(xml_file).unlink(missing_ok=True)
--> 235 tmp_xml_path.rename(xml_file)
    236 # since we create new openvino IR files, we don't need the bin file. So we delete it.
    237 tmp_xml_path.with_suffix(".bin").unlink()

File ~\anaconda3\envs\MeshStream\lib\pathlib.py:1234, in Path.rename(self, target)
   1224 def rename(self, target):
   1225     """
   1226     Rename this path to the target path.
   1227 
   (...)
   1232     Returns the new Path instance pointing to the target path.
   1233     """
-> 1234     self._accessor.rename(self, target)
   1235     return self.__class__(target)

FileExistsError: [WinError 183] Impossible de créer un fichier déjà existant: 'results\\patchcore\\mvtec\\bottle\\run\\weights\\openvino\\tmp.xml' -> 'results\\patchcore\\mvtec\\bottle\\run\\weights\\openvino/model.xml'

Code of Conduct

I agree to follow this project's Code of Conduct

blaz-r · 2023-10-18T08:14:13Z

Hello. If I understand correctly this was fixed in #1386 so it can be closed?

ggiret-thinkdeep · 2023-10-18T08:46:15Z

Exactly yes

ggiret-thinkdeep closed this as completed Oct 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: WinError183 on OpenVino Export mode (Windows) #1385

[Bug]: WinError183 on OpenVino Export mode (Windows) #1385

ggiret-thinkdeep commented Oct 5, 2023

blaz-r commented Oct 18, 2023

ggiret-thinkdeep commented Oct 18, 2023

[Bug]: WinError183 on OpenVino Export mode (Windows) #1385

[Bug]: WinError183 on OpenVino Export mode (Windows) #1385

Comments

ggiret-thinkdeep commented Oct 5, 2023

Describe the bug

Dataset

Model

Steps to reproduce the behavior

OS information

Expected behavior

Screenshots

Pip/GitHub

What version/branch did you use?

Configuration YAML

Logs

Code of Conduct

blaz-r commented Oct 18, 2023

ggiret-thinkdeep commented Oct 18, 2023