Versions scans wrapped in try-except (#80)

* try-except on versions scan * exit if no models could be accessed * fixing bug for wrong list of available versions * exit if no models for config file option * documentation update
openvinotoolkit · Jul 17, 2019 · 0049437 · 0049437
1 parent 6649085
commit 0049437
Show file tree

Hide file tree

Showing 3 changed files with 89 additions and 14 deletions.
diff --git a/README.md b/README.md
@@ -141,13 +141,54 @@ docker logs ie-serving
 
 
 ### Model import issues
-OpenVINO&trade; model server will fail to start when any of the defined model cannot be loaded successfully. The root cause of
-the failure can be determined based on the collected logs on the console or in the log file.
+OpenVINO&trade; Model Server loads all defined models versions according 
+to set [version policy](docs/docker_container.md#model-version-policy). 
+A model version is represented by a numerical directory in a model path, 
+containing OpenVINO model files with .bin and .xml extensions.
+
+Below are examples of incorrect structure:
+```bash
+models/
+├── model1
+│   ├── 1
+│   │   ├── ir_model.bin
+│   │   └── ir_model.xml
+│   └── 2
+│       ├── somefile.bin
+│       └── anotherfile.txt
+└── model2
+    ├── ir_model.bin
+    ├── ir_model.xml
+    └── mapping_config.json
+```
+
+In above scenario, server will detect only version `1` of `model1`.
+Directory `2` does not contain valid OpenVINO model files, so it won't 
+be detected as a valid model version. 
+For `model2`, there are correct files, but they are not in a numerical directory. 
+The server will not detect any version in `model2`.
+
+When new model version is detected, the server will loads the model files 
+and starts serving new model version. This operation might fail for the following reasons:
+- there is a problem with accessing model files (i. e. due to network connectivity issues
+to the  remote storage or insufficient permissions)
+- model files are malformed and can not be imported by the Inference Engine
+- model requires custom CPU extension
+
+In all those situations, the root cause is reported in the server logs or in the response from a call
+to GetModelStatus function. 
+
+Detected but not loaded model version will not be served and will report status
+`LOADING` with error message: `Error occurred while loading version`.
+When model files becomes accessible or fixed, server will try to 
+load them again on the next [version update](docs/docker_container.md#updating-model-versions) 
+attempt.
+
+At startup, the server will enable gRPC and REST API endpoint, after all configured models and detected model versions
+are loaded successfully (in AVAILABLE state).
+
+The server will fail to start if it can not list the content of configured model paths.
 
-The following problem might occur during model server initialization and model loading:
-* Missing model files in the location specified in the configuration file.
-* Missing version sub-folders in the model folder.
-* Model files require custom CPU extension.
 
 ### Client request issues
 When the model server starts successfully and all the models are imported, there could be a couple of reasons for errors 

diff --git a/ie_serving/main.py b/ie_serving/main.py
@@ -76,7 +76,8 @@ def parse_config(args):
                                            'base_path'],
                                        batch_size=batch_size,
                                        model_version_policy=model_ver_policy)
-            models[config['config']['name']] = model
+            if model is not None:
+                models[config['config']['name']] = model
         except ValidationError as e_val:
             logger.warning("Model version policy for model {} is invalid. "
                            "Exception: {}".format(config['config']['name'],
@@ -85,6 +86,10 @@ def parse_config(args):
             logger.warning("Unexpected error occurred in {} model. "
                            "Exception: {}".format(config['config']['name'],
                                                   e))
+    if not models:
+        logger.info("Could not access any of provided models. Server will "
+                    "exit now.")
+        sys.exit()
     if args.rest_port > 0:
         process_thread = threading.Thread(target=start_web_rest_server,
                                           args=[models, args.rest_port])
@@ -112,7 +117,13 @@ def parse_one_model(args):
         logger.error("Unexpected error occurred. "
                      "Exception: {}".format(e))
         sys.exit()
-    models = {args.model_name: model}
+    models = {}
+    if model is not None:
+        models[args.model_name] = model
+    else:
+        logger.info("Could not access provided model. Server will exit now.")
+        sys.exit()
+
     if args.rest_port > 0:
         process_thread = threading.Thread(target=start_web_rest_server,
                                           args=[models, args.rest_port])

diff --git a/ie_serving/models/model.py b/ie_serving/models/model.py
@@ -58,13 +58,20 @@ def build(cls, model_name: str, model_directory: str, batch_size,
         logger.info("Server start loading model: {}".format(model_name))
         version_policy_filter = cls.get_model_version_policy_filter(
             model_version_policy)
-        versions_attributes, available_versions = cls.get_version_metadata(
-            model_directory, batch_size, version_policy_filter)
+
+        try:
+            versions_attributes, available_versions = cls.get_version_metadata(
+                model_directory, batch_size, version_policy_filter)
+        except Exception as error:
+            logger.error("Error occurred while getting versions "
+                         "of the model {}".format(model_name))
+            logger.error("Failed reading model versions from path: {} "
+                         "with error {}".format(model_directory, str(error)))
+            return None
+
         versions_attributes = [version for version in versions_attributes
                                if version['version_number']
                                in available_versions]
-        available_versions = [version_attributes['version_number'] for
-                              version_attributes in versions_attributes]
         versions_statuses = dict()
         for version in available_versions:
             versions_statuses[version] = ModelVersionStatus(model_name,
@@ -73,6 +80,9 @@ def build(cls, model_name: str, model_directory: str, batch_size,
         engines = cls.get_engines_for_model(versions_attributes,
                                             versions_statuses)
 
+        available_versions = [version_attributes['version_number'] for
+                              version_attributes in versions_attributes]
+
         model = cls(model_name=model_name, model_directory=model_directory,
                     available_versions=available_versions, engines=engines,
                     batch_size=batch_size,
@@ -81,10 +91,23 @@ def build(cls, model_name: str, model_directory: str, batch_size,
         return model
 
     def update(self):
-        versions_attributes, available_versions = self.get_version_metadata(
-            self.model_directory, self.batch_size, self.version_policy_filter)
+        try:
+            versions_attributes, available_versions = \
+                self.get_version_metadata(
+                    self.model_directory,
+                    self.batch_size,
+                    self.version_policy_filter)
+        except Exception as error:
+            logger.error("Error occurred while getting versions "
+                         "of the model {}".format(self.model_name))
+            logger.error("Failed reading model versions from path: {} "
+                         "with error {}".format(self.model_directory,
+                                                str(error)))
+            return
+
         if available_versions == self.versions:
             return
+
         logger.info("Server start updating model: {}".format(self.model_name))
         to_create, to_delete = self._mark_differences(available_versions)
         logger.debug("Server will try to add {} versions".format(to_create))