Skip to content

Commit

Permalink
Merge pull request #15 from spi-x-i/fixing-documentation
Browse files Browse the repository at this point in the history
Fixing explanation about model loading phase
  • Loading branch information
spi-x-i authored Jun 20, 2017
2 parents 50a229a + db46a47 commit b0c2b9b
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,7 +125,7 @@ object FlinkJpmmlExample {
Some useful insights from the code:
- in order to load the PMML model, you need to specify only the PMML source path
- `ModelReader` is a lazy reader and it provides the right reading abstraction to TaskManagers
- The resulting `PMMLModel` will be **loaded once** for each TaskManager running on your architecture at
- The resulting `PMMLModel` will be **loaded by once** factory for each TaskManager running on your architecture at
_construction time_
- the `PmmlModel.predict` method expects Flink Vectors as input event and, if you want to manage NaNs, an optional
replace value;
Expand Down Expand Up @@ -155,9 +155,9 @@ the library will take care how to load the model in full compliance of the under
(e.g. HDFS, Alluxio)
- `ModelReader` is the object implementing the previous behavior; it will provide the loading methods but will read it
_lazily_, _i.e._ only when the transformation will be applied
- The `PMMLModel` will be loaded once for each TaskManager running on your architecture; that means if you have an
active TaskManager _A_ made up of 4 TaskSlots, your TM will load **one and only one** shared copy of the model; this is
crucial in order to let the system scale (still simple PMML models can grow to several hundreds of MegaBytes
- The `PMMLModel` will be loaded by a singleton model factory for each TaskManager running on your architecture; that means if you have an
active TaskManager _A_ made up of 4 TaskSlots, your TM will load the model from a single loader entity; this is
crucial in order to let the system scale in thread-safety (still simple PMML models can grow to several hundreds of MegaBytes
proportionally to the model size, meaning a big load in memory terms)
- the `PmmlModel.predict` method expects Flink Vectors as input events; this choice let us to leverage the underlying
Breeze implementation and **no reflection** will be applied at all; moreover, the user don't have to specify any
Expand Down

0 comments on commit b0c2b9b

Please sign in to comment.