-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The EncoderRegion, a generic plugin container for encoders. #657
Conversation
I am not quite ready to push this yet ... First I want to makes sure that everyone likes the idea of incorporating the encoders into NetworkAPI as plug-ins. Then I need to add the ability to plug-in the python written encoders. Questions:
|
I see python already handles this by allowing the SDR buffer to be an optional argument. |
Encoders can have dimensions. The dimensions are typically derived from the raw sensory inputs (which can have dimensions) [1], and from the parameters to the encoder [2]. Both of these things are typically known before the encoder is constructed. However IIRC, all of the encoders currently in htm.core have a single dimension. |
Thanks for your reply. I was hoping that you would respond. :-)
This is sort of what I thought. This means that the encoder itself knows this info. In my NetworkAPI situation, any encoder algorithm module should be able to be plugged into the new EncoderRegion but the EncoderRegion knows nothing about what any particular encoder needs. It does know:
So, it needs a way get this info from the encoder itself. I do not want to require a lot of new code in each encoder to provide this info but I do need something. So, here is my proposal for C++ encoders:
With this information I can build the parameter structure to initialize the encoder and prepare the input to be passed to it. Is there a better way to do this? |
I am struggling with how to be able to plug-in encoders written in python. I can use inspection to find the fields that it expects in initialization but types and default values are a little hard to derive without making some code changes. Looking for ideas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a quick review of the code, I'll need to think a bit more about the conceptual questions you are asking.
v2.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OT, going back with ver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The VERSION file is changed by the build every time it runs.
I don't know if that is good or bad that it does that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add it to gitignore? Or is VERSION to be manually set for the Releases? Then builds should not auto-increment the value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The builds do not auto increment. They just find the most recent tag and stick that in the VERSION file. The VERSION file is then used by everything that needs version during the build.
I think it is useful to know what was the current tag at the point a build occurred.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the latest current tag something like v2.0.10, or so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depends on what tags it finds on your branch. If it does not find any it remains unchanged.
* --------------------------------------------------------------------- */ | ||
|
||
/** @file | ||
* Defines the base class for all encoders. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is renamed from BaseEncoder
to GenericEnc
? why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The GenericEncoder name is just a temporary name until all encoders are converted. Then we can change its name its name back to BaseEncoder.
(int)((char *)&args_.n - (char *)&args_), \ | ||
BasicType::getType(typeid(args_.n)), \ | ||
std::to_string(args_.n)} \ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please mention in some comment that this code & structs are used for the NetworkAPI / EncoderRegions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please mention in some comment that this code & structs are used for the NetworkAPI / EncoderRegions
Yes I will, assuming that we use this way of obtaining the spec.
std::string getName() const override { return "RDSE"; } | ||
|
||
// A discription of the fields in the RDSE_Parameters structure. | ||
ParameterDescriptor getDescriptor() override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc that this is for netAPI/Regions.
Ok if the override is optional, I wouldn't be too happy if it were required for each new enc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok if the override is optional, I wouldn't be too happy if it were required for each new enc
If we are to use an encoder as a plugin to networkAPI with (Spec) then yes, all encoders will be required to provide this information. Otherwise, networkAPI does not know how to call it.
|
||
const RDSE_Parameters ¶meters = args_; | ||
|
||
void encode(Real64 input, SDR &output) override; | ||
|
||
void encode(Real64 input, SDR &output); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why isn't this required (override) anymore? The c++ API uses this method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why isn't this required (override) anymore? The c++ API uses this method.
Well, that would mean that ALL encoders must require an input of a Real64. I think that only applies to the ScalerEncoder and RDSE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was the reason why BaseEncoder was templated, #657 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understand, but I cannot use a templeted BaseEncoder with polymorphism. So it we have void encode(Real64 input, SDR &output);
it cannot use override. I know...its a bummer.
|
||
#include <htm/utils/Log.hpp> | ||
|
||
// from http://stackoverflow.com/a/9096509/1781435 | ||
#define stringify(x) #x | ||
#define expand_and_stringify(x) stringify(x) | ||
|
||
static bool startsWith(const std::string &s, const std::string &prefix) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, startsWith_
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, startsWith_
Good point.
/** | ||
* Regpresents a boolean value. | ||
*/ | ||
typedef bool Bool; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo above, and do we want a Bool
, why not just bool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I needed to be able to create getPrameterBool( ) with a macro. Otherwise it was getParameterbool() and I had to define a function with that name that called getPrameterBool().
I am beginning to wonder if all of this under-the-cover magic is useful. I am doing this in order to build the Spec that each Region object must provide in a static function. This is a hold-over from the OPF framework which called this function to get the info to fill in the GUI all of the options available for configuring the Region. The NetworkAPI engine also used this info in an attempt to be more intelligent about setting up the buffers and doing error checks at configuration time rather than run time. But I wonder if this is over-kill. What if we ripped out the Spec...what would we loose? What would we gain?
This would mean discarding parts of the API but I am not sure that it would be all the visible to external functions and processes other than that the Spec is no longer available. Each Region might have a little more code but in general the NetworkAPI would be simpler and easier to understand. I don't know... what do you think. |
Here is another idea. Rather than re-write NetworkAPI to get rid of the Spec, what if we required each Region to provide just a manually created JSON string that described the region .. and in the case of an EncoderRegion, a JSON string provided by each encoder that described that encoder. This would work for code in C++ and in Python. The down side is that I would still have trouble generically creating and accessing the parameter structure that we currently use to pass parameters to the C++ encoders. What is passed to Regions is the ValueMap which is basically a **kwargs. What I want to do is pass this on to the encoders. This is what the python encoders are expecting. But the C++ encoders use that parameter structure. That was an elegant solution for C++ apps that can create an instance of the structure and fill in the data. But for generic interfaces that cannot include the .hpp file it becomes a problem. Let me think about what might need to be the minimum needed in a JSON descriptor string if we did not use a structure for the parameters. |
Here is what a descriptor string might look like in YAML syntax if one were written for the ScalarEncoder.
I don't know. This seems like a lot of work just to support the creation of the Spec object. |
Obviously this is going to take some time to decide how to proceed. So I am going to put this on hold and take a smaller project; That of changing the Yaml parser so that it does not require the Spec during the parsing. It uses the Spec to determine the type of each field. I can remove the Spec requirement by parsing into a ValueMap object that is a map of strings rather than a map of native type fields. When a field is extracted from the map it will convert it to whatever type the caller needs. |
I created an issue (#664 ) regarding the ValueMap and the Spec so we can talk about this one part. |
So, I will be making a Region for each encoder rather than a universal EncoderRegion. This is what the Spec would look like for ScalarEncoder if I use the yaml parser and took advantage of the default fields, did not show the description fields, and tightened up the syntax a little: (Shown here in JSON syntax).
Should I go with this approach? |
thanks David for continuing the work on this! Although at first I was advocating for a Universal Enc region, I think for now the 1:1 approach will be better 👍
not sure if Encoder handles PS: Scalar has some nice properties etc, but for real-world inputs we'll use RDSE, so maybe start with that one first? (And we might not need ScalarE region at all?) |
You are correct. Each iteration only uses 1 Real64 value. Along this topic... I have been thinking that a small Region that takes a stream input and outputs as a link would be handy. |
Lets close this because I think each encoder will be its own region. |
This is a first crack at a plugin container for the encoder implementations. Feel free to recommend suggestions. The basic idea is that when you declare a region for an encoder using addRegion( ) you use the type
EncoderRegion:<encoder name>
. This will load the EncoderRegion implementation and load the requested encoder algorithm module.The GenericEncoder class is the base class for encoders that can be plugins. Once all of them are plugins we can change its name to BaseEncoder. The BaseEncoder cannot use a class template if it is to be used with polymorphism. So there are some differences between the two classes.
I have not yet added the Python bindings so that encoders can be loaded from python code as well.
You will notice that I had to add a little block of code to each encoder module so that the generic code in the EncoderRegion can create the expected configuration structure without having a reference to the class. I implemented this on ScalerEncoder and RDSE encoder. The Document encoder will have to wait until I extend the parameter handling a little more.