Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The EncoderRegion, a generic plugin container for encoders. #657

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .clang-format
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,6 @@
Language: Cpp
BasedOnStyle: LLVM
DisableFormat: false
IndentWidth: 2
ColumnLimit: 120
...
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
v2.0.11
v2.0.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OT, going back with ver?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VERSION file is changed by the build every time it runs.
I don't know if that is good or bad that it does that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add it to gitignore? Or is VERSION to be manually set for the Releases? Then builds should not auto-increment the value

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The builds do not auto increment. They just find the most recent tag and stick that in the VERSION file. The VERSION file is then used by everything that needs version during the build.
I think it is useful to know what was the current tag at the point a build occurred.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't the latest current tag something like v2.0.10, or so?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depends on what tags it finds on your branch. If it does not find any it remains unchanged.

4 changes: 3 additions & 1 deletion bindings/py/cpp_src/bindings/encoders/py_RDSE.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,9 @@ fields are filled in automatically.)");
py_RDSE.def_property_readonly("size",
[](RDSE &self) { return self.size; });

py_RDSE.def("encode", &RDSE::encode, R"()");
py_RDSE.def("encode", [](RDSE &self, htm::Real64 value, htm::SDR* sdr) {
breznak marked this conversation as resolved.
Show resolved Hide resolved
self.encode( value, *sdr );
});

py_RDSE.def("encode", [](RDSE &self, Real64 value) {
auto sdr = new SDR({self.size});
Expand Down
8 changes: 5 additions & 3 deletions bindings/py/cpp_src/bindings/encoders/py_ScalarEncoder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -115,12 +115,14 @@ fields are filled in automatically.)");
py_ScalarEnc.def_property_readonly("size",
[](const ScalarEncoder &self) { return self.size; });

py_ScalarEnc.def("encode", &ScalarEncoder::encode, R"()");
py_ScalarEnc.def("encode", [](ScalarEncoder &self, htm::Real64 value, htm::SDR* sdr) {
self.encode( value, *sdr );
});

py_ScalarEnc.def("encode", [](ScalarEncoder &self, htm::Real64 value) {
auto output = new SDR( self.dimensions );
self.encode( value, *output );
return output; },
R"()");
return output;
});
}
}
5 changes: 5 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,7 @@ set(algorithm_files

set(encoders_files
htm/encoders/BaseEncoder.hpp
htm/encoders/GenericEncoder.hpp
htm/encoders/ScalarEncoder.cpp
htm/encoders/ScalarEncoder.hpp
htm/encoders/RandomDistributedScalarEncoder.hpp
Expand Down Expand Up @@ -110,6 +111,8 @@ set(engine_files
htm/engine/RegionImplFactory.hpp
htm/engine/RegisteredRegionImpl.hpp
htm/engine/RegisteredRegionImplCpp.hpp
htm/engine/RegisteredEncoder.hpp
htm/engine/RegisteredEncoderCpp.hpp
htm/engine/Spec.cpp
htm/engine/Spec.hpp
htm/engine/YAMLUtils.cpp
Expand Down Expand Up @@ -150,6 +153,8 @@ set(os_files
)

set(regions_files
htm/regions/EncoderRegion.cpp
htm/regions/EncoderRegion.hpp
htm/regions/ScalarSensor.cpp
htm/regions/ScalarSensor.hpp
htm/regions/SPRegion.cpp
Expand Down
139 changes: 139 additions & 0 deletions src/htm/encoders/GenericEncoder.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
/* ---------------------------------------------------------------------
* HTM Community Edition of NuPIC
* Copyright (C) 2019, David McDougall
*
* This program is free software: you can redistribute it and/or modify
* it under the terms of the GNU Affero Public License version 3 as
* published by the Free Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
* See the GNU Affero Public License for more details.
*
* You should have received a copy of the GNU Affero Public License
* along with this program. If not, see http://www.gnu.org/licenses.
* --------------------------------------------------------------------- */

/** @file
* Defines the base class for all encoders.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is renamed from BaseEncoder to GenericEnc ? why?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The GenericEncoder name is just a temporary name until all encoders are converted. Then we can change its name its name back to BaseEncoder.

*/

#ifndef NTA_GENERIC_ENCODER
#define NTA_GENERIC_ENCODER

#include <htm/ntypes/BasicType.hpp>
#include <htm/types/Sdr.hpp>
#include <string>

namespace htm {

struct BaseParameters {
};

// This structure is populated with the FIELD( ) macro
typedef struct {
std::string name;
int offset;
NTA_BasicType type;
std::string default_value;
} ParameterDescriptorFields;
typedef struct {
NTA_BasicType expectedInputType;
size_t expectedInputSize;
char *parameterStruct;
size_t parameterSize;
std::map<std::string, ParameterDescriptorFields> parameters;
} ParameterDescriptor;

#define FIELD(n) \
{ #n, \
{ #n, \
(int)((char *)&args_.n - (char *)&args_), \
BasicType::getType(typeid(args_.n)), \
std::to_string(args_.n)} \
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please mention in some comment that this code & structs are used for the NetworkAPI / EncoderRegions

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please mention in some comment that this code & structs are used for the NetworkAPI / EncoderRegions

Yes I will, assuming that we use this way of obtaining the spec.



/**
* Base class for all encoders that can be used as a plugin for EncoderRegion.
* An encoder converts a value to a sparse distributed representation.
*
* Subclasses must implement method encode and Serializable interface.
* Subclasses can optionally implement method reset.
*
* There are several critical properties which all encoders must have:
*
* 1) Semantic similarity: Similar inputs should have high overlap. Overlap
* decreases smoothly as inputs become less similar. Dissimilar inputs have
* very low overlap so that the output representations are not easily confused.
*
* 2) Stability: The representation for an input does not change during the
* lifetime of the encoder.
*
* 3) Sparsity: The output SDR should have a similar sparsity for all inputs and
* have enough active bits to handle noise and subsampling.
*
* Reference: https://arxiv.org/pdf/1602.05925.pdf
*/
class GenericEncoder : public Serializable {
public:
virtual std::string getName() const = 0;

/**
* Members dimensions & size describe the shape of the encoded output SDR.
* This is the total number of bits in the result.
*/
const std::vector<UInt> &dimensions = dimensions_;
const UInt &size = size_;

virtual void reset() {}

// for use by generic EncoderRegion to
// generically pass parameter structure to encoder
virtual void initialize(BaseParameters *ptrToParameters) = 0;
virtual ParameterDescriptor getDescriptor() = 0; // parameter descriptors
virtual void encode(void *input, size_t input_count,
SDR &output) = 0; // untyped encode( ) call.

virtual ~GenericEncoder() {}

// overridden by including the macro CerealAdapter in subclass.
virtual void cereal_adapter_save(ArWrapper &a) const {};
virtual void cereal_adapter_load(ArWrapper &a){};

// encoders should override
virtual bool operator==(const GenericEncoder &other) const { return true; }

protected:
GenericEncoder() {
// initialize with dummy dimensions.
// Must set real dimensions later by calling init_base() again.
std::vector<UInt> dim({0});
init_base(dim);
}

GenericEncoder(const std::vector<UInt> dimensions) {
init_base(dimensions);
}

void init_base(const std::vector<UInt> dimensions) {
dimensions_ = dimensions;
size_ = SDR(dimensions).size;
}




private:
std::vector<UInt> dimensions_;
UInt size_;
};






} // end namespace htm
#endif // NTA_GENERIC_ENCODER
61 changes: 45 additions & 16 deletions src/htm/encoders/RandomDistributedScalarEncoder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -27,37 +27,44 @@
using namespace std;
using namespace htm;

RandomDistributedScalarEncoder::RandomDistributedScalarEncoder(
const RDSE_Parameters &parameters)
{ initialize( parameters ); }
RandomDistributedScalarEncoder::RandomDistributedScalarEncoder() {}
// Note: This encoder is not useable until initialize() is called.
breznak marked this conversation as resolved.
Show resolved Hide resolved

void RandomDistributedScalarEncoder::initialize( const RDSE_Parameters &parameters)
{
RandomDistributedScalarEncoder::RandomDistributedScalarEncoder( const RDSE_Parameters &params) {
initialize( params );
}

RandomDistributedScalarEncoder::RandomDistributedScalarEncoder(ArWrapper &wrapper) {
cereal_adapter_load(wrapper);
}


void RandomDistributedScalarEncoder::initialize(const RDSE_Parameters &params) {
// Check size parameter
NTA_CHECK( parameters.size > 0u );
NTA_CHECK(params.size > 0u);

// Initialize parent class.
BaseEncoder<Real64>::initialize({ parameters.size });
GenericEncoder::init_base({params.size});

// Check other parameters
UInt num_active_args = 0;
if( parameters.activeBits > 0u) { num_active_args++; }
if( parameters.sparsity > 0.0f) { num_active_args++; }
if (params.activeBits > 0u) { num_active_args++; }
if (params.sparsity > 0.0f) { num_active_args++; }
NTA_CHECK( num_active_args != 0u )
<< "Missing argument, need one of: 'activeBits' or 'sparsity'.";
NTA_CHECK( num_active_args == 1u )
<< "Too many arguments, choose only one of: 'activeBits' or 'sparsity'.";

UInt num_resolution_args = 0;
if( parameters.radius > 0.0f) { num_resolution_args++; }
if( parameters.category ) { num_resolution_args++; }
if( parameters.resolution > 0.0f) { num_resolution_args++; }
if (params.radius > 0.0f) { num_resolution_args++; }
if (params.category) { num_resolution_args++; }
if (params.resolution > 0.0f) { num_resolution_args++; }
NTA_CHECK( num_resolution_args != 0u )
<< "Missing argument, need one of: 'radius', 'resolution', 'category'.";
NTA_CHECK( num_resolution_args == 1u )
<< "Too many arguments, choose only one of: 'radius', 'resolution', 'category'.";

args_ = parameters;
args_ = params;
// Finish filling in all of parameters.

// Determine number of activeBits.
Expand Down Expand Up @@ -88,8 +95,7 @@ void RandomDistributedScalarEncoder::initialize( const RDSE_Parameters &paramete
}
}

void RandomDistributedScalarEncoder::encode(Real64 input, SDR &output)
{
void RandomDistributedScalarEncoder::encode(Real64 input, SDR &output) {
// Check inputs
NTA_CHECK( output.size == size );
if( isnan(input) ) {
Expand Down Expand Up @@ -126,13 +132,36 @@ void RandomDistributedScalarEncoder::encode(Real64 input, SDR &output)
output.setDense( data );
}

// bool RandomDistributedScalarEncoder::equals(const BaseEncoder &other) const
// {

bool RandomDistributedScalarEncoder::operator==(const GenericEncoder &other) const {
if (other.getName() != getName())
return false;
const RandomDistributedScalarEncoder &o = static_cast<const RandomDistributedScalarEncoder &>(other);
if (parameters.size == o.parameters.size &&
parameters.activeBits == o.parameters.activeBits &&
parameters.sparsity == o.parameters.sparsity &&
parameters.radius == o.parameters.radius &&
parameters.resolution == o.parameters.resolution &&
parameters.category == o.parameters.category &&
parameters.seed == o.parameters.seed)
return true;
return false;
}


std::ostream & htm::operator<<(std::ostream & out, const RandomDistributedScalarEncoder &self)
{
out << "RDSE ";
out << self.getName();
out << " size: " << self.parameters.size << ",\n";
out << " activeBits: " << self.parameters.activeBits << ",\n";
out << " resolution: " << self.parameters.resolution << ",\n";
out << " category: " << self.parameters.category << ",\n";
out << " seed: " << self.parameters.seed << std::endl;
return out;
}

// Register RandomDistributedScalarEncoder for Cereal Serialization
CEREAL_REGISTER_TYPE(RandomDistributedScalarEncoder);
CEREAL_REGISTER_POLYMORPHIC_RELATION(GenericEncoder, RandomDistributedScalarEncoder)
Loading