Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New YAML parser interface #715

Merged
merged 15 commits into from
Oct 21, 2019
Merged

New YAML parser interface #715

merged 15 commits into from
Oct 21, 2019

Conversation

dkeeney
Copy link

@dkeeney dkeeney commented Oct 12, 2019

Continuation of PR #282

src/htm/ntypes/Value.hpp Outdated Show resolved Hide resolved
@@ -46,7 +45,7 @@ SPRegion::SPRegion(const ValueMap &values, Region *region)
args_.columnCount = values.getScalarT<UInt32>("columnCount", 0);
args_.potentialRadius = values.getScalarT<UInt32>("potentialRadius", 0);
args_.potentialPct = values.getScalarT<Real32>("potentialPct", 0.5);
args_.globalInhibition = values.getScalarT<bool>("globalInhibition", true);
args_.globalInhibition = values.getScalarT<bool>("globalInhibition", false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use True as default, local inh is quite slow, and global is the default in the SP class.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use True as default,

Well...the problem is that this PR does change behavior slightly. In the past the defaults from the Spec were incorporated in values. The default given here was never used. The new behavior is that the defaults in the Spec are NOT included in values and therefore this default is now being used.

What I did was change these values to be the same as what was given in the Spec so that the default parameters passed on to the SP algorithm are unchanged.

Now, if you want to change the default to True in this case, then you must also change the expected results in the tests. I agree that the default here should be the same as the defaults used by SP but that sounds like a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I did was change these values to be the same as what was given in the Spec so that the default parameters passed on to the SP algorithm are unchanged.

ok, thanks for explanation, this is fine. also for other on the same topic 👍

I agree that the default here should be the same as the defaults used by SP but that sounds like a separate PR.

yep, let's keep that for another pr

src/htm/regions/SPRegion.cpp Show resolved Hide resolved
src/htm/regions/ScalarSensor.cpp Show resolved Hide resolved
src/htm/regions/TMRegion.cpp Show resolved Hide resolved
src/test/unit/engine/CppRegionTest.cpp Show resolved Hide resolved
@dkeeney
Copy link
Author

dkeeney commented Oct 15, 2019

@breznak I need your help. I know I must be doing something stupid but I cannot find it.

When I run the python unit tests I get 4 errors in NetworkTest.py. lines 164, 244, 264, 328. All three are the same error.

>     r = net.addRegion("test", "py.LinkRegion", "")
E     RuntimeError: TypeError: __init__() got multiple values for argument 'self'

J:\Projects\AI\htmG\bindings\py\tests\regions\network_test.py:328: RuntimeError

I need someone with more Python experience to take a look at this. It could be something to do with the bindings for addRegion( ) but I made no changes to it.

Where I did make changes is in bindings\py\cpp_src\plugin\PyBindRegion.cpp where I changed the building of the parameters when C++ code calls Python implemented regions. The original logic did not allow nesting of parameter structures. Calls to addRegion( ) would cause this to be called if the region is written in python.

I will appreciate any help I can get on this.

@dkeeney
Copy link
Author

dkeeney commented Oct 16, 2019

I gave up too soon...found the problem.
Of course it was someplace I was not expecting it to be...in the code that processed the Spec being read from the python code. The changes in my code exposed a bug over in that code.
Anyway... I working on making the corrections....

@breznak
Copy link
Member

breznak commented Oct 17, 2019

Do you intend to keep options for both libraries, or we decide for one over another?

@dkeeney
Copy link
Author

dkeeney commented Oct 17, 2019

Do you intend to keep options for both libraries, or we decide for one over another?

I don't know yet.

  • yaml-cpp still has the crash problem.
  • libyaml does not have the crash problem.
  • yaml-cpp is based on YAML 1.2
  • libyaml is based on YAML 1.1

At no time will we have both libraries compiled at the same time.
But I am still doing A - B tests to see which one seems to work best for us.
Eventually I will probably decide on one and remove the interface for the other but for now I would like to keep the options open.

@dkeeney
Copy link
Author

dkeeney commented Oct 17, 2019

There are still some things I do not like about the Value class.

  • its name, 'Value'. I am not imaginative enough to come up with a better name.
  • I use this 'assigned_' pointer which I do not like. Perhaps there is a way to do this with shared_ptr.

At the moment I am working on PyBindRegion.cpp and trying to get it to work with the new Value class. The GridCellLocationRegion.py uses parameters orientation and scale which are arrays of floats. That was not handled correctly.

But in general, the Spec limits the complexity of the parameter structure to flat mapped values or one dimensional arrays. I would expect new Regions (or encoders) to need a much richer parameter structure. But for now we are limited to this.

@dkeeney
Copy link
Author

dkeeney commented Oct 19, 2019

I want to tighten up the Value class before we merge.
I want to see if I can get rid of the assigned_ pointer. It might simplify a lot of code if the zombie object can be moved into the std::map without changing its address...so that things that point to it do not need to change their pointer after an assignment.

breznak
breznak previously approved these changes Oct 19, 2019
Copy link
Member

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Big congratulations for making this work, David! 🎉

I'm happy to see the cleanup and the new code has a really helpful docs 👍

  • small comments below
  • README might need updates as it mentions libyaml-cpp
  • I'm looking forward to the simplification with pointers you're mentioning
  • are we completely happy with libayml, or it too has its pros & cons?

Thank you!

elif(EXISTS ${REPOSITORY_DIR}/build/ThirdParty/share/libyaml.tar.gz)
set(URL ${REPOSITORY_DIR}/build/ThirdParty/share/libyaml.tar.gz)
else()
set(URL "https://github.com/yaml/libyaml/archive/master.zip")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use a fixed release/tag/commit

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use a fixed release/tag/commit

The current release is 0..2.2 but it did not build.
so as mentioned in the comments, we have to use the master repository until they release again.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to use the master repository until they release again.

yes, but use specific commit that works, so the "version" is fixed

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I do that? I think I can only have access to whatever is the latest in the archives.

src/htm/ntypes/Value.cpp Show resolved Hide resolved
| Value object - a container for data structures that have been parsed
| from a YAML or JSON string. It can hold any structure that can be
| parsed from these but does not implement Alias or reference features.
| YAML is a superset of JSON format but we only use the JSON compatible features.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use a JSON parser then? (not that we need, if libyaml works good)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we use a JSON parser

Yes, we could. But I would rather handle both JSON and Yaml.

@dkeeney
Copy link
Author

dkeeney commented Oct 20, 2019

I was able to get rid of the assigned_ feature so this is a little cleaner. Not sure it is much simpler however.

The zombies make it difficult to follow the code but I could not think of a better way to have a place to hold the keys and parent from a lookup until an assignment is made. yaml-cpp and rapidyaml used zombies as well. libyaml did not have a tree.

While I was at it, I added some thread locks so I think this should be threadsafe. I did not confirm that it is threadsafe...that would be a major test effort that is probably not worth it at this point.

@dkeeney
Copy link
Author

dkeeney commented Oct 20, 2019

The downloader for yaml-cpp is still in the package although not used. I would like to keep that for a little while to make sure that libyaml continues to work ok for us.

With the addition of this package, the end users of the library should not see anything different. However it does a few things for us.

  • cleans up some rather ugly code.
  • encapsulates the yaml parser so that in the future we have the option of switching to a different yaml parser with minimum effort if we find there is a problem.
  • Yaml (and JSON) parsing no longer uses the Spec so it can be used for things other than Region parameters. The Value structure is a universal container for parsed Yaml.
  • Moves the processing of parameters with the region Spec closer to the Region implementation so that the Region can have more flexibility in the layout of the parameters. It can now do nested sequences and maps.

In general it was much more work that should have been devoted to a low priority bug. :-)

@dkeeney
Copy link
Author

dkeeney commented Oct 20, 2019

Oh, another idea as to how to simplify some more...
...the parent_ field points to the parent's core not the parent Value object.
Let me see if that helps.

@dkeeney
Copy link
Author

dkeeney commented Oct 20, 2019

Yes, using the parent's core as the parent_ pointer on each node helped somewhat.

Copy link
Member

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does look pretty good!

Thanks for the improved doc & the thread-safety is a nice touch too!

@dkeeney dkeeney merged commit 23168cf into master Oct 21, 2019
@dkeeney dkeeney deleted the yamlProblem2 branch October 21, 2019 12:55
@breznak breznak mentioned this pull request Oct 21, 2019
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants