Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yaml Crash Problem WIP #282

Closed
wants to merge 17 commits into from
Closed

Yaml Crash Problem WIP #282

wants to merge 17 commits into from

Conversation

dkeeney
Copy link

@dkeeney dkeeney commented Feb 26, 2019

This PR is to investigate the problem where Yaml crashes when running in debug mode on Linux.
No fix yet.

Fixes #218

@dkeeney dkeeney mentioned this pull request Feb 26, 2019
@dkeeney
Copy link
Author

dkeeney commented Feb 27, 2019

Ok, after debugging around in Yaml on Ubuntu I am left without a clue.

What I know:

  • The text being parsed is "{count: 64}".
  • Using GNU gcc 8.2.0 compiler
  • Yaml crashes about 6 call levels down inside Yaml in node.h line 45. This is the node class.
  • This class has two members;
    • m_ref which is a reference node. This is a shared_ptr. This looks fine
    • m_dependencies is a std::set of pointers to other nodes. This is scrambled and we crash when getting an iterator to it.
  • Just prior to the crash, we call into set_type(type) on the node class. At that point, m_dependencies is clean and is an empty set.
  • we then call the mark_defined( ) method on the same class. Before doing anything in that method m_dependencies is scrambled. The m_ref member is still clean.

I see nothing that can cause the class to get messed up just by making a call....no other logic involved that I can see. The curious thing is that it crashes only when called from within our code (their unit tests run just fine), and only in debug mode on Linux. To me this sounds like the compiler got something wrong.

@dkeeney
Copy link
Author

dkeeney commented Feb 27, 2019

I can try another compiler but it seems like we had this crash before going to a newer compiler.

So I am left without a fix.

I will look around and see if there is another Yaml parser package that we could substitute.

@dkeeney
Copy link
Author

dkeeney commented Feb 27, 2019

Ok, here is what I am finding:

  • For C++, yaml-cpp is about the only game in town.
  • Nobody else seems to be having a crash like ours. That makes me suspicious that there is something we are not doing right. There is one thing that most people do not do is compile it with -fPIC option. But, we need that because this code needs to go into our Python extension library which is basically a shared library.
  • We call yaml-cpp in just one place:
      const YAML::Node doc = YAML::Load(parameter_string);
    
    The returned doc is a tree of parsed nodes. The crash is inside the Load( ) function. Nothing strange about how it is called.
  • As an alternative there is libyaml which is a yaml parser written in C which is quite popular. We could add a C++ wrapper. This is a SAX parser which means it performs callbacks for each field during the parse rather than returning a nested tree structure like yaml-cpp. I could adapt our code to use this without too much problem since what we do is iterate yaml-cpp's structure and build another one anyway. It might actually be faster to use callbacks but since this is only used at startup, speed is not really an issue.

At this point, I am ready to try the libyaml parser. I will do this in parallel so that I don't take out yaml-cpp until I am sure libyaml will work for us.

Sorry to be so verbose on this topic but I wanted to document the decisions.

this should fix yaml-cpp bug,
gtest might need to match Debug/Release of project
determine by CMAKE_BUILD_TYPE,
IDEs will need to rebuild,fix if Debug/Release is switched
@breznak breznak mentioned this pull request Feb 28, 2019
1 task
@@ -47,36 +49,15 @@ execute_process(COMMAND ${CMAKE_COMMAND}
if(result)
message(FATAL_ERROR "CMake step for Third Party builds failed: ${result}")
endif()
if(MSVC)
# for MSVC builds we need to build both Release and Debug builds
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not do this (double Release, Debug build). The IDE should fix that, or one can call Clean and Rebuild..
This way, the CI build time is artificially doubled

src/nupic/engine/YAMLUtils.cpp Outdated Show resolved Hide resolved
@breznak
Copy link
Member

breznak commented Feb 28, 2019

@dkeeney can you push the path that forces yaml-cpp to be built with Release, please? I'm not sure I've implemented it well

@dkeeney
Copy link
Author

dkeeney commented Feb 28, 2019

I am mothballing this PR. I will come back to it when there is less high priority work to do.
Basically the code here will create a wrapper around libyaml so that it looks like yaml-cpp.
The code is just a rough cut...have not even tried to compile.

@dkeeney
Copy link
Author

dkeeney commented Feb 28, 2019

@dkeeney can you push the path that forces yaml-cpp to be built with Release, please?

I tried that....it still crashes if the main program is Debug.
I know...this is really strange.

@breznak
Copy link
Member

breznak commented Sep 17, 2019

@dkeeney should we close this PR? As a way forward could be use the new yaml lib in #218 ?

@dkeeney
Copy link
Author

dkeeney commented Sep 17, 2019

As a way forward could be use the new yaml lib

I am in the process of testing with RapidYaml. Finding a few snags preventing me from pushing it to github. Trying to find a work-around. This parser is so new that there are a few rough edges yet but I like it so far.

If I cannot get this to work, there is another yaml parser I could try.

So, lets keep #282 since it is what I am working on.

@breznak
Copy link
Member

breznak commented Sep 17, 2019

Trying to find a work-around. This parser is so new that there are a few rough edges yet but I like it so far.

that's even better new, cool! Glad you like the parser. So let's see how well it fares for you.

@dkeeney
Copy link
Author

dkeeney commented Sep 17, 2019

I think I hit a wall with RapidYaml.

Exceptions are the killer. If there are any kind of error RapidYaml expect your entire program to die. I tried various ways to try and capture the error and throw an exception but it does not work. Most of the code is C code so it does not really support exceptions. But they should at least attempt to return an error code ... but they do not. I suspect they cannot guarantee that their internal data structures are still valid after an error is detected.

Anyway, I posted my review in their issues page in hopes that they will address this.
biojppm/rapidyaml#12

@dkeeney
Copy link
Author

dkeeney commented Sep 17, 2019

Ok, where to go from here...

My current thinking is to attempt to use libyaml. This is a SAX parser which means it gives callbacks for each thing it parses so you can build your own structure. It is a little extra code but not bad.

But in the mean time, I would like to take advantage of the work I did on RapidYaml and replace our current yaml code with one that does not use the Spec. It can still be yam-cpp but its interface is encapsulated into a single .cpp file,

@dkeeney dkeeney changed the title Yaml Crash Problem Yaml Crash Problem WIP Sep 19, 2019
@dkeeney dkeeney self-assigned this Sep 19, 2019
@dkeeney
Copy link
Author

dkeeney commented Sep 19, 2019

This push contains the code I am experimenting with. It contains all three yaml parsers although only one at a time is compiled. None of them actually compile at this point.
The three parsers are:

  • yaml-cpp This is our current parser but it is wrapped in a new class for encapsulation and removal of the Spec from the process.
  • RapidYaml This wrapper is mostly complete but ran into a problem with the lack of exceptions. If there is any kind of error the entire program halts. In all other regards this parser looks good. I am keeping this around with the hope that the RapidYaml developers may correct the problem.
  • libyaml This is a SAX parser. The Value.cpp file for this is mostly a placeholder. I will work on this more after yaml-cpp is working again.

The objectives of this project are:

  • Removes dependency on Spec for parameter parsing.
  • Automatic download of the selected Yaml parser dependency
  • Deletes YamlUtils.cpp/hpp, Scalar.cpp/hpp
  • Replaces Value.cpp/hpp with a common Value.hpp and its implementation in separate Value.cpp's using each of the three parsers.
  • Each parser is encapsulated in its copy of Value.cpp so as not to be exposed to the rest of the code.
  • CMakeLists.txt - bump CMake version to 3.9 to accommodate RapidYaml.

@dkeeney
Copy link
Author

dkeeney commented Sep 19, 2019

I am taking a break and traveling up to San Jose to see my grandson this weekend. I will continue work when I return.

Copy link
Member

@breznak breznak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks David, esp. for all the work with testing 3 yaml parsers!
I like the idea of hiding yaml parsing under Value.hpp, hope you get one of the parsers working (let's see if rapidYaml replies; do we need to handle crashes in yaml parsing, or could we just hard fail on error?)

@@ -29,6 +29,7 @@ set($ENV{HTM_CPP} ${REPOSITORY_DIR})
#
option(FORCE_CPP11 "Force compiler to use C++11 standard." OFF)
option(FORCE_BOOST "Force compiler to install and use Boost." OFF)
set(YAML_PARSER "yamlcpp" CACHE STRING "Specify the YAML parser to use 'yamlcpp', 'RapidYaml', 'libyaml', default 'yamlcpp'." )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i know this is a WIP, eventually we'll want to have just 1 parser.

htm/ntypes/Value.cpp
htm/ntypes/Value-yamlcpp.cpp
htm/ntypes/Value-rapidyaml.cpp
htm/ntypes/Value-libyaml.cpp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO, only one will be used when finalized

NTA_BasicType dataType;
// 1 = scalar; > 1 = array o fixed sized; 0 = array of unknown size
// TODO: should be size_t? Serialization issues?
size_t count;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guess it's size_t for the >1 case, likely we'd never exceed an UInt, but saving a few bytes in serialized state is imho not worth a change (so I'd leave as is)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

size_t in Spec

I did not make that change. That is in the master.
But it does not matter, eventually I hope to remove Spec if I can.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, I was replying to the TODO (which was new?)

src/htm/regions/VectorFileEffector.hpp Show resolved Hide resolved
src/test/CMakeLists.txt Show resolved Hide resolved
const char *s1 = "10";
vm.parse(s1);
EXPECT_TRUE(vm.isScalar());
UInt32 u = vm.as<UInt32>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to keep the value map with separate parse, "as" steps, or could we use just vm.as<T>("str") ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to keep the value map with separate parse, "as" steps, or could we use just vm.as("str") ?

I don't follow the question. Are you proposing the argument to the as( arg ) be a string to be parsed? I guess it could. This would only be useful for things like default values from the Spec stored as Yaml or JSON strings. But perhaps it might have use elsewhere.

The test is checking the ability of the root of the tree to be a scalar. Normally it is a map or a sequence but could be any of the three or empty. If you know ahead of time that a string contains a scalar then you don't need to parse...just use std:: conversion functions. But if you don't know then you will need to check if the parsed node is a scalar before using the as( ).

There are lots of tests yet to be added.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the reason I started with tests of the root being a scalar is that RapidYaml required the root node to be a map or sequence. If you parsed a string that was really a scalar it would create a sequence and add the scalar as the first node. So you could not tell the difference between "123" and "[123]". Scalar operations on the root would fail. So I had to find a work-around.

Yaml-cpp allowed the root to be scalar. libyaml does not have a node, you have to create it yourself...which is what our Value class is.


vm.parse("- 1");
EXPECT_TRUE(vm.isSequence());
u = vm[0].as<UInt32>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't "- 1" just fail? and why "1" is in [0], then "-" is then in [1]..parser working right-to-left?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't "- 1" just fail? and why "1" is in [0], then "-" is then in [1]..parser working right-to-left?

In YAML, a leading "- " (a dash followed by a space) means this is an element of a sequence.
The construct vm[0] means give me the first element of the sequence which should be the scaler "1".

@breznak
Copy link
Member

breznak commented Sep 19, 2019

Thanks, I've reviewed the recent changes. Have a nice trip!

@dkeeney
Copy link
Author

dkeeney commented Sep 19, 2019

The key thing to evaluate is Value.hpp since this should define the functions available for the parameter facility. ValueMap is the root and Value is a node in the parsed tree. ValueMap is actually just a Value object as well.

Right now all parameter handling is flat. In other words, the root is a map and all children are scalars. I want to be able to allow just about any structure that could be described by Yaml or JSON to be used as the set of parameters. Some of the new parsers will require it.

The inspiration for the feature set is taken from the YAML::Node in https://github.com/jbeder/yaml-cpp/wiki/Tutorial. I have yet to implement the operator= which will allow individual nodes to be created and modified after, or instead of, the parse.

Originally I was going to just use YAML::Node or a subclass of it, but that would lock us into yaml-cpp. So that is why we need the Value class to hide the implementation.

@dkeeney
Copy link
Author

dkeeney commented Sep 19, 2019

(let's see if rapidYaml replies; do we need to handle crashes in yaml parsing, or could we just hard fail on error?)

Well, this means the app terminates with no chance of catching and continuing. Makes it rather hard to do negative tests because the test stops. If you had a string like "abc" and tried to use vm.as() the program dies.

@dkeeney
Copy link
Author

dkeeney commented Sep 29, 2019

pushing this PR so that I can switch to Linux. I want to see if this fixes the Yaml crash problem in Linux debug mode.

@dkeeney
Copy link
Author

dkeeney commented Oct 11, 2019

I am getting some unexpected failures while locally testing my Yaml changes.
After refreshing the master in my local repository, the TMRegionTest is failing with the following:

ERR:  CHECK FAILED: "minThreshold <= activationThreshold"  [J:\Projects\AI\htmF\src\htm\algorithms\TemporalMemory.cpp line 107]
  • minThreshold default changed from 8 to 10 but the Spec default was not changed to match.
  • I get the "minThreshold <= activationThreshold" error from TM.
  • There are similar discrepancies in SP.

How could these errors have gotten into the master without being detected by the CI?
Are the all of the Regression tests being ran?

@dkeeney
Copy link
Author

dkeeney commented Oct 11, 2019

I don't know the cause yet but I cannot locally run the python tests using the command python setup.py test.

  • Running on Windows.
  • The cwd is at the top of the repository.
  • python is 3.7.4.
  • The links shown below do contain the modules.

The C++ unit tests do run successfully but when it gets to the python tests it give me the following:

======================================================= ERRORS ========================================================
_______________________________ ERROR collecting bindings/py/tests/sparse_link_test.py ________________________________
ImportError while importing test module 'J:\Projects\AI\htmF\bindings\py\tests\sparse_link_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
sparse_link_test.py:22: in <module>
    from htm.bindings.regions.PyRegion import PyRegion
E   ModuleNotFoundError: No module named 'htm.bindings.regions'
_____________________________ ERROR collecting bindings/py/tests/regions/network_test.py ______________________________
ImportError while importing test module 'J:\Projects\AI\htmF\bindings\py\tests\regions\network_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
regions\network_test.py:23: in <module>
    from htm.bindings.regions.PyRegion import PyRegion
E   ModuleNotFoundError: No module named 'htm.bindings.regions'
_____________________________ ERROR collecting bindings/py/tests/regions/pyregion_test.py _____________________________
ImportError while importing test module 'J:\Projects\AI\htmF\bindings\py\tests\regions\pyregion_test.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
regions\pyregion_test.py:20: in <module>
    from htm.bindings.regions.PyRegion import PyRegion
E   ModuleNotFoundError: No module named 'htm.bindings.regions'
------------------ generated xml file: J:\Projects\AI\htmF\bindings\py\tests\junit-test-results.xml -------------------
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=============================================== 3 error in 3.91 seconds ===============================================

@dkeeney
Copy link
Author

dkeeney commented Oct 11, 2019

This was a false alarm.
I cloned a fresh repository and everything builds and all tests run successfully.
So, I conclude that my branch is corrupted somehow. I will create a new branch

@dkeeney
Copy link
Author

dkeeney commented Oct 12, 2019

Continued in PR #715

@dkeeney dkeeney closed this Oct 12, 2019
@dkeeney dkeeney deleted the yamlProblem branch October 12, 2019 20:28
@breznak
Copy link
Member

breznak commented Oct 13, 2019

I don't know the cause yet but I cannot locally run the python tests using the command python setup.py test.

I was getting these by a) forgetting to uninstall some older htm.core version in the pyenv, b) some part compiled with different c++/py compiler/version.

@dkeeney
Copy link
Author

dkeeney commented Oct 14, 2019

Deleting the old versions of htm.core and and starting over with a new branch seemed to fix it.
I wonder if we should mention in the Readme that they might need to delete the old htm.core versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Yaml-cpp bug in debug mode
2 participants