Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GTSAM segfault on noiseModel::Diagonal::Sigmas #75

Closed
jyiyang opened this issue Jun 17, 2019 · 24 comments
Closed

GTSAM segfault on noiseModel::Diagonal::Sigmas #75

jyiyang opened this issue Jun 17, 2019 · 24 comments

Comments

@jyiyang
Copy link

jyiyang commented Jun 17, 2019

Just migrated from bitbucket repo commit c844966 to github newest version. GTSAM segfaults on running the code. GDB backtrace shows

Program received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x21) at malloc.c:2951
2951	malloc.c: No such file or directory.
#0  __GI___libc_free (mem=0x21) at malloc.c:2951;
#1  0x00007f675f9126b6 in gtsam::noiseModel::Diagonal::Sigmas(Eigen::Matrix<double, -1, 1, 0, -1, 1> const&, bool) () from /usr/local/lib/libgtsam.so.4;
#2  0x00007f675f8145d5 in _GLOBAL__sub_I_lago.cpp () from /usr/local/lib/libgtsam.so.4;
#3  0x00007f67605c56ba in call_init (l=<optimized out>, argc=argc@entry=9, argv=argv@entry=0x7ffe55b7b708, env=env@entry=0x7ffe55b7b758) at dl-init.c:72;
#4  0x00007f67605c57cb in call_init (env=0x7ffe55b7b758, argv=0x7ffe55b7b708, argc=9, l=<optimized out>) at dl-init.c:30;
#5  _dl_init (main_map=0x7f67607dc168, argc=9, argv=0x7ffe55b7b708, env=0x7ffe55b7b758) at dl-init.c:120;
#6  0x00007f67605b5c6a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2;
#7  0x0000000000000009 in ?? ()
#8  0x00007ffc4f6f5d8c in ?? ()
#9  0x00007ffc4f6f5dd5 in ?? ()
#10 0x00007ffc4f6f5dd8 in ?? ()
#11 0x00007ffc4f6f5e0c in ?? ()
#12 0x00007ffc4f6f5e0f in ?? ()

However, I have no problem in the build c844966. Built with TBB enabled, MKL disabled, and using system Eigen. I tried both using system Eigen and using package Eigen, but none of them works. Would appreciate any help!

@varunagrawal
Copy link
Collaborator

The error says it can't find malloc.h which is a std library. This seems like a linker error on your system. Can you please verify that gcc can see the path to /usr/include?

@jyiyang
Copy link
Author

jyiyang commented Jun 17, 2019

Yes, I have no problem running gtsam on my original project using build c844966. It occurs to me when I switch to the newest commit.

I will try to bisect the commit once I get back to my computer.

@jlblancoc
Copy link
Member

jlblancoc commented Jun 17, 2019 via email

@jyiyang
Copy link
Author

jyiyang commented Jun 18, 2019

Disabling march=native works. Thanks!

Just a bit curious why it doesn't work..

@jyiyang jyiyang closed this as completed Jun 18, 2019
@jlblancoc
Copy link
Member

jlblancoc commented Jun 18, 2019

Still not 100% clear, so I think this one should remain open ;-)

Probably it's caused by different flags while compiling different sources/projects that use Eigen...
Please, provide more details on how to reproduce the crash: was it a gtsam example or custom project? etc.

PS: It seems it's a custom project... Please, try to create the minimum CMakeLists.txt + main.cpp that fails to reproduce the problem.

@jlblancoc jlblancoc reopened this Jun 18, 2019
@nachovizzo
Copy link
Contributor

nachovizzo commented Jul 5, 2019

I'm facing the exact same issue here. I somehow managed to build the simplest example to show the segfault

NOTE: This same example does not crash with this version c844966 , but rather with <4.0.0>. This could be a starting point to find where is the issue

CMakeLists.txt

# ~~~
# @file      CMakeLists.txt
# @author    Ignacio Vizzo     [[email protected]]
#
# Copyright (c) 2019 Ignacio Vizzo, all rights reserved
cmake_minimum_required(VERSION 3.10)
project(gtsam_crash)

find_package(GTSAM REQUIRED)
include_directories(${GTSAM_INCLUDE_DIRS})

add_executable(crash_test main.cpp)
target_link_libraries(crash_test gtsam)

And the example program:

// @file      main.cpp
// @author    Ignacio Vizzo     [[email protected]]
//
// Copyright (c) 2019 Ignacio Vizzo, all rights reserved
#include <gtsam/geometry/Pose3.h>
#include <gtsam/linear/NoiseModel.h>
#include <gtsam/nonlinear/NonlinearFactor.h>
#include <gtsam/slam/BetweenFactor.h>

#include <Eigen/Core>

int main() {
  auto model =
      gtsam::noiseModel::Gaussian::Information(Eigen::Matrix3d::Identity());

  gtsam::NonlinearFactor::shared_ptr factor(
      new gtsam::BetweenFactor<gtsam::Pose3>(0, 0, gtsam::Pose3(), model));
  return 0;
}

With the following traceback to the crash

Program received signal SIGSEGV, Segmentation fault.
__GI___libc_free (mem=0x20) at malloc.c:3103
3103    malloc.c: No such file or directory.
(gdb) bt
#0  __GI___libc_free (mem=0x20) at malloc.c:3103
#1  0x00007ffff7829d2c in gtsam::noiseModel::Diagonal::Sigmas(Eigen::Matrix<double, -1, 1, 0, -1, 1> const&, bool) () from /usr/local/lib/libgtsam.so.4
#2  0x00007ffff7660c34 in _GLOBAL__sub_I_lago.cpp () from /usr/local/lib/libgtsam.so.4
#3  0x00007ffff7de5733 in call_init (env=0x7fffffffd7b8, argv=0x7fffffffd7a8, argc=1, l=<optimized out>) at dl-init.c:72
#4  _dl_init (main_map=0x7ffff7ffe170, argc=1, argv=0x7fffffffd7a8, env=0x7fffffffd7b8) at dl-init.c:119
#5  0x00007ffff7dd60ca in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#6  0x0000000000000001 in ?? ()
#7  0x00007fffffffdc69 in ?? ()
#8  0x0000000000000000 in ?? ()

@nachovizzo
Copy link
Contributor

I've found a simpler example that also segfaults on version 4.0.0 but not on c844966 (the backtrace is exactly the same)

// @file      main.cpp
// @author    Ignacio Vizzo     [[email protected]]
//
// Copyright (c) 2019 Ignacio Vizzo, all rights reserved
#include <gtsam/slam/BetweenFactor.h>
int main() { auto factor(new gtsam::BetweenFactor<double>(1, 2, 0.0)); }

@varunagrawal
Copy link
Collaborator

Thank you @nachovizzo, these are super helpful. We'll investigate this ASAP.

@jlblancoc
Copy link
Member

Thanks @nachovizzo for the test!

I can't reproduce the crash... my settings are:

  • Ubuntu 18.04 64bit
  • Both, gtsam from this PPA and built locally (with MARCH_NATIVE ON, Eigen 3.3.7)

It doesn't fail in any of those two settings.

Please, report the outputs from:

  • From your build dir of gtsam, do VERBOSE=1 cmake .
  • From the build dir of your test program:
make clean
VERBOSE=1 make

Also, just for curiosity, try if using libgtsam-dev from the PPA repository does also crash. Note that after installing, you will have to either uninstall it, or explicitly set GTSAM_DIR= to get back to use your own local build of the library, naturally.

@ke-sun
Copy link

ke-sun commented Jul 21, 2019

Try disabling march=native. Not a real solution, but will help to confirm a usual suspect...

Just to add more information to this. If both gtsam lib and the custom example are built with the -march=native flag on, the function noiseModel::Diagonal::Sigmas() still works fine.

@jlblancoc
Copy link
Member

That's really relevant, thanks @ke-sun !
Since the recent switch to CMake PUBLIC exported build flags, it should be impossible for a user program to link against a version of gtsam that was built with "march=tune", without automatically inheriting that flag...

However was able to bypass this exported flag, please share with us how did you include gtsam into your project, since the current GTSAMconfig.cmake file should be in charge of importing the exported cmake targers, including all PUBLIC build flags, so if that doesn't work, there must be a bug somewhere, or the user project is not importing gtsam correctly via find_package(GTSAM)...

Regarding the opening issue report above, if nobody manages to reproduce the crash, I think this one could be closed after some prudent time.

@ke-sun
Copy link

ke-sun commented Jul 22, 2019

@jlblancoc
I didn't compile the custom example with cmake toolchain, but directly used g++ instead (since there is only one cpp file in my "project" 😄 ). Therefore, I have to explicitly set the -march=native flag.

@fishbotics
Copy link

fishbotics commented Aug 15, 2019

FWIW I am also getting this same issue. Here's the stacktrace from gdb for me:

#0  __GI___libc_free (mem=0x21) at malloc.c:2951
#1  0x000000000044cc47 in Eigen::internal::handmade_aligned_free (ptr=0x68dc80) at /usr/local/include/gtsam/3rdparty/Eigen/Eigen/src/Core/util/Memory.h:98
#2  0x000000000044cc62 in Eigen::internal::aligned_free (ptr=0x68dc80) at /usr/local/include/gtsam/3rdparty/Eigen/Eigen/src/Core/util/Memory.h:179
#3  0x000000000044def3 in Eigen::internal::conditional_aligned_free<true> (ptr=0x68dc80) at /usr/local/include/gtsam/3rdparty/Eigen/Eigen/src/Core/util/Memory.h:230
#4  0x00007ffff793a491 in Eigen::internal::conditional_aligned_delete_auto<double, true> (ptr=0x68dc80, size=1) at /usr/local/include/gtsam/3rdparty/Eigen/Eigen/src/Core/util/Memory.h:416
#5  0x00007ffff795dd1d in Eigen::DenseStorage<double, -1, -1, 1, 0>::~DenseStorage (this=0x7fffffffd9e0, __in_chrg=<optimized out>)
    at /usr/local/include/gtsam/3rdparty/Eigen/Eigen/src/Core/DenseStorage.h:542
#6  0x00007ffff795c95a in Eigen::PlainObjectBase<Eigen::Matrix<double, -1, 1, 0, -1, 1> >::~PlainObjectBase (this=0x7fffffffd9e0, __in_chrg=<optimized out>)
    at /usr/local/include/gtsam/3rdparty/Eigen/Eigen/src/Core/PlainObjectBase.h:98
#7  0x00007ffff795c976 in Eigen::Matrix<double, -1, 1, 0, -1, 1>::~Matrix (this=0x7fffffffd9e0, __in_chrg=<optimized out>) at /usr/local/include/gtsam/3rdparty/Eigen/Eigen/src/Core/Matrix.h:178
#8  0x00007ffff0a47d3e in gtsam::noiseModel::Constrained::MixedSigmas (sigmas=...) at /home/afishman/Repositories/gtsam/gtsam/linear/NoiseModel.h:419
#9  0x00007ffff0a6ac87 in gtsam::noiseModel::Diagonal::Sigmas (sigmas=..., smart=true) at /home/afishman/Repositories/gtsam/gtsam/linear/NoiseModel.cpp:271
#10 0x00007ffff0ba8754 in __static_initialization_and_destruction_0 (__initialize_p=1, __priority=65535) at /home/afishman/Repositories/gtsam/gtsam/slam/lago.cpp:37
#11 0x00007ffff0ba8a32 in _GLOBAL__sub_I_lago.cpp(void) () at /home/afishman/Repositories/gtsam/gtsam/slam/lago.cpp:399
#12 0x00007ffff7de76ca in call_init (l=<optimized out>, argc=argc@entry=1, argv=argv@entry=0x7fffffffdbc8, env=env@entry=0x7fffffffdbd8) at dl-init.c:72
#13 0x00007ffff7de77db in call_init (env=0x7fffffffdbd8, argv=0x7fffffffdbc8, argc=1, l=<optimized out>) at dl-init.c:30
#14 _dl_init (main_map=0x7ffff7ffe168, argc=1, argv=0x7fffffffdbc8, env=0x7fffffffdbd8) at dl-init.c:120
#15 0x00007ffff7dd7c6a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#16 0x0000000000000001 in ?? ()
#17 0x00007fffffffdfcb in ?? ()
#18 0x0000000000000000 in ?? ()
(gdb) Quit

@jlblancoc
Copy link
Member

Hi @fishyai ,

What are your settings?
OS version, compiler version, gtsam version?
Are you using the PPA prebuilt version or building yourself?
If the latter: what version of Eigen3 did Cmake find? Did you enable mtune=native?
Are you using cmake for you own project?

A minimal set of CMakeLists.txt + test.cpp that crahes, together with all the information above, would be the best to debug this.

PS: Recently I found a crash very similar to this one, and it turned out the problem was using different GCC versions (5 vs 7) for gtsam and my project, and enabling the native optimization. Avoiding those two solved the issue, but I would be happier with a better understanding of what exactly is the "MUSTN'T DO"...

@fishbotics
Copy link

I think it's related to your explanation above. I was trying to include GTSAM and DART in the same project: https://dartsim.github.io/.

I'm guessing it has something to do with different Eigen settings. All I did was link the two libraries into another project and include a header from both in the same file. That was enough to cause the error. I can make a Docker example to show, but probably won't get to it for a couple weeks.

Thanks for your help!

@varunagrawal
Copy link
Collaborator

Can we close this?

@zhangxiaoya
Copy link

disable 'march=native' works for me, reference https://github.com/borglab/gtsam/pull/4/files

@mertkaraoglu
Copy link

mertkaraoglu commented Mar 24, 2021

This issue is still relevant with "4.0.3 release". Disabling "march=native" solves it.

@ProfFan
Copy link
Collaborator

ProfFan commented Mar 24, 2021

Note to all future visitors: if two of your dependencies use Eigen, make sure they are using the same version and use the same compilation settings. This is an inherent issue with a header-only library (Eigen) and is in no way related to GTSAM.

NOTE: it is irrelevant whether you are using a dynamic library. Eigen with different settings = FAIL.

There is a flag in GTSAM that makes GTSAM use the system Eigen. If you link other projects and GTSAM at the same time, make sure that you enable this.

varunagrawal added a commit that referenced this issue Apr 2, 2021
5ddaff8ba Merge pull request #77 from borglab/fix/template-as-template-arg
0b6f2d92b allow templates as paramters to templates
7f3e242b0 Merge pull request #76 from borglab/fix/cmake-config
0caa79b82 macro to find and configure matlab
522557232 fix GTWRAP_INCLUDE_NAME
78a5d3afa Use CMakePackageConfigHelpers to vastly simplify the package config
76f8b9e5d Merge pull request #75 from borglab/fix/template-args
3b8e8389e remove reference from shared pointers
045393c7b docs and flag renaming
d23d8beae tests
ef96b4bdc don't make template parameters as references
d1e1dc697 Merge pull request #74 from borglab/fix/type-recursion
8202ecf10 minor fixes
5855ea85b support for passing templated types as arguments
150cc0578 Support for templated return types
5c58f8d03 Merge pull request #73 from borglab/fix/types-refactor
c697aa9c8 refactored the basic and custom types to make it cleaner, added more tests
98e2c3fa1 Merge pull request #68 from borglab/fix/cmake
c6d5e786a make config agnostic to install prefix
4d6999f15 Merge pull request #69 from borglab/feature/call-and-index
ccf408804 add support for callable and indexing overloads
8f8e3ec93 add status messages
88566eca4 make WRAP_PYTHON_VERSION an optional argument
01b8368ad Merge pull request #67 from borglab/feature/operator-overloading
522a12801 remove unsupported operators
209346133 update check location for unary operator
39e290f60 fix small typo
faa589bec update DOCS
7ff83cec8 minor fixes
8ce37766f fixed tests
21c477c4d include pybind11/operators
a3534ac5e wrap operator overloads
67c8f2089 instantiate templates for operators
e9dce65d8 use ReturnType for ease in other places and use members in Class
3078aa6db added parser rule for operator overloading

git-subtree-dir: wrap
git-subtree-split: 5ddaff8bab6c05e8a943c94993bf496e13296dd6
@Bazs
Copy link

Bazs commented Jul 1, 2021

Note on the "disable march=native" solution: this will decrease the runtime performance of GTSAM according to the documentation here: https://github.com/borglab/gtsam/pull/4/files

I've found that enabling march=native in my project which uses GTSAM instead of disabling it in GTSAM also solves the problem for me, and keeps/improves the runtime performance. This is at the cost of potential loss of binary portability, i.e. the executables compiled on one platform this way may not run on another.

To enable march=native if you are using CMake, add

add_compile_options("-march=native")

to your main CMakeLists.txt.

This is still a workaround, since it seems GTSAM should be adding this compile option to any dependent project according to #75 (comment), but this doesn't seem to always work.

@jlblancoc
Copy link
Member

Hi,

This is still a workaround, since it seems GTSAM should be adding this compile option to any dependent project according to #75 (comment), but this doesn't seem to always work.

Please, if you could put together a minimal example of where a build of gtsam with march=native enabled does not propagate the flag to a client project, it would be great for debugging the issue. Thanks.

@nubertj
Copy link

nubertj commented Jun 12, 2023

Hi @jlblancoc,
For me when using catkin it does not seem to propagate the flag to the dependent projects. Is this possible and match with your experience?

Note, in this case I include the dependent packages only by using catkin_package( CATKIN_DEPENDS ${CATKIN_PACKAGE_DEPENDENCIES} ... instead of find(...)

@jlblancoc
Copy link
Member

Hi @nubertj

First, if your problem is because of using the march-native flag while building GTSAM, try disabling it and rebuild eveything to see if everything works first. That flag may add some performance gain, but it's probably not worth the problems in early prototyping stages.
Next, double check your gtsam build has the cmake variable to use the system Eigen3 version to "ON" to avoid other nasty errors.

And if you are using ROS 1 (as it seems from you mentioning catkin), let's hope this PR I made yesterday is accepted in time for GTSAM binary packages to get published directly by ROS 1 buildfarm, so all you would need to do is sudo apt install ros-$ROS_DISTRO-gtsam and add it as just another ros / cmake dependency to your project without all these troubles. If the PR is accepted ~today, the binary packages will be available within a few days. Otherwise, it will take 1+ month until the next ROS sync.

@nubertj
Copy link

nubertj commented Jun 14, 2023

Dear @jlblancoc,
Thanks a lot for your reply!

First, if your problem is because of using the march-native flag while building GTSAM, try disabling it and rebuilding everything to see if everything works first.

I am using GTSAM and catkin for two years on a regular basis, and all works perfectly fine without march=native

Now I tried it again after some time to potentially get a small speedup boost, but I get the above-mentioned issues. I even set all the build flags in EVERY library down the line that are either directly or indirectly depending on GTSAM or any of the intermediate libraries. But still, the same issue persists...
I could not make it work reliably as soon as gtsam is built with march=native due to some weird and unpredictable Eigen issues. I even tried to enforce the alignment manually, but this did not change much.

Next, double check your gtsam build has the cmake variable to use the system Eigen3 version to "ON" to avoid other nasty errors.

This is the case, we build GTSAM always with the system eigen.

Meanwhile, I am just using GTSAM still without march native, all good.
Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests