Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compile error with DLBFoam STANDALONE Lapack #21

Closed
kazumamatata opened this issue Jun 14, 2022 · 27 comments
Closed

Compile error with DLBFoam STANDALONE Lapack #21

kazumamatata opened this issue Jun 14, 2022 · 27 comments

Comments

@kazumamatata
Copy link

Hello again,

I wanted to report other issue I encounter when compiling DLBFoam v1.1_OF8.
To avoid intel MKL or other compilation problem , now I am trying to compile DLBFoam v1.1 with the STANDALONE lapack.
I am running OpenFOAM 8 provided by the supercomputer center, which uses gcc compiler and intelMPI. (That should be compiled properly and also validated).

However, there are some error message regarding the lapack.

I assume that is, caused by the wrong Lapack version. So my question: Can I ask you which lapack version you have tested?
My version is 3.10, downloaded from: http://www.netlib.org/lapack/

Here is the error message: What do you think?

[s50001@cx0010 DLBFoam]$ ./Allwmake --platform STANDALONE
wmake libso src/thermophysicalModels/chemistryModel
wmake libso src/ODE_DLB
wmakeLnInclude: linking include files to ./lnInclude
Making dependency list for source file seulex_LAPACK.C
g++ -std=c++11 -m64 -Dlinux64 -DWM_ARCH_OPTION=64 -DWM_DP -DWM_LABEL_SIZE=32 -Wall -Wextra -Wold-style-cast -Wnon-virtual-dtor -Wno-unused-parameter -Wno-invalid-offsetof -Wno-attributes -O3  -DNoRepository -ftemplate-depth-100 -I/usr/include    -I/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include  -DDEBUG=0  -IlnInclude -I. -I/work/opt/local/apps/gcc/4.8.5/impi/2019.9.304/openfoam/8/OpenFOAM-8/src/OpenFOAM/lnInclude -I/work/opt/local/apps/gcc/4.8.5/impi/2019.9.304/openfoam/8/OpenFOAM-8/src/OSspecific/POSIX/lnInclude   -fPIC -c ODESolvers/ODESolver/ODESolver.C -o Make/linux64GccDPInt32Opt/ODESolvers/ODESolver/ODESolver.o
g++ -std=c++11 -m64 -Dlinux64 -DWM_ARCH_OPTION=64 -DWM_DP -DWM_LABEL_SIZE=32 -Wall -Wextra -Wold-style-cast -Wnon-virtual-dtor -Wno-unused-parameter -Wno-invalid-offsetof -Wno-attributes -O3  -DNoRepository -ftemplate-depth-100 -I/usr/include    -I/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include  -DDEBUG=0  -IlnInclude -I. -I/work/opt/local/apps/gcc/4.8.5/impi/2019.9.304/openfoam/8/OpenFOAM-8/src/OpenFOAM/lnInclude -I/work/opt/local/apps/gcc/4.8.5/impi/2019.9.304/openfoam/8/OpenFOAM-8/src/OSspecific/POSIX/lnInclude   -fPIC -c ODESolvers/ODESolver/ODESolverNew.C -o Make/linux64GccDPInt32Opt/ODESolvers/ODESolver/ODESolverNew.o
g++ -std=c++11 -m64 -Dlinux64 -DWM_ARCH_OPTION=64 -DWM_DP -DWM_LABEL_SIZE=32 -Wall -Wextra -Wold-style-cast -Wnon-virtual-dtor -Wno-unused-parameter -Wno-invalid-offsetof -Wno-attributes -O3  -DNoRepository -ftemplate-depth-100 -I/usr/include    -I/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include  -DDEBUG=0  -IlnInclude -I. -I/work/opt/local/apps/gcc/4.8.5/impi/2019.9.304/openfoam/8/OpenFOAM-8/src/OpenFOAM/lnInclude -I/work/opt/local/apps/gcc/4.8.5/impi/2019.9.304/openfoam/8/OpenFOAM-8/src/OSspecific/POSIX/lnInclude   -fPIC -c ODESolvers/seulex_LAPACK/seulex_LAPACK.C -o Make/linux64GccDPInt32Opt/ODESolvers/seulex_LAPACK/seulex_LAPACK.o
ODESolvers/seulex_LAPACK/seulex_LAPACK.C: In member function ‘bool Foam::seulex_LAPACK::seul(Foam::scalar, const scalarField&, Foam::label, Foam::scalar, Foam::label, Foam::scalarField&, const scalarField&) const’:
ODESolvers/seulex_LAPACK/seulex_LAPACK.C:157:53: error: too few arguments to function ‘void dgetrs_(const char*, const int*, const int*, const double*, const int*, const int*, double*, const int*, int*, size_t)’
     dgetrs_(&TRANS,&N,&NRHS,A,&LDA,IPIV,b,&LDB,&INFO);
                                                     ^
In file included from /work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapack.h:11:0,
                 from /work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapacke.h:36,
                 from ODESolvers/seulex_LAPACK/seulex_LAPACK.H:58,
                 from ODESolvers/seulex_LAPACK/seulex_LAPACK.C:28:
/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapack.h:4043:42: note: declared here
 #define LAPACK_dgetrs_base LAPACK_GLOBAL(dgetrs,DGETRS)
                                          ^
/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapacke_mangling.h:12:39: note: in definition of macro ‘LAPACK_GLOBAL’
 #define LAPACK_GLOBAL(lcname,UCNAME)  lcname##_
                                       ^
/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapack.h:4044:6: note: in expansion of macro ‘LAPACK_dgetrs_base’
 void LAPACK_dgetrs_base(
      ^
ODESolvers/seulex_LAPACK/seulex_LAPACK.C:211:61: error: too few arguments to function ‘void dgetrs_(const char*, const int*, const int*, const double*, const int*, const int*, double*, const int*, int*, size_t)’
             dgetrs_(&TRANS,&N,&NRHS,A,&LDA,IPIV,b,&LDB,&INFO);
                                                             ^
In file included from /work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapack.h:11:0,
                 from /work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapacke.h:36,
                 from ODESolvers/seulex_LAPACK/seulex_LAPACK.H:58,
                 from ODESolvers/seulex_LAPACK/seulex_LAPACK.C:28:
/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapack.h:4043:42: note: declared here
 #define LAPACK_dgetrs_base LAPACK_GLOBAL(dgetrs,DGETRS)
                                          ^
/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapacke_mangling.h:12:39: note: in definition of macro ‘LAPACK_GLOBAL’
 #define LAPACK_GLOBAL(lcname,UCNAME)  lcname##_
                                       ^
/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapack.h:4044:6: note: in expansion of macro ‘LAPACK_dgetrs_base’
 void LAPACK_dgetrs_base(
      ^
ODESolvers/seulex_LAPACK/seulex_LAPACK.C:268:57: error: too few arguments to function ‘void dgetrs_(const char*, const int*, const int*, const double*, const int*, const int*, double*, const int*, int*, size_t)’
         dgetrs_(&TRANS,&N,&NRHS,A,&LDA,IPIV,b,&LDB,&INFO);
                                                         ^
In file included from /work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapack.h:11:0,
                 from /work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapacke.h:36,
                 from ODESolvers/seulex_LAPACK/seulex_LAPACK.H:58,
                 from ODESolvers/seulex_LAPACK/seulex_LAPACK.C:28:
/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapack.h:4043:42: note: declared here
 #define LAPACK_dgetrs_base LAPACK_GLOBAL(dgetrs,DGETRS)
                                          ^
/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapacke_mangling.h:12:39: note: in definition of macro ‘LAPACK_GLOBAL’
 #define LAPACK_GLOBAL(lcname,UCNAME)  lcname##_
                                       ^
/work/00/gs50/s50001/lapack-3.10.1/LAPACKE/include/lapack.h:4044:6: note: in expansion of macro ‘LAPACK_dgetrs_base’
 void LAPACK_dgetrs_base(
      ^
make: *** [Make/linux64GccDPInt32Opt/ODESolvers/seulex_LAPACK/seulex_LAPACK.o] Error 1
@blttkgl
Copy link
Collaborator

blttkgl commented Jun 14, 2022

Are you testing the standalone installation on your cluster or personal laptop? If former, I'd advise to stick to the intel-mkl or openblas packages pre-installed in the cluster by your IT team and avoid using a standalone LAPACKinstallation. If you're on your PC, try:

(sudo) apt-get install liblapacke-dev

to install the lapacke-dev package, which at the moment works very well on my ubuntu 22.04 desktop.

@kahilah
Copy link

kahilah commented Jun 14, 2022

I fully agree with blttkgl, first, test this on your own machine and on cluster, use the system installation of MKL or OpenBlas instead of your own installation of Lapack. In that way you get the best performance and less problems.

Based on the error message I would make a wild guess that your lapack compilation or its linking has some issues. The error message states that "too few arguments to function ‘void dgetrs_" and if I recall correctly, this may result in case you are linking via fortran interface of lapack library instead of the traditional c interface (The difference is in number of arguments). So perhaps your compilation environment on cluster was not consistent between compiling lapack, openfoam and dlbfoam?

@kazumamatata
Copy link
Author

On my Ubuntu machine the DLBFoam could be compiled!
I used the (sudo) apt-get install liblapacke-dev and that worked well.

On the supercomputer system I have these two option:

1. OpenFOAM/8 (gcc, compilation by IT) + STANDALONE (compilation by myself)
2. OpenFOAM/8 (icc, compilation by myself) + MKL (compilation by IT)

Currently I am trying this on 1. Issue #20 is option 2.

On the supercomputer @kahilah you were right. Maybe I compiled Lapack (fortran) only. Not LAPACKE (for C).

Now, the compilation succeeded! However, there is an error at the PSRTest.

Validating enthalpy computation at T=298.15
PASSED (error 6.23195e-08)

Validating enthalpy computation at T=298.15
PASSED (error 9.5815e-07)
./PSRTest.bin: symbol lookup error: /work/gs50/s50001/OpenFOAM/s50001-8/platforms/linux64GccDPInt32Opt/lib/libODE_DLB.so: undefined symbol: dgetrf_

Validation FAILED.
See output above for further information.

Test error control on mechanism consistency in pyjacLoadBalancedChemistryModel:

I will also try the tutorial later

log.Allwmake.standalone.txt

@kahilah
Copy link

kahilah commented Jun 15, 2022

Error message states: "undefined symbol: dgetrf_",

in which dgetr_ is a lapack function and when symbol is undefined, it suggests that the library linking is not succesfull. Most probably Lapack paths are wrong, or not set in the PATH environmental variable so your system can't find the right library. Another option is that the lapack compilation has not been succesfull so dlbfoam finds the header files but the library file is not working.

When you compile / run dlbfoam, are you sure that your environment in the shell is well defined? So when you compile something, you know which compiler is activated and that you know that if you have many versions of same library like lapack, you refer to the right one etc.?

I recommend to compile the lapack test and tutorial files (available in lapack sources) so you know that your lapack is functioning as expected.

@moreff
Copy link
Member

moreff commented Oct 4, 2022

I got some reports from different people and was able to reproduce the issue myself on Ubuntu 22.04.

It is caused by the change of the lapack.h header file in LAPACK 3.9.1 (Ubuntu 22.04 uses netlib's LAPACK 3.10) to introduce additional size parameters to function definitions using LAPACK_FORTRAN_STRLEN_END macro definition. Then, in netlib's LAPACK 3.10 this was extended to dgetrs_ function, used here.

Perhaps, it can be fixed in a similar way to this fix in OpenCV library.

@drhcelik
Copy link

drhcelik commented Oct 4, 2022

This is known issue, I discussed this with @moreff a month ago maybe.
I work with Intel-MKL on my PC and Intel-MKL package that Ubuntu 22.04 uses works okay with current DLBFoam as well.
Basically, sudo apt install intel-mkl and then set your $MKL_ROOT environment variable as /usr/include/mkl

@blttkgl
Copy link
Collaborator

blttkgl commented Oct 5, 2022

Yes, I feel at this point standalone installation does not need to be supported if devs think it's too big of a hassle. Couple of notes why it exists in the first place to my knowledge:

1- Intel-mkl was not directly included in the package manager until Ubuntu 20.04. You could still install it, but you needed to jump through some hoops. Having a standalone version made things easier for a user with limited experience on Unix.

2- There were benchmarks all around showing Intel explicitly throttling the efficiency of MKL libraries on non-intel architectures (e.g., AMD). This is the reason why we have an OpenBLAS option as well, because CSC's Mahti have AMD processors and using OpenBLAS is much faster over there.

@mactone
Copy link

mactone commented Oct 14, 2022

I have the same issue with compiling.
I am using Ubuntu 22.04 LTS with OpenFOAM v9
Followed the steps on the youtube https://www.youtube.com/watch?v=1hwSffkuuY8

(sudo) apt-get install liblapacke-dev
./Allmake --clean --platform STANDALONE

Still have exactly the same issue as @kazumamatata

Do I need to compile as root? I compile it in my run directory.

@blttkgl
Copy link
Collaborator

blttkgl commented Oct 14, 2022

Hey @mactone , please check the response by Ilya in #21 (comment), and proposed temporary fix by Hasan in #21 (comment) . You can use intel-MKL instead of Standalone LAPACK until the issue is fixed.

Bulut

@mactone
Copy link

mactone commented Oct 14, 2022

Hey @mactone , please check the response by Ilya in #21 (comment), and proposed temporary fix by Hasan in #21 (comment) . You can use intel-MKL instead of Standalone LAPACK until the issue is fixed.

Bulut

Thank you Bulut.
I tried the Hasan's way, installed the MKL and building with --platform=MKL
It failed somewhere else as follow

In file include from ODESolvers/seulex_LAPACK/seulex_LAPACK.C:28 ODESolvers/selex_LAPACK/seulex_LAPACK.H:55:10: fatal error: mkl_lapack.h: No such file or directory

I check my $MKLROOT is /usr/include/mkl/

There is kml_lapack.h in this path.

@drhcelik
Copy link

Hey @mactone , please check the response by Ilya in #21 (comment), and proposed temporary fix by Hasan in #21 (comment) . You can use intel-MKL instead of Standalone LAPACK until the issue is fixed.
Bulut

Thank you Bulut. I tried the Hasan's way, installed the MKL and building with --platform=MKL It failed somewhere else as follow

In file include from ODESolvers/seulex_LAPACK/seulex_LAPACK.C:28 ODESolvers/selex_LAPACK/seulex_LAPACK.H:55:10: fatal error: mkl_lapack.h: No such file or directory

I check my $MKLROOT is /usr/include/mkl/

There is kml_lapack.h in this path.

I tested using virtual machine and at least in my system mkl header files end up in that folder. One way that you may check is where libmkl-dev is installed along with intel-mkl package. So, may you try dpkg -L libmkl-dev , this will show you where header file is and set that folder as your MKLROOT.

@kahilah
Copy link

kahilah commented Oct 14, 2022

As users seem to have issues with these paths, we should probably extend the Allwmake MKL/openblas checks to not only look at whether the variable exists but if correct files are located in the path.

@mactone
Copy link

mactone commented Oct 17, 2022

DLBfoam
Solved the MKL question, but got another error. Can' t go on. Anyone solved this issue?

@drhcelik
Copy link

What is your gcc version and what is your operating system?
For instance CentOS have several bugs related to gcc (I am not sure if they are fixed or not) and this issue might be related to that.

@mactone
Copy link

mactone commented Oct 17, 2022

What is your gcc version and what is your operating system? For instance CentOS have several bugs related to gcc (I am not sure if they are fixed or not) and this issue might be related to that.

I am using Ubuntu 22.04 LTS with gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0

@mactone
Copy link

mactone commented Oct 23, 2022

Though I can't have the DLBfoam built on my Ubuntu workstation. But it is built smoothly on windows openfoam 9.

@mactone
Copy link

mactone commented Oct 24, 2022

I tried remove openfoam9 libapacke-dev and reinstall them.
Still I have the same issue as @kazumamatata

I wonder how @kazumamatata solved the issue?!

@drhcelik
Copy link

Dear mactone,

What I feel that the problem you are having is not related to DLBFoam, so I feel like you should be able to use DLBFoam but I haven't tested it myself.

What I get from your error message, MIGSIGSTKSZ variable cannot be used in constexpr concept. There might be an update to Catch2 that I am not aware. I feel like, as a very dirty and temporary solution, you should be able to compile unittests if you modify those lines:

Line 11258: static constexpr std::size_t sigStackSize = 32768;

Line 11317: char FatalConditionHandler::altStackMem[32768] = {};

However like I said, this may not be the correct approach and you might miss some tests, I haven't checked anything myself. You may try and maybe we should investigate this in another issue.

Regards

@mactone
Copy link

mactone commented Oct 24, 2022

DLBfoam Solved the MKL question, but got another error. Can' t go on. Anyone solved this issue?

@drhcelik On my previous trial make with MKL as the platform. It seems that I've built the c_pyjac_test, can I use DLBFoam without finishing the whole Allwmake process?

Using the tricks you mentioned, I need to change catch.hpp file as you suggested?
Thank you.

@drhcelik
Copy link

I don't have access to Ubuntu 22.04 right now, so I am not able to test it but if you have DLBFoam case already, you may test it or you may use one of the cases in this repository to test it. I feel like you are able to compile libraries but you fail at the tests. Yes, you may try to modify those lines in catch.hpp. The problem occurs at these lines I think:

static constexpr std::size_t sigStackSize =

char FatalConditionHandler::altStackMem[sigStackSize] = {};

This problem is not related to MKL or DLBFoam, I think this is something related to Catch2. But like I said, this probably is not the most elegant way to do this, just modify them to test if it works or not.

@mactone
Copy link

mactone commented Oct 25, 2022

Thank you @drhcelik, I finally resolved the issue through your instruction.

@blttkgl
Copy link
Collaborator

blttkgl commented Oct 25, 2022

Perhaps we could itemize the action needed to be taken to close this issue, and assign it to someone before this thread spirals out of control? @moreff @kahilah any suggestions where to start? I am in favor of removing the STANDALONE option if MKL's performance on AMD is now comparable to that of Intel architectures.

@kahilah
Copy link

kahilah commented Oct 25, 2022

I think there are two questions here:

  1. shall we support STANDALONE?
  • I think standalone should be still included as there is no guarantee that MKL works everywhere. Perhaps MKL could be a new default or we mention these latest issues clearly in README as known issues.
  1. How top help users having troubles with their library linking and paths?
  • I think a simple shell script in Allwmake should be enough to capture most issues related to e.g. MKL. Of course we can clean this nicer and extend for testing include files as well but typically LIB is the challenge for users.

See example I just put together quickly:

MKL_LIBS=("libmkl_intel_ilp64.so" "libmkl_sequential.so" "libmkl_core.so" "libpthread.so" "libm.so" "libdl.so")

for LIBI in $MKL_LIBS
do
    LIB_PATH=$MKLROOT/lib/intel64/$LIBI

    if [ ! -f "$LIB_PATH" ] ; then
        echo "ERROR: $LIB_PATH not found."
        echo "If your installation is not under default MKLROOT, please comment out this function.
        exit 1
    fi
done

@drhcelik
Copy link

As I remember, sudo apt install intel-mkl installs everything into /usr/include/mkl. intel64 folder is in that folder directly but installing intel-mkl through Intel's packages, intel64 folder is installed under $MKLROOT/lib.

@blttkgl
Copy link
Collaborator

blttkgl commented Oct 25, 2022

I think we need to decide how much hand-holding we want to provide to the user. In most cases MKL comes preinstalled and MKLROOT is available if the compilation is on a cluster, and the assumption should be that user knows basic UNIX and/or they have access to their IT if they are attempting to compile this on a cluster. As for personal computer usage we could just provide a simple standalone compilation with the stock library. We can even consider shipping the standalone version with a dynamic lapack library, since the performance is not a concern for a PC installation using standalone.

@kahilah
Copy link

kahilah commented Oct 25, 2022

I agree with @blttkgl , one should not start building this based on how things are looking on certain distros. Compilation scripts should utilise $MKLROOT variable and then README should state that it is user responsibility to set this in their shell correctly. Then we can give examples for that.

@arintanen
Copy link
Member

Hi,
Some recent lapack interfaces require that the length of the char* array should be passed as the last argument, which you can see in your error message

error: too few arguments to function ‘void dgetrs_(const char*, const int*, const int*, const double*, const int*, const int*, double*, const int*, int*, size_t)’

where the last argument should be 1 noting the length of const char*.
OpenFOAM 8 is not anymore supported but this issue was fixed in later versions. See commit abbb26f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants