Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use both absolute and relative header paths in header matching #362

Merged
merged 16 commits into from
Jan 1, 2025

Conversation

Tal500
Copy link
Contributor

@Tal500 Tal500 commented Aug 6, 2024

Fixes #402

This fix an issue of importing the same header by different addresses.
The issue is trickier than seems, because if a user used the include dir flag -I... with an absolute path, then every matched header was detected to be absolute, but if the import was a relative import to the source file and the source file isn't with an absolute path, than the header is identified by the combined relative path.

See https://sourceforge.net/p/cppcheck/discussion/general/thread/c01f80556b/

Note - I tried to make the code compatible with Windows, but was tested only on linux.

This fix an issue of importing the same header by different addresses.
The issue is trickier than seems, because if a user used the include dir flag `-I...` with an absolute path, then every matched header was detected to be absolute, but if the import was a relative import to the source file and the source file isn't with an absolute path, than the header is identified by the combined relative path.
See https://sourceforge.net/p/cppcheck/discussion/general/thread/c01f80556b/
@firewave
Copy link
Collaborator

firewave commented Aug 6, 2024

This change might be problematic because it is obviously requires a test but so far there are no tests at all utilizing includes (see #261 for a related discussion).

This should probably be implemented by introducing Python tests similar to what Cppcheck is doing. @danmar would that be acceptable for tests which actually require (multiple) files?

@Tal500
Copy link
Contributor Author

Tal500 commented Aug 6, 2024

This change might be problematic because it is obviously requires a test but so far there are no tests at all utilizing includes (see #261 for a related discussion).

This should probably be implemented by introducing Python tests similar to what Cppcheck is doing. @danmar would that be acceptable for tests which actually require (multiple) files?

What about these (failed) tests? https://ci.appveyor.com/project/danmar/simplecpp/builds/50353854

@danmar
Copy link
Owner

danmar commented Aug 12, 2024

This should probably be implemented by introducing Python tests similar to what Cppcheck is doing. @danmar would that be acceptable for tests which actually require (multiple) files?

sure, it would definitely make sense to include actual files in some python tests but as @Tal500 pointed out there are some tests for the handling at least.

What about these (failed) tests?

yes imho those are valuable also.

@Tal500
Copy link
Contributor Author

Tal500 commented Aug 14, 2024

@danmar if I guess correctly, the failure is due to that the header files are not really in the working directory or so, right?

@danmar
Copy link
Owner

danmar commented Aug 16, 2024

if I guess correctly, the failure is due to that the header files are not really in the working directory or so, right?

yes that sounds probable to me.

I don't know how your changes will affect the output from simplecpp if all filepaths will always be absolute that is also unfortunate. Is the output changed?

@Tal500
Copy link
Contributor Author

Tal500 commented Aug 18, 2024

I don't know how your changes will affect the output from simplecpp if all filepaths will always be absolute that is also unfortunate. Is the output changed?

Yes, except for the system headers that were not found in the scan paths, e.g. STL headers (that simplecpp doesn't open them at all and it's not consider to be an error).

We could have a more complicated solution, that uses two header attributes:

  1. Name: The (first) included path, perhaps normalized
  2. Path: The absolute path that it is loaded from

The first attribute is mandatory, and the second one is optional and exists only when the header is found.
That way, the error messages could still be looking the same as for today, but still correct.

An alternative solution is to have a user configuration that toggles the include path absoluteness.

@Tal500
Copy link
Contributor Author

Tal500 commented Sep 4, 2024

We could have a more complicated solution, that uses two header attributes:
@danmar what do you think about this solution? Do you have a better idea?

@danmar
Copy link
Owner

danmar commented Sep 28, 2024

We could have a more complicated solution, that uses two header attributes:
@danmar what do you think about this solution

I am very sorry for the delay. I was at cppcon last week and had a presentation.

As far as I understand the solution you suggest makes sense to me. We don't need to have these properties for each token right.

@danmar
Copy link
Owner

danmar commented Sep 28, 2024

@Tal500 just making sure you see my comment.

@Tal500
Copy link
Contributor Author

Tal500 commented Sep 29, 2024

I am very sorry for the delay. I was at cppcon last week and had a presentation.

I'd like too see it when it will be published!

As far as I understand the solution you suggest makes sense to me. We don't need to have these properties for each token right.

I'll look into that more

@danmar
Copy link
Owner

danmar commented Sep 29, 2024

I am very sorry for the delay. I was at cppcon last week and had a presentation.

I'd like too see it when it will be published!

👍

for your information here is a short description:
https://cppcon2024.sched.com/event/1gZdy/building-cppcheck-what-we-learned-from-17-years-of-development

I guess it will take a while until it is published.

@Tal500
Copy link
Contributor Author

Tal500 commented Nov 17, 2024

We could have a more complicated solution, that uses two header attributes:
@danmar what do you think about this solution

I am very sorry for the delay. I was at cppcon last week and had a presentation.

As far as I understand the solution you suggest makes sense to me. We don't need to have these properties for each token right.

I went on a different solution - just be conservative and check whether the relative version of the path or the absolute version of the path was already loaded (in case that the header file was found).

If the header is included as a system header, the default expansion is to absolute path.
If the header is included as a relative manner (by "") - it just simplifies the path - and do not change its path absoluteness (relative paths keep being relative, absolute paths keep being absolute).

I think that this is the sane solution for this problem.
I run the tests on Linux for both simplecpp and cppcheck, and they all passed eventually.

@Tal500 Tal500 changed the title fix: always use absolute (and simplified) header path whenever possible fix: use both always and relative header paths in header matching Nov 18, 2024
@Tal500 Tal500 changed the title fix: use both always and relative header paths in header matching fix: use both absolute and relative header paths in header matching Nov 18, 2024
@danmar
Copy link
Owner

danmar commented Nov 20, 2024

I think that this is the sane solution for this problem.

ok I didn't fully dig into your solution but I will trust you for now.

I think a pytest test would be a good idea. How is your pytest skills? I can provide a resource that will write the test if it feels complex for you.

@Tal500
Copy link
Contributor Author

Tal500 commented Nov 20, 2024

I think that this is the sane solution for this problem.

ok I didn't fully dig into your solution but I will trust you for now.

I think a pytest test would be a good idea. How is your pytest skills? I can provide a resource that will write the test if it feels complex for you.

I know python, I think I can learn pytest a little bit if needed (where is it used and in which repo?).
Why do you suggest pytest? Don't you rather prefer to add a simplecpp unit test?

In addition, what is your plan for a unittest? I suggest to have a unittest that includes both realtive path and curDir() + "/" + realative_path, and see if it is included twice. Do you have any security concerns about using/leaking the curDir() of the user machine in the test cases?

@danmar
Copy link
Owner

danmar commented Nov 21, 2024

Why do you suggest pytest? Don't you rather prefer to add a simplecpp unit test?

If you want to create a test that creates physical files in different locations and checks how simplecpp behaves with different -I flags then I think a pytest is more flexible and more proper.

I know python, I think I can learn pytest a little bit if needed (where is it used and in which repo?).

Good. We use pytest in Cppcheck. See this for instance: https://github.com/danmar/cppcheck/blob/main/test/cli/other_test.py

There are several tests in that file that creates testfiles using tmpdir..

In addition, what is your plan for a unittest? I suggest to have a unittest that includes both realtive path and curDir() + "/" + realative_path, and see if it is included twice. Do you have any security concerns about using/leaking the curDir() of the user machine in the test cases?

more or less if you read from the actual filesystem I think it's not a unit test. It's quite common that people wants to be able to compile the test binary in a arbitrary path and run it there and if it depends on physical files then that will not work.

@Tal500
Copy link
Contributor Author

Tal500 commented Dec 1, 2024

Why do you suggest pytest? Don't you rather prefer to add a simplecpp unit test?

If you want to create a test that creates physical files in different locations and checks how simplecpp behaves with different -I flags then I think a pytest is more flexible and more proper.

I know python, I think I can learn pytest a little bit if needed (where is it used and in which repo?).

Good. We use pytest in Cppcheck. See this for instance: https://github.com/danmar/cppcheck/blob/main/test/cli/other_test.py

There are several tests in that file that creates testfiles using tmpdir..

In addition, what is your plan for a unittest? I suggest to have a unittest that includes both realtive path and curDir() + "/" + realative_path, and see if it is included twice. Do you have any security concerns about using/leaking the curDir() of the user machine in the test cases?

more or less if you read from the actual filesystem I think it's not a unit test. It's quite common that people wants to be able to compile the test binary in a arbitrary path and run it there and if it depends on physical files then that will not work.

Tested locally, here are my added tests (with helper functions), append them to https://github.com/danmar/cppcheck/blob/main/test/cli/other_test.py to see the results:

def __test_relative_header_create_header(dir, with_pragma_once=True):
    header_file = os.path.join(dir, 'test.h')
    with open(header_file, 'wt') as f:
        f.write(f"""
                {"#pragma once" if with_pragma_once else ""}
                #ifndef TEST_H_INCLUDED
                #define TEST_H_INCLUDED
                #else
                #error header_was_already_included
                #endif
                """)
    return header_file, "error: #error header_was_already_included"

def __test_relative_header_format_include_path_arg(include_path):
    return f"-I{json.dumps(str(include_path))[1:-1]}"

def __test_relative_header_format_include(include, is_sys_header=False):
    if is_sys_header:
        return f"<{json.dumps(include)[1:-1]}>"
    else:
        return json.dumps(include)

def __test_relative_header_create_source(dir, include1, include2, is_include1_sys=False, is_include2_sys=False):
    src_file = os.path.join(dir, 'test.c')
    with open(src_file, 'wt') as f:
        f.write(f"""
                #undef TEST_H_INCLUDED
                #include {__test_relative_header_format_include(include1, is_include1_sys)}
                #include {__test_relative_header_format_include(include2, is_include2_sys)}
                """)
    return src_file


def test_relative_header_0_rel(tmpdir):
    _, double_include_error = __test_relative_header_create_header(tmpdir, with_pragma_once=False)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h")

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert double_include_error in stderr

def test_relative_header_0_sys(tmpdir):
    _, double_include_error = __test_relative_header_create_header(tmpdir, with_pragma_once=False)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h", is_include1_sys=True, is_include2_sys=True)

    args = [__test_relative_header_format_include_path_arg(tmpdir), test_file]

    _, _, stderr = cppcheck(args)
    assert double_include_error in stderr

def test_relative_header_1_rel(tmpdir):
    __test_relative_header_create_header(tmpdir)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h")

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''

def test_relative_header_1_sys(tmpdir):
    __test_relative_header_create_header(tmpdir)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test.h", is_include1_sys=True, is_include2_sys=True)

    args = [__test_relative_header_format_include_path_arg(tmpdir), test_file]

    _, _, stderr = cppcheck(args)
    assert stderr == ''

## TODO: the following tests should pass after applying simplecpp#362

def test_relative_header_2(tmpdir):
    header_file, _ = __test_relative_header_create_header(tmpdir)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", header_file)

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''

def test_relative_header_3(tmpdir):
    test_subdir = os.path.join(tmpdir, "test_subdir")
    os.mkdir(test_subdir)
    header_file, _ = __test_relative_header_create_header(test_subdir)

    test_file = __test_relative_header_create_source(tmpdir, "test_subdir/test.h", header_file)

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''

def test_relative_header_3_inv(tmpdir):
    test_subdir = os.path.join(tmpdir, "test_subdir")
    os.mkdir(test_subdir)
    header_file, _ = __test_relative_header_create_header(test_subdir)

    test_file = __test_relative_header_create_source(tmpdir, header_file, "test_subdir/test.h")

    args = [test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''


def test_relative_header_4(tmpdir):
    test_subdir = os.path.join(tmpdir, "test_subdir")
    os.mkdir(test_subdir)
    header_file, _ = __test_relative_header_create_header(test_subdir)

    test_file = __test_relative_header_create_source(tmpdir, header_file, "test2.h", is_include2_sys=True)

    args = [__test_relative_header_format_include_path_arg(test_subdir), test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''



def test_relative_header_5(tmpdir):
    test_subdir = os.path.join(tmpdir, "test_subdir")
    os.mkdir(test_subdir)
    __test_relative_header_create_header(test_subdir)

    test_file = __test_relative_header_create_source(tmpdir, "test.h", "test_subdir/test.h", is_include1_sys=True)

    args = [__test_relative_header_format_include_path_arg(test_subdir), test_file]

    _, _, stderr = cppcheck(args, cwd=tmpdir)
    assert stderr == ''

@danmar
Copy link
Owner

danmar commented Dec 5, 2024

I prefer that you add some pytest test file to simplecpp repo. We want to test this directly here.

I don't have strong opinion. But you could add integration_test.py or simplecpp_test.py maybe in the root folder. The test would use the simplecpp binary rather than cppcheck.

def __test_relative_header_format_include_path_arg(include_path):
return f"-I{json.dumps(str(include_path))[1:-1]}"

I don't understand this. The -I does not expect json.

@Tal500
Copy link
Contributor Author

Tal500 commented Dec 5, 2024

I prefer that you add some pytest test file to simplecpp repo. We want to test this directly here.

Will do that, I need also to copy some verification of infra functions that I had in cppcheck pytest files.

def __test_relative_header_format_include_path_arg(include_path):
return f"-I{json.dumps(str(include_path))[1:-1]}"

I don't understand this. The -I does not expect json.

It's a common trick to extract escaped string to a string in many language to use JSON serialization for a string, and in Python it is just json.dumps(s). It's equivalent to the C++ std::quoted(). Notice that JSON serialization is more general than just quoting a string (i.e. also numbers booleans arrays and dictionaries), but this is the natural function for that. Notice also that I'm also trimming to the Python range [1:-1] in order to eliminate the trailing " from the begin and the end of the quoted string.

BTW: I'm not sure that I should have escaped the strings with the "-I..." flag, since this argument is passed directly to the process arguments, and I'm not sure if the string escapes characters (i.e. /) are getting processed.

@danmar
Copy link
Owner

danmar commented Dec 6, 2024

ok I was also confused by the {} but oops now I understand those are because it's a format string. I spontanously don't think you'll need to escape but feel free to try..

@Tal500
Copy link
Contributor Author

Tal500 commented Dec 10, 2024

I prefer that you add some pytest test file to simplecpp repo. We want to test this directly here.

I don't have strong opinion. But you could add integration_test.py or simplecpp_test.py maybe in the root folder. The test would use the simplecpp binary rather than cppcheck.

Done. Note the needed gitignore that for some reason you didn't notice

@danmar
Copy link
Owner

danmar commented Dec 12, 2024

Thanks.. test looks good imho. I have two more small comments:

  • I would like that there is an issue for this.
  • please run the test in a github action

@Tal500
Copy link
Contributor Author

Tal500 commented Dec 15, 2024

Thanks.. test looks good imho. I have two more small comments:

* I would like that there is an issue for this.

* please run the test in a github action

Done

Copy link
Owner

@danmar danmar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry for delayed review. please refactor testing.

integration_test.py Outdated Show resolved Hide resolved
@Tal500
Copy link
Contributor Author

Tal500 commented Dec 22, 2024

sorry for delayed review. please refactor testing.

Done

@danmar danmar merged commit c4b6e37 into danmar:master Jan 1, 2025
1 check passed
@danmar
Copy link
Owner

danmar commented Jan 1, 2025

@Tal500 Thank you!

@Tal500 Tal500 deleted the absolute-path-header branch January 6, 2025 16:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

absolute and relative header paths aren't resolved to be the same header
3 participants