Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Sobel and Scharr operators #392

Merged
merged 15 commits into from
Oct 29, 2019

Conversation

simmplecoder
Copy link
Contributor

@simmplecoder simmplecoder commented Sep 24, 2019

Description

This commit adds Sobel and Scharr
operators with support for 0th and 1st
degrees with other degrees planned for
later

References

https://www.researchgate.net/publication/239398674_An_Isotropic_3_3_Image_Gradient_Operator

https://www.researchgate.net/profile/Hanno_Scharr/publication/220955743_Optimal_Filters_for_Extended_Optical_Flow/links/004635151972eda98f000000/Optimal-Filters-for-Extended-Optical-Flow.pdf

Tasklist

  • Implement the operators
  • Add test case(s)
  • Ensure all CI builds pass
  • Review and approve

@simmplecoder simmplecoder added status/work-in-progress Do NOT merge yet until this label has been removed! and removed status/work-in-progress Do NOT merge yet until this label has been removed! labels Oct 1, 2019
example/Jamfile Outdated Show resolved Hide resolved
include/boost/gil/detail/math.hpp Outdated Show resolved Hide resolved
include/boost/gil/image_processing/numeric.hpp Outdated Show resolved Hide resolved
test/core/image_processing/sobel_scharr.cpp Outdated Show resolved Hide resolved
@stefanseefeld
Copy link
Member

yes, yes, I still plan to refactor the code so the numeric extension will be fused into core. Sorry for my slowness... :-(

@mloskot
Copy link
Member

mloskot commented Oct 1, 2019

@stefanseefeld

I still plan to refactor the code so the numeric extension will be fused into core

Okay, so if @simmplecoder doesn't prefer to do it now, then we can split numeric.hpp into multiple headers during or after the fuse.

@simmplecoder

This comment has been minimized.

@mloskot mloskot added the cat/feature New feature or functionality label Oct 8, 2019
@mloskot

This comment has been minimized.

Copy link
Member

@mloskot mloskot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simmplecoder LGTM.

However, please ask and wait for @stefanseefeld 's review and consider his review superior to mine.

@simmplecoder
Copy link
Contributor Author

@stefanseefeld , would you like me to unify the interface of all kernel generator functions in this PR?

@stefanseefeld
Copy link
Member

@simmplecoder , sorry, I don't entirely understand the question. What do you mean by "unify" ?

@simmplecoder
Copy link
Contributor Author

@stefanseefeld, earlier functions that generate Gaussian and mean kernels return image_view, while these return kernel_2d. Would you like me to make all functions return kernel_2d?

@stefanseefeld
Copy link
Member

Ah, yes, I think that would be preferable. Thanks,

@mloskot
Copy link
Member

mloskot commented Oct 22, 2019

@stefanseefeld

Ah, yes, I think that would be preferable.

I agree that it's better to (create and) use dedicated types for job than trying to (bend to) re-use existing types where they don't necessarily fit, neither conceptually nor physically.
Another aspect is that if we use types like image_view as general-purpose types we are somewhat tightening the coupling between functions operating on kernels and image_view preventing treating those as completely independent entities. This may cause problems in future (e.g. it may be hard to change either).

@simmplecoder
Copy link
Contributor Author

@miralshah365 , could you please check if I broke your code? I migrated small part of your code that relied on old image_view interface for kernel generators.

@simmplecoder simmplecoder added the status/work-in-progress Do NOT merge yet until this label has been removed! label Oct 23, 2019
@simmplecoder
Copy link
Contributor Author

The output of Harris and Hessian seem to be differ from before the migration to kernel_2d. I will investigate that further, but I believe the interface should be more or less stable now. Hopefully I will be able to find what's wrong until the deadline.

@mloskot
Copy link
Member

mloskot commented Oct 23, 2019

@simmplecoder If there is anything to update in @miralshah365 's code and she's offline, then don't hesitate to nudge me. I will try to take care of updating Miral's code.

AFAIS, current failures are in the kernel tests, see https://dev.azure.com/boostorg/gil/_build/results?buildId=701&view=logs&jobId=2517ed61-6924-508d-087f-7c02f775cbba

 "D:\a\1\s\boost-root\libs\gil\_build\test\core\image_processing\test_core_image_processing_simple_kernels.vcxproj" (default target) (26) ->
       (ClCompile target) -> 
         d:\a\1\s\boost-root\libs\gil\test\core\image_processing\simple_kernels.cpp(20): error C2664: 'boost::gil::kernel_2d<float,std::allocator<T>> boost::gil::generate_normalized_mean<float,std::allocator<T>>(size_t)': cannot convert argument 1 from 'boost::gil::image_view<boost::gil::gray32f_loc_t>' to 'size_t' [D:\a\1\s\boost-root\libs\gil\_build\test\core\image_processing\test_core_image_processing_simple_kernels.vcxproj]
         d:\a\1\s\boost-root\libs\gil\test\core\image_processing\simple_kernels.cpp(39): error C2664: 'boost::gil::kernel_2d<float,std::allocator<T>> boost::gil::generate_normalized_mean<float,std::allocator<T>>(size_t)': cannot convert argument 1 from 'boost::gil::image_view<boost::gil::gray32f_loc_t>' to 'size_t' [D:\a\1\s\boost-root\libs\gil\_build\test\core\image_processing\test_core_image_processing_simple_kernels.vcxproj]
         d:\a\1\s\boost-root\libs\gil\test\core\image_processing\simple_kernels.cpp(51): error C2664: 'boost::gil::kernel_2d<float,std::allocator<T>> boost::gil::generate_unnormalized_mean<float,std::allocator<T>>(size_t)': cannot convert argument 1 from 'boost::gil::image_view<boost::gil::gray32f_loc_t>' to 'size_t' [D:\a\1\s\boost-root\libs\gil\_build\test\core\image_processing\test_core_image_processing_simple_kernels.vcxproj]
         d:\a\1\s\boost-root\libs\gil\test\core\image_processing\simple_kernels.cpp(69): error C2664: 'boost::gil::kernel_2d<float,std::allocator<T>> boost::gil::generate_unnormalized_mean<float,std::allocator<T>>(size_t)': cannot convert argument 1 from 'boost::gil::image_view<boost::gil::gray32f_loc_t>' to 'size_t' [D:\a\1\s\boost-root\libs\gil\_build\test\core\image_processing\test_core_image_processing_simple_kernels.vcxproj]
         d:\a\1\s\boost-root\libs\gil\test\core\image_processing\simple_kernels.cpp(81): error C2664: 'boost::gil::kernel_2d<float,std::allocator<T>> boost::gil::generate_gaussian_kernel<float,std::allocator<T>>(size_t,double)': cannot convert argument 1 from 'boost::gil::image_view<boost::gil::gray32f_loc_t>' to 'size_t' [D:\a\1\s\boost-root\libs\gil\_build\test\core\image_processing\test_core_image_processing_simple_kernels.vcxproj]
         d:\a\1\s\boost-root\libs\gil\test\core\image_processing\simple_kernels.cpp(114): error C2664: 'boost::gil::kernel_2d<float,std::allocator<T>> boost::gil::generate_gaussian_kernel<float,std::allocator<T>>(size_t,double)': cannot convert argument 1 from 'boost::gil::image_view<boost::gil::gray32f_loc_t>' to 'size_t' [D:\a\1\s\boost-root\libs\gil\_build\test\core\image_processing\test_core_image_processing_simple_kernels.vcxproj]

@simmplecoder simmplecoder requested a review from mloskot October 23, 2019 15:34
@simmplecoder
Copy link
Contributor Author

@mloskot, I made a lot of changes after your approval, could you please have a look at new changes?

@simmplecoder
Copy link
Contributor Author

FYI, I checked the output of the Harris and Hessian with cmp --silent $old $new || echo "files are different" of commits 5abd20e (needs fixing cmake due to my bad rebase) and 42a3ea0, the files are identical

@simmplecoder simmplecoder removed the status/work-in-progress Do NOT merge yet until this label has been removed! label Oct 23, 2019
@simmplecoder
Copy link
Contributor Author

@stefanseefeld , I believe I did all the changes I wanted. The PR is ready for review.

Copy link
Member

@mloskot mloskot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

@@ -42,7 +44,7 @@ void compute_harris_responses(
" tensor from the same image");
}

auto const window_length = weights.dimensions().x;
auto const window_length = weights.size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor naming remark: wouldn't width or x_size fit better as name here than length?

inline void compute_hessian_responses(
GradientView ddxx,
GradientView dxdy,
GradientView ddyy,
Weights weights,
const kernel_2d<T, Allocator>& weights,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor stylistic remark: since in C++ this reads from right to left as "weights is a reference to const kernel_2d object", as we do it for pointers, this should be the East Const way: kernel_2d<T, Allocator> const& weights

You may want to join the revolution! 😉

image

/// \brief Generate mean kernel
/// \ingroup ImageProcessingMath
///
/// Fills supplied view with normalized mean
/// in which all entries will be equal to
/// \code 1 / (dst.size()) \endcode
inline void generate_normalized_mean(boost::gil::gray32f_view_t dst)
template <typename T = float, typename Allocator = std::allocator<T>>
inline kernel_2d<T, Allocator> generate_normalized_mean(std::size_t side_length)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, wouldn't width or x_size fit better as name here than length?

}
}
}

/// \brief Generate mean kernel
/// \ingroup ImageProcessingMath
///
/// Fills supplied view with normalized mean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps you could run find and replace for /// Fills supplied view changing to /// Fills kernel. This could be done even after merge as single commit with [ci skip] tag in subject line. No need to CI this kind of change.

case 1:
{
kernel_2d<T, Allocator> result(3, 1, 1);
std::copy(dx_sobel.begin(), dx_sobel.end(), result.begin());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have compile-time known std::array<float, 9> dx_sobel, this place seems like a good opportunity to add static_assert verifying dx_sobel.size() and BOOST_ASSERT verifying dx_sobel.size() == result.size().

Such assertions would also serve as code comment for a reader who doesn't have to jump to dx_sobel definition.

case 1:
{
kernel_2d<T, Allocator> result(3, 1, 1);
std::copy(dx_scharr.begin(), dx_scharr.end(), result.begin());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, I'd add some assertions.

case 1:
{
kernel_2d<T, Allocator> result(3, 1, 1);
std::copy(dy_sobel.begin(), dy_sobel.end(), result.begin());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And assertions here.

case 1:
{
kernel_2d<T, Allocator> result(3, 1, 1);
std::copy(dy_scharr.begin(), dy_scharr.end(), result.begin());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And assertions here too 😄

[](gray32f_pixel_t pixel) -> float {return pixel.at(std::integral_constant<int, 0>{}); }
);

kernel_2d<float> kernel = generate_gaussian_kernel(kernel_size, 1.0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kernel_2d<float> const kernel?

This simple addition immediately clears potential reader's question "can a thing be modified by function call that follows?".

Harris and Hessian examples now use
new interface for kernel generation
simple_kernels are now using kernel_2d
interface
Normalized mean generation had missing
return at the end of the function
This commit reacts to kernel_2d,
convolve_2d being moved to
namespace detail
@simmplecoder simmplecoder merged commit 62379dd into boostorg:develop Oct 29, 2019
@simmplecoder simmplecoder deleted the sobel-scharr branch August 1, 2020 20:08
@Sayan-Chaudhuri
Copy link

@simmplecoder I was using the sobel operator for gradient along x direction for the photo attached below. The sobel operator is of degree 1. To my surprise, after applying the sobel operator, the pixel values are coming of the magnitude of 65000 in the gradient image. Is it possible? I used the sobel_scharr.cpp file in the example folder of boost gil and simply passed the name of the operator and the name of the picture and the other arguments as mentioned. I tested the same picture by applying sobel in opencv in python and found the pixel values to be as it should be. If you can kindly help me regarding this issue,I will be highly grateful
cat

@simmplecoder
Copy link
Contributor Author

@Sayan-Chaudhuri perhaps you are using unsigned images and views? IIRC the read functions cannot read signed images, but you can convert them into signed or use color_converted_view. The destination view must be signed as well.

@simmplecoder
Copy link
Contributor Author

simmplecoder commented Apr 6, 2021 via email

@Sayan-Chaudhuri
Copy link

Thanks for taking out time to look into the matter

@meshtag
Copy link
Member

meshtag commented Apr 7, 2021

@simmplecoder @Sayan-Chaudhuri , I believe the issue was caused due to 16 in
gil::gray16_image_t dx_image(input_image.dimensions()); and gil::gray16_image_t dy_image(input_image.dimensions());.
Changing them to gil::gray8_image_t dx_image(input_image.dimensions()) and gil::gray8_image_t dy_image(input_image.dimensions()) solved the problem.

@meshtag
Copy link
Member

meshtag commented Apr 7, 2021

@simmplecoder , perhaps we can add a check inside convolve_2d() for handling such things? Is it a good idea to use channel_traits for this purpose? (/cc @mloskot @lpranam )

@Sayan-Chaudhuri
Copy link

Yes, I also found that but why did this happen?

@Sayan-Chaudhuri
Copy link

@meshtag

@meshtag
Copy link
Member

meshtag commented Apr 7, 2021

Without doing an in depth analysis, my initial thoughts suggest that we are probably looking at some kind of an overflow inside convolve_2d(). I will have to test it more in order to reach to a conclusive reason.

@Sayan-Chaudhuri
Copy link

Should I make a pull request if i find out the reason and solve it?@meshtag

@meshtag
Copy link
Member

meshtag commented Apr 7, 2021

I am waiting for feedback on this. If we all agree on the root cause of this problem and my suggestion to solve it, I would like to create a PR myself. Lets wait until we hear from more experienced developers.

@Sayan-Chaudhuri
Copy link

ok.

@simmplecoder
Copy link
Contributor Author

simmplecoder commented Apr 7, 2021

@meshtag I believe @Scramjet911 didn't have much luck with channel traits. It seems like whatever it returns is not what we need. The problem with the example is that it uses unsigned type where negative values will be present. The fix to use gray8_image_t is wrong, because it uses unsigned type too. I will PR the fix and mention everybody in it this evening.

@Scramjet911
Copy link
Contributor

Scramjet911 commented Apr 7, 2021

@meshtag I believe @Scramjet911 didn't have much luck with channel traits. It seems like whatever it returns is not what we need. The problem with the example is that it uses unsigned type where negative values will be present. The fix to use gray8_image_t is wrong, because it uses unsigned type too. I will PR the fix and mention everybody in it this evening.

@simmplecoder Channel traits was working fine, but the problem was that the destination image of that case was of floating type, giving me the wrong result. So I had to use the source channel traits for checking the constraints.

@simmplecoder
Copy link
Contributor Author

simmplecoder commented Apr 7, 2021

@Sayan-Chaudhuri @meshtag unfortunately I didn't get time to write saturate cast, but you can tweak the equation inside transform_pixels below:

#include <boost/gil/image_processing/numeric.hpp>
#include <boost/gil/extension/io/png.hpp>
#include <boost/gil/extension/numeric/convolve.hpp>
#include <string>
#include <iostream>

namespace gil = boost::gil;

int main(int argc, char* argv[])
{
    if (argc != 5)
    {
        std::cerr << "usage: " << argv[0] << ": <input.png> <sobel|scharr> <output-x.png> <output-y.png>\n";
        return -1;
    }

    gil::gray8_image_t input_image;
    gil::read_image(argv[1], input_image, gil::png_tag{});
    auto input = gil::view(input_image);
    auto filter_type = std::string(argv[2]);

    gil::gray16s_image_t dx_image(input_image.dimensions());
    auto dx = gil::view(dx_image);
    gil::gray16s_image_t dy_image(input_image.dimensions());
    auto dy = gil::view(dy_image);

    auto signed_input = gil::color_converted_view<gil::gray8s_pixel_t>(input);
    if (filter_type == "sobel")
    {
        gil::detail::convolve_2d(signed_input, gil::generate_dx_sobel(1), dx);
        gil::detail::convolve_2d(signed_input, gil::generate_dy_sobel(1), dy);
    }
    else if (filter_type == "scharr")
    {
        gil::detail::convolve_2d(signed_input, gil::generate_dx_scharr(1), dx);
        gil::detail::convolve_2d(signed_input, gil::generate_dy_scharr(1), dy);
    }
    else
    {
        std::cerr << "unrecognized gradient filter type. Must be either sobel or scharr\n";
        return -1;
    }

    gil::gray8_image_t output_image(input_image.dimensions());
    auto output = gil::view(output_image);

    gil::transform_pixels(dx, output, [](const gil::gray16s_pixel_t& dx_pixel) {
        double value = dx_pixel[0];
        return gil::gray8_pixel_t((value - std::numeric_limits<std::int16_t>::min()) / (std::numeric_limits<std::int16_t>::max()) * 255);
    });

    gil::write_view(argv[3], output, gil::png_tag{});

    gil::transform_pixels(dy, output, [](const gil::gray16s_pixel_t& dx_pixel) {
      double value = dx_pixel[0];
      return gil::gray8_pixel_t((value - std::numeric_limits<std::int16_t>::min()) / (std::numeric_limits<std::int16_t>::max()) * 255);
    });
    gil::write_view(argv[4], output, gil::png_tag{});
}

The main problem now is that the values are incorrectly scaled. I bet opencv uses saturate_cast for this, but I'm not sure. I will investigate tomorrow.

@Sayan-Chaudhuri
Copy link

Thanks @simmplecoder I shall look into it

@Sayan-Chaudhuri
Copy link

@simmplecoder I wanted to confirm that sobel and scharr currently dont work for multichannel images right?

@simmplecoder
Copy link
Contributor Author

simmplecoder commented Apr 8, 2021 via email

@Sayan-Chaudhuri
Copy link

@simmplecoder I think for proper scaling we should be using this formula
NewValue = (((OldValue - OldMin) * (NewMax - NewMin)) / (OldMax - OldMin)) + NewMin right??
In transform pixels function , I think you have used this formula but you have given the denominator wrong

@Sayan-Chaudhuri
Copy link

@simmplecoder Also,how does the convolution operator in boost deal with pixels at boundaries?

@mloskot mloskot mentioned this pull request Jan 31, 2023
3 tasks
@cgringmuth
Copy link
Contributor

cgringmuth commented Feb 1, 2023

@Sayan-Chaudhuri @meshtag unfortunately I didn't get time to write saturate cast, but you can tweak the equation inside transform_pixels below:

#include <boost/gil/image_processing/numeric.hpp>
#include <boost/gil/extension/io/png.hpp>
#include <boost/gil/extension/numeric/convolve.hpp>
#include <string>
#include <iostream>

namespace gil = boost::gil;

int main(int argc, char* argv[])
{
    if (argc != 5)
    {
        std::cerr << "usage: " << argv[0] << ": <input.png> <sobel|scharr> <output-x.png> <output-y.png>\n";
        return -1;
    }

    gil::gray8_image_t input_image;
    gil::read_image(argv[1], input_image, gil::png_tag{});
    auto input = gil::view(input_image);
    auto filter_type = std::string(argv[2]);

    gil::gray16s_image_t dx_image(input_image.dimensions());
    auto dx = gil::view(dx_image);
    gil::gray16s_image_t dy_image(input_image.dimensions());
    auto dy = gil::view(dy_image);

    auto signed_input = gil::color_converted_view<gil::gray8s_pixel_t>(input);
    if (filter_type == "sobel")
    {
        gil::detail::convolve_2d(signed_input, gil::generate_dx_sobel(1), dx);
        gil::detail::convolve_2d(signed_input, gil::generate_dy_sobel(1), dy);
    }
    else if (filter_type == "scharr")
    {
        gil::detail::convolve_2d(signed_input, gil::generate_dx_scharr(1), dx);
        gil::detail::convolve_2d(signed_input, gil::generate_dy_scharr(1), dy);
    }
    else
    {
        std::cerr << "unrecognized gradient filter type. Must be either sobel or scharr\n";
        return -1;
    }

    gil::gray8_image_t output_image(input_image.dimensions());
    auto output = gil::view(output_image);

    gil::transform_pixels(dx, output, [](const gil::gray16s_pixel_t& dx_pixel) {
        double value = dx_pixel[0];
        return gil::gray8_pixel_t((value - std::numeric_limits<std::int16_t>::min()) / (std::numeric_limits<std::int16_t>::max()) * 255);
    });

    gil::write_view(argv[3], output, gil::png_tag{});

    gil::transform_pixels(dy, output, [](const gil::gray16s_pixel_t& dx_pixel) {
      double value = dx_pixel[0];
      return gil::gray8_pixel_t((value - std::numeric_limits<std::int16_t>::min()) / (std::numeric_limits<std::int16_t>::max()) * 255);
    });
    gil::write_view(argv[4], output, gil::png_tag{});
}

The main problem now is that the values are incorrectly scaled. I bet opencv uses saturate_cast for this, but I'm not sure. I will investigate tomorrow.

I think your scaling is not correct. If you move values such that min is 0 then you just cannot scale by max. Also I think it is better to scale based on the range of image instead of the range of pixel type [-32768, 32767]. Imagine your range of your gradient is [-10, 10]. You would barely see them if you subtract -32768 and divide by 32767 . But if you map -10 -> 0 and 10 -> 255 then you can see them clearly. So my idea was to stretch the range of your gradients to max range of output type.

Here is mine version to write to png files

template<typename InputView>
void write_view(const InputView& img, const std::string& filename) {
  gil::gray8_image_t output_image(img.dimensions());
  auto output = gil::view(output_image);

  const auto [minPix, maxPix] = std::minmax_element(img.begin(), img.end());
  const auto minVal = minPix[0];
  const auto maxVal = maxPix[0];
  gil::transform_pixels(img, output, [&minVal, &maxVal](const auto& pix) {
      double value = pix[0];
      return gil::gray8_pixel_t((value - minVal) / (maxVal - minVal) * 255);
  });

  gil::write_view(filename, output, gil::png_tag{});
}

Now you have nice gradients (this image has been used):

dx dy

However, the dx and dy is still switched because of a bug. Please see this PR #723.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat/feature New feature or functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants