Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Are the allocations from imfilter! expected? #76

Open
davidbp opened this issue Sep 10, 2018 · 8 comments
Open

Are the allocations from imfilter! expected? #76

davidbp opened this issue Sep 10, 2018 · 8 comments

Comments

@davidbp
Copy link

davidbp commented Sep 10, 2018

Hello,

I was trying to use imfilter! with an array gx preallocated before calling the function with the hope of not allocating memory. I see 2.31 MiB allocated. Why is it happening? (should I expect it? I did not expect any allocations if the output array was preallocated)

thank you

A minimal working example:

using Images
using TestImages
using BenchmarkTools

function to_gray(image)
    x = Array{ColorTypes.Gray{Float32}, 2}(image)
    return Array{Float32}(x)
end

img = testimage("mandrill");
img = to_gray(img)
direction_x =  centered([-1 0 1])
gx = imfilter(img, direction_x);

@benchmark imfilter($img, $direction_x)
@benchmark imfilter!($gx, $img, $direction_x)

Prints the following:

BenchmarkTools.Trial: 
  memory estimate:  3.31 MiB
  allocs estimate:  58
  --------------
  minimum time:     4.126 ms (0.00% GC)
  median time:      4.597 ms (0.00% GC)
  mean time:        5.200 ms (7.27% GC)
  maximum time:     10.802 ms (28.68% GC)
  --------------
  samples:          960
  evals/sample:     1


BenchmarkTools.Trial: 
  memory estimate:  2.31 MiB
  allocs estimate:  56
  --------------
  minimum time:     4.101 ms (0.00% GC)
  median time:      4.419 ms (0.00% GC)
  mean time:        4.839 ms (5.47% GC)
  maximum time:     10.787 ms (25.44% GC)
  --------------
  samples:          1032
  evals/sample:     1
@timholy
Copy link
Member

timholy commented Sep 10, 2018

It's due to padding to handle the boundary conditions. At last check it was faster to copy the entire array than it was to exploit the otherwise-clever "interior vs exterior" logic that I built into this package:

# # An optimized case that performs only "virtual padding"
# function imfilter!{S,T,N,A<:Union{FIR,FIRTiled}}(r::AbstractCPU{A},
# out::AbstractArray{S,N},
# img::AbstractArray{T,N},
# kernel::ProcessedKernel,
# border::Pad{0})
# # The fast path: handle the points that don't need padding
# iinds = map(intersect, interior(img, kernel), axes(out))
# imfilter!(r, out, img, kernel, NoPad(border), iinds)
# # The not-so-fast path: handle the edges
# # TODO: when the kernel is factored, move this logic in to each factor
# # This is especially important for bigger kernels, where the product pkernel is larger
# padded = view(img, padindices(img, border(kernel))...)
# pkernel = kernelconv(kernel...)
# _imfilter_iter!(r, out, padded, pkernel, EdgeIterator(axes(out), iinds))
# end
Perhaps you could try uncommenting it and seeing how it shakes out on 0.7/1.0? (The optimizer is quite different now, it may be better to go back to the clever solution.)

A possible detail is that you might need to be careful about your kernel. If you still discover allocations try something like this:

kernel = (ImageFiltering.ReshapedOneD{2,1}(centered([-1,0,1])),)

If you find that necessary but opaque, I can explain later.

@timholy
Copy link
Member

timholy commented Sep 10, 2018

Oh, BTW, also consider img = Gray{Float32}.(testimage("mandrill")) instead of defining to_gray.

@davidbp
Copy link
Author

davidbp commented Sep 10, 2018

I have no idea why but calling the function with the kernel you wrote
kernel = (ImageFiltering.ReshapedOneD{2,1}(centered([-1,0,1])),)

is about 5 times faster

@benchmark imfilter!(gx, img, kernel)
BenchmarkTools.Trial: 
  memory estimate:  2.02 MiB
  allocs estimate:  21
  --------------
  minimum time:     755.484 μs (0.00% GC)
  median time:      881.723 μs (0.00% GC)
  mean time:        1.020 ms (12.63% GC)
  maximum time:     3.054 ms (58.79% GC)
  --------------
  samples:          4867
  evals/sample:     1

Additional comments

  • If you have a couple of minutes I would like some hints on why is that faster or how can I think about rewriting kernels (is it explained in the documentation of ImageFiltering?).

  • If I try to apply that idea (without knowing what's going on) to compute the gradient in the y direction I get an error...

kernel_y = (ImageFiltering.ReshapedOneD{2,1}(centered([-1,0,1])'),)
#error
DimensionMismatch("must be one dimensional, got 2")
  • In any case we should definitely use this 'trick' in ImageFeatures. It would speed up quite a lot the computation of the HOG which currently does the following:
function create_descriptor(img::AbstractArray{CT, 2}, params::HOG) where CT<:Images.NumberLike
    #compute gradient
    gx = imfilter(img, centered([-1 0 1]))
    gy = imfilter(img, centered([-1 0 1]'))
    mag = hypot.(gx, gy)
    phase = orientation.(gx, gy)

    create_hog_descriptor(mag, phase, params)
end

@davidbp
Copy link
Author

davidbp commented Sep 10, 2018

In case I would like to use imfilter! or imfilter without padding how should I do it?

I have tried an example from the documentation:

k = centered([-1 0 1])
np = NoPad(Pad(:replicate))
imfilter(img, k, np)

but returns

DimensionMismatch("requested indices (1:512, 0:125) and kernel indices (Base.Slice(0:0), Base.Slice(0:0)) do not agree with indices of padded input, (Base.OneTo(512), Base.OneTo(512))")

There are many imfilter! methods in the source code. Which one should I use to avoid padding?

@timholy
Copy link
Member

timholy commented Sep 11, 2018

First some preliminaries: understand that ImageFiltering was "written against" older versions of julia; while it has been ported to 0.7/1.0 it does not take advantage of every aspect of more recent capabilities of Julia. For example, ImageFiltering places a huge premium on type-stability which was critically important in older versions of Julia. This is not as important as it used to be, but until someone spends several days taking a close look at every aspect of ImageFiltering we're stuck with the original design choices. So comments below are definitely relevant, but in some cases may only be a reflection of the current state of ImageFiltering rather than universal truths.

is about 5 times faster

interesting, for me it's about 2x faster. Still a good thing, obviously, but a much smaller improvement. (It's also much slower on my machine, and that may be relevant.)

If you have a couple of minutes I would like some hints on why is that faster or how can I think about rewriting kernels (is it explained in the documentation of ImageFiltering?).

[0 1 0] is a 2d kernel, though the first dimension is of size 1. Filtering is vastly faster for separable 2d kernels than for non-separable kernels, so ImageFiltering goes to the effort to try to discover whether your 2d kernel is separable before doing the actual filtering. The fact that the first dimension is of size 1 means, of course, that it is separable and that you only need to perform operations along dimension 2. But your version doesn't guarantee that at the level of the type system. My version does: a ReshapeOneD means "reshape a 1d vector to be a multidimensional object that is still essentially one-dimensional, but along some arbitrary dimension." In particular, ReshapeOneD{2,1}(v) means turn it into a 2d object with 1 "padding dimension" before the vector, i.e., so that its extent is along dimension 2. (With newer versions of Julia we should special case v' since that's wrapped in Adjoint but that wasn't possible in older versions of Julia so hasn't been done yet.)

When you put ReshapedOneD objects into a tuple you define a filter cascade and ImageFiltering doesn't try to "get smart" with your kernel, it takes it literally. The short-circuiting of the "cleverness" might contribute to some of the speed advantage, but probably not much. There are differences in dispatch that may be significant but perhaps not inevitable. To be honest, the dispatch rules for ImageFiltering are considerably more complicated than ideal because OffsetArrays used to have bad performance so I used a lot of workarounds. Now they don't so things could certainly be simplified.

(is it explained in the documentation of ImageFiltering?)

A few hints of this are in the docs but probably not with sufficient detail. Would love help fixing that, naturally.

If I try to apply that idea (without knowing what's going on) to compute the gradient in the y direction I get an error...

Actually the example I gave was working along dimension 2 (which is actually "horizontal" as viewed by e.g., ImageView). For filtering along the 1st dimension just use kernel = (centered([-1,0,1]),) (i.e., a vector).

In case I would like to use imfilter! or imfilter without padding how should I do it?

Inner should be useful but I see there's a (deliberate?) copy here. To be honest I don't remember why, but it could be important. (Consider deleting it and seeing what happens?) You're right that NoPad should be useful but it may be a bit more complicated because it's really intended only for internal use. You may have discovered a bug but I can't check tonight.

@davidbp
Copy link
Author

davidbp commented Sep 12, 2018

Thank you for your detailed explanation.

  • I understand that [-1 0 1] is a 2d kernel (ndims([-1 0 1]) is 2) but why can't I use [-1,0,1] (ndims([-1,0,1]) is 1) ?

I get the following error:

ArgumentError: Pad{1}(:replicate, (1,), (1,)) lacks the proper padding sizes for an array with 2 dimensions

Is it because there is no way for imfilter to know if the kernel has to be applied in the X or Y axis? (since the image is 2D but the kernel is 1D)

  • I don't see how to use ReshapedOneD to do the same with gy = imfilter(img, centered([-1 0 1]')) even though it should be straightforward.

Actually It's a bit odd you created a tuple when you created the kernel but If I don't do it it does not work.

## works
k2 = (ImageFiltering.ReshapedOneD{2,1}(centered([-1,0,1])),)
imfilter(img, k2)

## does not work
k2 = ImageFiltering.ReshapedOneD{2,1}(centered([-1,0,1]))
imfilter(img, k2)
  • I will try to do some workaround to use imfilter! without alocations with your suggestions.

Thank you for your comments :)

@timholy
Copy link
Member

timholy commented Sep 12, 2018

Looks like you need

kernel = (ImageFiltering.ReshapedOneD{2,0}(centered([-1,0,1])),)

(note this is {2,0} instead of {2,1} like before). I think we probably should make that work properly, but this will get you going.

Pro-tip: to understand what all this means, be aware that ImageFiltering has a lot of internal documentation accessible with ? and module-scoping, e.g.,

help?> ImageFiltering.ReshapedOneD
  ReshapedOneD{N,Npre}(data)

  Return an object of dimensionality N, where data must have dimensionality 1. The axes are 0:0 for the first Npre dimensions, have the axes of data for dimension Npre+1, and are 0:0 for the remaining dimensions.

  data must support eltype and ndims, but does not have to be an AbstractArray.

  ReshapedOneDs allow one to specify a "filtering dimension" for a 1-dimensional filter.

The trickiest part is that ImageFiltering has several sub-modules so you often have to figure out which one you need.

Actually It's a bit odd you created a tuple when you created the kernel but If I don't do it it does not work

Regarding tuples for the kernel, see that link to the documentation about factored kernels and above about the "filter cascade." But we should add methods that support single kernels of custom type.

In principle it shouldn't be necessary to mess around with this level of manual optimization; you're digging into the private interface of ImageFiltering, so expect to have to dive into the code for some explanations.

@yakir12
Copy link

yakir12 commented Jan 22, 2025

I'm sure much has changed since, but just to let anyone landing here know, it is entirely possible to imfilter! with zero allocations:

using ImageFiltering, ComputationalResources, BenchmarkTools

fun1(n, img, buff, kernel) = for i in 1:n
    imfilter!(buff, img, kernel)
end

fun2(n, img, buff, kernel) = for i in 1:n
    imfilter!(CPU1(Algorithm.FIR()), buff, img, kernel, NoPad(),  (40:60, 40:60))
end

sz = (100, 100)
img = rand(sz...)
buff = rand(sz...)
kernel = Kernel.DoG(1)
times = (1, 1000)

bytes1 = map(times) do n
    @ballocated fun1($n, $img, $buff, $kernel) 
end

bytes2 = map(times) do n
    @ballocated fun2($n, $img, $buff, $kernel) 
end

where:

julia> bytes1
(1205280, 1205280000)

julia> bytes2
(0, 0)

(imfilter) pkg> st
Status `~/mwe/imfilter/Project.toml`
  [ed09eef8] ComputationalResources v0.3.2
  [6a3955dd] ImageFiltering v0.7.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants