Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implicit use of rounding while using int64 paths / performance of repeated calculations #236

Closed
sponsoredlinks opened this issue Sep 23, 2022 · 8 comments

Comments

@sponsoredlinks
Copy link

Hello, thanks for the excellent library. We are using it in a genetic algorithm that solves using Difference() within its fitness function ie calls it repeatedly tens to hundreds of thousands of times on two sets of contours

In current code we are using int64 paths and points which do not vary often:

Subject is 8 contours of 30-40 points each
Clip is 20-40 paths of 4 points each

During testing I have noticed that there is heavy use of round(); see attached screenshot from VS profiling. I am trying to work out if this is an error in my coding (I can understand that this would happen if using doubles but it seems strange when using int64 natively)

I can't see anything in the issues that references this - I can upload some test code if you like, but was wondering firstly if this was known/expected behaviour and if it can be mitigated

image

ps. Semi-related, there is also some use of malloc/free that I have noticed - is there a way to avoid this for repeated, similar calculations? It seems that it would be nice to avoid dynamic allocation in general for performance reasons

@AngusJohnson
Copy link
Owner

AngusJohnson commented Sep 23, 2022

Hi @nmccouat.

In current code we are using int64 paths and points which do not vary often:

OK. Is it possible that subjects change more than clips?
If so, have you considered union-ing the clips before repeat calls to Difference?
Of course if clips change between differences, then that won't help.

During testing I have noticed that there is heavy use of round();

Indeed.
And removing some of these does make a very big difference in performance, with no significant cost to clipping precision. While only having done brief testing so far, I don't think it's advisable to remove all round() calls.
But the following changes in GetIntersectPoint() in clipper.engine.cpp I'm pretty sure are safe:

	Point64 GetIntersectPoint(const Active& e1, const Active& e2)
	{
          ...
			return (abs(e1.dx) < abs(e2.dx)) ?
				Point64(static_cast<int64_t>((e1.dx * q + b1)),
					static_cast<int64_t>((q))) :
				Point64(static_cast<int64_t>((e2.dx * q + b2)),
					static_cast<int64_t>((q)));
		}
	}

ps. Semi-related, there is also some use of malloc/free that I have noticed

I'm not sure what part of my code you're referring to here.
There are certainly many objects that are allocated on the heap, but I'm not aware of calling malloc/free directly.

Edit:
The changes above (removed rounds) are safe and i will update the library shortly. But this changes expected values in the CI testing which will need to be updated too.

@sponsoredlinks
Copy link
Author

sponsoredlinks commented Sep 23, 2022 via email

@AngusJohnson
Copy link
Owner

AngusJohnson commented Sep 23, 2022

The contours of both clips and subjects are generally non-overlapping - is
that a requirement for them to be unioned?

Yes.
It won't help if they're all discrete.

Edit:
Intersection: Time (secs)

Edge Count Before After
  C++ C++
0 0.00 0.00
1000 0.12 0.11
2000 0.38 0.33
3000 1.00 0.81
4000 2.82 1.99
5000 8.85 4.68
6000 18.10 11.60
7000 36.00 22.00
8000 68.00 44.00

@philstopford
Copy link
Contributor

There might be some benefit for the C# code as well.

@AngusJohnson
Copy link
Owner

There might be some benefit for the C# code as well.

Yep, there is.
I've made these changes to C# and Delphi too (though for Delphi it wasn't just a simply typecast).

@sponsoredlinks
Copy link
Author

Hi Angus, just a quick update on this - there's one more round() that causes a reasonably significant degradation based on my testing -> https://github.com/AngusJohnson/Clipper2/blob/main/CPP/Clipper2Lib/src/clipper.engine.cpp#L149

By the nature of what we're doing, our code will typically succeed even if there is slight imprecision in the rounding, so it's hard to tell if this is actually a valid change for other users. Either way, removing this call to round() results for us in a roughly 25-30% performance improvement for repeated calculations :)

@AngusJohnson
Copy link
Owner

Hi Angus, just a quick update on this - there's one more round() that causes a reasonably significant degradation based on my testing.

Yes, I was surprised I'd left that std::round in having removed most of the others. But on testing this again I now remember why it's still there. Rather than degrading performance this round very substantially improves performance. And you can see this when you run the benchmark test that's part of the ConsoleDemo1 application. Yes, this is just one test and it's probably not even typical of how the library is used. Nevertheless in this test, performance drops by half. So at least for the time being I'm leaving it alone 😜.

@sponsoredlinks
Copy link
Author

sponsoredlinks commented Oct 20, 2022

Hi Angus, I was interested as to what was happening here - the results are indeed a bit counterintuitive - so I did a bit of profiling using your demo

It seems that the issue is in the edges adjacent test loop in ProcessIntersectList - without round() being called, the loop seems to fail to match adjacent edges efficiently

For some reason I couldn't get good granular info when profiling so added the test code below to ProcessIntersectList

Results were as follows - note the super high value for count3, which probably indicates it's incorrectly failing the equality on the first attempt and subsequently having to search inefficiently?

nb. it seems that nearbyint() performs similarly to round in our test application

Output:

With std::round in TopX
50,000,001, 24,798,993, 5,576,662,582, 24,798,993
3.87 secs

Without std::round in TopX:
49,000,001, 24,408,465, 15,713,042,771, 24,408,465
9.80 secs

With nearbyint() instead of round in TopX:
50,000,001, 24,749,958, 5,228,389,583, 24,749,958
3.69 secs

		std::vector<IntersectNode>::iterator node_iter, node_iter2;
		for (node_iter = intersect_nodes_.begin();
			node_iter != intersect_nodes_.end();  ++node_iter)
		{
			static int64_t count1 = 0;
			static int64_t count2 = 0;

			bool case1 = (node_iter->edge1->next_in_ael == node_iter->edge2);
			count1++;

			if(!case1 && !(node_iter->edge1->prev_in_ael == node_iter->edge2))
			{
				count2++;

			//if (!EdgesAdjacentInAEL(*node_iter))
			//{
				node_iter2 = node_iter + 1;
				static int64_t count3 = 0;
				static int64_t count4 = 0;


				/*while(!EdgesAdjacentInAEL(*node_iter2)) ++node_iter2;*/
				while(true)
				{
					bool case2 = (node_iter2->edge1->next_in_ael == node_iter2->edge2);
					count3++;

					//std::cout << case2 << ", " << node_iter2->edge1->next_in_ael << ", " << node_iter2->edge2 << std::endl;

					if(case2 || (node_iter2->edge1->prev_in_ael == node_iter2->edge2))
					{
						count4++;
						break;
					}

					++node_iter2;
				}

				if (count1++ % 1000000 == 0)
				{
					std::cout << count1 << ", " << count2 << ", " << count3 << ", " << count4 << std::endl;
					
				}

				std::swap(*node_iter, *node_iter2);
			}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants