Alignment of multi-camera pointclouds needs work #18

jackjansen · 2022-07-12T08:14:14Z

We need to fix the alignment once and for all.

~~The original issue is https://gitlab.com/VRTogether_EU/cwipc/cwipc_util/-/issues/41~~ but this link is dead now.

There is an old repo for experiments at https://github.com/cwi-dis/pointclouds-alignment

New experiment data is at https://github.com/cwi-dis/cwipc_test/tree/master/pointcloud-registration-test

Edit 20-Nov-2023: most of the text above is outdated. Kept for future reference.

Current plan of attack:

Create an algorithm that returns the approximate mis-alignment of each camera (as a distance in meters)
Implement a few of the alignment algorithms in a way that they can be run automatically
Create a registration algorithm that does something like
- Given the misalignment from step 1 run one or more of the alignment algorithms.
- Re-measure the misalignment. Check that it has gone down.
- Repeat until happy, or no more improvements
- record the results in cameraconfig.json. Also record there the current misalignment, because it is a good value for voxelization (later, during production).
After all this is done look at the other registration ideas @Silvia024 found, and possibly also that algorithm that @nachoreimat found.
After all this is done try to automate the coarse registration with Aruco codes
After all that is done see whether we can use multiple Aruco codes to handle a large number of cameras with fields of view that don't fully overlap.
If (big if) the registration procedure works with point clouds of people (as opposed to only working with point clouds of boxes, or not working at all) we should try applying it to the cwipc-sxr captures, to see if we can get better alignment.
In parallel to all of the steps above we should think about whether there's a paper in here somewhere, and who should be the primary person to lead this.

The text was updated successfully, but these errors were encountered:

jackjansen · 2023-04-08T21:34:52Z

Current best procedure (copied from Slack):

I think I've just done the best calibration we've ever done. Keeping note here so we can try again (in other words:
@shirsub
can try this procedure in vrsmall).

Use 4 cameras, not 3.
First do the coarse calibration cwipc_calibrate --kinect --nofine with the A4 sheet on the floor.
Edit cameraconfig, set correct high/low, near/far, radius, erosion=2
Do cwipc_view --kinect with nothing in view, and ensure there are no points. If there are points edit cameraconfig to fix.
At the centerpoint stack 3 IKEA boxes so each camera sees 2 sides (and each side is seen by two cameras)
cwipc_view --kinect and w to save a pointcloud.
cwipc_calibrate --kinect --reuse --nocoarse --nograb pointcloud_xxxxxxxxx.ply --corr 0.05
(The 0.05 is how far the per-camera pointclouds are apart in the grabbed image. For many years I picked lower numbers here all the time, but the trick is you should pick higher numbers.)
Pick algorithm 6 (cumulative multiscale ICP), variant 2 (point2plane)
Inspect result, after - a few times to make the points smaller. If you look at the stack of boxes from above you get a very good idea of the quality.
The way the algorithm works at the moment means (I think) it will be happy after doing 3 steps so with the new calibration I did another grab and calibrate with --corr 0.02 which gave me a pretty good result (at least I think so).

jackjansen · 2023-04-19T21:20:50Z

I think we should start looking at opencv to help with calibration. For example, we could use an Aruco marker (see https://docs.opencv.org/4.x/d5/dae/tutorial_aruco_detection.html) to help with detection of (0, 0, 0) and do the coarse calibration automatically.

This also gives us camera positions, so we can then determine the camera ordering for the next step (which pairs of cameras we should align).

We could then capture a 3D object (maybe the boxes, but possibly a person is good enough). I have the feeling that we could use the histogram of point-to-point distances to determine how well the coarse calibration worked: I'm expecting a plateau on the left of the histogram, then a drop, then a very long tail. Somewhere along the drop would seem to be a good value for the --corr parameter.

We then use cumulative multi scale ICP or something, and after that recompute the histogram. corr should now be lower.

Then I think we repeat with the other camera pairs.

Question is what to do after all pairs have been done, because possibly the last calibration has broken the first one.

Another option could be to first compute the corr for each pair, and use that to determine the order of doing the aligning, but I'm not sure whether "best first" or "worst first" is the best option. I think "best first" but I'm not sure.

But I do think that after we've finished calibration we should repeat the calculation of corr, because this will give us the optimal cell size for voxelization.

@fonskuijk I would really like your opinion on this idea...

jackjansen · 2023-11-12T22:34:03Z

Edit 20-Nov-2023: most of the information in this comment is now outdated. See below. Keeping it here for future historic reference.

Okay, let's use this issue to keep note of what we've done (and found out) so far.

There are 6 pointclouds in cwipc_test that have been grabbed in vrbig and vrsmall. One of the stack of boxes, one of Jack facing the front camera and one of Jack facing between the front camera and the next camera.
From visual inspection the registration in vrbig looks pretty good, the registration in vrsmall has one camera off by a few centimeters.
There's a script sandbox/voxelize_curve.py (now badly named) that vowelises a point cloud by ever decreasing voxelsize. It creates a CSV file with an entry per voxelsize, showing how many points there are, how many points with 1 contributing camera, how many points with 2 contributing cameras, etc.
Each voxelized point cloud is also saved.
The script has been run on the two boxes pointclouds, after colorising them per camera, i.e. after running --filter 'colorize(1,"camera")' on it.
The resulting CSV files can be imported into voxelize_curve_spreadsheet.numbers to graph them.
(note that various fixes were made to cwipc_view and cwipc_downsample() to make this all work somewhat)

The results are difficult to interpret. I was expecting to see some sort of a threshold effect between cellsize > epsilon and cellsize < epsilon, where epsilon is how good the current registration is (in other words: when the cellsize is so large that points from camera 1 and camera 2 would be merged I would expect a super-linear drop in the point count.

This doesn't happen. except for very small and very large cell sizes I'm seeing a more-or-less linear decrease in the number of points (of approximate a factor of 2, as is to be expected as I'm using sort(0.5) as the factor between cell sizes.

Next I decided to look at the number of contributing cameras for each result point (after voxelizing) at each cellsize. Again, I was expecting to see something interesting at the boundary of cellsize == epsilon: something like a sudden increase in the number of points with two contributing cameras (at the expense of the number of points with one contributing camera).

But again no such luck: things look vaguely linear. Moreover, already at small cellsizes there are quite a few points that already have three contributing cameras. This should not be possible with the boxes, I think: I think the alignment of the cameras and the boxes was such that each side could really only be seen by two cameras.

Either I am making a stupid mistake in my thinking, or there is another bug in the way cwipc_downsample() works. I will investigate the latter first.

Edit: there was indeed another stupid bug in cwipc_downsample() which led to sometimes too many contributing cameras being specified in a point.

Here is the correct version of the previous graph:

jackjansen · 2023-11-13T16:36:58Z

Came up with a wild plan (thanks @Silvia024 !).

What if we take synthetic point cloud, and then create 4 point clouds from that (North, East, South, West), which each of the new point clouds having half the points of the original (basically the ones that can be seen from that direction).

Now we can slightly perturb each of the NESW point clouds, and recombine them into a single "4-camera" point cloud.

This we can now present to voxelize_curve.py and see what it reports. And we know the ground truth, because we have done the perturbations.

So this should allow us to see what the magic numbers are that we are looking for.

Moreover, this scheme (or variations of it, for example introducing some noise) could also be used to do automatic testing of our point loud registration algorithms (once we have written them:-).

jackjansen · 2023-11-14T16:33:38Z

That was a bit of a disappointment, so far. Or actually maybe it just showed our erronous thinking:-)

Implemented the generation of registration-test pointclouds, with a settable offset for camera 2 and 4. (1 and 3 will be correct).

Created pointclouds with an error of 0.03 and 0.003.

The results are in https://github.com/cwi-dis/cwipc_test/tree/master/pointcloud-registration-test
but the quick message is that the curves still look pretty linear. There is a clear "minimum voxelsize" where points from two cameras where points from multiple cameras start to be combined, but this "minimum voxelsize" is quite a bit smaller than the pre-determined error with which the point clouds were generated.

I could say something (based on the data for these pointclouds) like: "As soon as more than 15% of the points come from two cameras our voxelsize is larger than the error" but this would just be data-fitting.

@Silvia024 can you have a look at the data?

jackjansen · 2023-11-14T16:36:27Z

Another wild idea. Comments please.

For each voxelsize we voxelize the whole pointcloud, just as we do now.
But we also take the 4 per-camera pointclouds and voxelize those individually. We add the pointcounts of those cloud.

We now compare the two numbers. This comparison is a measure for how much camera-combining has happened.

Edit 20-Nov-2023: We decided not to go down this path, because we dropped the idea of using voxelization to compute alignment quality.

jackjansen · 2023-11-14T16:47:24Z

~~And yet another wild idea. We are now doing this on the whole pointcloud, maybe that is a bad idea because it confuses our numbers.~~

~~What if we construct the 6 pairwise point clouds (cam1+2, cam1+3, etc) and run this analysis separately on each pairwise pointcloud?~~

Edit: this did not work. I experimented with it by creating the generated point clouds (0.03 and 0.003) with only two cameras (dropping the other two). The curves are just as meaningless as for the full 4-camera cloud.

I am pretty sure that the same is true for the previous wild idea.

Silvia024 · 2023-11-14T16:57:32Z

maybe tomorrow we can chat better about this?

btw I just found this: IntelRealSense/librealsense#10795
and it seems quite relevant and it might be of interest. I will try to read it tonight

Silvia024 · 2023-11-14T21:19:31Z

I found two interesting papers...

https://www.sciencedirect.com/science/article/abs/pii/S0143816618308273 here I found interesting that they use a cube with chessboard patterns on four sides as calibration object so they know all the measurements (e.g., distance between the edge of chessboard pattern). They put the cube such that one camera capture mainly only one side of the cube. What it's interesting to me is that to verify the reconstructed point clouds, they check if two adjacent side of the cube are orthogonal;
https://dl.acm.org/doi/abs/10.1145/3570361.3592530?casa_token=sNaOGgeRC5IAAAAA:4_WVQOtqLz9sVcqmvRoBOMKKBsBPZX6vtJjFrNd553VoJes5l49chIb8ytQejYnJUJWjCtdPMcKW
this one is very recent (just published at MobiCom!) and I think they are doing what we also would like to have in our system :"We propose Dynamic Camera Calibration (DCC) in MetaStream to address this limitation. The DCC module dynamically obtains and manages the camera pose in real-time and calibrates cameras without any pre-defined makers".

jackjansen · 2023-11-15T22:13:39Z

Updated two previous comments (the two wild ideas): they turned out not to work. I have also experimented with changing the "next cellsize factor" from sqrt(0.5) to sqrt(sqrt(0.5)) but there is no appreciable bump around the known epsilon with which the point clouds are created.

I now think that the whole idea of using voxelization at successively smaller sizes and looking for a discontinuity simply doesn't work.

To come up with a theory: voxelization is essentially a stochastic procedure, because it depends on the origin of the grid and the individual original points have no relationship to that grid. So whether or not two points are combined if they are less that cellsize apart is a chance: if cellsize > 2*distance they are definitely combined, if cellsize < distance/sqrt(3) they are definitely not, but the area in between is grey. @Silvia024 does this sound reasonable?

@Silvia024 @troeggla please up-vote this comment if you agree that voxelization is a dead end. Then I can remove all the generated data and spreadsheets over in cwipc_test/pointcloud-registration-test (which is quite a lot).

jackjansen · 2023-11-15T22:27:21Z

My suggestion for the next try is to compute foreach point in cam1cloud the minimum distance to any point in cam2cloud. First thing to look at is the histogram of these distances. (April 9 comment). I hope we would see a very large peak just below epsilon with a very long tail (all the cam1 points that have no corresponding cam2 point).

I think the KD-Tree is the data structure we want. And it appears to be supported in NumPy, example code is in https://medium.com/@OttoYu/exploring-kd-tree-in-point-cloud-c9c767095923

jackjansen · 2023-11-16T23:05:08Z

The idea of using KDTree and plotting a histogram of the mimic distances looks very promising.

There's a Jupyter notebook in cwipc_test, I'm going to convert it to a script.

Silvia024 · 2023-11-17T10:12:30Z

Updated two previous comments (the two wild ideas): they turned out not to work. I have also experimented with changing the "next cellsize factor" from sqrt(0.5) to sqrt(sqrt(0.5)) but there is no appreciable bump around the known epsilon with which the point clouds are created.

I now think that the whole idea of using voxelization at successively smaller sizes and looking for a discontinuity simply doesn't work.

To come up with a theory: voxelization is essentially a stochastic procedure, because it depends on the origin of the grid and the individual original points have no relationship to that grid. So whether or not two points are combined if they are less that cellsize apart is a chance: if cellsize > 2*distance they are definitely combined, if cellsize < distance/sqrt(3) they are definitely not, but the area in between is grey. @Silvia024 does this sound reasonable?

@Silvia024 @troeggla please up-vote this comment if you agree that voxelization is a dead end. Then I can remove all the generated data and spreadsheets over in cwipc_test/pointcloud-registration-test (which is quite a lot).

the theory seems correct to me :)
agree that maybe it's not a good strategy

jackjansen · 2023-11-17T10:19:55Z

I'm getting beautiful graphs, plotting (per camera pair) the cumulative distance-to-other-cloud.

jackjansen · 2023-11-17T12:51:24Z

All the graphs are now in cwipc_test. They look mostly usable. I'm going to change the Y axis from being counts to being fractions, that should make them even easier to interpret (I hope).

jackjansen · 2023-11-19T22:34:50Z

For reference in further discussion below: here is the current state of the graph for the boxes in vrsmall (with one camera about 1cm off):

And here is one of the Jack point clouds (the best one, the sideways one):

And here is the one for the generated point cloud, with a 0.03 m offset of two of the cameras:

jackjansen · 2023-11-19T22:42:31Z

First thing I would like to do is convert each of these lines into a number correspondence. This number should be an upper bound for the registration error between the two cameras in that pair, for the given capture. If we are pretty sure the cameras have no significant overlap in field of view (pairs (0, 3) and (1, 2) in the graphs above) we return an invalid value (NaN or None or something like that).

Once we have this number we can do two things with it:

Use it as the --corr parameter to the various registration algorithms we have
After applying an algorithm we can recompute this correspondence number and ascertain it is lower than the previous value.

jackjansen · 2023-11-19T23:01:09Z

Idea: we should look at the derivative of these lines (which is actually the distance histogram data) and the second derivative. Possibly after smoothing the lines a little.

The second derivative should start positive, then go negative, then go positive again. Maybe that last point is the number we're interested in.

jackjansen · 2023-11-20T00:13:47Z

Unrelated idea (@Silvia024 comments please): we are currently looking at the ~~symmetric~~ combined difference between camera pairs.

Would it be better to do a per-camera comparison to all other cameras combined? I.e. not compute and graph the distance per camera pair, but in stead look at the distance of the points of camera N to the points of all other cameras together?

That should extend the "tail" of the histogram, probably making it more linear-ish. Also, the resulting number correspondence should be more useful, because it's a number that pertains to a single camera (in stead of to a pair of cameras).

jackjansen · 2023-11-20T13:17:10Z

Would it be better to do a per-camera comparison to all other cameras combined?

Implemented, and it seems to produce much better results. So much better that We need to check that we are not looking at bullshit data.

Here are the graphs (for single-camera-to-all-others) that are for the same dataset as the pair graphs of yesterday:

jackjansen · 2023-11-20T13:59:50Z

And here are the histogram graphs of the same datasets:

Silvia024 · 2023-11-21T08:59:37Z

The original issue is https://gitlab.com/VRTogether_EU/cwipc/cwipc_util/-/issues/41

@jackjansen, I can't access to this link. It says page not found. Maybe do I not have the right to access it?

jackjansen · 2023-11-21T15:56:34Z

The original issue is https://gitlab.com/VRTogether_EU/cwipc/cwipc_util/-/issues/41

@jackjansen, I can't access to this link. It says page not found. Maybe do I not have the right to access it?

That link is dead (the gitlab site disappeared when we stopped paying for it).

jackjansen · 2023-11-21T16:02:20Z

Getting started with a guess at the correspondence (when the histogram is "over the hill", and has descended about half way down). Looks good for the generated point clouds, and usable for the captured ones (in the right ballpark).

jackjansen · 2023-12-22T12:41:19Z

Paused working on the aruco markers for now.

Implemented a manual coarse calibration algorithm (basically the same as what we have: select the four coloured corners of the calibration target).

This works, the resulting pointcloud and the correspondence graph are in pointcloud-registration-test/vrsmall-noreg/capture.

Here is the graph:

The correspondence errors are too optimistic, but visual inspection of the graph shows that something like 0.05 is probably the correct number.

But: these pointclouds are so big (500Kpoint per camera) that the analysis is very expensive: it takes about a minute on my M1 Mac, and I didn't wait for it on the Intel Mac. That's a shame, because for our common use case (capturing in a room where the walls are close enough to be captured by the cameras) this would actually be a very useful test capture...

jackjansen · 2023-12-23T16:11:41Z

In case we decide to go for detecting the Aruco markers in the RGB image and then finding the corresponding points in the point cloud: here are some interesting links about how the conversion of image pixel coordinates to 3D points can be done using librealsense API's:

IntelRealSense/librealsense#8221
https://medium.com/@yasuhirachiba/converting-2d-image-coordinates-to-3d-coordinates-using-ros-intel-realsense-d435-kinect-88621e8e733a
IntelRealSense/librealsense#11031

jackjansen · 2024-01-01T23:11:51Z

Back to fine registration.

There is a thinking error in the idea that only looking at the correspondence between each (camera-N, all-cameras-except-N) is good enough.

This is shown clearly in a new dataset offline-boxes. See there for the full details, but here is a screenshot of the calibration it managed to come up with:

Visually, it is clear what the problem 1 is: the green and blue camera together (or the red and brown camera together) need to be moved to make things better. But looking at each camera in isolation is never going to work: if we would move the green camera to better align with the red one we would completely lose the alignment with the blue camera.

There is also a problem 2 with the analysis. Here is what our code thinks of the alignment it came up with:

Those numbers (14 mm to 18 mm) are far too optimistic.

For problem 1 a potential solution is to not only look at correspondences between (camera-N, all-cameras-except-N) but also (camera-N-and-M, all-cameras-except-N-and-M) for every combination (N, M) where those two cameras are "closest to each other". If and N-and-M camera combination would end up as candidate-to-fix we would try fixing that "camera" and apply the transformation to both cameras.

But this feels like a hacker-jack solution: it seems like it might fix this problem but I'm not really sure it's really a solution and not a quick hack.

I have the feeling that tackling problem 2 first may be better.

Ideas, anyone?

jackjansen · 2024-01-05T17:00:44Z

Some progress with coarse calibration.

I'm manually positioning the pointcloud in the open3d viewer so that the virtual camera is approximately where the physical camera was. I then grab the RGB and D images.

I can now detect the aruco markers in the RGB image.

But this is where the good news stops: the next step is converting the u, v coordinates that the aruco detector returns (coordinates in the RGB image) back to x, y, z coordinates in the point cloud.

I think I'm doing all the right things: I'm taking the depth from D[v, u] and using the camera intrinsics to de-project, and then I'm transforming with the extrinsic matrix.

But I always end up with weird origins. I've tried transposing the extrinsic matrix, I've tried mirroring x, or mirroring y and z (which was suggested somewhere) but nothing seems to work.

jackjansen · 2024-01-07T21:24:32Z

See isl-org/Open3D#6508 for a decent description of the open3d coordinate system (RH y-UP) and its idiosyncrasies with respect to lookat and forward.

jackjansen · 2024-01-08T10:53:29Z

The problem may be that I "invert" the affine transform (the extrinsic matrix) by transposing it. That would work for a normal 3x3 matrix in our case (because we know all the matrices are size and shape preserving) but of course it doesn't work for an affine transform, I have to use another vector as the fourth column (and clear the fourth row).

Here is an explanation: http://negativeprobability.blogspot.com/2011/11/affine-transformations-and-their.html

jackjansen · 2024-01-08T14:13:11Z

And I found https://stackoverflow.com/questions/2624422/efficient-4x4-matrix-inverse-affine-transform which says a similar thing. Also, looking at the source code for open3d PointCloud.create_from_depth_image it seems to be that they're indeed doing this, but they have the eigen3d affine transform which has an invert. I guess I'll have to create that by hand.

jackjansen · 2024-01-14T13:34:04Z

Automatic coarse calibration based on Aruco markers is working. It's actually working so well that I have merged back into master, at a1b2855, so this can now be considered production-ready.

jackjansen · 2024-01-14T13:38:01Z

Multi-marker coarse alignment, for camera positions where not all cameras can see the (0, 0, 0) origin aruco marker so we use auxiliary markers, is also working good enough that I've merged it into master, at b923190

jackjansen · 2024-01-14T14:00:48Z

So, back to fine calibration or actually first to the analysis of the current calibration.

I'm working with the offline-boxes capture, because that shows the problem most clearly. I've created plots for all analysers that we have (one2all, one2all-filtered, one2all-reverse, one2all-reverse-filtered and pairwise). All the graphs are in cwipc_test.

The most informative graph for this dataset (but note the italics) is the pairwise graph:

The "camera numbers" here are the or of the two contributing cameras. As a human we can easily see that camera 1 is opposite camera 4 and camera 2 is opposite camera 8. And I also know this is correct, because I know that the cameras are placed in the order 1-2-4-8 clockwise. We can probably detect this algorithmically if we want.

As a human we can estimate the correspondences of the pairs:

1 to 2 (red): about 2cm
2 to 4 (olive): about 4mm
4 to 8 (turquoise): at least 4cm
8 to 1 (dark blue): less than 4mm

We can also see that the correspondence errors that the current "algorithm" (but really "quick hack" is a better term) has come up with are wildly wrong. Not surprising: the current "algorithm" works by finding the peak in the histogram and then move right until we get below a value that is less than 0.5*peak.

I will experiment with mean and stddev to see if I can get some more decent numbers. Then, if they work for this dataset, they should also be tried for the captured Jack dataset.

jackjansen · 2024-01-14T15:12:20Z

Mean and stddev by themselves are not going to work. Here are the results for the graph above:

camera 3: mean=0.02291982490598397, std=0.024645493357004753, peak=0.002320735082210554, corr=0.007816463821634046
camera 5: mean=0.04401491186517467, std=0.028213745113280272, peak=0.012704797435346417, corr=0.058097526853423245
camera 9: mean=0.016755202242697633, std=0.026204388020447566, peak=0.0018051266662951993, corr=0.002804512231245586
camera 6: mean=0.015378887548555181, std=0.023343767740899458, peak=0.001824420385722404, corr=0.003573071088354493
camera 10: mean=0.048489777693837444, std=0.028316377093057312, peak=0.0021639993720285154, corr=0.08227788490063927
camera 12: mean=0.0341007288961789, std=0.023926849570313418, peak=0.01910221049639875, corr=0.04007440525818792

The mean for the "good pairs" (6 and 9) is far too high.

And that is pretty logical, when you think about it: the long tails have an inordinate effect on the mean.

Next thing to try: first compute mean and stddev. then throw away all distances that are larger than (wild guess) mean+stddev, or maybe 2*mean. Then compute mean and stddev on the points that remain.

Edit: another thing to try is to keep only the points in the range [mean-stddev, mean+stddev].

The idea is that for the "bad pairs" this will throw away less of the points, but for the "good pairs" it will throw away more points.

jackjansen · 2024-01-14T15:59:44Z

Tried that. Also tried running the filtering multiple times, to see how the mean and stddev behave. Used the bracketing filter [mean-stddev, mean+stddev] on two premises:

It sort-of feels more mathematically correct,
it just so happens that for the "good pairs" std > mean so we don't throw away any "good points", while for the "bad pairs" std < mean so we throw away points on both sides, so running the filter successively should not change mean too much (while for the "good pairs" it will lower mean).

Here are the results:

camera 3: peak=0.002320735082210554, corr=0.007816463821634046
camera 3: 0 filters: mean=0.02291982490598397, std=0.024645493357004753, nPoint=81262
camera 3: 1 filters: mean=0.012951448749072387, std=0.010969866908687875, nPoint=66992
camera 3: 2 filters: mean=0.009752436971104068, std=0.005658494330190411, nPoint=52946
camera 3: 3 filters: mean=0.009181153840633022, std=0.0032857770409335132, nPoint=32361
camera 5: peak=0.012704797435346417, corr=0.058097526853423245
camera 5: 0 filters: mean=0.04401491186517467, std=0.028213745113280272, nPoint=44935
camera 5: 1 filters: mean=0.04137415890726912, std=0.016101593626257675, nPoint=26525
camera 5: 2 filters: mean=0.04045734373520693, std=0.009445048042243229, nPoint=15533
camera 5: 3 filters: mean=0.040051235318581284, std=0.005529610745340443, nPoint=8873
camera 9: peak=0.0018051266662951993, corr=0.002804512231245586
camera 9: 0 filters: mean=0.016755202242697633, std=0.026204388020447566, nPoint=81399
camera 9: 1 filters: mean=0.006063694689290081, std=0.008691325161748644, nPoint=67977
camera 9: 2 filters: mean=0.003177805834766034, std=0.0024780598363417766, nPoint=59915
camera 9: 3 filters: mean=0.002515016672254931, std=0.0011297240188485047, nPoint=52181
camera 6: peak=0.001824420385722404, corr=0.003573071088354493
camera 6: 0 filters: mean=0.015378887548555181, std=0.023343767740899458, nPoint=93198
camera 6: 1 filters: mean=0.006850735824829795, std=0.00853339253154322, nPoint=79885
camera 6: 2 filters: mean=0.003812802111714147, std=0.0027496213518108494, nPoint=69301
camera 6: 3 filters: mean=0.003089897111975529, std=0.00140128710337592, nPoint=56900
camera 10: peak=0.0021639993720285154, corr=0.08227788490063927
camera 10: 0 filters: mean=0.048489777693837444, std=0.028316377093057312, nPoint=43545
camera 10: 1 filters: mean=0.04792050129221857, std=0.016245794338419692, nPoint=25355
camera 10: 2 filters: mean=0.04781840036709909, std=0.00938153612816915, nPoint=14709
camera 10: 3 filters: mean=0.04774644484942719, std=0.005421145457380683, nPoint=8482
camera 12: peak=0.01910221049639875, corr=0.04007440525818792
camera 12: 0 filters: mean=0.0341007288961789, std=0.023926849570313418, nPoint=73397
camera 12: 1 filters: mean=0.028625500662925147, std=0.011632248990853532, nPoint=52026
camera 12: 2 filters: mean=0.027763304899151773, std=0.0064978872262983645, nPoint=33259
camera 12: 3 filters: mean=0.027708708994364492, std=0.003789831220465292, nPoint=19513

This seems to be going in the right direction: the "bad pairs" (opposing cameras) have their mean staying put at high values. The "good pairs" have their mean going down significantly, towards what appears to be a correct value. The "not so good pairs" (3 and 12) also seem to end up at decent values.

jackjansen · 2024-01-14T17:19:43Z

Partial success. That is to say: this works pretty well for camera-pair measurements on the boxes:

These are pretty beleivabe numbers!

Unfortunately it does not work well at all for the one-to-all-others measurements:

I think the problem is that this algorithm throws away any points that it can't match (which, in case of this dataset, includes the mismatched "edges that are sticking out".

Let's first check how the pair-wise measurements work on the other datasets.

jackjansen · 2024-01-15T00:33:57Z

That didn't work very well. I've now made the pair-wise measurement symmetric but this needs work: at the moment it is far too expensive.

And it is also too aggressive in trying to put as many points into the overlapping set as it can. Can be seen with the loot datasets.

We should somehow re-enable the max_distance topping of the kdtree distance finder (I disabled it for now) but still count the points that go over it.

jackjansen · 2024-01-26T22:24:20Z

For future reference: when we get back to finding the "best" algorithm to align the pointclouds we should look at point-to-plane ICP with a robust kernel. From https://www.open3d.org/docs/latest/tutorial/pipelines/robust_kernels.html#Vanilla-ICP-vs-Robust-ICP I get the impression that the robust kernel is a way to deal with noise. The referenced page uses generated noise, but of course our sensors are also noisy...

jackjansen · 2024-02-07T13:10:32Z

Copied from Slack:

Folks, in your research of registration algorithms, have you come across any that allow "pinning" of one of the variables? I.e. ask the algorithm to find the optimal transformation but specifying, for example, that the y-translation must be zero?
Because if that exists then we could do fine calibration in two steps:

First do a fine calibration of the empty capture. This will align all the floors. Then assure that the floors also fall in the plane y=0.
Next do the fine calibration with boxes or people or whatever, but pin y=0.

Actually, thinking a bit more, we not only want to pin the y-translation to 0 but also the x-rotation and z-rotation. So the only free variables should be y-rotation, x-translation and z-translation.

ireneviola · 2024-02-08T11:18:01Z

We might want to change the loss function as to only consider a 2D error - which would effectively mean that in every iteration, the algorithm would be forced to change only the parameters that are considered in the loss function, because the others would have no impact in the error. There might be other ways of writingit as an optimization problem, though; we should check it out

jackjansen · 2024-09-20T22:49:18Z

The fixer managed to make them all upright, but the's where the good news stops. They're still off by quite a bit, and moreover (and worse): the analysis algorithm produces way too optimistic values.

Inspecting this issue again after half a year of inactivity, but a lot of actually using the current registration setup in production. The comment quoted above (from 11-Dec-2023) seems to be the main thing that is bothering us most at the moment: Often, when running cwipc_register --fine, the script will report that it has managed to align all point clouds to within a few millimeters. But actual inspection of the captured point cloud clearly shows that some areas (and often important areas like the head) are off by 5-10cm.

The "solution" we are currently using is to simply try again with the subject human in a different pose, and hoping for the best.

Fixing this, or at least showing the operator something (for example a graph of the p2p distance distribution) from which they can tell this has happened is at the moment of paramount importance.

jackjansen · 2024-09-20T23:14:00Z

Once we have addressed the issue above (the bad numbers coming out of our analysis) my feeling is that we should move to a "mixed" upper strategy. Right now our upper strategy is either pairwise or one-to-all-others, but maybe we should first do one round of pairwise, and after that a round of one-to-all-others.

If we do the pairwise round in the right order (i.e. most overlapping pair first) I think that should get us out of the "local minimum problem" with the boxes.

The right order should be easy to compute: for each pair, compute the upper bound of the percentage/fraction of points that could possibly overlap. High-to-low is the right order.

jackjansen · 2025-02-03T21:38:12Z

Finally getting back to this. Started by forwarding the 18-alignment branch to the current master.
Will start by addressing the first 21-Sep-2024 point: finding out why our analysis is so optimistic.

jackjansen · 2025-02-03T22:06:07Z

We're going to need some debug tools to see what is happening.

One is #136 so we can visually inspect the different tiles at the same time.

Another is that when we are paused and we use a command that should change our view (such as the colorise option from #136, or selecting a different tileset to see with the number commands) the point cloud is redrawn with the new options.

We may also want an option to cwipc_register that makes it create a log directory with everything that it has been doing at each step, including all ply files, etc.

jackjansen · 2025-02-24T14:27:35Z

Starting to experiment with capture-2024-1127-1429 to see how good we can get.

First thing noticed is that the radius-filter is messing things up: the back of Thong isn't captured. Turned off the radius filter.

Then I noticed that the "moving Aruco Marker" problem is back. Need to get rid of that too.

jackjansen · 2025-02-24T15:31:46Z

Fixed those issues by modifying cameraconfig. The new one is attached here:

cameraconfig.json

With this cameraconfig we took a capture just after Thong clapped his hands (slightly after ts=1732710676049.

And here is the histogram plot of the final resulting distances between each camera and all others:

We can see a number of (completely different) things from this plot:

The histogram plot is not the right one to show, we want cumulative because it would make it much easier (we hope) to interpret the results.
The correspondence error computed for camera 4 is far too optimistic.

We need to fix the first bullet. Then we need to decide what we do first:

Fix the second bullet, or
Have a "human inspection" outer algorithm, where the user decides which camera to try fixing next.

For reference, here is the log of the steps the algorithm took. As can be seen camera 4, which appears to be the troublemaker from human inspection of the graphs, was never re-aligned, because it always appeared to have the "best" result for its alignment:

grab: captured 972926 points, ts=1732710676183
grab: stopping
grab: stopped
cwipc_register: Saved pointcloud and cameraconfig for step3_capture_fine
cwipc_register: Use fine alignment class MultiCamera
camera 1: 0 filters: mean=0.08726267932263795, std=0.16682569169882122, nPoint=255935
camera 1: 1 filters: mean=0.03994678700741507, std=0.046734246223548034, nPoint=230698
camera 1: 2 filters: mean=0.026024080235830076, std=0.023486786270562738, nPoint=202884
camera 1: 3 filters: mean=0.01830999702134181, std=0.013575160276728114, nPoint=145884
camera 1: corr=0.03188515729806993, matched=134256, total=255935, fraction=0.524570691777209
camera 2: 0 filters: mean=0.12507721274850828, std=0.20507471253062573, nPoint=258325
camera 2: 1 filters: mean=0.03885565536668416, std=0.06712790973243157, nPoint=212320
camera 2: 2 filters: mean=0.020087208061662358, std=0.022716392718550568, nPoint=193486
camera 2: 3 filters: mean=0.011577415504397934, std=0.010806521428930899, nPoint=162272
camera 2: corr=0.02238393693332883, matched=133829, total=258325, fraction=0.5180644536920546
camera 4: 0 filters: mean=0.13837594942240514, std=0.29550212864077785, nPoint=267089
camera 4: 1 filters: mean=0.03304925221287418, std=0.07666745932378718, nPoint=233604
camera 4: 2 filters: mean=0.012464862683069351, std=0.01705470907356439, nPoint=214986
camera 4: 3 filters: mean=0.007079425048334478, std=0.006499748874885224, nPoint=188443
camera 4: corr=0.013579173923219702, matched=160356, total=267089, fraction=0.6003841416157161
camera 8: 0 filters: mean=0.19004005007503835, std=0.3443411966968814, nPoint=219267
camera 8: 1 filters: mean=0.07425747250256175, std=0.1295977439169904, nPoint=191922
camera 8: 2 filters: mean=0.0242699051546891, std=0.04346627822352093, nPoint=162575
camera 8: 3 filters: mean=0.010153926454012606, std=0.01264793364287867, nPoint=144094
camera 8: corr=0.022801860096891276, matched=126830, total=219267, fraction=0.5784272143094948
registration.MultiCamera: Before: overall correspondence error 0.024489670863628233. Per-camera correspondence, ordered worst-first:
	camnum=1, correspondence=0.03188515729806993, weight=0.37648411290308587
	camnum=8, correspondence=0.022801860096891276, weight=0.2679356030621006
	camnum=2, correspondence=0.02238393693332883, weight=0.2642271128895933
	camnum=4, correspondence=0.013579173923219702, weight=0.1627484583790241
registration.MultiCamera: Step 1: camera 1, correspondence error 0.03188515729806993, overall correspondence error 0.024489670863628233
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.278176e-01, inlier_rmse=1.404606e-02, and correspondence_set size of 135087
Access transformation to get result.
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.278176e-01, inlier_rmse=1.404606e-02, and correspondence_set size of 135087
Access transformation to get result.
camera 1: 0 filters: mean=0.08515003121721029, std=0.16219971557567636, nPoint=255935
camera 1: 1 filters: mean=0.03944200513305314, std=0.04602416760871725, nPoint=230970
camera 1: 2 filters: mean=0.02564313404211889, std=0.023322835623908814, nPoint=202680
camera 1: 3 filters: mean=0.017700764204254607, std=0.01343614521583445, nPoint=147451
camera 1: corr=0.031136909420089058, matched=133711, total=255935, fraction=0.5224412448473246
camera 2: 0 filters: mean=0.1261837839728534, std=0.2063795408556695, nPoint=258325
camera 2: 1 filters: mean=0.039161718584824354, std=0.06737647626853334, nPoint=212203
camera 2: 2 filters: mean=0.020558623456058, std=0.023314484283719752, nPoint=193782
camera 2: 3 filters: mean=0.011712788038094267, std=0.011202703058570438, nPoint=161905
camera 2: corr=0.022915491096664707, matched=133953, total=258325, fraction=0.518544469176425
camera 4: 0 filters: mean=0.13674437823952734, std=0.29260257659315536, nPoint=267089
camera 4: 1 filters: mean=0.03293592517235412, std=0.07524826004957648, nPoint=233886
camera 4: 2 filters: mean=0.012751611144142284, std=0.017278143956666384, nPoint=215151
camera 4: 3 filters: mean=0.007182366444128553, std=0.006667451445125218, nPoint=187727
camera 4: corr=0.013849817889253772, matched=159701, total=267089, fraction=0.597931775550472
camera 8: 0 filters: mean=0.190009468584249, std=0.3443543199367892, nPoint=219267
camera 8: 1 filters: mean=0.07421773854638268, std=0.12959852864706503, nPoint=191920
camera 8: 2 filters: mean=0.024223135473280803, std=0.04343843611368515, nPoint=162570
camera 8: 3 filters: mean=0.010127073527619965, std=0.012597436101665258, nPoint=144125
camera 8: corr=0.022724509629285225, matched=126894, total=219267, fraction=0.5787190958967834
registration.MultiCamera: Step 1: per-camera correspondence, ordered worst-first:
	camnum=1, correspondence=0.031136909420089058, weight=0.3675225186194392
	camnum=2, correspondence=0.022915491096664707, weight=0.2705229699892271
	camnum=8, correspondence=0.022724509629285225, weight=0.26703815261298075
	camnum=4, correspondence=0.013849817889253772, weight=0.16593547967402955
registration.MultiCamera: Step 2: camera 2, correspondence error 0.022915491096664707, overall correspondence error 0.02428450512243735
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.189277e-01, inlier_rmse=9.174326e-03, and correspondence_set size of 134052
Access transformation to get result.
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.189277e-01, inlier_rmse=9.174326e-03, and correspondence_set size of 134052
Access transformation to get result.
camera 1: 0 filters: mean=0.08483494273011204, std=0.1620483013680917, nPoint=255935
camera 1: 1 filters: mean=0.03926487600054994, std=0.045942779543020625, nPoint=231066
camera 1: 2 filters: mean=0.025492898181408596, std=0.023267805001552604, nPoint=202776
camera 1: 3 filters: mean=0.01748906771124196, std=0.013412983975476368, nPoint=148107
camera 1: corr=0.030902051686718328, matched=133588, total=255935, fraction=0.5219606540723231
camera 2: 0 filters: mean=0.1262190678707062, std=0.20652847717333697, nPoint=258325
camera 2: 1 filters: mean=0.0391711263301533, std=0.06759738467608756, nPoint=212212
camera 2: 2 filters: mean=0.02054506243217559, std=0.023462209014660125, nPoint=193855
camera 2: 3 filters: mean=0.011630897174464076, std=0.011242480463347224, nPoint=161926
camera 2: corr=0.0228733776378113, matched=133982, total=258325, fraction=0.5186567308622859
camera 4: 0 filters: mean=0.13675038651780155, std=0.29260050589973113, nPoint=267089
camera 4: 1 filters: mean=0.032942291281810095, std=0.0752428663397196, nPoint=233886
camera 4: 2 filters: mean=0.012765479086436991, std=0.01728549702870202, nPoint=215163
camera 4: 3 filters: mean=0.007193978172306782, std=0.006677447133917714, nPoint=187743
camera 4: corr=0.013871425306224497, matched=159725, total=267089, fraction=0.5980216332383588
camera 8: 0 filters: mean=0.1906543841919635, std=0.34475906981817683, nPoint=219267
camera 8: 1 filters: mean=0.07469297567181772, std=0.13012275745184318, nPoint=191883
camera 8: 2 filters: mean=0.024339505929850794, std=0.04363447385844152, nPoint=162386
camera 8: 3 filters: mean=0.010226977582093165, std=0.012665447779570158, nPoint=144069
camera 8: corr=0.022892425361663325, matched=126911, total=219267, fraction=0.5787966269434069
registration.MultiCamera: Step 2: per-camera correspondence, ordered worst-first:
	camnum=1, correspondence=0.030902051686718328, weight=0.36472195067959823
	camnum=2, correspondence=0.0228733776378113, weight=0.2700307617298742
	camnum=8, correspondence=0.022892425361663325, weight=0.2690144151082393
	camnum=4, correspondence=0.013871425306224497, weight=0.16619644385564938
registration.MultiCamera: Step 3: camera 8, correspondence error 0.022892425361663325, overall correspondence error 0.024216661975675832
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.807714e-01, inlier_rmse=7.996496e-03, and correspondence_set size of 127344
Access transformation to get result.
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.807714e-01, inlier_rmse=7.996496e-03, and correspondence_set size of 127344
Access transformation to get result.
camera 1: 0 filters: mean=0.08480573883775841, std=0.1620665131602924, nPoint=255935
camera 1: 1 filters: mean=0.03922983300526544, std=0.04597604179420256, nPoint=231063
camera 1: 2 filters: mean=0.025450178894767753, std=0.023326021048003128, nPoint=202759
camera 1: 3 filters: mean=0.017387675001898764, std=0.013478223722260268, nPoint=148372
camera 1: corr=0.030865898724159034, matched=133465, total=255935, fraction=0.5214800632973215
camera 2: 0 filters: mean=0.12559620279149736, std=0.20564728429253104, nPoint=258325
camera 2: 1 filters: mean=0.0390352048880553, std=0.0672175007515984, nPoint=212322
camera 2: 2 filters: mean=0.02047315371380545, std=0.023432864310322352, nPoint=193868
camera 2: 3 filters: mean=0.011549480510941243, std=0.011214073033693242, nPoint=161861
camera 2: corr=0.022763553544634486, matched=133868, total=258325, fraction=0.5182154263040744
camera 4: 0 filters: mean=0.13742725991263743, std=0.29406412810899096, nPoint=267089
camera 4: 1 filters: mean=0.03322530312368369, std=0.07554521229175706, nPoint=233963
camera 4: 2 filters: mean=0.012957675344570312, std=0.01740890323778628, nPoint=215223
camera 4: 3 filters: mean=0.0073237974863557915, std=0.00673637299277873, nPoint=187617
camera 4: corr=0.01406017047913452, matched=159655, total=267089, fraction=0.5977595483153556
camera 8: 0 filters: mean=0.18983636922486904, std=0.3452892290040007, nPoint=219267
camera 8: 1 filters: mean=0.07397996868578144, std=0.1295160525258807, nPoint=192070
camera 8: 2 filters: mean=0.024127611062895588, std=0.04335302264504482, nPoint=162778
camera 8: 3 filters: mean=0.010089635934008755, std=0.012547536803761297, nPoint=144404
camera 8: corr=0.02263717273777005, matched=127114, total=219267, fraction=0.5797224388530878
registration.MultiCamera: Step 3: per-camera correspondence, ordered worst-first:
	camnum=1, correspondence=0.030865898724159034, weight=0.36426682216896883
	camnum=2, correspondence=0.022763553544634486, weight=0.2687148608547355
	camnum=8, correspondence=0.02263717273777005, weight=0.26605106019808944
	camnum=4, correspondence=0.01406017047913452, weight=0.16845167592862273
registration.MultiCamera: Step 4: camera 4, correspondence error 0.01406017047913452, overall correspondence error 0.024123472521746674
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=6.012191e-01, inlier_rmse=5.755844e-03, and correspondence_set size of 160579
Access transformation to get result.
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=6.012191e-01, inlier_rmse=5.755844e-03, and correspondence_set size of 160579
Access transformation to get result.
camera 1: 0 filters: mean=0.08662557728434213, std=0.16551215651781165, nPoint=255935
camera 1: 1 filters: mean=0.039880709274299354, std=0.04655976698744746, nPoint=230882
camera 1: 2 filters: mean=0.026002624358990915, std=0.023780290730512323, nPoint=202696
camera 1: 3 filters: mean=0.017976990098078703, std=0.0138587999016265, nPoint=146282
camera 1: corr=0.031835789999705204, matched=133228, total=255935, fraction=0.5205540469259773
camera 2: 0 filters: mean=0.12559533943164594, std=0.20564740776892507, nPoint=258325
camera 2: 1 filters: mean=0.039034154467309456, std=0.06721660765597363, nPoint=212322
camera 2: 2 filters: mean=0.020471987913061256, std=0.02342908091585363, nPoint=193868
camera 2: 3 filters: mean=0.011548404986942129, std=0.011205868043515447, nPoint=161860
camera 2: corr=0.022754273030457576, matched=133896, total=258325, fraction=0.5183238168973193
camera 4: 0 filters: mean=0.13737522935643365, std=0.2931434419411899, nPoint=267089
camera 4: 1 filters: mean=0.03288494890403441, std=0.07570251691552064, nPoint=233608
camera 4: 2 filters: mean=0.012608328016293704, std=0.017160278276197607, nPoint=215023
camera 4: 3 filters: mean=0.007062456488081894, std=0.0066233372709715264, nPoint=187526
camera 4: corr=0.01368579375905342, matched=159436, total=267089, fraction=0.5969395969133884
camera 8: 0 filters: mean=0.18907186145208518, std=0.34337227269028964, nPoint=219267
camera 8: 1 filters: mean=0.07361104579638122, std=0.12892966931346186, nPoint=191915
camera 8: 2 filters: mean=0.023974464712775416, std=0.04311412789177767, nPoint=162641
camera 8: 3 filters: mean=0.01002621313535229, std=0.012568856406781095, nPoint=144279
camera 8: corr=0.022595069542133382, matched=126797, total=219267, fraction=0.5782767128660492
registration.MultiCamera: Step 4: per-camera correspondence, ordered worst-first:
	camnum=1, correspondence=0.031835789999705204, weight=0.3756565032166122
	camnum=2, correspondence=0.022754273030457576, weight=0.26861006682357413
	camnum=8, correspondence=0.022595069542133382, weight=0.26549980957821573
	camnum=4, correspondence=0.01368579375905342, weight=0.16394756856218545
registration.MultiCamera: Step 5: camera 1, correspondence error 0.031835789999705204, overall correspondence error 0.024507540079677023
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.243753e-01, inlier_rmse=1.400334e-02, and correspondence_set size of 134206
Access transformation to get result.
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.243753e-01, inlier_rmse=1.400334e-02, and correspondence_set size of 134206
Access transformation to get result.
camera 1: 0 filters: mean=0.085744143195895, std=0.1629295893616391, nPoint=255935
camera 1: 1 filters: mean=0.039511715829330274, std=0.04595802880690485, nPoint=230678
camera 1: 2 filters: mean=0.025886279902809808, std=0.023489419406180306, nPoint=202767
camera 1: 3 filters: mean=0.017956762514358533, std=0.01356942217281289, nPoint=146094
camera 1: corr=0.03152618468717142, matched=133663, total=255935, fraction=0.5222536972278118
camera 2: 0 filters: mean=0.12687136098262647, std=0.20733285447164593, nPoint=258325
camera 2: 1 filters: mean=0.03937345673642397, std=0.067556080971456, nPoint=212229
camera 2: 2 filters: mean=0.020832455215532453, std=0.023624509888987198, nPoint=194056
camera 2: 3 filters: mean=0.011799441161706633, std=0.011456482620224587, nPoint=161692
camera 2: corr=0.02325592378193122, matched=133740, total=258325, fraction=0.5177199264492403
camera 4: 0 filters: mean=0.13614773009717254, std=0.29084277080503623, nPoint=267089
camera 4: 1 filters: mean=0.032746877794534585, std=0.07493767861597596, nPoint=233744
camera 4: 2 filters: mean=0.012676105267106623, std=0.01725133933258455, nPoint=215073
camera 4: 3 filters: mean=0.007098475064730394, std=0.006650654639468999, nPoint=187561
camera 4: corr=0.013749129704199392, matched=159479, total=267089, fraction=0.597100591937519
camera 8: 0 filters: mean=0.18906210509871343, std=0.3433760976908357, nPoint=219267
camera 8: 1 filters: mean=0.07359989895358737, std=0.128931325738454, nPoint=191915
camera 8: 2 filters: mean=0.02395911555900265, std=0.04310054457832315, nPoint=162639
camera 8: 3 filters: mean=0.01003126268730873, std=0.012575232086168385, nPoint=144316
camera 8: corr=0.022606494773477114, matched=126814, total=219267, fraction=0.5783542439126726
registration.MultiCamera: Step 5: per-camera correspondence, ordered worst-first:
	camnum=1, correspondence=0.03152618468717142, weight=0.3721059849319554
	camnum=2, correspondence=0.02325592378193122, weight=0.2745048520886307
	camnum=8, correspondence=0.022606494773477114, weight=0.26563709066924135
	camnum=4, correspondence=0.013749129704199392, weight=0.16471000269823738
registration.MultiCamera: Step 6: camera 1, correspondence error 0.03152618468717142, overall correspondence error 0.0244992751064434
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.222303e-01, inlier_rmse=1.388377e-02, and correspondence_set size of 133657
Access transformation to get result.
RegistrationComputer_ICP_Point2Point: RegistrationComputer_ICP_Point2Point result: RegistrationResult with fitness=5.222303e-01, inlier_rmse=1.388377e-02, and correspondence_set size of 133657
Access transformation to get result.
camera 1: 0 filters: mean=0.08570856522150984, std=0.16282013235384463, nPoint=255935
camera 1: 1 filters: mean=0.03950176109090798, std=0.04594035830972884, nPoint=230675
camera 1: 2 filters: mean=0.025884192435114257, std=0.02348344326392318, nPoint=202769
camera 1: 3 filters: mean=0.017959348605171687, std=0.013564917302260449, nPoint=146057
camera 1: corr=0.03152426590743214, matched=133653, total=255935, fraction=0.5222146248070799
camera 2: 0 filters: mean=0.12693157241935005, std=0.20743039849530176, nPoint=258325
camera 2: 1 filters: mean=0.03937401916220994, std=0.0675644555255504, nPoint=212220
camera 2: 2 filters: mean=0.020838858484963026, std=0.02362778629463241, nPoint=194067
camera 2: 3 filters: mean=0.01180275307823051, std=0.011458399774637492, nPoint=161691
camera 2: corr=0.023261152852868002, matched=133735, total=258325, fraction=0.5177005709861608
camera 4: 0 filters: mean=0.13608604503595922, std=0.2907401368034397, nPoint=267089
camera 4: 1 filters: mean=0.03273966268685515, std=0.07490889046678796, nPoint=233754
camera 4: 2 filters: mean=0.012673769895350047, std=0.017246906948641092, nPoint=215072
camera 4: 3 filters: mean=0.007096693873413673, std=0.006646845878896774, nPoint=187556
camera 4: corr=0.013743539752310447, matched=159497, total=267089, fraction=0.5971679852034341
camera 8: 0 filters: mean=0.18906203631999458, std=0.3433761397946863, nPoint=219267
camera 8: 1 filters: mean=0.0735998203724273, std=0.1289313834808062, nPoint=191915
camera 8: 2 filters: mean=0.023959022832760257, std=0.043100641604935076, nPoint=162639
camera 8: 3 filters: mean=0.010030749932079227, std=0.012574562557327482, nPoint=144315
camera 8: corr=0.02260531248940671, matched=126815, total=219267, fraction=0.578358804562474
registration.MultiCamera: Step 6: per-camera correspondence, ordered worst-first:
	camnum=1, correspondence=0.03152426590743214, weight=0.37208097885016417
	camnum=2, correspondence=0.023261152852868002, weight=0.27456570456985097
	camnum=8, correspondence=0.02260531248940671, weight=0.2656233765227408
	camnum=4, correspondence=0.013743539752310447, weight=0.1646445880448745
registration.MultiCamera: Step 6: Giving up: went only from 0.03152618468717142 to 0.03152426590743214
registration.MultiCamera: After 6 steps: overall correspondence error 0.02449924277602858. Per-camera correspondence, ordered worst-first:
	camnum=1, correspondence=0.03152426590743214, weight=0.37208097885016417
	camnum=2, correspondence=0.023261152852868002, weight=0.27456570456985097
	camnum=8, correspondence=0.02260531248940671, weight=0.2656233765227408
	camnum=4, correspondence=0.013743539752310447, weight=0.1646445880448745
	camindex=0, change=0.004923383770767776
	camindex=1, change=0.0035834263613848762
	camindex=2, change=0.004234437482021266
	camindex=3, change=0.0047952394388900855
registration.MultiCamera: Voxelizing with 0.03464716140173069: point count 31566, was 1000616
registration.MultiCamera: Pointcounts per tile, after voxelizing:
	tile 0: 31566
	tile 1: 6022
	tile 2: 6626
	tile 3: 644
	tile 4: 7621
	tile 5: 903
	tile 6: 198
	tile 7: 113
	tile 8: 7565
	tile 9: 25
	tile 10: 638
	tile 11: 15
	tile 12: 573
	tile 13: 56
	tile 14: 349
	tile 15: 218
cwipc_register: fine aligner ran for 107.283 seconds
cwipc_register: analyzer ran for 14.915 seconds
cwipc_register: Sorted correspondences after fine calibration
	camnum=1, correspondence=0.03152426590743214, weight=0.37208097885016417
	camnum=2, correspondence=0.023261152852868002, weight=0.27456570456985097
	camnum=8, correspondence=0.02260531248940671, weight=0.2656233765227408
	camnum=4, correspondence=0.013743539752310447, weight=0.1646445880448745
cwipc_register: Saved pointcloud and cameraconfig for step4_after_fine

jackjansen · 2025-02-25T16:27:21Z

Serious improvements. Here are the before/after correspondences (for the above dataset, and a similar capture):

cwipc_register: Sorted correspondences before fine calibration
	camnum=1, correspondence=0.029772669903727095, weight=0.32975381448799584
	camnum=4, correspondence=0.015593297675407158, weight=0.17546715061016732
	camnum=2, correspondence=0.014681153906851572, weight=0.1639654860227558
	camnum=8, correspondence=0.012609081220802731, weight=0.13812289235045297
cwipc_register: analyzer ran for 2.103 seconds
cwipc_register: Sorted correspondences after fine calibration
	camnum=1, correspondence=0.011843114105968925, weight=0.13234881827053135
	camnum=2, correspondence=0.007111442839985053, weight=0.07982065666339981
	camnum=4, correspondence=0.005414429270679016, weight=0.06188110117293287
	camnum=8, correspondence=0.004999972304687699, weight=0.05537202835582819

jackjansen · 2025-02-25T16:30:46Z

I think the only improvement that could still be done is to do different steps (the outer algorithm):

Start with a synthetic floor (optionally). Call this the "current set"
Find the camera that is nearest to the current set.
Align it to the current set.
add it to the current set.
Repeat until there are no cameras left.

Maybe after this we could do one more pass over all the cameras but I'm not sure this is worthwhile.

jackjansen · 2025-02-27T11:29:16Z

It turns out that that previous comment is indeed needed, for some situations. Because we still sometimes have the issue that the cameras are "pairwise aligned" (the problem we saw last year with the boxes datasets).

jackjansen · 2025-02-27T16:04:28Z

Unfortunately this doesn't work.

The problem is that the initial step, aligning the first two cameras, will over-enthusiastically rotate the "camera position" around the origin, to achieve the best possible overlap between the two point clouds. Which is of course completely wrong.

A possible solution may be to limit the target point cloud to just the points that could conceivably be matched to the source point cloud, but not sure that will fly.

jackjansen · 2025-02-27T16:10:19Z

Hmm, thinking out loud: maybe we should use a much smaller correspondence to the alignment step, basically reducing the points taken into account far too much, but then at least we know that the only points taken into account are points that are probably correct...

jackjansen · 2025-03-01T00:43:47Z

It may be that our analyzer functions are still too optimistic. I'm seeing a case where the distribution plots of the distance clearly doesn't correspond to reality.

jackjansen · 2025-03-01T00:54:15Z

Indeed! And it's the floor that is making it so optimistic! If I remove the floor I am getting much more realistic numbers.

jackjansen assigned nachoreimat Jul 12, 2022

jackjansen mentioned this issue Nov 10, 2023

cwipc_tilecolor should go #72

Open

jackjansen assigned jackjansen, troeggla and Silvia024 and unassigned nachoreimat Nov 12, 2023

Alignment of multi-camera pointclouds needs work #18

Alignment of multi-camera pointclouds needs work #18

Comments

jackjansen commented Jul 12, 2022 • edited Loading

jackjansen commented Apr 8, 2023

jackjansen commented Apr 19, 2023

jackjansen commented Nov 12, 2023 • edited Loading

jackjansen commented Nov 13, 2023

jackjansen commented Nov 14, 2023

jackjansen commented Nov 14, 2023 • edited Loading

jackjansen commented Nov 14, 2023 • edited Loading

Silvia024 commented Nov 14, 2023 • edited Loading

Silvia024 commented Nov 14, 2023

jackjansen commented Nov 15, 2023 • edited Loading

jackjansen commented Nov 15, 2023

jackjansen commented Nov 16, 2023

Silvia024 commented Nov 17, 2023

jackjansen commented Nov 17, 2023

jackjansen commented Nov 17, 2023

jackjansen commented Nov 19, 2023 • edited Loading

jackjansen commented Nov 19, 2023

jackjansen commented Nov 19, 2023

jackjansen commented Nov 20, 2023 • edited Loading

jackjansen commented Nov 20, 2023

jackjansen commented Nov 20, 2023

Silvia024 commented Nov 21, 2023

jackjansen commented Nov 21, 2023

jackjansen commented Nov 21, 2023

jackjansen commented Dec 22, 2023

jackjansen commented Dec 23, 2023

jackjansen commented Jan 1, 2024

jackjansen commented Jan 5, 2024

jackjansen commented Jan 7, 2024

jackjansen commented Jan 8, 2024

jackjansen commented Jan 8, 2024

jackjansen commented Jan 14, 2024

jackjansen commented Jan 14, 2024

jackjansen commented Jan 14, 2024

jackjansen commented Jan 14, 2024 • edited Loading

jackjansen commented Jan 14, 2024

jackjansen commented Jan 14, 2024

jackjansen commented Jan 15, 2024 • edited Loading

jackjansen commented Jan 26, 2024

jackjansen commented Feb 7, 2024

ireneviola commented Feb 8, 2024

jackjansen commented Sep 20, 2024 • edited Loading

jackjansen commented Sep 20, 2024

jackjansen commented Feb 3, 2025

jackjansen commented Feb 3, 2025 • edited Loading

jackjansen commented Feb 24, 2025

jackjansen commented Feb 24, 2025 • edited Loading

jackjansen commented Feb 25, 2025

jackjansen commented Feb 25, 2025

jackjansen commented Feb 27, 2025

jackjansen commented Feb 27, 2025

jackjansen commented Feb 27, 2025

jackjansen commented Mar 1, 2025

jackjansen commented Mar 1, 2025

jackjansen commented Jul 12, 2022 •

edited

Loading

jackjansen commented Nov 12, 2023 •

edited

Loading

jackjansen commented Nov 14, 2023 •

edited

Loading

jackjansen commented Nov 14, 2023 •

edited

Loading

Silvia024 commented Nov 14, 2023 •

edited

Loading

jackjansen commented Nov 15, 2023 •

edited

Loading

jackjansen commented Nov 19, 2023 •

edited

Loading

jackjansen commented Nov 20, 2023 •

edited

Loading

jackjansen commented Jan 14, 2024 •

edited

Loading

jackjansen commented Jan 15, 2024 •

edited

Loading

jackjansen commented Sep 20, 2024 •

edited

Loading

jackjansen commented Feb 3, 2025 •

edited

Loading

jackjansen commented Feb 24, 2025 •

edited

Loading