Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chunk merging issue #133

Open
FaroutYLq opened this issue Jul 23, 2023 · 2 comments
Open

Chunk merging issue #133

FaroutYLq opened this issue Jul 23, 2023 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@FaroutYLq
Copy link
Collaborator

The following runs seem to be problematic in cuts computation. Example error looks like below. [45501, 44405, 46721, 47083, 47287, 47296, 47298, 46731, 45437, 45451, 045462, 045501]

ValueError: Cannot merge chunks with different number of items: [[046731.peak_positions_cnn: 1659907969sec 424508440 ns - 1659907990sec 899344920 ns, 22238 items, 0.0 MB/s], [046731.peak_positions_mlp: 1659907969sec 424508440 ns - 1659907990sec 899344920 ns, 22234 items, 0.0 MB/s], [046731.peak_positions_gcn: 1659907969sec 424508440 ns - 1659907990sec 899344920 ns, 22238 items, 0.0 MB/s]]

More details you can find logs at /project/lgrandi/xenonnt/data_management_reprocessing/job_logs. Already used --notlazy in straxer, and I am requiring single core at this point.

@FaroutYLq FaroutYLq added the bug Something isn't working label Jul 23, 2023
@FaroutYLq FaroutYLq self-assigned this Jul 23, 2023
@FaroutYLq
Copy link
Collaborator Author

Trying to build on midway2 notebook, and same problems show up.

runs = ['051851', '050841', '050831', '050773', '047874', '047298', '047296', '047287', '047083', '046731', '046721', '046494', '046493', '045501', '045462', '045451', '045437', '044405']
st.make(runs, 'event_ms_naive')

ValueError: Cannot merge chunks with different number of items: [[045437.peaklets: 1656799988sec 619275800 ns - 1656800031sec 568948760 ns, 88236 items, 8.6 MB/s], [045437.peaklet_classification: 1656799988sec 619275800 ns - 1656800031sec 568948760 ns, 88235 items, 0.0 MB/s]]

@dachengx
Copy link

dachengx commented Jul 25, 2023

(For a record, it is discussed here: https://xenonnt.slack.com/archives/C016DM0JPK9/p1690013077769009)

I highly suspect that the reason for this error is deep in the algebra library of different nodes. Similar thing https://xenonnt.slack.com/archives/C016DM0JPK9/p1683044901706209 has been observed before. So if we re-process these three plugins on the same node, like xenon1t, we will get no error.

The quick fix would be just re-reprocess them. But for a long-term solution, we should test the output on different nodes, even on the same node multiple times to test the randomness of this phenomenon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants