Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

schema: Print an error if the qsv stats invocation fails #2110

Merged
merged 1 commit into from
Sep 7, 2024

Conversation

abrauchli
Copy link
Contributor

Aborts if either

  • the stats process is killed (e.g. SIGKIL due to memory pressure)
  • or the stats process returns a non-zero return code

@abrauchli
Copy link
Contributor Author

abrauchli commented Sep 6, 2024

This is the log output for a SIGKIL run:

[2024-09-06 18:35:17.536943 +00:00] INFO [qsv::util] src/util.rs:136: Using 7 jobs...
[2024-09-06 18:39:40.314465 +00:00] ERROR [qsv::cmd::schema] src/cmd/schema.rs:139: Failed to infer schema via stats and frequency from History_List_Details_20161108.csv: qsv stats terminated with signal: 9
[2024-09-06 18:39:40.317034 +00:00] ERROR [qsv] src/main.rs:300: Failed to infer schema via stats and frequency from History_List_Details_20161108.csv: qsv stats terminated with signal: 9
[2024-09-06 18:39:40.318750 +00:00] INFO [qsv::util] src/util.rs:1176: END "schema /home/foo/projec..." elapsed: 262.8931

The contribution guide asks to run cargo +nightly fmt which yields some unrelated formatting changes that are not present in this PR.

Before merging, please check that ret code != 0 is a valid abort condition

src/util.rs Fixed Show fixed Hide fixed
Aborts if either
* the stats process is killed (e.g. SIGKIL due to memory pressure)
* or the stats process returns a non-zero return code
@jqnatividad jqnatividad merged commit cabfe5a into dathere:master Sep 7, 2024
16 checks passed
@jqnatividad
Copy link
Collaborator

Thanks for your contribution @abrauchli !

Out of curiosity, can you share your qsv --version? Also, how big was the file? File size, rowcount and number of headers.

Is it indexed?

FYI - we're currently exploring ways of making stats run with arbitrarily large files, even with the advanced statistics in qsv pro...

cc @rzmk

@abrauchli
Copy link
Contributor Author

abrauchli commented Sep 7, 2024

I tried a couple of versions on two different boxes with each 32G ram. The csv is about 1.8G uncompressed but has > 150 columns. Once I shrunk the number of columns to ~50 it ran through (also drops the file size to ~1G).

I regularly process files up to 4G and never had issues so far, but those didn't go over 20 columns or so.

These are the qsv versions I use in production:

qsv --version
qsv 0.133.1-mimalloc--8-8;24.84 GiB-1.18 GiB-16.90 GiB-31.05 GiB (Unknown_target compiled with Rust 1.80.1) installed
qsv --version
qsv 0.133.1-mimalloc-python-3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0];to;-16-16;25.01 GiB-5.43 GiB-14.37 GiB-31.26 GiB (Unknown_target compiled with Rust 1.80.1) installed

@abrauchli abrauchli deleted the err-if-stats-fails branch September 7, 2024 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants