-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix output_root #855
Fix output_root #855
Conversation
const auto output_root = this->tree.get_optional<std::string>("output_root"); | ||
std::string str; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stcui007, I think this should fix your issue.
std::string str; | |
std::string str = "./"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean the failing tests or something else. I don't want to write to "./" though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fairly certain the call to stat
will fail if you provide it the empty string like you are right now in the case when output_root
is not present. I think that is why the Build_and_Run_Model
action is failing.
I think you should return "./"
if output_root
is not present (boost::none
) like in the previous code. We can always assume that the directory ngen
is run from, ./
, exists.
One solution is:
if (output_root != boost::none && *output_root != "") {
// Check if the path ends with a trailing slash,
// otherwise add it.
output_root->back() == '/'
? *output_root
: *output_root + "/";
//use C system function to check if there is a dir match that defined in realization
struct stat sb;
if (stat(output_root->c_str(), &sb) == 0 && S_ISDIR(sb.st_mode))
return *output_root;
else {
throw std::runtime_error("output_root directory does not exist, please create one matching that in realization");
}
}
return "./";
Aside, I would think we would want to check that the calculated output_root
can be written to by the current user. That is out of the scope of this PR and should be handled elsewhere if deemed necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) It looks like this code would return "./" if output_root directory does not exist.
(2) This code would seem not get a chance to throw based on the outer if, which is not what we want.
(3) This block of code is a bit tricky:
output_root->back() == '/'
? *output_root
: *output_root + "/";
where does the value land?
(4) The failed test looks like come from this line in Example_model_run.yml:
./ngen data/catchment_data.geojson '' data/nexus_data.geojson '' data/example_realization_config.json
It takes realization file from the master, not from the PR, right?
(5) In the latest realization file, I set output_root to "./", so it automatically exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code handles the output root for ngen
, so if there is one in a realization config file under the key, output_root
, that one will be used, otherwise, the default is to write to the directory where ngen
was called, "./"
.
I don't think you pushed an updated data/example_realization_config.json
file. Looks like you pushed up data/example_bmi_multi_realization_config.json
and data/example_bmi_multi_realization_config_w_noah_pet_cfe.json
.
Based on the comments, I think this PR is partially solving #848 but needs a bit more discussion, since it feels like an XY problem. There's 2 separate issues here: (1) (2) If no I think the question still unanswered for this PR is: do we want |
The issue is, if we can still call it an issue, if someone runs a big job
and forgot to create the directory or entered wrong format in realization,
then when the job finishes, he ends up having a lot of problems. This PR
tries to stop that happening at the start.
It seems to me in the long run, requireing an output_root doesn't have any
downside. The decision needs to be made by Nels. Not sure if there is any
ripple effect to calibration if we change realization.
Maybe as you alluded to in (1). We could create a
directory programmatically.
…On Thu, Jul 18, 2024 at 3:15 PM Justin Singh-M. - NOAA < ***@***.***> wrote:
Based on the comments, I think this PR is partially solving #848
<#848> but needs a bit more
discussion, since it feels like an XY problem.
There's 2 separate issues here:
(1) output_root fails to write outputs if the specified path is invalid
(doesn't exist, isn't writable, etc.) as outlined in #848
<#848> -- There are two approaches
to solving this I think: either create the directory path (i.e. emulate mkdir
-p) if possible, or throw an error. This PR currently performs the latter
via stat.
(2) If no output_root is specified, potentially lots of files will be
outputted into the project directory since output_root defaults to ./, as
outline in #857 <#857>. I don't
think this is actually an issue as the solution is to just specify an
output_root in the config file.
I think the real question for this PR as is: do we want output_root to be
*optional* or *required*? Or rather, if output_root has an invalid path,
should we try to make the directories or should we always throw an error?
—
Reply to this email directly, view it on GitHub
<#855 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACA4SRK325UA36A37C3532TZNAO7NAVCNFSM6AAAAABLAWMQD6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZXGQ4DINZYGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Great, summary @program--. I share the same feelings.
My preference would be that Likewise, if possible, emulating From a practicality stance, we are already including |
@austin Raney - NOAA Affiliate ***@***.***>
You are right, I didn't. I thought that test used the multiple bmi
realization.
On Thu, Jul 18, 2024 at 3:42 PM Shengting Cui - NOAA Affiliate <
***@***.***> wrote:
… The issue is, if we can still call it an issue, if someone runs a big job
and forgot to create the directory or entered wrong format in realization,
then when the job finishes, he ends up having a lot of problems. This PR
tries to stop that happening at the start.
It seems to me in the long run, requireing an output_root doesn't have any
downside. The decision needs to be made by Nels. Not sure if there is any
ripple effect to calibration if we change realization.
Maybe as you alluded to in (1). We could create a
directory programmatically.
On Thu, Jul 18, 2024 at 3:15 PM Justin Singh-M. - NOAA <
***@***.***> wrote:
> Based on the comments, I think this PR is partially solving #848
> <#848> but needs a bit more
> discussion, since it feels like an XY problem.
>
> There's 2 separate issues here:
>
> (1) output_root fails to write outputs if the specified path is invalid
> (doesn't exist, isn't writable, etc.) as outlined in #848
> <#848> -- There are two
> approaches to solving this I think: either create the directory path (i.e.
> emulate mkdir -p) if possible, or throw an error. This PR currently
> performs the latter via stat.
>
> (2) If no output_root is specified, potentially lots of files will be
> outputted into the project directory since output_root defaults to ./,
> as outline in #857 <#857>. I
> don't think this is actually an issue as the solution is to just specify an
> output_root in the config file.
>
> I think the real question for this PR as is: do we want output_root to
> be *optional* or *required*? Or rather, if output_root has an invalid
> path, should we try to make the directories or should we always throw an
> error?
>
> —
> Reply to this email directly, view it on GitHub
> <#855 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACA4SRK325UA36A37C3532TZNAO7NAVCNFSM6AAAAABLAWMQD6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZXGQ4DINZYGM>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Yeah, that's a good point. Like you have started to do in this PR, transitioning our examples in |
@stcui007, no worries! |
@austin Raney - NOAA Affiliate ***@***.***>
I assume you know quite bit about calibration. So changing the Examples
would not affect them? We probably need to ask Nels. At the same time, we
need to carefully check our code so it doesn't have effect on that.
…On Thu, Jul 18, 2024 at 3:49 PM Austin Raney ***@***.***> wrote:
@stcui007 <https://github.com/stcui007>, no worries!
—
Reply to this email directly, view it on GitHub
<#855 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACA4SROKIXCXHVUDP2IGE4LZNAS35AVCNFSM6AAAAABLAWMQD6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZXGU2DSNJSGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
… On Mon, Jul 22, 2024 at 6:02 PM Justin Singh-M. - NOAA < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In include/realizations/catchment/Formulation_Manager.hpp
<#855 (comment)>:
>
+ const char* dir = str.c_str();
+
+ //use C++ system function to check if there is a dir match that defined in realization
+ struct stat sb;
+ if (stat(dir, &sb) == 0 && S_ISDIR(sb.st_mode))
+ return dir;
+ else {
+ //throw std::runtime_error("output_root directory does not exist, please create one matching that in realization");
+ mkdir(dir, 0755);
For (1), as long as we throw an error, that's fine.
For (2), where are you seeing that mkdir can create multi-level
directories?
This example results in an error:
#include <stdio.h>#include <string.h>#include <sys/stat.h>#include <errno.h>#include <unistd.h>
void test_path(const char* path) {
errno = 0;
int result_code = mkdir(path, 0775);
if (result_code == -1)
printf("[%s] error: %s\n", path, strerror(errno));
unlink(path);
}
int main(int argc, const char* argv[]) {
// This passes (unless current directory is not writable)
test_path("example_path");
// This fails unless 'example_nested_path/' is created
test_path("example_nested_path/subdir");
// This fails unless '/opt/ngen/outputs/' is created
test_path("/opt/ngen/outputs/22July2024_1");
return 0;
}
For your second point in (2), the purpose is for consumers. It's a
reasonable assumption that if output_root is set to a directory, and it
can create a top-level directory, then it should be able to create
multi-level directories. However, worst-case, as long as we're throwing the
errors, then the consumer will be informed.
—
Reply to this email directly, view it on GitHub
<#855 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACA4SRMW7TIPOHRKWFHJCBTZNWFO5AVCNFSM6AAAAABLAWMQD6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCOJSGY3TSNBVGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
A more careful reading of the site suggests it might have to be done
recursively.
On Tue, Jul 23, 2024 at 7:52 AM Shengting Cui - NOAA Affiliate <
***@***.***> wrote:
… This link:
https://stackoverflow.com/questions/55994752/adding-multiple-directories-to-mkdir
On Mon, Jul 22, 2024 at 6:02 PM Justin Singh-M. - NOAA <
***@***.***> wrote:
> ***@***.**** commented on this pull request.
> ------------------------------
>
> In include/realizations/catchment/Formulation_Manager.hpp
> <#855 (comment)>:
>
> >
> + const char* dir = str.c_str();
> +
> + //use C++ system function to check if there is a dir match that defined in realization
> + struct stat sb;
> + if (stat(dir, &sb) == 0 && S_ISDIR(sb.st_mode))
> + return dir;
> + else {
> + //throw std::runtime_error("output_root directory does not exist, please create one matching that in realization");
> + mkdir(dir, 0755);
>
> For (1), as long as we throw an error, that's fine.
>
> For (2), where are you seeing that mkdir can create multi-level
> directories?
>
> This example results in an error:
>
> #include <stdio.h>#include <string.h>#include <sys/stat.h>#include <errno.h>#include <unistd.h>
> void test_path(const char* path) {
> errno = 0;
> int result_code = mkdir(path, 0775);
> if (result_code == -1)
> printf("[%s] error: %s\n", path, strerror(errno));
>
> unlink(path);
> }
> int main(int argc, const char* argv[]) {
> // This passes (unless current directory is not writable)
> test_path("example_path");
>
> // This fails unless 'example_nested_path/' is created
> test_path("example_nested_path/subdir");
>
> // This fails unless '/opt/ngen/outputs/' is created
> test_path("/opt/ngen/outputs/22July2024_1");
> return 0;
> }
>
> For your second point in (2), the purpose is for consumers. It's a
> reasonable assumption that if output_root is set to a directory, and it
> can create a top-level directory, then it should be able to create
> multi-level directories. However, worst-case, as long as we're throwing the
> errors, then the consumer will be informed.
>
> —
> Reply to this email directly, view it on GitHub
> <#855 (comment)>, or
> unsubscribe
> <https://github.com/notifications/unsubscribe-auth/ACA4SRMW7TIPOHRKWFHJCBTZNWFO5AVCNFSM6AAAAABLAWMQD6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCOJSGY3TSNBVGI>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
"However, worst-case, as long as we're throwing the errors, then the
consumer will be informed."
I agree. If users are warned, then create a multi-level directory manually,
it should always work.
On Tue, Jul 23, 2024 at 8:32 AM Shengting Cui - NOAA Affiliate <
***@***.***> wrote:
… A more careful reading of the site suggests it might have to be done
recursively.
On Tue, Jul 23, 2024 at 7:52 AM Shengting Cui - NOAA Affiliate <
***@***.***> wrote:
> This link:
> https://stackoverflow.com/questions/55994752/adding-multiple-directories-to-mkdir
>
> On Mon, Jul 22, 2024 at 6:02 PM Justin Singh-M. - NOAA <
> ***@***.***> wrote:
>
>> ***@***.**** commented on this pull request.
>> ------------------------------
>>
>> In include/realizations/catchment/Formulation_Manager.hpp
>> <#855 (comment)>:
>>
>> >
>> + const char* dir = str.c_str();
>> +
>> + //use C++ system function to check if there is a dir match that defined in realization
>> + struct stat sb;
>> + if (stat(dir, &sb) == 0 && S_ISDIR(sb.st_mode))
>> + return dir;
>> + else {
>> + //throw std::runtime_error("output_root directory does not exist, please create one matching that in realization");
>> + mkdir(dir, 0755);
>>
>> For (1), as long as we throw an error, that's fine.
>>
>> For (2), where are you seeing that mkdir can create multi-level
>> directories?
>>
>> This example results in an error:
>>
>> #include <stdio.h>#include <string.h>#include <sys/stat.h>#include <errno.h>#include <unistd.h>
>> void test_path(const char* path) {
>> errno = 0;
>> int result_code = mkdir(path, 0775);
>> if (result_code == -1)
>> printf("[%s] error: %s\n", path, strerror(errno));
>>
>> unlink(path);
>> }
>> int main(int argc, const char* argv[]) {
>> // This passes (unless current directory is not writable)
>> test_path("example_path");
>>
>> // This fails unless 'example_nested_path/' is created
>> test_path("example_nested_path/subdir");
>>
>> // This fails unless '/opt/ngen/outputs/' is created
>> test_path("/opt/ngen/outputs/22July2024_1");
>> return 0;
>> }
>>
>> For your second point in (2), the purpose is for consumers. It's a
>> reasonable assumption that if output_root is set to a directory, and it
>> can create a top-level directory, then it should be able to create
>> multi-level directories. However, worst-case, as long as we're throwing the
>> errors, then the consumer will be informed.
>>
>> —
>> Reply to this email directly, view it on GitHub
>> <#855 (comment)>, or
>> unsubscribe
>> <https://github.com/notifications/unsubscribe-auth/ACA4SRMW7TIPOHRKWFHJCBTZNWFO5AVCNFSM6AAAAABLAWMQD6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCOJSGY3TSNBVGI>
>> .
>> You are receiving this because you were mentioned.Message ID:
>> ***@***.***>
>>
>
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, it soesn't make any difference.
…On Tue, Jul 23, 2024 at 4:10 PM Justin Singh-M. - NOAA < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In include/realizations/catchment/Formulation_Manager.hpp
<#855 (comment)>:
>
+ const char* dir = str.c_str();
+
+ //use C++ system function to check if there is a dir match that defined in realization
+ struct stat sb;
+ if (stat(dir, &sb) == 0 && S_ISDIR(sb.st_mode))
+ return dir;
+ else {
+ int result = mkdir(dir, 0755);
+ if (result == 0)
+ return dir;
+ else
+ throw std::runtime_error("mkdir " + std::string(dir) + " failed, please create " + std::string(dir));
Oops, I think it actually should be std::strerror
<https://en.cppreference.com/w/cpp/string/byte/strerror>, my bad.
—
Reply to this email directly, view it on GitHub
<#855 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACA4SRKIONXSXU4BSJLH54TZN3BF5AVCNFSM6AAAAABLAWMQD6VHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMZDCOJVGA4TQMBSG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Changed the included header. |
The logic looks fine to me. |
Looks like this adds some reasonable QoL logic to help with output management. Thanks! |
This PR fixes the problem of directory that
output_root
pointed to does not exist. If this is the case, the code creates a sub-directory based in the name given by theoutput_root
in the realization. The code defaults to writing the outputs to the project directory ifoutput_root
does not exist as in the current realization.This is an attempt to solve the problem raised in issues #857 and #848, where large number of output files are written. This is a common problem for running ngen on any large basin.
Additions
Removals
Changes
get_output_root()
function and unit test.Two realizations with
output_root
for demonstration. One realization in its original form, i.e., w/ooutput_root
for demonstrationTesting
Screenshots
Notes
Todos
Checklist
Testing checklist (automated report can be put here)
Target Environment support