-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CALET text conversion #3
Conversation
|
||
import requests | ||
|
||
from ... import conversion, utils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a missing file that corresponds with this utils import? I see that you are importing conversion and utils from the grandparent package but I am unable to find utils.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file is in #2
} | ||
|
||
|
||
def text_to_json_calet(notice, record_number): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unusual to me. Typically we don't put logic in init.py files.
It is much more common to see something like the following:
from my_package import main_function
main_function()
so that would end up with the functions below existing in a separate file and then being imported and called as needed.
from file_you_created import create_all_calet_jsons
create_all_calet_jsons()
There are a couple of reasons for this:
- code readability. People typically aren't looking in init files for functions. In cases where the package is larger and there might be a large quantity of functions that go in there, it can get really long and really hard to read. It isn't hard to read here because there is not a lot of code, but it's good to get in the habit of putting things in a standard place.
- Again, not really an issue here because it's a small amount of code, but because everything in this file gets run at import that means if you have a lot of code in the init file, it can eventually lead to slow import times.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially trying to configure the way you've described but I think the reason that I formatted the files this way was that I was struggling with figuring out the relatives import and tried a few different things and this format stuck. But now that I have relative imports working I could certainly change it back. I could define a conversion.py file for each subpackage which has most of the logic and keep the core stuff in the init.py
main.sh
Outdated
@@ -0,0 +1,3 @@ | |||
#!/bin/bash | |||
|
|||
python3 -m gcn_classic_text_to_json.notices.calet.__init__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this code intended to be run? Is it something that will be called from the command line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is how the code is intended to be run for now. It needs to be called from the command line with the repo directory as your root. This was just something quick that I got up so I could start testing if the conversions were accurate. If there are better ideas for how to call the subpackages, I can change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, we're going to make a few changes but let's do them one bit at a time so that they don't get jumbled together.
Let's start with these steps:
-
let's break the functions out into their own file(s) and run it like you have it here just to make sure that the imports are working properly. You can call them from the init files for now (or if you feel like you don't need the init files, you can just import them relatively instead of as a package).
-
Then go ahead and add in your PR Utility functions #2 into this PR so utils is there. Typically to get a PR approved, all the code it needs to run is expected to be in the PR. There are times that you have to line them up in order and merge in order, but it can get complicated really fast if you do that. So since it's not a ton of additional code, we'll just close Utility functions #2 and bring that into this branch and they can go in together.
-
you can remove the main.sh file because we are going to replace it. the main.sh is fine, it's just that it's not necessary to use bash to accomplish what you want to do. instead of
bash main.sh
you can just call it all with python. Here are two alternate ways to approach it:
A. Inside the gcn_classic_text_to_json let's create a file called__main__.py
and inside of that import the entry function (create_all_calet_jsons). It would look like this (but keep in mind that the import may have changed since you are moving things out into a file that isn't the init file) so it would look like this but with an updated import statement:
from gcn_classic_text_to_json.notices.calet import create_all_calet_jsons
if __name__ == "__main__":
create_all_calet_jsons()
this way when you want to run the code, you can just do python3 -m gcn_classic_text_to_json
to run it. This will only work if we leave them as packages, so if you decide that you just want to use relative imports instead of modules, try option B.
B. just create a file at the root of the project and put the import for create_all_calet_jsons import and function call like so (again keeping in mind any changes you have made to imports):
from gcn_classic_text_to_json.notices.calet import create_all_calet_jsons
create_all_calet_jsons()
when I was testing this locally, I just called it entrypoint.py and placed it at the root level. the way I call it in the terminal is python3 entrypoint.py
. You can call it whatever you want, entrypoint as a file name is just a way of saying "get to the code starting here". Another one commonly used for cli execution is cli.py. so in that case you'd run python3 cli.py
The last one is not super duper important functionally because it does get called correctly the way you have it. I just wanted to show you other ways to do it that would keep it pure python. There are many ways to create an entrypoint to your code and usually we determine which one to use based on how the code will be called. If we wanted to turn the whole thing into an importable package that we would call from another code base, we'd use yet a different way of doing it. (which I am happy to pair with you on and show you or send you reading materials just so you know what all the options are, but I didn't want it to get too confusing here. so just let me know if you'd like that).
Imports in python are sometimes hard and confusing and take time to get used to. You are very much not alone in having encountered issues with them.
I have tested your code locally and it does run just fine and it seems to be doing exactly what you want it to be doing, so excellent job on getting it all to work! If you need any help or get bogged down anywhere in this change request, please don't suffer too long. I am happy to pair with you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great!
Added CALET text conversion Added main.sh Added README with keyword mapping switched to using Beautiful Soup small changes updated README refactored code to use email library switched to using context manager to handle files Added utility functions removed parse_trigger_ids() (moved to conversion.py) added general parsing function moved utils.py fixed error with set_alert_datetime() removed redundant functions refactored code and updated main call Set up project environment Added License, updated README and added main directory updated README.md updated README and .gitignore switched to using sernum to save jsons added bs4 to poetry
Added code for CALET text conversion. The code works by calling a general text parsing function that is meant to be used by all missions and then adding additional details that are specific CALET. However if a different code structure is preferred, I can change it.