-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add directory transfer support for SFTPOperator #44126
Conversation
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
|
@Dawnpool You seem to have a whole test non-db test suite failing. Can you check it on your local breeze and see if you can fix it? |
Hi, I guess there was an issue with the test code at the time I forked the repository. It might have been resolved by merging the main branch. There is no problem on my local breeze environment for now. |
Let's wait for one of the maintainers to approve the remaining workflows. That will give you some clarity if rebasing against |
Approved workflows. |
errrors, errors everywhere :D |
@Dawnpool Try reproducing your CI tests locally and figuring out what's going wrong. Seems this is a good place to start - https://github.com/apache/airflow/blob/main/dev/breeze/doc/ci/08_running_ci_locally.md |
PR has conflicts that needs to be resolved |
@potiuk @eladkal |
I think it's a side effect of bad configuration - likely some initialization issue in some of the skipped provider tests that are missing somewhere. Likely it is mitigated when full tests are run as some other tests are intitializing SQL Alchemy (where we completely should not need it for DB Tests). I will take a look at it later today and try to find out what it is (And it would be great to not merge it till then as we might be eble to see if the problem is fixed when I find it. |
Can you please rebase/resolve conflicts and ping me if it fails again. |
@potiuk |
😱 |
I think this should fix it: #45244 |
Let's see - I added the fix on top and we should see if the problem is fixed. |
Nope - there is another issue |
All right. I think I found it #45249 - very interesting issue (and your change accidentally revealed it because it selectively run non-db |
OK. The fix is merged. You can rebase again and I 🤞 that it should work now. |
Awesome work, congrats on your first merged pull request! You are invited to check our Issue Tracker for additional contributions. |
Finally! |
Finally! Thank you all😁 |
* Add directory put and get functions for sftp provider * Add test code * Add directory exists check * Fix merge conflict * Add path exists check
Thank you for your contribution @Dawnpool! In this implementation, files are downloaded sequentially, right? If so - when downloading folders containing a large number of small files, this approach will be quite inefficient. I think we should consider expanding the method, or adding another one that downloads files concurrently. Possibly related to SFTPHookAsync. |
Hi @Dev-iL! |
* Add directory put and get functions for sftp provider * Add test code * Add directory exists check * Fix merge conflict * Add path exists check
* Add directory put and get functions for sftp provider * Add test code * Add directory exists check * Fix merge conflict * Add path exists check
* Add directory put and get functions for sftp provider * Add test code * Add directory exists check * Fix merge conflict * Add path exists check
* Add directory put and get functions for sftp provider * Add test code * Add directory exists check * Fix merge conflict * Add path exists check
This PR implements directory transfer for SFTPOperator, related to the issue I have raised before.
Currently, the SFTPOperator only accepts file paths, and you have to specify every filename in a folder by list when transferring an entire folder.
By adding some directory handling logic, the operator now can accept directory paths as well, allowing users easily transfer entire folders.