Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate schedule downloads #61

Merged
merged 32 commits into from
Sep 20, 2023
Merged
Changes from 1 commit
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
a2af9bf
First commit for downloading and saving schedule data
dcjohnson24 Jul 18, 2023
4b38f62
Fix syntax error
dcjohnson24 Jul 18, 2023
50a8a4e
Change version constraint of mapclassify
dcjohnson24 Jul 19, 2023
f56d0d4
remove single quote
dcjohnson24 Jul 20, 2023
140ffbc
Run as a module
dcjohnson24 Jul 20, 2023
9f00363
Add print function for saving csv to public bucket
dcjohnson24 Jul 25, 2023
12f6b08
Download schedule daily at 5:30pm UTC
dcjohnson24 Jul 25, 2023
8aa3691
Save zipfile from transitchicago.com to s3
dcjohnson24 Jul 25, 2023
2ee3d05
Change method of uploading zipfile
dcjohnson24 Jul 26, 2023
7c6a42e
Check that objects exist in bucket
dcjohnson24 Jul 27, 2023
bc91766
Change yield to print
dcjohnson24 Jul 27, 2023
c0c153c
Separate downloading zip file and saving daily summaries
dcjohnson24 Aug 8, 2023
2dc18f3
remove job dependency
dcjohnson24 Aug 8, 2023
d35f310
Add args to same line
dcjohnson24 Aug 8, 2023
ee7b057
Save realtime summary file
dcjohnson24 Aug 13, 2023
461df42
Change to string
dcjohnson24 Aug 13, 2023
e1baeaa
Correct python version name
dcjohnson24 Aug 13, 2023
398d62a
Add quotes
dcjohnson24 Aug 13, 2023
77ef708
Add environment context
dcjohnson24 Aug 13, 2023
4842fa0
Remove quotes
dcjohnson24 Aug 13, 2023
3817614
Test without environment variables
dcjohnson24 Aug 13, 2023
cfb0960
Revert "Test without environment variables"
dcjohnson24 Aug 13, 2023
c08335a
Change python version
dcjohnson24 Aug 13, 2023
1eef5d2
Loosen constraint on pandas version
dcjohnson24 Aug 14, 2023
b56f0c8
Change cta_schedule_versions to cta_data_downloads
dcjohnson24 Aug 14, 2023
665e90e
Install libgeo-dev
dcjohnson24 Aug 14, 2023
6e287ec
Back to python 3.10
dcjohnson24 Aug 14, 2023
9b04970
Change back to version constraint
dcjohnson24 Aug 14, 2023
f0bd45a
Change timezone to America/Chicago
dcjohnson24 Aug 14, 2023
cebd713
Change to correct end date for realtime data
dcjohnson24 Aug 14, 2023
20c595f
rename schedule summary function
dcjohnson24 Aug 15, 2023
4c06991
remove on push
lauriemerrell Sep 20, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
rename schedule summary function
dcjohnson24 committed Aug 15, 2023
commit 20c595fe7687fbc716c2da5f1808030c330be175
4 changes: 2 additions & 2 deletions .github/workflows/cta_data_downloads.yml
Original file line number Diff line number Diff line change
@@ -47,8 +47,8 @@ jobs:
- name: 'Save schedule summaries'
run: |
pip install -r requirements.txt
python -c 'from scrape_data.cta_data_downloads import save_route_daily_summary; \
save_route_daily_summary()' $AWS_ACCESS_KEY_ID $AWS_SECRET_ACCESS_KEY
python -c 'from scrape_data.cta_data_downloads import save_sched_daily_summary; \
save_sched_daily_summary()' $AWS_ACCESS_KEY_ID $AWS_SECRET_ACCESS_KEY


save-realtime-daily-summary:
2 changes: 1 addition & 1 deletion scrape_data/cta_data_downloads.py
Original file line number Diff line number Diff line change
@@ -60,7 +60,7 @@ def save_csv_to_bucket(df: pd.DataFrame, filename: str) -> None:
.put(Body=csv_buffer.getvalue())


def save_route_daily_summary() -> None:
def save_sched_daily_summary() -> None:
data = sga.GTFSFeed.extract_data(CTA_GTFS)
data = sga.format_dates_hours(data)
trip_summary = sga.make_trip_summary(data)