Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging develop into master #22

Closed
wants to merge 109 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
109 commits
Select commit Hold shift + click to select a range
50639e1
attempt docker-registry
ChiaraFor96 Mar 14, 2024
0ea591f
attempt docker-registry
ChiaraFor96 Mar 14, 2024
4b9326c
attempt docker-registry
ChiaraFor96 Mar 14, 2024
2800e1b
attempt docker-registry
ChiaraFor96 Mar 14, 2024
7f8029e
attempt docker-registry
ChiaraFor96 Mar 14, 2024
12f6643
attempt docker-registry
ChiaraFor96 Mar 14, 2024
115f8f9
change docker registry compose
ManuelePasini Mar 15, 2024
e2a0d4d
update ports for expose docker registry
ChiaraFor96 Mar 15, 2024
702a6c1
upd
ChiaraFor96 Mar 15, 2024
11dbeec
upd
ChiaraFor96 Mar 15, 2024
ff614dd
upd
ChiaraFor96 Mar 15, 2024
a740e6c
Merge branch 'master' into develop
ManuelePasini Mar 15, 2024
dce358d
Merge branch 'develop' into develop-forresi
ChiaraFor96 Mar 15, 2024
cc65ad2
align with repo
ChiaraFor96 Mar 15, 2024
9f5cc94
change volume folder
ChiaraFor96 Mar 15, 2024
2da971c
change folder
ChiaraFor96 Mar 15, 2024
7bce64e
add comment
ChiaraFor96 Mar 15, 2024
306bca2
update docker file
ChiaraFor96 Mar 18, 2024
7eee9c5
update readme and geoserver docker file
ChiaraFor96 Mar 18, 2024
3423b6f
update readme and geoserver docker file
ChiaraFor96 Mar 18, 2024
6da610e
fix volumes
ChiaraFor96 Mar 18, 2024
d2ac997
fix port
ChiaraFor96 Mar 18, 2024
a68162b
refactor
ChiaraFor96 Mar 19, 2024
6f37cd7
geoserver with hdfs client
ChiaraFor96 Mar 19, 2024
da2d049
fix
ChiaraFor96 Mar 19, 2024
4c874c7
fix
ChiaraFor96 Mar 19, 2024
77da86c
fix
ChiaraFor96 Mar 19, 2024
c322d48
WIP geoserver
ChiaraFor96 Mar 19, 2024
a9f5a5a
WIP geoserver
ChiaraFor96 Mar 19, 2024
210d253
WIP geoserver
ChiaraFor96 Mar 19, 2024
3350dfe
WIP geoserver
ChiaraFor96 Mar 19, 2024
8aa7643
airflow remove unused volume and geoserver add script
ChiaraFor96 Mar 19, 2024
7c0ce1d
fix script mount
ChiaraFor96 Mar 19, 2024
e00dc91
fix script mount
ChiaraFor96 Mar 19, 2024
5897be7
change geoserver mount to nfs
ChiaraFor96 Mar 19, 2024
5a68964
change geoserver to be a manager
ChiaraFor96 Mar 20, 2024
a566141
manage airflow roles
ChiaraFor96 Mar 20, 2024
69475ff
pass airlfow installation
ChiaraFor96 Mar 20, 2024
de61383
pass airlfow installation
ChiaraFor96 Mar 20, 2024
fd9fbb7
small updates
ChiaraFor96 Mar 20, 2024
c06a906
wip to change postgres db
ChiaraFor96 Mar 20, 2024
5330de4
fix airflow with our postgres db
ChiaraFor96 Mar 20, 2024
e838df0
add a todo
ChiaraFor96 Mar 20, 2024
6ff5c0c
update docker airflow tentative script
ChiaraFor96 Mar 20, 2024
e5e70b2
fix docker file and update readme
ChiaraFor96 Mar 20, 2024
eb6854e
Merge branch 'master' into develop-forresi
ChiaraFor96 Mar 21, 2024
1415c59
update the readme
ChiaraFor96 Mar 22, 2024
6b7b7a4
remove unuseful code and update readme to a final version
ChiaraFor96 Mar 25, 2024
f868ba7
add constraint to fix deploy
ChiaraFor96 Mar 28, 2024
31bc52d
modified spark hist server port
ManuelePasini Apr 3, 2024
bb9e92c
Merge branch 'develop' into develop-forresi
ChiaraFor96 Apr 3, 2024
42efcfe
update airflow
ChiaraFor96 Apr 5, 2024
dfd021d
update airflow
ChiaraFor96 Apr 5, 2024
20eab33
update readme
ChiaraFor96 Apr 5, 2024
e7d14b0
add airflow env and change deploy-swarm.sh
ChiaraFor96 Apr 10, 2024
3687104
try to fix variable substitution
ChiaraFor96 Apr 10, 2024
3640b52
fix variable substitution
ChiaraFor96 Apr 10, 2024
d535c03
add mail configuration for airflow
ChiaraFor96 Apr 10, 2024
d093239
add mail configuration for airflow
ChiaraFor96 Apr 10, 2024
5fe29d4
add mail configuration for airflow
ChiaraFor96 Apr 10, 2024
47d34bd
add mail configuration for airflow
ChiaraFor96 Apr 10, 2024
bc164eb
try to fix problems
ChiaraFor96 Apr 10, 2024
d742265
try to fix problems
ChiaraFor96 Apr 10, 2024
ddd11bb
Added sftp server
ManuelePasini May 7, 2024
318f49a
Fix: port error in SFTP server
ManuelePasini May 7, 2024
d6f31f3
Dev: Updating SFTP
ManuelePasini May 7, 2024
00c4829
Fix: Updating hostkeys
ManuelePasini May 7, 2024
69da1d1
Fix: updating keys
ManuelePasini May 7, 2024
dc4809f
Fix
ManuelePasini May 7, 2024
b53b655
Fix
ManuelePasini May 7, 2024
2c30bda
DEV: implementing SFTP
ManuelePasini May 8, 2024
a4c2d2b
Working SFTP server
ManuelePasini May 8, 2024
d53d8ac
Refactor: updated NFS folders
ManuelePasini May 8, 2024
e65b603
Dev: SFTP install ACL
ManuelePasini May 9, 2024
2bcdd0e
Fix: keeping service alive
ManuelePasini May 9, 2024
4946c69
Fix: acl implementations
ManuelePasini May 9, 2024
60e1657
Merge branch 'refs/heads/develop' into develop-forresi
ChiaraFor96 May 16, 2024
094cbb2
remove .env from root and update spark version instead of using latest
ChiaraFor96 May 16, 2024
4d81bbe
update spark version
ChiaraFor96 May 16, 2024
c2e32dc
refactor and move legacy stack
ChiaraFor96 May 16, 2024
98362ac
utils add docker cleaner
ChiaraFor96 Jun 26, 2024
0bab270
update cleaner
ChiaraFor96 Jun 26, 2024
e0bc7b7
update cleaner
ChiaraFor96 Jun 26, 2024
721982f
install docker
ChiaraFor96 Jun 26, 2024
ef7ab2a
add commands script
ChiaraFor96 Jun 26, 2024
f77c130
fix commands script
ChiaraFor96 Jun 26, 2024
fd47913
change script
ChiaraFor96 Jun 26, 2024
1428902
feat: trying things
ManuelePasini Jun 27, 2024
8ab3660
updating FIWARE stack
ManuelePasini Jun 27, 2024
797669f
updating kafka stack
ManuelePasini Jun 27, 2024
1f0422e
WIP geoserver csrf
ChiaraFor96 Jul 24, 2024
b9f89de
fix geoserver csrf
ChiaraFor96 Jul 24, 2024
1f6021e
lost some updates
Jul 25, 2024
66b736d
added hue
ManuelePasini Jul 30, 2024
2628151
updated sftp port
Jul 30, 2024
1d36d2e
fix: updating hue stack
ManuelePasini Jul 30, 2024
e3f8fa7
Merge branch 'develop' of https://github.com/big-unibo/dataplatform i…
ManuelePasini Jul 30, 2024
8781a52
modified hue stack
Jul 31, 2024
0a534ed
feat: Swarm cleaner now also cleans up dangling services
ManuelePasini Sep 11, 2024
b047f49
chore: minor refactoring:
ManuelePasini Sep 12, 2024
6a57390
removed .env
Sep 12, 2024
026dafa
added .env example
Sep 12, 2024
4a084d1
chore: removing .airflow.example after mergint it with .env
Sep 12, 2024
3103141
fix: modyfing deploy swarm script for c.i.
ManuelePasini Sep 12, 2024
de74b57
fix: modifying .env for c.i.
ManuelePasini Sep 12, 2024
8c4f6a0
fix: modifying .env for c.i.
ManuelePasini Sep 12, 2024
331520c
fix: modifying .env for c.i.
ManuelePasini Sep 12, 2024
11b8a5b
fix: modifying .env for c.i.
ManuelePasini Sep 12, 2024
cbf74f4
fix: modifying .env for c.i.
ManuelePasini Sep 12, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 0 additions & 27 deletions .env

This file was deleted.

4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
.env
dataplatform/.env
/dataplatform/tests/HDFD_VENV_3.7
/dataplatform/tests/HDFD_VENV_3.7
/dataplatform/multiple_stacks/runtime/
Expand Down Expand Up @@ -639,3 +641,5 @@ fabric.propertyDataList
/python-utils/src/main/python/gmail/credentials.json
/python-utils/src/main/python/gmail/token.pickle
*.png

*.airflow.env
51 changes: 51 additions & 0 deletions airflow_dags/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# Airflow
[Installazione](https://airflow.apache.org/docs/apache-airflow/stable/howto/docker-compose/index.html)

## Docker registry
- stack `dataplatform/multiple_stacks/registry.yaml`
- contact locally in the cluster at 127.0.0.0:5000
- Starting from a Docker file in a cluster machine
- `docker image build --tag 127.0.0.0:5000/IMAGE_NAME:VERSION -f PATH_DOCKERFILE .`
- `docker push 127.0.0.0:5000/IMAGE_NAME:VERSION`
- From any other cluster machine
- `docker pull 127.0.0.0:5000/IMAGE_NAME:VERSION`

- Clean data in the registry (enter in the container)
- `registry garbage-collect -m /etc/docker/registry/config.yml`

## Airflow
Start example:
- In the directory of dags that is `:${NFSPATH}/dataplatform_config/airflow_data/dags`
- create a directory for each project and put the file for generate the DAG (in a subdirectory)
- example of a files are in __abds-bigdata__ project `\cimice\src\main\resources` and `\ingestion-weather\src\main\resources`
- [DAG](https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html): A DAG (Directed Acyclic Graph) is the core concept of Airflow, collecting Tasks together, organized with dependencies and relationships to say how they should run.
- [scheduling options](https://airflow.apache.org/docs/apache-airflow/1.10.1/scheduler.html)
- Our dags are always of one task, that is a [Docker Operator](https://airflow.apache.org/docs/apache-airflow-providers-docker/1.0.2/_api/airflow/providers/docker/operators/docker/index.html)
- have a name has to be made of alphanumeric characters, dashes, dots and underscores exclusively
- In particular, we use a specialization that is the [Docker swarm operator](https://airflow.apache.org/docs/apache-airflow-providers-docker/stable/_api/airflow/providers/docker/operators/docker_swarm/index.html#airflow.providers.docker.operators.docker_swarm.DockerSwarmOperator),
that can be useful for put constraints in where spawn docker containers.
### DockerSwarmOperator Airflow (some things)
- last line returned by the docker container is in XComs
- If want logs all the things on the standard output: `xcom_all=True`
- constraints in cpus and memory usage
- **auto_remove**=True, the docker rm
- **mounts**=[] Use volumes "source", "target", "type", "read_only"
- **command**="Command to be run in the container", overwrites the cmd, add a space at the end to tell that is not a template
- **mount_tmp_dir**=False, not mount a temporary directory
- **container_name**=similar to task name
- **placement**
- **network_mode** and **networks** use BIG-dataplatform-network
- For extra things refer to the official documentation

### Trigger a dag from python application
This is made in **abds-bigdata** project `ingestion-weather` module,
through the `python-service-interaction-utils/src/main/python/airflow_interaction.py` service.

### Common errors in the deploy
- not pass files that are in .gitignore in the build of the container
- use service names and internal ports when refer to other services (not use exposed ports)
- set the link to services in the config to the new clusters (e.g., hdfs)
- Errors in dag import: enter inside airflow-scheduler container and launch `airflow scheduler`

### Possible updates
- configure an smpt server, for [send mails on failure](https://stackoverflow.com/questions/58736009/email-on-failure-retry-with-airflow-in-docker-container)
51 changes: 0 additions & 51 deletions airflow_dags/bashscript/bash_script_example.py

This file was deleted.

3 changes: 0 additions & 3 deletions airflow_dags/bashscript/bash_script_example.sh

This file was deleted.

13 changes: 0 additions & 13 deletions airflow_dags/docker_with_code/Dockerfile

This file was deleted.

26 changes: 0 additions & 26 deletions airflow_dags/docker_with_code/docker_include_python.py

This file was deleted.

17 changes: 0 additions & 17 deletions airflow_dags/docker_with_code/python_script.py

This file was deleted.

10 changes: 0 additions & 10 deletions airflow_dags/dockerpyscript/Dockerfile

This file was deleted.

68 changes: 0 additions & 68 deletions airflow_dags/dockerpyscript/README.md

This file was deleted.

47 changes: 0 additions & 47 deletions airflow_dags/dockerpyscript/docker_operation_example.py

This file was deleted.

34 changes: 0 additions & 34 deletions airflow_dags/dockerpyscript/python_docker_example.py

This file was deleted.

Loading