Skip to content

Commit

Permalink
[B / Docker] Logging for api (#517)
Browse files Browse the repository at this point in the history
* merge multiple experiments

commit 8652a67
Author: Judith <[email protected]>
Date:   Mon Apr 11 18:17:20 2022 +0200

    missed `handle_requests` on last commit

commit a354f89
Author: Judith <[email protected]>
Date:   Mon Apr 11 18:16:15 2022 +0200

    implement #356

commit fd54365
Author: Judith <[email protected]>
Date:   Mon Apr 11 17:56:35 2022 +0200

    docstrings

commit 9f43c9d
Author: Judith <[email protected]>
Date:   Mon Apr 11 15:07:47 2022 +0200

    fix docker test

commit b3e6e80
Merge: 6f64f9e 091eee6
Author: felix-20 <[email protected]>
Date:   Mon Apr 11 14:31:39 2022 +0200

    Merge branch 'development' into 379-multiple-experiments

commit 6f64f9e
Author: Judith <[email protected]>
Date:   Mon Apr 11 14:20:47 2022 +0200

    clean up docker port

commit 8c35e8a
Author: Judith <[email protected]>
Date:   Mon Apr 11 11:33:30 2022 +0200

    merge development

    commit 091eee6
    Author: Nikkel Mollenhauer <[email protected]>
    Date:   Mon Apr 11 10:43:00 2022 +0200

        Tests for `config_validation.py` (#404)

        * Don't get default data (shouldn't be necessary)

        * Unpack default data

        * Refactored utils functions to return dicts instead of strings

        * Adapted to new mock format

        * Adapted to new mock format

        * Some first tests

        * More tests

        * Fixed testcase-names

        * Moved file-endings to initial function call

        * Fixed tests

        * More asserts

        * More tests

        * More tests

        * More tests

        * `validate_sub_keys`-tests

    commit f8bc162
    Author: jannikgro <[email protected]>
    Date:   Fri Apr 8 13:20:42 2022 +0200

        [D] Stable baselines integration (#384)

        * refactored reinforcement learning agent to accept marketplace

        * adapted test_exampleprinter.py to marketplace initialization

        * add market option to accept continuos actions

        * fixed action space check

        * initial stable baselines integration

        * Agent init by env (#390)

        * introduced self.network in actorcritic_agent

        * added network_architecture in QLearningAgent

        * changed actorcritic_agent to network_architecture

        * set back training_scenario

        * am_configuration initialize rl-agent via marketplace

        * added final analyse to stable baselines training

        * added more stable baselines algorithms

        * added ppo algorithm

        * introduced stable_baselines_folder

        * renamed training to callback

        * satisfied linter

        * fixed loading problem

        * try to make tqdm run in stable_baselines

        * make tqdm running

        * reduced pmonitoring episodes in sb training

        * save model only if significantly better

        * fixed too long test time bug

        * moved back to 250 episodes testing

        * set timeout to 15 minutes

        * added first batch of fixes to @NikkelM feedback

        * added type annotations and asserts in stable_baselines_model

        * added sbtraining to training_scenario

        * applied comments in am_configuration

        * solved .dat problem and fixed crashing asserts

        * reintroduced _end_of_training

        * removed deprecated if

        * Moved '.dat' to function call instead of appending within function

        * Fixed assert

        * fixed model file ending bug

        * Add short explanation docstring

        Co-authored-by: Johann Schulze Tast <[email protected]>

        * fixed wrong docstring

        * Fixed tests

        Co-authored-by: NikkelM <[email protected]>
        Co-authored-by: Johann Schulze Tast <[email protected]>

commit 0183876
Author: Judith <[email protected]>
Date:   Mon Apr 11 11:31:15 2022 +0200

    name from `names` for container

commit ad70b7c
Author: Judith <[email protected]>
Date:   Sun Apr 10 16:32:47 2022 +0200

    fix #380

commit 8e1314c
Author: Judith <[email protected]>
Date:   Sun Apr 10 16:05:28 2022 +0200

    multiple experiments are supported on the webserver

commit 53a4000
Author: Judith <[email protected]>
Date:   Fri Apr 8 14:36:44 2022 +0200

    support for starting multiple container on docker side

commit 3c94f43
Author: Judith <[email protected]>
Date:   Fri Apr 8 11:46:07 2022 +0200

    ability to add `DockerInfo` for multiple container

* first attempt for websocket

* websocket on docker site working

* just send all of it if things have changed

* started on webserver site

* webserver sends push notification to user about stopped container

* commit

* websocket to ssl

* fix

* try

* started on database manager

* more db

* fix?

* debug statements

* more types

* does table exist fix

* colorful terminal output

* commit

* debug

* extra health checker

* debug for force stop

* some logging

* fix tests

* telegram notifications

* better logging

* better logging

* fix test?

* different logging level

* small fixes

* started on system monitoring

* system monitoring

* csv possibility for system data

* two buttons in webserver

* more fail prove

* fix configuration form

* silent_starter

* logging statements for silent starter

* debug

* debug

* more debugging

* fix error

* more debugging

* more local monitoring

* adjusted logging in silent starter

* gpu

* os.system

* subprocess

* Squashed commit of the following:

commit af12e13
Merge: bc9d3a3 ab704ca
Author: Nikkel Mollenhauer <[email protected]>
Date:   Tue Jun 7 18:20:56 2022 +0200

    Merge branch '484-configuration-remove-combined-config' of https://github.com/hpi-epic/BP2021 into 484-configuration-remove-combined-config

commit bc9d3a3
Author: Nikkel Mollenhauer <[email protected]>
Date:   Tue Jun 7 18:20:50 2022 +0200

    Removed dead methods

commit ab704ca
Author: felix-20 <[email protected]>
Date:   Tue Jun 7 16:55:34 2022 +0200

    improved prefill (#496)

    * improved prefill with consideration of the current formdata

    * remove debug statements

    * remove print debug

    * Added lxml dependency

    * Removed debug comment

    Co-authored-by: Nikkel Mollenhauer <[email protected]>

commit 8839f90
Author: Judith <[email protected]>
Date:   Tue Jun 7 12:01:43 2022 +0200

    remove `config_is_final`

commit 53d79f9
Author: NikkelM <[email protected]>
Date:   Tue Jun 7 10:41:51 2022 +0200

    Review feedback by @SinNeax

commit 99ab681
Author: NikkelM <[email protected]>
Date:   Sat Jun 4 15:02:34 2022 +0200

    Fixed config validation

commit 6a09267
Merge: 179e0da e623413
Author: Judith <[email protected]>
Date:   Fri Jun 3 14:50:15 2022 +0200

    Merge branch '484-configuration-remove-combined-config' of https://github.com/hpi-epic/BP2021 into 484-configuration-remove-combined-config

commit 179e0da
Author: Judith <[email protected]>
Date:   Fri Jun 3 14:49:49 2022 +0200

    fix javascript error

commit e623413
Author: NikkelM <[email protected]>
Date:   Fri Jun 3 14:38:20 2022 +0200

    Added `config_type` field to default modelfiles

commit bd15108
Merge: dac92d3 507cbc1
Author: NikkelM <[email protected]>
Date:   Fri Jun 3 09:45:11 2022 +0200

    Merge branch '484-configuration-remove-combined-config' of https://github.com/hpi-epic/BP2021 into 484-configuration-remove-combined-config

commit dac92d3
Author: NikkelM <[email protected]>
Date:   Fri Jun 3 09:44:49 2022 +0200

    Fixed tests

commit 507cbc1
Author: Judith <[email protected]>
Date:   Fri Jun 3 07:52:30 2022 +0200

    dynamic table for sim_market possible

commit bce03bd
Author: Judith <[email protected]>
Date:   Thu Jun 2 20:15:17 2022 +0200

    fix webserver tests

commit 57dbb33
Author: Nikkel Mollenhauer <[email protected]>
Date:   Thu Jun 2 14:29:40 2022 +0200

    Webserver format

commit b3780cd
Author: Nikkel Mollenhauer <[email protected]>
Date:   Thu Jun 2 14:15:01 2022 +0200

    Fixed validation

commit c9657a9
Author: Nikkel Mollenhauer <[email protected]>
Date:   Thu Jun 2 13:50:20 2022 +0200

    Added back needed functionality of validating "complete" configs

commit dca24f3
Author: Nikkel Mollenhauer <[email protected]>
Date:   Thu Jun 2 13:34:21 2022 +0200

    Added some tests

commit fff9b47
Author: Nikkel Mollenhauer <[email protected]>
Date:   Thu Jun 2 13:12:41 2022 +0200

    Renamed `market` to `sim_market`

commit caf5c01
Merge: b8c43b7 f50fe65
Author: Nikkel Mollenhauer <[email protected]>
Date:   Thu Jun 2 11:30:18 2022 +0200

    Merge branch '484-configuration-remove-combined-config' of https://github.com/hpi-epic/BP2021 into 484-configuration-remove-combined-config

commit b8c43b7
Author: Nikkel Mollenhauer <[email protected]>
Date:   Thu Jun 2 11:29:34 2022 +0200

    New config format

commit f50fe65
Author: Judith <[email protected]>
Date:   Thu Jun 2 11:21:10 2022 +0200

    implement new config validation for webserver

commit 135e8f3
Author: Judith <[email protected]>
Date:   Wed Jun 1 20:52:12 2022 +0200

    fix webserver tests

commit d326245
Author: Judith <[email protected]>
Date:   Wed Jun 1 17:26:50 2022 +0200

    fix prefill

commit 70492c8
Author: Nikkel Mollenhauer <[email protected]>
Date:   Wed Jun 1 17:08:34 2022 +0200

    Fixed remaining tests(?)

commit 59136b1
Author: Nikkel Mollenhauer <[email protected]>
Date:   Wed Jun 1 16:35:00 2022 +0200

    Fixed most tests

commit c912309
Author: Nikkel Mollenhauer <[email protected]>
Date:   Wed Jun 1 16:11:20 2022 +0200

    Removed references to `class`-field

commit 2ccee78
Author: Nikkel Mollenhauer <[email protected]>
Date:   Wed Jun 1 15:12:56 2022 +0200

    Removed `class` keywords from config files

commit c51dab7
Author: Nikkel Mollenhauer <[email protected]>
Date:   Wed Jun 1 15:09:46 2022 +0200

    renamed rl_config to q_learning_config

commit f9aa012
Author: NikkelM <[email protected]>
Date:   Tue May 31 19:49:14 2022 +0200

    Fixed Agent_monitoring with stable_baselines agents

commit 16df8c4
Merge: 7610c6e 26cb54b
Author: Judith <[email protected]>
Date:   Tue May 31 13:50:24 2022 +0200

    Merge branch '484-configuration-remove-combined-config' of https://github.com/hpi-epic/BP2021 into 484-configuration-remove-combined-config

commit 7610c6e
Author: Judith <[email protected]>
Date:   Tue May 31 13:50:14 2022 +0200

    script for creating rl model

commit 26cb54b
Author: NikkelM <[email protected]>
Date:   Tue May 31 11:21:46 2022 +0200

    Simplified and extended key validation

commit 0fa7eb5
Author: Judith <[email protected]>
Date:   Mon May 30 14:17:22 2022 +0200

    rl config works in view

commit da05474
Author: NikkelM <[email protected]>
Date:   Fri May 27 13:37:39 2022 +0200

    New feather website

commit dd40104
Author: NikkelM <[email protected]>
Date:   Fri May 27 12:56:28 2022 +0200

    Removed debug comment, fixed policyanalyzer

commit 1d025c9
Author: NikkelM <[email protected]>
Date:   Fri May 27 12:39:17 2022 +0200

    Reintroduced `main` for testing purposes

commit c90a8d6
Author: NikkelM <[email protected]>
Date:   Fri May 27 12:26:10 2022 +0200

    Fixed config validation for webserver

commit 4107aaf
Author: NikkelM <[email protected]>
Date:   Fri May 27 10:57:04 2022 +0200

    Added some debugging to docker

commit 2588f6d
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 22:22:39 2022 +0200

    preconditions for actor critic parameters and stable baselines parameter have been created

commit 4e174e6
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:42:23 2022 +0200

    restored changes

commit 14f181d
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:37:23 2022 +0200

    restored last changes

commit 3d3ac9d
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:34:11 2022 +0200

    more information

commit 31a212f
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:31:31 2022 +0200

    more information

commit 1b139e0
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:28:02 2022 +0200

    test

commit 1e80e8c
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:25:16 2022 +0200

    next try

commit 09a433e
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:22:45 2022 +0200

    another printf try

commit de9f5eb
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:19:07 2022 +0200

    more printf debug

commit 511d38a
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:16:11 2022 +0200

    printf debug

commit 1d1a3c4
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:12:06 2022 +0200

    restored last changes

commit 61c0d8e
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:09:22 2022 +0200

    tried without reading check

commit 068abaa
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:03:54 2022 +0200

    tried with dirty fix

commit f1cdf45
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 19:02:03 2022 +0200

    debug print

commit 75a674e
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 18:56:08 2022 +0200

    restored docker_manager

commit 39bc238
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 18:51:31 2022 +0200

    tried with assert False

commit 6986513
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 18:21:59 2022 +0200

    small change in docker_manager

commit 561fc00
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 18:13:18 2022 +0200

    tried to fix config_validation

commit 3fded29
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 17:25:10 2022 +0200

    reintroduced test_hyperparameter_config_market

commit b6a4945
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 15:52:19 2022 +0200

    reintroduced test_hyperparameter_config_rl

commit 7cef3e7
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 15:19:29 2022 +0200

    added rules and verifications

commit 473ddf7
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 12:43:12 2022 +0200

    introduced JSONConfigurable class and demanded class entry in config

commit 36024c8
Author: Jan Niklas Groeneveld <[email protected]>
Date:   Thu May 26 11:50:48 2022 +0200

    solved #484 without check

* automatically write template files

* try fix pre-commit

* try fix pre-commit

* try except

* more debug

* more debug

* debugging :)

* hotfix for qlearning file

* adopting to the given agent works again

* restore docker manager and app.py

* try restore again

* webserver tests are running again

merge from dev was very very strange

* new rl_config

* precommit?

* precommit!

* assert print

Co-Authored-By: Nikkel Mollenhauer <[email protected]>

* remove print

Co-Authored-By: Nikkel Mollenhauer <[email protected]>

* new regex for matching ce agents

Co-Authored-By: Nikkel Mollenhauer <[email protected]>

* fixed fixed price agent

Co-Authored-By: Nikkel Mollenhauer <[email protected]>

* comment out assert

Co-Authored-By: Nikkel Mollenhauer <[email protected]>
Co-Authored-By: jannikgro <[email protected]>

* delete some files

* bcolors back

* readme websocket

* debug websocket

* fixed websocket

* small websocket fix

* logging for websocket

* more logging?

* logging

* logging relevant content

* reset?

* reset webserver

* relevant gitignore stuff

* some resets in docker manager

* init files again?

* separate_markets again

* remove unnecessary code

* fixed docker tests, introduce Mocked logger

* logger into DockerManager

* Fixed typo bug

* log to file when executing `docker_manager`

Co-authored-by: Nikkel Mollenhauer <[email protected]>
Co-authored-by: jannikgro <[email protected]>
Co-authored-by: Johann Schulze Tast <[email protected]>
  • Loading branch information
4 people authored Jul 20, 2022
1 parent d20d2dc commit a5402ae
Show file tree
Hide file tree
Showing 9 changed files with 240 additions and 184 deletions.
8 changes: 3 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,10 @@ data
# results from the tests
tests/test_results

# all webserver related directories
webserver/configurations
webserver/data
webserver/package-lock.json
webserver/node_modules
# all webserver and docker related directories
webserver/.env.txt
docker_api/.env.txt
docker_api/log_files/*

# file where the users data path is saved
recommerce/configuration/user_path.txt
Expand Down
16 changes: 10 additions & 6 deletions docker/app.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# app.py
import hashlib
import logging
import os
import time

Expand All @@ -21,8 +21,12 @@
# If using a remote machine use
# uvicorn --host 0.0.0.0 app:app --reload
# instead to expose it to the local network
manager = DockerManager()
path_to_log_files = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'log_files')
if not os.path.isdir(path_to_log_files):
os.makedirs(path_to_log_files)

logger = logging.getLogger('uvicorn.error')
manager = DockerManager(logger)
app = FastAPI()


Expand All @@ -47,17 +51,15 @@ def verify_token(request: Request) -> bool:
"""
verifies for a given request that the header contains the right AUTHORIZATION_TOKEN.
Warning: This cannot be considered 100% secure, without https, any network sniffer can read the token
Args:
request (Request): The request to the API
Returns:
bool: if the given authorization token matches our authorization token.
"""
try:
token = request.headers['Authorization']
except KeyError:
print('The request did not set an Authorization header')
logger.error('The request did not set an Authorization header')
return False
master_secret_as_int = sum(ord(c) for c in os.environ['AUTHORIZATION_TOKEN'])
current_time = int(time.time() / 3600) # unix time in hours
Expand All @@ -84,6 +86,7 @@ async def start_container(num_experiments: int, config: Request, authorized: boo
if not authorized:
return JSONResponse(status_code=401, content=vars(DockerInfo('', 'Not authorized')))
all_container_infos = manager.start(config=await config.json(), count=num_experiments)

# check if all prerequisites were met
if type(all_container_infos) == DockerInfo:
return JSONResponse(status_code=404, content=vars(all_container_infos))
Expand All @@ -93,7 +96,7 @@ async def start_container(num_experiments: int, config: Request, authorized: boo
if (is_invalid_status(all_container_infos[index].status) or all_container_infos[index].data is False):
return JSONResponse(status_code=404, content=vars(all_container_infos[index]))
return_dict[index] = vars(all_container_infos[index])
print(f'successfully started {num_experiments} container')
logger.info(f'successfully started {num_experiments} container')
return JSONResponse(return_dict, status_code=200)


Expand Down Expand Up @@ -283,5 +286,6 @@ async def check_if_api_is_available(authorized: bool = Depends(verify_token)) ->
uvicorn.run('app:app',
host='0.0.0.0',
port=8000,
log_config='./log_api.ini',
ssl_keyfile='/etc/sslzertifikat/api_cert.key',
ssl_certfile='/etc/sslzertifikat/api_cert.crt')
81 changes: 51 additions & 30 deletions docker/docker_manager.py
Original file line number Diff line number Diff line change
Expand Up @@ -59,21 +59,24 @@ class DockerManager():
_allowed_commands = ['training', 'exampleprinter', 'agent_monitoring']
# dictionary of container_id:host-port pairs
_port_mapping = {}
_logger = None

def __new__(cls):
def __new__(cls, logger):
"""
This function makes sure that the `DockerManager` is a singleton.
Returns:
DockerManager: The DockerManager instance.
"""
cls._logger = logger
if cls._instance is None:
print('A new instance of DockerManager is being initialized')
cls._logger.info('A new instance of DockerManager is being initialized')
cls._instance = super(DockerManager, cls).__new__(cls)
cls._client = cls._get_client()

if cls._client is not None:
cls._update_port_mapping()

return cls._instance

def start(self, config: dict, count: int) -> DockerInfo or list:
Expand All @@ -96,7 +99,7 @@ def start(self, config: dict, count: int) -> DockerInfo or list:
command_id = config['environment']['task']

if command_id not in self._allowed_commands:
print(f'Command with ID {command_id} not allowed')
self._logger.warning(f'Command with ID {command_id} not allowed')
return DockerInfo(id='No container was started', status=f'Command not allowed: {command_id}')

if not self._confirm_image_exists():
Expand All @@ -105,7 +108,6 @@ def start(self, config: dict, count: int) -> DockerInfo or list:
all_container_infos = []
for _ in range(count):
# start a container for the image of the requested command
print('cuda?', is_available())
container_info: DockerInfo = self._create_container(command_id, config, use_gpu=is_available())
if 'Image not found' in container_info.status or container_info.data is False:
# something is wrong with our container
Expand All @@ -125,7 +127,7 @@ def health(self, container_id: str) -> DockerInfo:
Returns:
DockerInfo: A JSON serializable object containing the id and the status of the new container.
"""
print(f'Checking health status for: {container_id}')
self._logger.info(f'Checking health status for: {container_id}')
container: Container = self._get_container(container_id)
if not container:
return DockerInfo(container_id, status='Container not found')
Expand Down Expand Up @@ -206,7 +208,7 @@ def start_tensorboard(self, container_id: str) -> DockerInfo:
if container.status != 'running':
return DockerInfo(container_id, status='Container is not running. Download the data and start a tensorboard locally.')

print(f'Starting tensorboard for: {container_id}')
self._logger.info(f'Starting tensorboard for: {container_id}')
container.exec_run(cmd='tensorboard serve --host 0.0.0.0 --logdir ./results/runs', detach=True)
port = self._port_mapping[container.id]
return DockerInfo(container_id, status=container.status, data=str(port))
Expand All @@ -229,7 +231,7 @@ def get_container_logs(self, container_id: str, timestamps: bool, stream: bool,
if not container:
return DockerInfo(container_id, status='Container not found')

print(f'Getting logs for {container_id}...')
self._logger.info(f'Getting logs for {container_id}...')

logs = container.logs(stream=stream, timestamps=timestamps, tail=tail,
stderr=docker.APIClient().inspect_container(container.id)['State']['ExitCode'] != 0)
Expand Down Expand Up @@ -276,10 +278,10 @@ def remove_container(self, container_id: str) -> DockerInfo:

container_info = self._stop_container(container_id)
if container_info.status != 'exited':
print(f'Container not stopped successfully. Status: {container_info.status}')
self._logger.warning(f'Container not stopped successfully. Status: {container_info.status}')
return DockerInfo(id=container_id, status=f'Container not stopped successfully. Status: {container_info.status}')

print(f'Removing container: {container_id}')
self._logger.info(f'Removing container: {container_id}')
try:
exit_code = container.wait()['StatusCode']
container.remove()
Expand All @@ -297,12 +299,12 @@ def ping(self) -> bool:
Returns:
bool: If the server is running or not.
"""
print('Pinging docker server...')
self._logger.info('Pinging docker server...')
try:
return self._get_client().ping()
except Exception:
print('Docker server is not responding!')
print(f'Client is: {self._get_client()}')
self._logger.warning('Docker server is not responding!')
self._logger.info(f'Client is: {self._get_client()}')
return False

# PRIVATE METHODS
Expand Down Expand Up @@ -344,19 +346,19 @@ def _confirm_image_exists(self, update: bool = False) -> str:
tagged_images = [image.tags[0].rsplit(':')[0] for image in all_images if len(image.tags)]

if len(all_images) != len(tagged_images):
print('You have untagged images and may want to remove them:')
self._logger.info('You have untagged images and may want to remove them:')
for image in all_images:
if len(image.tags) == 0:
print(image.id)
self._logger.info(image.id)

if update:
print(f'{IMAGE_NAME} image will be created/updated.')
self._logger.info(f'{IMAGE_NAME} image will be created/updated.')
return self._build_image()

if IMAGE_NAME not in tagged_images:
print(f'{IMAGE_NAME} image does not exist and will be created')
self._logger.info(f'{IMAGE_NAME} image does not exist and will be created')
return self._build_image()
print(f'{IMAGE_NAME} image already exists')
self._logger.info(f'{IMAGE_NAME} image already exists')
return self._get_client().images.get(IMAGE_NAME).id[7:]

def _build_image(self) -> str:
Expand All @@ -367,7 +369,7 @@ def _build_image(self) -> str:
str: The id of the image or None if the build failed.
"""
# https://docker-py.readthedocs.io/en/stable/images.html
print(f'Building {IMAGE_NAME} image')
self._logger.info(f'Building {IMAGE_NAME} image')

# Find out if an image with the name already exists to remove it afterwards
try:
Expand All @@ -381,13 +383,13 @@ def _build_image(self) -> str:
for output in logs:
if 'stream' in output:
output_str = output['stream'].strip('\r\n').strip('\n')
print(output_str)
self._logger.info(output_str)
img = self._get_client().images.get(IMAGE_NAME)
except docker.errors.BuildError or docker.errors.APIError as error:
print(f'An error occurred while building the {IMAGE_NAME} image\n{error}')
self._logger.error(f'An error occurred while building the {IMAGE_NAME} image\n{error}')
return None
if old_img is not None and old_img.id != img.id:
print(f'\nA {IMAGE_NAME} image already exists, it will be overwritten')
self._logger.warning(f'\nA {IMAGE_NAME} image already exists, it will be overwritten')
self._get_client().images.remove(old_img.id[7:])
# return id without the 'sha256:'-prefix
return img.id[7:]
Expand All @@ -405,7 +407,7 @@ def _create_container(self, command_id: str, config: dict, use_gpu: bool = True)
DockerInfo: A DockerInfo object with id and status set.
"""
# https://docker-py.readthedocs.io/en/stable/containers.html
print(f'Creating container for command: {command_id}')
self._logger.info(f'Creating container for command: {command_id}')

# first update the port mapping in case containers were added/removed without our knowledge
self._update_port_mapping()
Expand Down Expand Up @@ -438,7 +440,7 @@ def _create_container(self, command_id: str, config: dict, use_gpu: bool = True)

upload_info = self._upload_config(container.id, command_id, config)
if not upload_info.data:
print('Failed to upload configuration file!')
self._logger.warning('Failed to upload configuration file!')
return upload_info

def _start_container(self, container_id: str) -> DockerInfo:
Expand All @@ -456,10 +458,10 @@ def _start_container(self, container_id: str) -> DockerInfo:
return DockerInfo(id=container_id, status='Container not found.')

if container.status == 'running':
print(f'Container is already running: {container_id}')
self._logger.info(f'Container is already running: {container_id}')
return DockerInfo(id=container_id, status='running')

print(f'Starting container: {container_id}')
self._logger.info(f'Starting container: {container_id}')
try:
container.start()
# Reload the attributes to get the correct status
Expand All @@ -484,6 +486,13 @@ def _get_container(self, container_id: str) -> Container:
except docker.errors.NotFound:
return None

def _get_container_exit_code(self, container: Container) -> str:
try:
exit_code = container.wait()['StatusCode']
except docker.errors.APIError as error:
exit_code = f'could not get, {error}'
return exit_code

def _stop_container(self, container_id: str) -> DockerInfo:
"""
Stop a running container.
Expand All @@ -500,7 +509,7 @@ def _stop_container(self, container_id: str) -> DockerInfo:
if not container:
return DockerInfo(container_id, status='Container not found.')

print(f'Stopping container: {container_id}')
self._logger.info(f'Stopping container: {container_id}')
try:
container.stop(timeout=10)
# Reload the attributes to get the correct status
Expand All @@ -525,7 +534,7 @@ def _upload_config(self, container_id: str, command_id: str, config_dict: dict)
if not container:
return DockerInfo(id=container_id, status='Container not found.')

print('Copying config files into container...')
self._logger.info('Copying config files into container...')
# create a directory to store the files safely
os.makedirs('config_tmp', exist_ok=True)
os.chdir('config_tmp')
Expand Down Expand Up @@ -557,7 +566,7 @@ def _upload_config(self, container_id: str, command_id: str, config_dict: dict)
if upload_ok:
os.chdir('..')
shutil.rmtree('config_tmp')
print('Copying config files complete.')
self._logger.info('Copying config files complete')
return DockerInfo(id=container_id, status=container.status, data=upload_ok)

@classmethod
Expand All @@ -572,10 +581,22 @@ def _update_port_mapping(cls):
running_recommerce_containers = list(cls._get_client().containers.list(filters={'label': IMAGE_NAME}))
# Get the port mapped to '6006/tcp' within the container
occupied_ports = [int(container.ports['6006/tcp'][0]['HostPort']) for container in running_recommerce_containers]
# Create a dictionary of container_id: mapped port
cls._port_mapping = dict(zip([container.id for container in running_recommerce_containers], occupied_ports))
# Create a dictionary of container_id: mapped port


if __name__ == '__main__': # pragma: no cover
manager = DockerManager()
import logging
path_to_log_files = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'log_files')
if not os.path.isdir(path_to_log_files):
os.makedirs(path_to_log_files)

logging.basicConfig(filename=os.path.join(path_to_log_files, 'docker_manager.log'),
filemode='a',
format='%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s',
datefmt='%H:%M:%S',
level=logging.DEBUG)

docker_manager_logger = logging.getLogger('docker-manager')
manager = DockerManager(docker_manager_logger)
print(manager._confirm_image_exists(update=True), '\n')
21 changes: 21 additions & 0 deletions docker/log_api.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
[loggers]
keys=root

[handlers]
keys=logfile

[formatters]
keys=logfileformatter

[logger_root]
level=INFO
handlers=logfile

[formatter_logfileformatter]
format=[%(asctime)s.%(msecs)03d] %(levelname)s [%(thread)d] - %(message)s

[handler_logfile]
class=handlers.RotatingFileHandler
level=DEBUG
args=('log_files/api.log','a')
formatter=logfileformatter
Loading

0 comments on commit a5402ae

Please sign in to comment.