Skip to content

Commit

Permalink
Open ai gym wrapper + bump engine version (#52)
Browse files Browse the repository at this point in the history
* wip

* Update gym.py

* .

* wip

* wip

* Update Dockerfile.gym.dev

* wip

* Update gym.py

* fwd model close

* Update dev_gym.py

* Update forward_model.py

* wip

* Update agent.py

* Update gym.py

* Update gym.py

* Update gym.py

* .

* .

* Update gym.py

* wip

* wip

* Update gym.py

* wip

* wip

* Update dev_gym.py

* wip

* wip

* wip

* wip

* Update gym.py

* wip

* Update gym.py

* Update gym.py

* Update gym.py

* .

* Bump websockets version

* Update gym.py

* Update gym.py

* wip

* wip

* gym

* close

* Update README.md

* Update README.md

* Update README.md
  • Loading branch information
thegalah authored Dec 6, 2021
1 parent dfeb54b commit be04d86
Show file tree
Hide file tree
Showing 12 changed files with 193 additions and 28 deletions.
28 changes: 15 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,15 @@ docker-compose up --abort-on-container-exit --force-recreate

# Starter kits

| Kit | Link | Description | Up-to-date? | Contributed by |
| -------------- | ------------------------------------------------------------------------- | -------------------------------------------------- | ----------- | --------------------------------------- |
| Python3 | [Link](https://github.com/CoderOneHQ/starter-kits/tree/master/python3) | Basic Python3 starter || Coder One |
| Python3-fwd | [Link](https://github.com/CoderOneHQ/starter-kits/tree/master/python3) | Includes example for using forward model simulator || Coder One |
| TypeScript | [Link](https://github.com/CoderOneHQ/starter-kits/tree/master/typescript) | Basic TypeScript starter || Coder One |
| TypeScript-fwd | [Link](https://github.com/CoderOneHQ/starter-kits/tree/master/typescript) | Includes example for using forward model simulator || Coder One |
| Go | [Link](https://github.com/CoderOneHQ/bomberland/tree/master/go) | Basic Go starter || [dtitov](https://github.com/dtitov) |
| C++ | [Link](https://github.com/CoderOneHQ/bomberland/tree/master/C%2B%2B) | Basic C++ starter || [jfbogusz](https://github.com/jfbogusz) |
| Kit | Link | Description | Up-to-date? | Contributed by |
| ------------------- | ------------------------------------------------------------------------- | -------------------------------------------------- | ----------- | --------------------------------------- |
| Python3 | [Link](https://github.com/CoderOneHQ/starter-kits/tree/master/python3) | Basic Python3 starter || Coder One |
| Python3-fwd | [Link](https://github.com/CoderOneHQ/starter-kits/tree/master/python3) | Includes example for using forward model simulator || Coder One |
| Python3-gym-wrapper | [Link](https://github.com/CoderOneHQ/starter-kits/tree/master/python3) | Open AI Gym wrapper || Coder One |
| TypeScript | [Link](https://github.com/CoderOneHQ/starter-kits/tree/master/typescript) | Basic TypeScript starter || Coder One |
| TypeScript-fwd | [Link](https://github.com/CoderOneHQ/starter-kits/tree/master/typescript) | Includes example for using forward model simulator || Coder One |
| Go | [Link](https://github.com/CoderOneHQ/bomberland/tree/master/go) | Basic Go starter || [dtitov](https://github.com/dtitov) |
| C++ | [Link](https://github.com/CoderOneHQ/bomberland/tree/master/C%2B%2B) | Basic C++ starter || [jfbogusz](https://github.com/jfbogusz) |

# Contributing

Expand All @@ -42,11 +43,12 @@ For any help, please contact us directly on [Discord](https://discord.gg/NkfgvRN

# Release Notes

| Ver. | Changes | Date | Binary |
| ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------- | ----------- |
| 1523 | Forward model bug fixes + unit move blocking on moving to same cell + reset game with a set world and prng seed (See: [Docs](https://www.gocoder.one/docs/api-reference#reset-game)) | 29 Nov 2021 | [Link](https://github.com/CoderOneHQ/bomberland/releases/tag/build-1523) |
| 1065 | Added `UNITS_PER_AGENT` environment flag (See: [Docs](https://gocoder.one/docs/api-reference#%EF%B8%8F-environment-flags)) | 9 Oct 2021 | - |
| 974 | Added functionality: <ul><li>Reset the game without restarting engine/containers</li><li>Evaluate next state by the game engine given a state + list of actions</li></ul> See: [Docs](https://gocoder.one/docs/api-reference#-administrator-api) | 18 Sep 2021 | [Link](https://github.com/CoderOneHQ/bomberland/releases/tag/build-974) |
| Ver. | Changes | Date | Binary |
| ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | ------------- | ------------------------------------------------------------------------ |
| 1555 | Changes to support open ai gym wrapper | 6th Dev 2021 | [Link](https://github.com/CoderOneHQ/bomberland/releases/tag/build-1523) |
| 1523 | Forward model bug fixes + unit move blocking on moving to same cell + reset game with a set world and prng seed (See: [Docs](https://www.gocoder.one/docs/api-reference#reset-game)) | 29th Nov 2021 | [Link](https://github.com/CoderOneHQ/bomberland/releases/tag/build-1523) |
| 1065 | Added `UNITS_PER_AGENT` environment flag (See: [Docs](https://gocoder.one/docs/api-reference#%EF%B8%8F-environment-flags)) | 9th Oct 2021 | - |
| 974 | Added functionality: <ul><li>Reset the game without restarting engine/containers</li><li>Evaluate next state by the game engine given a state + list of actions</li></ul> See: [Docs](https://gocoder.one/docs/api-reference#-administrator-api) | 18th Sep 2021 | [Link](https://github.com/CoderOneHQ/bomberland/releases/tag/build-974) |

# Discussion and Questions

Expand Down
8 changes: 7 additions & 1 deletion base-compose.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
version: "3"
services:
game-server:
image: coderone.azurecr.io/game-server:1523
image: coderone.azurecr.io/game-server:1555
volumes:
- ./logs:/app/logs

Expand Down Expand Up @@ -33,6 +33,12 @@ services:
dockerfile: Dockerfile.fwd.dev
volumes:
- ./python3:/app
python3-gym-dev:
build:
context: python3
dockerfile: Dockerfile.gym.dev
volumes:
- ./python3:/app

typescript-agent:
build:
Expand Down
26 changes: 26 additions & 0 deletions open-ai-gym-wrapper-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
version: "3"
services:
gym:
extends:
file: base-compose.yml
service: python3-gym-dev
environment:
- FWD_MODEL_CONNECTION_STRING=ws://fwd-server:6969/?role=admin
depends_on:
- fwd-server
networks:
- coderone-open-ai-gym-wrapper

fwd-server:
extends:
file: base-compose.yml
service: game-server
environment:
- TELEMETRY_ENABLED=0
- PORT=6969
- WORLD_SEED=1234
- PRNG_SEED=1234
networks:
- coderone-open-ai-gym-wrapper
networks:
coderone-open-ai-gym-wrapper: null
6 changes: 6 additions & 0 deletions python3/Dockerfile.gym.dev
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
FROM python:3.8-bullseye

COPY ./requirements.txt /app/requirements.txt
WORKDIR /app
RUN python -m pip install -r requirements.txt
ENTRYPOINT PYTHONUNBUFFERED=1 python dev_gym.py
7 changes: 7 additions & 0 deletions python3/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Overview

`agent.py` - random agent

`agent_fwd.py` - random agent that connects to forward model

`dev_gym.py` - [open ai gym wrapper](https://gym.openai.com/)
8 changes: 6 additions & 2 deletions python3/agent.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from typing import Union
from game_state import GameState
import asyncio
import random
Expand All @@ -8,11 +9,12 @@

actions = ["up", "down", "left", "right", "bomb", "detonate"]


class Agent():
def __init__(self):
self._client = GameState(uri)

### any initialization code can go here
# any initialization code can go here
self._client.set_game_tick_callback(self._on_game_tick)

loop = asyncio.get_event_loop()
Expand All @@ -23,7 +25,7 @@ def __init__(self):
loop.run_until_complete(asyncio.wait(tasks))

# returns coordinates of the first bomb placed by a unit
def _get_bomb_to_detonate(self, unit) -> [int, int] or None:
def _get_bomb_to_detonate(self, unit) -> Union[int, int] or None:
entities = self._client._state.get("entities")
bombs = list(filter(lambda entity: entity.get(
"unit_id") == unit and entity.get("type") == "b", entities))
Expand Down Expand Up @@ -56,8 +58,10 @@ async def _on_game_tick(self, tick_number, game_state):
else:
print(f"Unhandled action: {action} for unit {unit_id}")


def main():
Agent()


if __name__ == "__main__":
main()
5 changes: 2 additions & 3 deletions python3/agent_fwd.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from typing import Union
from forward_model import ForwardModel
from game_state import GameState
import asyncio
import copy
import os
import random

Expand All @@ -27,7 +27,6 @@ def connect(self):
loop = asyncio.get_event_loop()

client_connection = loop.run_until_complete(self._client.connect())
client_fwd_connection = None

client_fwd_connection = loop.run_until_complete(
self._client_fwd.connect())
Expand All @@ -38,7 +37,7 @@ def connect(self):
self._client_fwd._handle_messages(client_fwd_connection))
loop.run_forever()

def _get_bomb_to_detonate(self, game_state) -> [int, int] or None:
def _get_bomb_to_detonate(self, game_state) -> Union[int, int] or None:
agent_number = game_state.get("connection").get("agent_number")
entities = self._client._state.get("entities")
bombs = list(filter(lambda entity: entity.get(
Expand Down
34 changes: 34 additions & 0 deletions python3/dev_gym.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import asyncio
from typing import Dict
from gym import Gym
import os

fwd_model_uri = os.environ.get(
"FWD_MODEL_CONNECTION_STRING") or "ws://127.0.0.1:6969/?role=admin"

mock_6x6_state: Dict = {"agents": {"a": {"agent_id": "a", "unit_ids": ["c", "e", "g"]}, "b": {"agent_id": "b", "unit_ids": ["d", "f", "h"]}}, "unit_state": {"c": {"coordinates": [0, 1], "hp": 3, "inventory": {"bombs": 3}, "blast_diameter": 3, "unit_id": "c", "agent_id": "a", "invulnerability": 0}, "d": {"coordinates": [5, 1], "hp": 3, "inventory": {"bombs": 3}, "blast_diameter": 3, "unit_id": "d", "agent_id": "b", "invulnerability": 0}, "e": {"coordinates": [3, 3], "hp": 3, "inventory": {"bombs": 3}, "blast_diameter": 3, "unit_id": "e", "agent_id": "a", "invulnerability": 0}, "f": {"coordinates": [2, 3], "hp": 3, "inventory": {"bombs": 3}, "blast_diameter": 3, "unit_id": "f", "agent_id": "b", "invulnerability": 0}, "g": {"coordinates": [2, 4], "hp": 3, "inventory": {"bombs": 3}, "blast_diameter": 3, "unit_id": "g", "agent_id": "a", "invulnerability": 0}, "h": {"coordinates": [3, 4], "hp": 3, "inventory": {"bombs": 3}, "blast_diameter": 3, "unit_id": "h", "agent_id": "b", "invulnerability": 0}}, "entities": [
{"created": 0, "x": 0, "y": 3, "type": "m"}, {"created": 0, "x": 5, "y": 3, "type": "m"}, {"created": 0, "x": 4, "y": 3, "type": "m"}, {"created": 0, "x": 1, "y": 3, "type": "m"}, {"created": 0, "x": 3, "y": 5, "type": "m"}, {"created": 0, "x": 2, "y": 5, "type": "m"}, {"created": 0, "x": 5, "y": 4, "type": "m"}, {"created": 0, "x": 0, "y": 4, "type": "m"}, {"created": 0, "x": 1, "y": 1, "type": "w", "hp": 1}, {"created": 0, "x": 4, "y": 1, "type": "w", "hp": 1}, {"created": 0, "x": 3, "y": 0, "type": "w", "hp": 1}, {"created": 0, "x": 2, "y": 0, "type": "w", "hp": 1}, {"created": 0, "x": 5, "y": 5, "type": "w", "hp": 1}, {"created": 0, "x": 0, "y": 5, "type": "w", "hp": 1}, {"created": 0, "x": 4, "y": 0, "type": "w", "hp": 1}, {"created": 0, "x": 1, "y": 0, "type": "w", "hp": 1}, {"created": 0, "x": 5, "y": 0, "type": "w", "hp": 1}, {"created": 0, "x": 0, "y": 0, "type": "w", "hp": 1}], "world": {"width": 6, "height": 6}, "tick": 0, "config": {"tick_rate_hz": 10, "game_duration_ticks": 300, "fire_spawn_interval_ticks": 2}}


def calculate_reward(state: Dict):
# custom reward function
return 1


async def main():
gym = Gym(fwd_model_uri)
await gym.connect()
env = gym.make("bomberland-open-ai-gym", mock_6x6_state)
for i_ in range(1000):
actions = []
observation, done, info = await env.step(actions)
reward = calculate_reward(observation)

print(f"reward: {reward} done: {done} info: {info}")
if done:
await env.reset()
await gym.close()


if __name__ == "__main__":
asyncio.run(main())
15 changes: 11 additions & 4 deletions python3/forward_model.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,23 @@
import asyncio
from typing import Dict, List
import websockets
import json
import copy


class ForwardModel:
def __init__(self, connection_string: str):
self._connection_string = connection_string
self._next_state_callback = None
self.connection = None

async def close(self):
if self.connection is not None:
await self.connection.close()

def set_next_state_callback(self, next_state_callback):
self._next_state_callback = next_state_callback

async def connect(self):
self.connection = await websockets.client.connect(self._connection_string)
self.connection = await websockets.connect(self._connection_string)
if self.connection.open:
return self.connection

Expand All @@ -36,6 +40,9 @@ async def _on_data(self, data):
elif data_type == "next_game_state":
payload = data.get("payload")
await self._on_next_state(payload)
elif data_type == "game_state":
# no-op
return
else:
print(f"unknown packet \"{data_type}\": {data}")

Expand All @@ -60,7 +67,7 @@ async def _on_next_state(self, payload):
next_state call since payloads can come back in any order
It should ideally be unique
"""
async def send_next_state(self, sequence_id, game_state, actions):
async def send_next_state(self, sequence_id: int, game_state: Dict, actions: List[Dict]):
game_state.pop("connection", None)
packet = {"actions": actions,
"type": "evaluate_next_state", "state": game_state, "sequence_id": sequence_id}
Expand Down
12 changes: 8 additions & 4 deletions python3/game_state.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,10 @@
import asyncio
from typing import Union
import websockets
import json

from websockets.client import WebSocketClientProtocol

_move_set = set(("up", "down", "left", "right"))


Expand All @@ -15,7 +18,7 @@ def set_game_tick_callback(self, generate_agent_action_callback):
self._tick_callback = generate_agent_action_callback

async def connect(self):
self.connection = await websockets.client.connect(self._connection_string)
self.connection = await websockets.connect(self._connection_string)
if self.connection.open:
return self.connection

Expand All @@ -32,10 +35,11 @@ async def send_bomb(self, unit_id: str):
await self._send(packet)

async def send_detonate(self, x, y, unit_id: str):
packet = {"type": "detonate", "coordinates": [x, y], "unit_id": unit_id}
packet = {"type": "detonate", "coordinates": [
x, y], "unit_id": unit_id}
await self._send(packet)

async def _handle_messages(self, connection: str):
async def _handle_messages(self, connection: WebSocketClientProtocol):
while True:
try:
raw_data = await connection.recv()
Expand Down Expand Up @@ -138,7 +142,7 @@ def _on_unit_action(self, action_packet):
else:
print(f"Unhandled agent action recieved: {action_type}")

def _get_new_unit_coordinates(self, coordinates, move_action) -> [int, int]:
def _get_new_unit_coordinates(self, coordinates, move_action) -> Union[int, int]:
[x, y] = coordinates
if move_action == "up":
return [x, y+1]
Expand Down
70 changes: 70 additions & 0 deletions python3/gym.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
import asyncio
import json
from typing import Callable, Dict, List

import websockets
from forward_model import ForwardModel


class GymEnv():
def __init__(self, fwd_model: ForwardModel, channel: int, initial_state: Dict, send_next_state: Callable[[Dict, List[Dict], int], Dict]):
self._state = initial_state
self._initial_state = initial_state
self._fwd = fwd_model
self._channel = channel
self._send = send_next_state

async def reset(self):
self._state = self._initial_state
print("Resetting")

async def step(self, actions):
state = await self._send(self._state, actions, self._channel)
self._state = state.get("next_state")
return [state.get("next_state"), state.get("is_complete"), state.get("tick_result").get("events")]


class Gym():
def __init__(self, fwd_model_uri: str):
self._client_fwd = ForwardModel(fwd_model_uri)
self._channel_counter = 0
self._channel_is_busy_status: Dict[int, bool] = {}
self._channel_buffer: Dict[int, Dict] = {}
self._client_fwd.set_next_state_callback(self._on_next_game_state)
self._environments: Dict[str, GymEnv] = {}

async def connect(self):
loop = asyncio.get_event_loop()

client_fwd_connection = await self._client_fwd.connect()

loop = asyncio.get_event_loop()
loop.create_task(
self._client_fwd._handle_messages(client_fwd_connection))

async def close(self):
await self._client_fwd.close()

async def _on_next_game_state(self, state):
channel = state.get("sequence_id")
self._channel_is_busy_status[channel] = False
self._channel_buffer[channel] = state

def make(self, name: str, initial_state: Dict) -> GymEnv:
if self._environments.get(name) is not None:
raise Exception(
f"environment \"{name}\" has already been instantiated")
self._environments[name] = GymEnv(
self._client_fwd, self._channel_counter, initial_state, self._send_next_state)
self._channel_counter += 1
return self._environments[name]

async def _send_next_state(self, state, actions, channel: int):
self._channel_is_busy_status[channel] = True
await self._client_fwd.send_next_state(channel, state, actions)
while self._channel_is_busy_status[channel] == True:
# TODO figure out why packets are not received without some sleep
await asyncio.sleep(0.0001)
result = self._channel_buffer[channel]
del self._channel_buffer[channel]
return result
2 changes: 1 addition & 1 deletion python3/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
asyncio==3.4.3
websockets==8.1
websockets==10.1

0 comments on commit be04d86

Please sign in to comment.