computa_env
is a gym-style environment for developing machine learning agents that interact with a computer.
First, you'll need to install computa_env
:
pip install computer-env # not available yet
Then, you can create a new environment:
from computa_env import LocalGUIEnv
env = LocalGUIEnv()
# We also offer VNCGUIEnv, ChromeEnv, StdioEnv, and AndroidEnv
Call reset()
to start a new episode:
observation = env.reset()
Now you can interact with the environment just like any other gym environment:
while True:
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
print(observation, reward, done, info)
-
General purpose gym-compatible computer interaction environment
-
Supports multiple agents sharing the same machine simultaneously (including you!)
-
LocalGUIEnv
provides system keyboard i/o, mouse i/o, and audiovideo output -
VNCGUIEnv
provides a connection to a VNC server with mouse i/o, keyboard i/o, and video output -
ChromeEnv
provides a connection to a Chrome browser with mouse i/o, keyboard i/o, and video output -
StdioEnv
provides a connection to a system process with keyboard i/o and ansi terminal output -
AndroidEnv
provides a connection to an Android device with touch i/o, keyboard input, and audiovideo output
The action sent in the step(action)
method is merely a request for the computer to execute that action. Other processes may be controlling the keyboard, mouse, audio, and other devices. This means you, other agents, or bots may be able to simultaneously operate on the same computer, but it also means the agent could be 'surprised' by the computer's behavior. For example, the release_key
action might not always result in a key_pressed
observation if another user is still holding the key down.
ComputaEnv modalities are passed in at initialization time. Subclasses like LocalComputaEnv
and VNCComputaEnv
provide convenience initialization for several standard modalities. However, you can easily introduce you own modalities by subclassing the ComputerModality
class:
# TODO make this example work
from computa_env import ComputaEnv, ComputerModality, LocalComputaEnv
class TouchModality(ComputerModality):
def __init__(self, num_fingers = 2, screen_size: Tuple[int, int]):
...
super().__init__(
name='touch',
<<<<<<< HEAD
action_space=spaces.Discrete(2), # or None if not applicable
observation_space=spaces.Box(low=0, high=1, shape=(2,), dtype=np.uint8), # or None if not applicable
=======
action_space=gym.spaces.Discrete(2), # or None if not applicable
observation_space=gym.spaces.Box(low=0, high=1, shape=(2,), dtype=np.uint8), # or None if not applicable
>>>>>>> d9f4883863ce60672e075a784f94e72269e05950
device='mouse',
device_args={'button': 1},
)
...
def observe(self):
# not needed if TouchModality were an action-only modality
...
def act(self, action):
# not needed if TouchModality were an observation-only modality
...
env = ComputaEnv(modalities=LocalComputaEnv.modalities + [MyModality()])
from computa_env import LocalComputaEnv
print(LocalComputaEnv.observation_modalities)
# TODO
print(LocalComputaEnv.action_modalities)
# TODO
from computa_env import VNCComputaEnv
print(VNCComputaEnv.observation_modalities)
# TODO
print(VNCComputaEnv.action_modalities)
# TODO
<<<<<<< HEAD
num_fingers
shouldn't be a parameter. It should be a separate modalityFinger(screen)
. Likewise, using xinput for multiple cursors, you should be able to doMouse(screen)
, and asupports_multimouse
environment will be able to respond intelligently. More generally, there should be a corresponding Environment protocol for all modalities:supports_keyboard
,supports_touch
, etc., and maybe there should be asupports_X = make_protocol(modality_X)
meta-function. =======
d9f4883863ce60672e075a784f94e72269e05950