Skip to content

Commit

Permalink
Added Dockerfile (SakanaAI#21)
Browse files Browse the repository at this point in the history
* add docker

* update docker

* update dockerfile

* Update README.md

* Update README.md

* Update README.md

---------

Co-authored-by: Cong Lu <[email protected]>
  • Loading branch information
t46 and conglu1997 authored Aug 19, 2024
1 parent 6b6b456 commit 29d56ef
Show file tree
Hide file tree
Showing 2 changed files with 112 additions and 1 deletion.
89 changes: 89 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Use Python 3.11 as the base image
FROM python:3.11-bullseye

# Avoid prompts from apt
ENV DEBIAN_FRONTEND=noninteractive

# Set working directory
WORKDIR /app

# Install system dependencies including texlive-full
RUN apt-get update && apt-get install -y --no-install-recommends \
wget=1.21-1+deb11u1 \
git=1:2.30.2-1+deb11u2 \
build-essential=12.9 \
libssl-dev=1.1.1w-0+deb11u1 \
zlib1g-dev=1:1.2.11.dfsg-2+deb11u2 \
libbz2-dev=1.0.8-4 \
libreadline-dev=8.1-1 \
libsqlite3-dev=3.34.1-3 \
libncursesw5-dev=6.2+20201114-2+deb11u2 \
xz-utils=5.2.5-2.1~deb11u1 \
tk-dev=8.6.11+1 \
libxml2-dev=2.9.10+dfsg-6.7+deb11u4 \
libxmlsec1-dev=1.2.31-1 \
libffi-dev=3.3-6 \
liblzma-dev=5.2.5-2.1~deb11u1 \
texlive-full=2020.20210202-3 \
&& rm -rf /var/lib/apt/lists/*

# Upgrade pip
RUN pip install --no-cache-dir --upgrade pip==24.2

# Install Python packages
RUN pip install --no-cache-dir \
anthropic==0.34.0 \
aider-chat==0.50.1 \
backoff==2.2.1 \
openai==1.40.6 \
matplotlib==3.9.2 \
pypdf==4.3.1 \
pymupdf4llm==0.0.10 \
torch==2.4.0 \
numpy==1.26.4 \
transformers==4.44.0 \
datasets==2.21.0 \
tiktoken==0.7.0 \
wandb==0.17.7 \
tqdm==4.66.5 \
scikit-learn==1.5.1 \
einops==0.8.0

# Clone and install NPEET with a specific commit
RUN git clone https://github.com/gregversteeg/NPEET.git
WORKDIR /app/NPEET
RUN git checkout 8b0d9485423f74e5eb199324cf362765596538d3 \
&& pip install .

# Clone the AI-Scientist repository
WORKDIR /app
RUN git clone https://github.com/SakanaAI/AI-Scientist.git

# Set working directory to AI-Scientist
WORKDIR /app/AI-Scientist

# Prepare NanoGPT data
RUN python data/enwik8/prepare.py && \
python data/shakespeare_char/prepare.py && \
python data/text8/prepare.py

# Set up baseline runs
RUN for dir in templates/*/; do \
if [ -f "${dir}experiment.py" ]; then \
cd "${dir}" || continue; \
python experiment.py --out_dir run_0 && \
python plot.py; \
cd /app/AI-Scientist || exit; \
fi \
done

# Create entrypoint script
RUN printf '#!/bin/bash\n\
python launch_scientist.py "$@"\n' > /app/entrypoint.sh && \
chmod +x /app/entrypoint.sh

# Set the entrypoint
ENTRYPOINT ["/app/entrypoint.sh"]

# Set the default command to an empty array
CMD []
24 changes: 23 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ We further provide all runs and data from our paper [here](https://drive.google.
10. [Grokking Through Compression: Unveiling Sudden Generalization via Minimal Description Length](https://github.com/SakanaAI/AI-Scientist/tree/main/example_papers/mdl_grokking_correlation.pdf)
11. [Accelerating Mathematical Insight: Boosting Grokking Through Strategic Data Augmentation](https://github.com/SakanaAI/AI-Scientist/tree/main/example_papers/data_augmentation_grokking.pdf)

**Note**: Caution! This codebase will execute LLM-written code. There are various risks and challenges associated with this autonomy. This includes e.g. the use of potentially dangerous packages, web access, and potential spawning of processes. Use at your own discretion. Please make sure to containerize and restrict web access appropriately.
**Note**: Caution! This codebase will execute LLM-written code. There are various risks and challenges associated with this autonomy. This includes e.g. the use of potentially dangerous packages, web access, and potential spawning of processes. Use at your own discretion. Please make sure to [containerize](#containerization) and restrict web access appropriately.

<p align="center">
<a href="https://github.com/SakanaAI/AI-Scientist/blob/main/example_papers/adaptive_dual_scale_denoising/adaptive_dual_scale_denoising.pdf"><img src="https://github.com/SakanaAI/AI-Scientist/blob/main/docs/anim-ai-scientist.gif" alt="Adaptive Dual Scale Denoising" width="80%" />
Expand All @@ -43,6 +43,7 @@ We further provide all runs and data from our paper [here](https://drive.google.
5. [Template Resources](#template-resources)
6. [Citing The AI Scientist](#citing-the-ai-scientist)
7. [Frequently Asked Questions](#faq)
8. [Containerization](#containerization)

## Requirements

Expand Down Expand Up @@ -270,3 +271,24 @@ Please refer to the instructions for different templates. In this current iterat
### How do I add support for a new foundation model?
Please see this [PR](https://github.com/SakanaAI/AI-Scientist/pull/7) for an example of how to add a new model, e.g. this time for Claude via Bedrock.
We do not advise any model that is significantly weaker than GPT-4 level for The AI Scientist.

## Containerization

We include a [community-contributed](https://github.com/SakanaAI/AI-Scientist/pull/21) Docker image that may assist with your containerization efforts in `Dockerfile`.

You can use this image like this:

```bash
# Endpoint Script
docker run -e OPENAI_API_KEY=$OPENAI_API_KEY <AI_SCIENTIST_IMAGE> \
--model “gpt-4o-2024-05-13” \
--experiment 2d_diffusion \
--num-ideas 1
```

```bash
# Interactive
docker run -it -e OPENAI_API_KEY=$OPENAI_API_KEY \
--entrypoint /bin/bash \
<AI_SCIENTIST_IMAGE>
```

0 comments on commit 29d56ef

Please sign in to comment.