Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Docker Env and web demo through Cog #63

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 11 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,15 @@ This is a reference implementation of 3D Ken Burns Effect from a Single Image [1

<a href="https://arxiv.org/abs/1909.05483" rel="Paper"><img src="http://content.sniklaus.com/kenburns/paper.jpg" alt="Paper" width="100%"></a>


## demo
Click the link below to run inference through Replicate's web demo:

[Demo and Docker image on Replicate](https://replicate.com/sniklaus/3d-ken-burns)


<a href="https://replicate.com/sniklaus/3d-ken-burns"><img src="https://replicate.com/sniklaus/3d-ken-burns/badge"></a>

## setup
Several functions are implemented in CUDA using CuPy, which is why CuPy is a required dependency. It can be installed using `pip install cupy` or alternatively using one of the provided [binary packages](https://docs.cupy.dev/en/stable/install.html#installing-cupy) as outlined in the CuPy repository. Please also make sure to have the `CUDA_HOME` environment variable configured.

Expand All @@ -29,6 +38,7 @@ python depthestim.py --in ./images/doublestrike.jpg --out ./depthestim.npy

To benchmark the depth estimation, run `python benchmark-ibims.py` or `python benchmark-nyu.py`. You can use it to easily verify that the provided implementation runs as expected.


## colab
If you do not have a suitable environment to run this projects then you could give Colab a try. It allows you to run the project in the cloud, free of charge. There are several people who provide Colab notebooks that should get you started. A few that I am aware of include one from [Arnaldo Gabriel](https://colab.research.google.com/github/agmm/colab-3d-ken-burns/blob/master/automatic-3d-ken-burns.ipynb), one from [Vlad Alex](https://towardsdatascience.com/very-spatial-507aa847179d), and one from [Ahmed Harmouche](https://github.com/wpmed92/3d-ken-burns-colab).

Expand Down Expand Up @@ -106,4 +116,4 @@ This is a project by Adobe Research. It is licensed under the [Creative Commons
```

## acknowledgment
The video above uses materials under a Creative Common license or with the owner's permission, as detailed at the end.
The video above uses materials under a Creative Common license or with the owner's permission, as detailed at the end.
31 changes: 31 additions & 0 deletions cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Configuration for Cog ⚙️
# Reference: https://github.com/replicate/cog/blob/main/docs/yaml.md

build:
# set to true if your model requires a GPU
gpu: true
cuda: "11.4"

# python version in the form '3.8' or '3.8.12'
python_version: "3.8"

# a list of packages in the format <package-name===<version>
python_packages:

- "ipython==7.33.0"
- "flask==2.0.3"
- "gevent==1.3.0"
- "moviepy==1.0.0"
- "numpy==1.21.6"
- "h5py==3.7.0"
- "torch==1.11.0"
- "torchvision==0.12.0"

# commands run after the environment is setup
run:
- "apt-get update && apt-get install -y python3-opencv"
- "pip install opencv-python"
- "pip install scipy"
- "pip install cupy-cuda114 --pre"
# predict.py defines how predictions are run on your model
predict: "predict.py:Predictor"
130 changes: 130 additions & 0 deletions predict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
""" Cog-based inference for animating still image (3D Ken Burns effect)"""
# Prediction interface for Cog ⚙️
# https://github.com/replicate/cog/blob/main/docs/python.md

import tempfile
import warnings

from cog import BasePredictor, Input, Path

warnings.filterwarnings("ignore")

#!/usr/bin/env python

import base64
import getopt
import glob
import io
import math
import os
import random
import re
import shutil
import sys
import tempfile
import time
import urllib
import zipfile

import cupy
import cv2
import flask
import gevent
import gevent.pywsgi
import h5py
import moviepy
import moviepy.editor
import numpy
import scipy
import scipy.io
import torch
import torchvision

##########################################################

assert (
int(str("").join(torch.__version__.split(".")[0:2])) >= 12
) # requires at least pytorch version 1.2.0

torch.set_grad_enabled(
False
) # make sure to not compute gradients for computational performance

torch.backends.cudnn.enabled = (
True # make sure to use cudnn for computational performance
)

##########################################################

objCommon = {}

exec(open("./common.py", "r").read())

exec(open("./models/disparity-estimation.py", "r").read())
exec(open("./models/disparity-adjustment.py", "r").read())
exec(open("./models/disparity-refinement.py", "r").read())
exec(open("./models/pointcloud-inpainting.py", "r").read())

##########################################################


class Predictor(BasePredictor):
def setup(self):
"""Load the model into memory to make running multiple predictions efficient"""
pass

def predict(
self,
image: Path = Input(description="Input image"),
) -> Path:
"""Run a single prediction on the model"""

npyImage = cv2.imread(filename=str(image), flags=cv2.IMREAD_COLOR)
intWidth = npyImage.shape[1]
intHeight = npyImage.shape[0]

fltRatio = float(intWidth) / float(intHeight)

intWidth = min(int(1024 * fltRatio), 1024)
intHeight = min(int(1024 / fltRatio), 1024)

npyImage = cv2.resize(
src=npyImage,
dsize=(intWidth, intHeight),
fx=0.0,
fy=0.0,
interpolation=cv2.INTER_AREA,
)

process_load(npyImage, {})
objFrom = {
"fltCenterU": intWidth / 2.0,
"fltCenterV": intHeight / 2.0,
"intCropWidth": int(math.floor(0.97 * intWidth)),
"intCropHeight": int(math.floor(0.97 * intHeight)),
}

objTo = process_autozoom(
{"fltShift": 100.0, "fltZoom": 1.25, "objFrom": objFrom}
)

npyResult = process_kenburns(
{
"fltSteps": numpy.linspace(0.0, 1.0, 75).tolist(),
"objFrom": objFrom,
"objTo": objTo,
"boolInpaint": True,
}
)

output_path = Path(tempfile.mkdtemp()) / "output.mp4"
moviepy.editor.ImageSequenceClip(
sequence=[
npyFrame[:, :, ::-1]
for npyFrame in npyResult + list(reversed(npyResult))[1:]
],
fps=25,
).write_videofile(str(output_path))
return output_path

# end