This repository contains a pytorch implementation for the AAAI 2024 paper, FPRF: Feed-Forward Photorealistic Style Transfer of Large-Scale 3D Neural Radiance Fields.
LLFF_github.mp4
This code was developed on Ubuntu 18.04 with Python 3.9, CUDA 11.8 and PyTorch 2.0.0. Later versions should work, but have not been tested.
- Python 3.9
- CUDA 11.8
- Single GPU w/ minimum 24 GB RAM
Create and activate a virtual environment, then install pytorch
and tiny-cuda-nn
:
conda create -n FPRF python=3.9
conda activate FPRF
conda install pytorch==2.0.0 torchvision==0.15.0 torchaudio==2.0.0 pytorch-cuda=11.8 -c pytorch -c nvidia
pip install git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
Install the remaining requirements with pip:
pip install -r requirements.txt
To run FPRF, please download the LLFF dataset and put it in ./data.
Run below commands to train a stylizable 3D scene.
PYTHONPATH=. python plenoxels/main.py --config-path plenoxels/configs/final/LLFF/llff_flower.py
You can change the scene by editing config files in ./plenoxels/configs/final/LLFF/.
Run below commands to transfer the style of a 3D scene to the refernece images in ./references.
PYTHONPATH=. python plenoxels/main.py --config-path plenoxels/configs/final/LLFF/llff_flower.py --log-dir logs/flower --render-only
Here are the controllable hyperparameters.
PYTHONPATH=. python plenoxels/main.py --config-path plenoxels/configs/final/LLFF/llff_flower.py --log-dir logs/flower --render-only --style_path ./references --num_clusters 10 --local_global_blending_ratio 0.3 --temperature 100
-
num_clusters - Number of clusters for clustering each reference image.
-
local_global_blending_ratio - Ratio of global style feature for style transfer. 1 refers using only global style features and 0 refers using only local style features.
-
temperature - Temperature of softmax operation for semantic correspondence matching.
To inference with a checkpoint, please download a model.pth file from this link and put it in ./logs/{scene}, e.g. ./logs/flower/model.pth .
If you find our code or paper helps, please consider citing:
@inproceedings{kim2024fprf,
title={{FPRF}: Feed-Forward Photorealistic Style Transfer of Large-Scale {3D} Neural Radiance Fields},
author={GeonU Kim and Kim Youwang and Tae-Hyun Oh},
year={2024},
booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
}
This work was supported by the LG Display (2022008004), Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (No.2022-0-00290,Visual Intelligence for Space-Time Understanding and Generation based on Multi-layered Visual Common Sense; No.RS-2023-00225630, Development of Artificial Intelligence for Text-based 3D Movie Generation; No.2021-0-02068, Artificial Intelligence Innovation Hub; No. 2019-0-01906, Artificial Intelligence Graduate School Program(POSTECH))
The implementation of FPRF is largely inspired and fine-tuned from the seminal prior work, K-Planes (Fridovich-Keil et al.). We thank the authors of K-Planes who made their code public.