Mesoscopic Insights: Orchestrating Multi-scale & Hybrid Architecture for Image Manipulation Localization
This repository contains the official PyTorch implementation of our AAAI2025 paper: "Mesoscopic Insights: Orchestrating Multi-scale & Hybrid Architecture for Image Manipulation Localization".
The Mesorch Framework employs a novel multi-scale parallel architecture to effectively process input images, setting a new benchmark in image manipulation localization. By leveraging distinct frequency components and feature hierarchies, it captures both local manipulations and global inconsistencies. Its adaptive weighting mechanism ensures precise and comprehensive results, making it a robust solution for image manipulation localization tasks.
Note:
All code in this project is developed based on the IMDLBenCo repository.
For any dataset-related issues or additional resources, please refer to the repository linked above.
Below are the testing and training details for Mesorch based on this repository.
Click to expand
This document provides step-by-step instructions for setting up the environment for the project, ensuring compatibility and successful installation of required dependencies.
git clone [email protected]:scu-zjz/Mesorch.git
Run the following command in your terminal:
conda create -n mesorch python==3.10
conda activate mesorch
pip install torch torchvision
pip install imdlbenco
pip install "numpy<2"
To use the pretrained models, download the checkpoints from the following link:
The directory structure of the checkpoints is as follows:
Mesorch/
├── ckpt_mesorch/
│ └── mesorch-98.pth
├── ckpt_mesorch_p/
│ └── mesorch_p-118.pth
├── extractor/
├── .gitignore
├── balanced_dataset.json
├── LICENSE
├── ...
├── train_mesorch_p.sh
├── train_mesorch.sh
└── train.py
All the following examples are based on the Mesorch model. The Mesorch-P model shares the same testing procedure as Mesorch, with no significant differences.
sh test_mesorch_f1.sh
sh test_mesorch_permute_f1.sh
sh test_robust_mesorch.sh
Click to expand
This part provides instructions on how to configure and execute the training shell script for this project.
To begin the training process, you need to download the pretrained weights for Segformer. Specifically, this project uses the mit-b3 model pretrained on ImageNet. Follow the instructions below to download it from the official Segformer GitHub repository:
-
Visit the official Segformer GitHub repository: Segformer GitHub.
-
Navigate to the "Training" section in the repository's README or directly access the download link provided for the mit-b3 model.
-
Download the pretrained weights for mit-b3.
To start the training process, you need to execute the provided .sh
shell script file. Before running the script, ensure that key parameters such as seg_pretrain_path
, data_path
, and test_data_path
are properly configured.
Before running the training shell script, edit and configure the following parameters in the .sh
file as needed:
-
seg_pretrain_path
: This should point to the pretrained segmentation model file. Ensure the file exists at the specified location.Example:
seg_pretrain_path="/mnt/data0/xuekang/workspace/segformer/mit_b3.pth"
-
data_path
: This is the directory containing the training data. Update this path to the location of your training datasetExample:
data_path="/mnt/data0/xuekang/workspace/Mesorch/balanced_dataset.json"
-
test_data_path
: This is the directory containing the testing data. Update this path to the location of your test dataset.Example:
test_data_path="/mnt/data0/public_datasets/IML/CASIA1.0"
Once the parameters are correctly configured, execute the shell script to start the training process. Use the following command:
sh train_mesorch.sh
If you find our work interesting or helpful, please don't hesitate to give us a star🌟 and cite our paper🥰! Your support truly encourages us!
@misc{zhu2024meso
title={Mesoscopic Insights: Orchestrating Multi-scale & Hybrid Architecture for Image Manipulation Localization},
author={Xuekang Zhu and Xiaochen Ma and Lei Su and Zhuohang Jiang and Bo Du and Xiwen Wang and Zeyu Lei and Wentao Feng and Chi-Man Pun and Jizhe Zhou},
year={2024},
eprint={2412.13753},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2412.13753},
}