Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[飞桨多模态大模型套件PaddleMIX开发大赛] rfc & code #890

Open
wants to merge 17 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,307 changes: 1,307 additions & 0 deletions paddlemix/datacopilot/example/iqa_filter/filter_example.ipynb

Large diffs are not rendered by default.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不要把这些二进制文档提上来;可以拉到issue里 然后在这放链接;或者打个包后面我给你上传一下

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删了吧 ~ 其实就是 ipynb 导出的 pdf ~

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

可以的;另外那些图片可以也去掉吗;打个包当到issue里;在ipynb里写一下链接;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删了吧 ~ 图片可以从 https://hf-mirror.com/datasets/adamo1139/llava-instruct-150k-with-images 下载 ~ 我示例里面也说明一下 ~

Binary file not shown.
1 change: 1 addition & 0 deletions paddlemix/datacopilot/example/iqa_filter/llava_tmp_10.json

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion paddlemix/datacopilot/nn/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,5 @@


from ._lid import FastTextLIDModel
from .arniqa import ARNIQA
from .inscaptagger import PPInsCapTagger

15 changes: 15 additions & 0 deletions paddlemix/datacopilot/nn/arniqa/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from .arniqa import ARNIQA
99 changes: 99 additions & 0 deletions paddlemix/datacopilot/nn/arniqa/arniqa.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from __future__ import annotations

from pathlib import Path

import paddle
import paddle.nn.functional as F

from .pd_model_encoder.x2paddle_code import Sequential as encoder_paddle_model
from .pd_model_regressor.x2paddle_code import (
TorchLinearRegression as regressor_paddle_model,
)


class ARNIQA(paddle.nn.Layer):
"""
ARNIQA: Learning Distortion Manifold for Image Quality Assessment

@inproceedings{agnolucci2024arniqa,
title={ARNIQA: Learning Distortion Manifold for Image Quality Assessment},
author={Agnolucci, Lorenzo and Galteri, Leonardo and Bertini, Marco and Del Bimbo, Alberto},
booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
pages={189--198},
year={2024}
}

Reference:
- Arxiv link: https://www.arxiv.org/abs/2310.14918
- Official Github: https://github.com/miccunifi/ARNIQA
"""

def __init__(
self,
default_mean: tuple[float] = (0.485, 0.456, 0.406),
default_std: tuple[float] = (0.229, 0.224, 0.225),
feat_dim: int = 2048,
):
super(ARNIQA, self).__init__()
self.default_mean = paddle.to_tensor(default_mean).view([1, 3, 1, 1])
self.default_std = paddle.to_tensor(default_std).view([1, 3, 1, 1])
self.feat_dim = feat_dim
self.encoder = encoder_paddle_model()
self.regressor = regressor_paddle_model()

encoder_paddle_params = paddle.load(str(Path(__file__).parent / "pd_model_encoder" / "model.pdparams"))
regressor_paddle_params = paddle.load(str(Path(__file__).parent / "pd_model_regressor" / "model.pdparams"))

self.encoder.set_dict(encoder_paddle_params, use_structured_name=True)
self.regressor.set_dict(regressor_paddle_params, use_structured_name=True)

def forward(self, x: paddle.Tensor) -> float:
x, x_ds = self._preprocess(x)

f = F.normalize(self.encoder(x), axis=1)
f_ds = F.normalize(self.encoder(x_ds), axis=1)
f_combined = paddle.hstack((f, f_ds)).reshape([-1, self.feat_dim * 2])

score = self.regressor(f_combined)
score = self._scale_score(score)

return score

def _preprocess(self, x: paddle.Tensor):
x_ds = F.interpolate(x, scale_factor=0.5, mode="bilinear", align_corners=False)
x = (x - self.default_mean) / self.default_std
x_ds = (x_ds - self.default_mean) / self.default_std
return x, x_ds

def _scale_score(self, score: float) -> float:
new_range = (0.0, 1.0)

# Compute scaling factors
original_range = (1, 100)
original_width = original_range[1] - original_range[0]
new_width = new_range[1] - new_range[0]
scaling_factor = new_width / original_width

# Scale score
scaled_score = new_range[0] + (score - original_range[0]) * scaling_factor

return scaled_score

def __call__(self, item: paddle.Tensor) -> float:
return self.forward(item)

def inference(self, item: paddle.Tensor) -> float:
return self.forward(item)
13 changes: 13 additions & 0 deletions paddlemix/datacopilot/nn/arniqa/pd_model_encoder/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Loading