Skip to content

Commit

Permalink
Merge branch 'PaddlePaddle:develop' into test
Browse files Browse the repository at this point in the history
  • Loading branch information
mikemikimike authored Jan 7, 2025
2 parents 6aedca9 + 2bd7bff commit f548061
Show file tree
Hide file tree
Showing 122 changed files with 14,067 additions and 234 deletions.
66 changes: 57 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,23 +34,22 @@


## 📰新闻
**🔥2024.11.21日 - 2024.12.22日 PaddleMIX开发项目挑战(已结束)**
**🔥2025.01.07日直播课 飞桨PP系列模型上新!**

- ✨「体验官招募」PaddleMIX开发项目挑战
点击链接报名🔗:https://aistudio.baidu.com/activitydetail/1503019366
🏆投稿至飞桨星河社区项目大厅,加精获得PaddleMIX体验官认证证书及京东卡激励
欢迎大家投稿~
- ✨PP-DocBee文档图像理解的新‘蜂’向标!
为了帮助您迅速且深入地了解**PaddleMIX****PP-DocBee文档理解特色模型**,并熟练掌握实际操作技巧,百度高级研发工程师将在**1月7日(周二)19:00**,为您详细解读PP-DocBee的核心技术,手把手演示多模态大模型开发全流程。赶快扫描下方海报二维码预约报名!
<details>
<summary>点击展开活动海报</summary>
<p align="center">
<img src='https://github.com/user-attachments/assets/27e0bbe3-0ff8-49ef-bd39-81a31a2b288b' width="25%">
<img src='https://github.com/user-attachments/assets/3b7adc9e-c68d-44d1-9674-05b933947deb' width="80%">
</p>
</details>

## 📣最新进展

<!-- 📚《飞桨多模态大模型开发套件PaddleMIX 2.1 震撼发布》,图文音视频场景全覆盖,多模态高效助力产业创新。超大规模训练支持,覆盖图文预训练、文生图、跨模态视觉任务,覆盖金融、教育、电商、医疗等产业场景。8月8日(周四)20:00 带你直播了解多模态大模型最新架构,深度解析PaddleMIX高性能模型库,手把手演示LLaVA模型训推全流程。[报名链接](https://www.wjx.top/vm/wKqysjx.aspx?udsid=449688) -->

**🎉 2024.01.02 新增自研文档理解模型[PP-DocBee](./paddlemix/examples/ppdocbee)推理和训练,支持[高性能推理](./deploy/ppdocbee)**

**🎉 2024.12.17 支持[GOT-OCR2_0](./paddlemix/examples/GOT_OCR_2_0)推理和训练**

Expand All @@ -64,15 +63,16 @@

**🎉 2024.11.1 支持[LLaVA-OneVision](./paddlemix/examples/llava_onevision/)[LLaVA-Critic](./paddlemix/examples/llava_critic/)推理**


<details>
<summary>点击展开更多</summary>

**🎉 2024.10.31 喜迎外部开发者的[创作教程页面](paddlemix_applications.md)更新**

* 🌟 自9月6日发起大模型套件精品项目征集活动以来,我们收到了30个优质开发者项目,其中25个精品项目已通过平台评估并成功加精。

* 🙏 衷心感谢各位开发者基于套件的精彩创作!🚀 诚挚邀请您也来分享您的创意 - 欢迎将教程发布到公开网页或[飞桨AI Studio](https://aistudio.baidu.com/aistudio/community/multimodal?from=singlemessage)社区!

<details>
<summary>点击展开更多</summary>

**🔥2024.10.11 发布PaddleMIX v2.1**
* 支持[PaddleNLP 3.0 beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0-beta0)版本,抢先体验其最新功能。
* 新增[Qwen2-VL](./paddlemix/examples/qwen2_vl/)[InternVL2](./paddlemix/examples/internvl2/)[Stable Diffusion 3 (SD3)](https://github.com/PaddlePaddle/PaddleMIX/blob/develop/ppdiffusers/examples/dreambooth/README_sd3.md)等前沿模型。
Expand Down Expand Up @@ -290,6 +290,7 @@ python setup.py install
<ul>
<li><a href="paddlemix/examples/groundingdino">Grounding DINO</a></li>
<li><a href="paddlemix/examples/sam">SAM</a></li>
<li><a href="paddlemix/examples/sam2">SAM2</a></li>
<li><a href="paddlemix/examples/YOLO-World">YOLO-World</a></li>
</ul>
</ul>
Expand Down Expand Up @@ -358,6 +359,53 @@ python setup.py install
更多模型能力,可参考[模型能力矩阵](./paddlemix/examples/README.md)



## 📊多模数据处理工具箱DataCopilot
<table align="center">
<tbody>
<tr align="center" valign="center">
<td>
<b>基础能力</b>
</td>
<td>
<b>数据分析</b>
</td>
<td>
<b>数据生成</b>
</td>
</tr>
<tr valign="top">
<td>
<ul>
</ul>
<li><b>使用文档</b></li>
<ul>
<li><a href="paddlemix/datacopilot">DataCopilot</a></li>
</ul>
</td>
<td>
<ul>
</ul>
<li><b>能力标签模型</b></li>
<ul>
<li><a href="paddlemix/datacopilot/example">PP-InsCapTagger</a></li>
</ul>
</td>
<td>
<ul>
</ul>
<li><b>文档类数据生成方案</b></li>
<ul>
<li><a href="paddlemix/datacopilot/example">PP-InfinityDocData</a></li>
</ul>
</td>
</tr>
</tbody>
</table>

更多数据相关功能,可参考[DataCopilot](./paddlemix/datacopilot)主页


## 🏆特色模型|工具

### 💎跨模态任务流水线AppFlow
Expand Down
22 changes: 12 additions & 10 deletions README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,21 +37,21 @@

**🔥PaddleMIX Development Project Challenge (November 21 - December 22, 2024)**

**🔥2024.11.21 - 2024.12.22 PaddleMIX Development Project Challenge (Ended)**

- ✨「Experience Officer Recruitment」PaddleMIX Development Project Challenge
Click the link to register 🔗: [https://aistudio.baidu.com/activitydetail/1503019366](https://aistudio.baidu.com/activitydetail/1503019366)
🏆 Submit to the PaddlePaddle Galaxy Community Project Hall to be featured and receive a PaddleMIX Experience Officer certification certificate and JD.com card incentives.
Everyone is welcome to submit~
**🔥Live Course on January 7th, 2025: New PaddlePaddle PP Series Models Released!**

- ✨PP-DocBee: A New 'Bee'-ginning in Document Image Understanding!
To help you quickly and deeply understand **PaddleMIX**'s **PP-DocBee document understanding model** and master practical skills, Baidu's senior R&D engineers will provide a detailed explanation of PP-DocBee's core technology and demonstrate the complete development process of multimodal large models at **19:00 on January 7th (Tuesday)**. Scan the QR code below to register now!
<details>
<summary>Click to view the event poster</summary>
<summary>Click to expand event poster</summary>
<p align="center">
<img src='https://github.com/user-attachments/assets/27e0bbe3-0ff8-49ef-bd39-81a31a2b288b' width="25%">
<img src='https://github.com/user-attachments/assets/3b7adc9e-c68d-44d1-9674-05b933947deb' width="80%">
</p>
</details>


## 📣 Latest Developments
**🎉 2024.01.02 Added support for [PP-DocBee](./paddlemix/examples/ppdocbee) inference and training, supporting [high-performance inference](./deploy/ppdocbee)**

**🎉 2024.12.17 Support for [InternVL2_5 (1B, 2B, 4B, 8B)](./paddlemix/examples/internvl2) inference**

Expand All @@ -63,14 +63,15 @@ Everyone is welcome to submit~

**🎉 2024.11.1 Support for [LLaVA-OneVision](./paddlemix/examples/llava_onevision/) and [LLaVA-Critic](./paddlemix/examples/llava_critic/) inference**


<details>
<summary>Click to expand more</summary>

**🎉 2024.10.31 Welcome to the Update of External Developer's Creative [Tutorial Page](paddlemix_applications.md)**
* 🌟 Since the launch of our Large Model Suite Premium Project Collection activity on September 6th, we have received 30 high-quality developer projects. Among them, 25 premium projects have successfully passed the platform evaluation and been featured.

* 🙏 We sincerely thank all developers for their wonderful creations based on our suite! 🚀 We cordially invite you to share your creativity as well - welcome to publish your tutorials on public web pages or in the [PaddlePaddle AI Studio](https://aistudio.baidu.com/aistudio/community/multimodal?from=singlemessage) community!

<details>
<summary>Click to expand more</summary>

**🔥 PaddleMIX v2.1 Released on 2024.10.11**
* Supports the [PaddleNLP 3.0 beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0-beta0) version, allowing early access to its latest features.
* Added cutting-edge models like [Qwen2-VL](./paddlemix/examples/qwen2_vl/), [InternVL2](./paddlemix/examples/internvl2/), and [Stable Diffusion 3 (SD3)](https://github.com/PaddlePaddle/PaddleMIX/blob/develop/ppdiffusers/examples/dreambooth/README_sd3.md).
Expand Down Expand Up @@ -284,6 +285,7 @@ python setup.py install
<ul>
<li><a href="paddlemix/examples/groundingdino">Grounding DINO</a></li>
<li><a href="paddlemix/examples/sam">SAM</a></li>
<li><a href="paddlemix/examples/sam2">SAM2</a></li>
<li><a href="paddlemix/examples/YOLO-World">YOLO-World</a></li>
</ul>
</ul>
Expand Down
6 changes: 3 additions & 3 deletions build_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,16 @@ echo "开始安装 PaddleMIX 及其依赖..."

# 安装 PaddleMIX
echo "安装 PaddleMIX..."
pip install -e .
pip install -e . --user

# 安装 ppdiffusers
echo "安装 ppdiffusers..."
cd ppdiffusers
pip install -e .
pip install -e . --user
cd ..
#注:ppdiffusers部分模型需要依赖 CUDA 11.2 及以上版本,如果本地机器不符合要求,建议前往 [AI Studio](https://aistudio.baidu.com/index) 进行模型训练、推理任务。
#如果希望使用**bf16**训练推理,请使用支持**bf16**的GPU,如A100。

# 安装依赖包
echo "安装依赖包..."
pip install -r requirements.txt
pip install -r requirements.txt --user
49 changes: 49 additions & 0 deletions deploy/ppdocbee/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# PP-DocBee

## 1. 模型介绍

PP-DocBee 是PaddleMIX团队自研的一款专注于文档理解的多模态大模型,在中文文档理解任务上具有卓越表现。该模型是基于Qwen2-VL-2B架构针对文档理解场景进行优化的,通过近 500 万条文档理解类多模态数据集进行微调优化,各种数据集包括了通用VQA类、OCR类、图表类、text-rich文档类、数学和复杂推理类、合成数据类、纯文本数据等,并设置了不同训练数据配比。在学术界权威的几个英文文档理解评测榜单上,PP-DocBee基本都达到了同参数量级别模型的SOTA。在内部业务中文场景类的指标上,PP-DocBee也高于目前的热门开源和闭源模型。

## 2 环境准备

- **python >= 3.10**
- **paddlepaddle-gpu 要求>=3.0.0b2或版本develop**
```
# paddlepaddle-gpu develop版安装示例
python -m pip install paddlepaddle-gpu==0.0.0.post118 -f https://www.paddlepaddle.org.cn/whl/linux/gpu/develop.html
```

- **paddlenlp 需要特定版本**

在PaddleMIX/代码目录下执行以下命令安装特定版本的paddlenlp:
```bash
# 安装示例
git submodule update --init --recursive
cd PaddleNLP
git reset --hard e91c2d3d634b12769c30aa419ddf931c20b7ca9f
pip install -e .
cd csrc
python setup_cuda.py install
```

> 注:
* 请确保安装了以上依赖,否则无法运行。同时,需要安装 paddlemix/external_ops 下的自定义OP, `python setup.py install`。如果安装后仍然找不到算子,需要额外设置PYTHONPATH
* (默认开启flash_attn)使用flash_attn 要求A100/A800显卡或者H20显卡

## 3 高性能推理

在PP-DocBee的高性能推理优化中,**视觉模型部分继续使用PaddleMIX中的模型组网;但是语言模型部分调用PaddleNLP中高性能的Qwen2语言模型**,以得到高性能的PP-DocBee推理版本。

### 3.1. 文本&单张图像输入高性能推理
```bash
python deploy/ppdocbee/single_image_infer.py \
--model_name_or_path PaddleMIX/PPDocBee-2B-1129 \
--dtype bfloat16 \
--benchmark True \
```

- 在 NVIDIA A100-SXM4-80GB 上测试的内部业务中文场景平均端到端速度性能如下:

| model | Paddle Inference| PyTorch | Paddle 动态图 |
| ---------------------- | --------------- | ------------ | ------------ |
| PPDocBee-2B | 0.9267 s | 1.7114 s | 1.5935 s |
Loading

0 comments on commit f548061

Please sign in to comment.