Skip to content

Commit

Permalink
Merge branch 'master' into feature/add_colab_link
Browse files Browse the repository at this point in the history
  • Loading branch information
curioyang authored Aug 30, 2023
2 parents 58689cd + 6544a5c commit 93769df
Show file tree
Hide file tree
Showing 22 changed files with 429 additions and 268 deletions.
12 changes: 11 additions & 1 deletion docs/USAGE_v2.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,15 @@ Type "help", "copyright", "credits" or "license" for more information.

k230模型编译推理参考Jupyter脚本:[User_guide](../examples/user_guide/k230_simulate.ipynb),脚本中包含了单输入和多输入的示例。

如果在Docker中运行Jupyter脚本,可以参考[配置Jupyter lab](https://github.com/kunjing96/docker-jupyterlab#32-%E9%85%8D%E7%BD%AEjupyter-lab)进行配置。
如果在Docker中运行Jupyter脚本,可以参考以下命令,之后在浏览器窗口打开即可。

```shell
docker run -it --rm --privileged=true -p 8889:8889 --name Kendryte -v `pwd`:/mnt -w /mnt ghcr.io/kendryte/k230_sdk /bin/bash -c "/bin/bash

pip install jupyterlab

jupyter-lab --ip 0.0.0.0 --allow-root
```

在执行脚本之前需要根据自身需求修改以下内容:

Expand Down Expand Up @@ -153,6 +161,8 @@ subgraph A
end

```
##### 动态shape参数
详见[动态shape参数说明](./shape_bucket.md)

#### 代码示例

Expand Down
14 changes: 13 additions & 1 deletion docs/USAGE_v2_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,16 @@ Type "help", "copyright", "credits" or "license" for more information.

Model compilation, inference for k230 can be found in the Jupyter script [User_guide](../examples/user_guide/k230_simulate.ipynb), this script contains single and multiple input examples.

If you run Jupyter scripts in Docker, you can refer to [Configure Jupyter lab](https://github.com/kunjing96/docker-jupyterlab#32-%E9%85%8D%E7%BD%AEjupyter-lab) to configure them.
If you run the Jupyter script in Docker, you can refer to the command and then open it in your browser.

```shell
docker run -it --rm --privileged=true -p 8889:8889 --name Kendryte -v `pwd`:/mnt -w /mnt ghcr.io/kendryte/k230_sdk /bin/bash -c "/bin/bash

pip install jupyterlab

jupyter-lab --ip 0.0.0.0 --allow-root
```


You need to modify the following to suit your needs before executing the script:

Expand Down Expand Up @@ -154,6 +163,9 @@ subgraph A
```

##### Dynamice shape args
Refer to[Dynamic shape args description](./shape_bucket.md)

#### Example

```python
Expand Down
48 changes: 48 additions & 0 deletions docs/shape_bucket.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# ShapeBucket使用说明

ShapeBucket是针对动态shape的一种解决方案,会根据输入长度的范围以及指定的段的数量来对动态shape进行优化。该功能默认为false,需要打开对应的option才能生效,除了指定对应的字段信息,其他流程与编译静态模型没有区别。

对应的不同CompileOptions中的字段

| 字段名称 | 类型 | 是否必须 | 描述 |
| --------------------------- | --------------------- | -------- | --------------------------------------------------------------- |
| shape_bucket_enable | bool || 是否开启ShapeBucket功能,默认为False。在 `dump_ir=True`时生效 |
| shape_bucket_range_info | Dict[str, [int, int]] || 每个输入shape维度信息中的变量的范围,最小值必须大于等于1 |
| shape_bucket_segments_count | int || 输入变量的范围划分为几段 |
| shape_bucket_fix_var_map | Dict[str, int] || 固定shape维度信息中的变量为特定的值 |

## onnx

在模型的shape中会有些维度为变量名字,这里以一个onnx模型的输入为例

> tokens: int64[batch_size, tgt_seq_len]
>
> step: float32[seq_len, batch_size]
对应这个输入有如下的配置

```python
shape_bucket_options = nncase.ShapeBucketOptions()
shape_bucket_options.shape_bucket_enable = True
shape_bucket_options.shape_bucket_range_info = {"seq_len":[1, 100], "tgt_seq_len":[1, 100]}
shape_bucket_options.shape_bucket_segments_count = 2
shape_bucket_options.shape_bucket_fix_var_map = {"batch_size" : 3}
```

shape的维度信息中存在seq_len,tgt_seq_len,batch_size这三个变量。首先是batch_size,虽然是变量的但实际应用的时候固定为3,因此在**fix_var_map**中添加batch_size = 3,在运行的时候会将这个维度固定为3。

seq_len,tgt_seq_len两个是实际会发生改变的,因此需要配置这两个变量的实际范围,也就是**range_info**的信息。**segments_count**是实际分段的数量,会根据范围等分为几份,对应的编译时间也会相应增加几倍。

## tflite

tflite的模型与onnx不同,shape上暂未标注维度的名称,目前只支持输入中具有一个维度是动态的,并且名称统一配置为-1,配置方式如下

```cpp
shape_bucket_options = nncase.ShapeBucketOptions()
shape_bucket_options.shape_bucket_enable = True
shape_bucket_options.shape_bucket_range_info = {"-1":[1, 100]}
shape_bucket_options.shape_bucket_segments_count = 2
shape_bucket_options.shape_bucket_fix_var_map = {"batch_size" : 3}
```
配置完这些选项后整个编译的流程和静态shape一致。
8 changes: 8 additions & 0 deletions python/nncase/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -357,6 +357,10 @@ class CompileOptions:
dump_asm: bool
dump_ir: bool
dump_dir: str
shape_bucket_enable: bool
shape_bucket_range_info: dict
shape_bucket_segments_count: int
shape_bucket_fix_var_map: dict

def __init__(self) -> None:

Expand All @@ -375,6 +379,10 @@ def __init__(self) -> None:
self.dump_asm = True
self.dump_ir = False
self.dump_dir = "tmp"
self.shape_bucket_enable = False
self.shape_bucket_range_info = {}
self.shape_bucket_segments_count = 2
self.shape_bucket_fix_var_map = {}


class ShapeBucketOptions:
Expand Down
1 change: 1 addition & 0 deletions src/Native/include/nncase/kernels/kernel_utils.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@
#include <cassert>
#include <cmath>
#include <cstddef>
#include <nncase/kernels/stackvm/resize_image.h>
#include <nncase/runtime/datatypes.h>
#include <numeric>

Expand Down
94 changes: 94 additions & 0 deletions src/Native/include/nncase/kernels/stackvm/resize_image.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
/* Copyright 2019-2023 Canaan Inc.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#pragma once
#include <nncase/runtime/stackvm/opcode.h>

using namespace nncase::runtime::stackvm;

using get_coordinate_func_t = float (*)(float, float, float, float, float,
float);
using get_nearest_pixel_func_t = int64_t (*)(float);

get_coordinate_func_t get_coordinate_from_resized(
image_resize_transformation_mode_t coordinate_transform_mode);

get_nearest_pixel_func_t
get_nearest_pixel_from_origin(image_resize_nearest_mode_t nearest_mode);

inline get_coordinate_func_t get_coordinate_from_resized(
image_resize_transformation_mode_t coordinate_transform_mode) {
switch (coordinate_transform_mode) {
case image_resize_transformation_mode_t::asymmetric:
return [](float x_resized, float x_scale, float, float, float, float) {
return x_resized * x_scale;
};
case image_resize_transformation_mode_t::pytorch_half_pixel:
return [](float x_resized, float x_scale, float length_resized, float,
float, float) {
return length_resized > 1 ? (x_resized + 0.5f) * x_scale - 0.5f
: 0.0f;
};
case image_resize_transformation_mode_t::align_corners:
return [](float x_resized, float, float length_resized,
float length_original, float, float) {
return length_resized == 1 ? 0
: x_resized * (length_original - 1) /
(length_resized - 1);
};
case image_resize_transformation_mode_t::tfcrop_and_resize:
return [](float x_resized, float, float length_resized,
float length_original, float roi_start, float roi_end) {
auto orig =
length_resized > 1
? roi_start * (length_original - 1) +
(x_resized * (roi_end - roi_start) *
(length_original - 1)) /
(length_resized - 1)
: 0.5 * (roi_start + roi_end) * (length_original - 1);
return static_cast<float>(orig);
};
default: // "image_resize_transformation_mode_t::half_pixel"
return [](float x_resized, float x_scale, float, float, float, float) {
return ((x_resized + 0.5f) * x_scale) - 0.5f;
};
}
}

inline get_nearest_pixel_func_t
get_nearest_pixel_from_origin(image_resize_nearest_mode_t nearest_mode) {
switch (nearest_mode) {
case image_resize_nearest_mode_t::round_prefer_ceil:
return [](float x_original) {
return static_cast<int64_t>(std::round(x_original));
};
case image_resize_nearest_mode_t::floor:
return [](float x_original) {
return static_cast<int64_t>(std::floor(x_original));
};
case image_resize_nearest_mode_t::ceil:
return [](float x_original) {
return static_cast<int64_t>(std::ceil(x_original));
};
default: // default is round_prefer_floor
return [](float x_original) {
// for half way cases prefer floor
if (x_original == static_cast<int64_t>(x_original) + 0.5f) {
return static_cast<int64_t>(std::floor(x_original));
}
return static_cast<int64_t>(std::round(x_original));
};
}
}
4 changes: 3 additions & 1 deletion src/Native/src/kernels/stackvm/optimized/opt_ops.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@
*/
#pragma once
#include <nncase/kernels/kernel_context.h>
#include <nncase/kernels/kernel_utils.h>
#include <nncase/runtime/datatypes.h>
#include <nncase/runtime/error.h>
#include <nncase/runtime/result.h>
#include <nncase/runtime/stackvm/opcode.h>
#include <nncase/tensor.h>
#include <nncase/value.h>

BEGIN_NS_NNCASE_KERNELS_MODULE(stackvm)
namespace optimized {

Expand Down Expand Up @@ -111,6 +111,8 @@ NNCASE_API result<void> resize_nearest_neighbor(
gsl::span<const size_t> in_shape, gsl::span<const size_t> in_strides,
gsl::span<const size_t> out_strides, int32_t out_h, int32_t out_w,
bool align_corners, bool half_pixel_centers,
get_coordinate_func_t get_coordinate_func,
get_nearest_pixel_func_t get_nearset_func,
kernel_context &context) noexcept;

NNCASE_API result<void>
Expand Down
36 changes: 25 additions & 11 deletions src/Native/src/kernels/stackvm/optimized/resize_image.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,10 @@ result<void> resize_nearest_neighbor_impl(
const T *input, T *output, gsl::span<const size_t> in_shape,
NNCASE_UNUSED gsl::span<const size_t> in_strides,
NNCASE_UNUSED gsl::span<const size_t> out_strides, int32_t out_h,
int32_t out_w, bool align_corners, bool half_pixel_centers,
int32_t out_w, NNCASE_UNUSED bool align_corners,
NNCASE_UNUSED bool half_pixel_centers,
get_coordinate_func_t get_coordinate_func,
get_nearest_pixel_func_t get_nearset_func,
NNCASE_UNUSED kernel_context &context) noexcept {
auto scales = kernels::detail::get_resize_scales(in_shape, out_h, out_w,
align_corners);
Expand All @@ -110,15 +113,23 @@ result<void> resize_nearest_neighbor_impl(
auto *output_ptr = begin_output_ptr + oc * out_image_size;

for (int oy = 0; oy < out_h; oy++) {
auto in_y = kernels::detail::get_nearest_neighbor(
oy, in_shape[2], height_scale, align_corners,
half_pixel_centers);
auto iy = get_coordinate_func(oy, height_scale, out_h,
in_shape[2], 0, 0);
int64_t in_y = get_nearset_func(iy);
if (in_y < 0)
in_y = 0;
if (in_y >= in_shape[2])
in_y = in_shape[2] - 1;
auto *in_row = input_ptr + in_y * in_shape[3];

for (int ox = 0; ox < out_w; ox++) {
auto in_x = kernels::detail::get_nearest_neighbor(
ox, in_shape[3], width_scale, align_corners,
half_pixel_centers);
auto ix = get_coordinate_func(ox, width_scale, out_w,
in_shape[3], 0, 0);
int64_t in_x = get_nearset_func(ix);
if (in_x < 0)
in_x = 0;
if (in_x >= in_shape[3])
in_x = in_shape[3] - 1;
*output_ptr++ = in_row[in_x];
}
}
Expand Down Expand Up @@ -264,10 +275,11 @@ inline result<void> resize_bilinear_impl(
half_pixel_centers, context);

#define RESIZE_NEAREST_NEIGHBOR_IMPL(type) \
resize_nearest_neighbor_impl(reinterpret_cast<const type *>(input), \
reinterpret_cast<type *>(output), in_shape, \
in_strides, out_strides, out_h, out_w, \
align_corners, half_pixel_centers, context);
resize_nearest_neighbor_impl( \
reinterpret_cast<const type *>(input), \
reinterpret_cast<type *>(output), in_shape, in_strides, out_strides, \
out_h, out_w, align_corners, half_pixel_centers, get_coordinate_func, \
get_nearset_func, context);

result<void> optimized::resize_bilinear(
typecode_t type, const gsl::byte *input, gsl::byte *output,
Expand All @@ -283,6 +295,8 @@ result<void> optimized::resize_nearest_neighbor(
gsl::span<const size_t> in_shape, gsl::span<const size_t> in_strides,
gsl::span<const size_t> out_strides, int32_t out_h, int32_t out_w,
bool align_corners, bool half_pixel_centers,
get_coordinate_func_t get_coordinate_func,
get_nearest_pixel_func_t get_nearset_func,
kernel_context &context) noexcept {
FP_OR_Q_IMPL(type, RESIZE_NEAREST_NEIGHBOR_IMPL);
}
4 changes: 3 additions & 1 deletion src/Native/src/kernels/stackvm/reference/instance_norm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -35,10 +35,12 @@ result<void> instance_norm_impl(const float *input, const float *scale,
float epsilon) {
return apply(in_shape, [&](gsl::span<const size_t> index) -> result<void> {
auto c = index[1];
auto offi = index[0] * in_shape[1] + index[1];
auto off = offset(in_strides, index);
const auto x = input[off];
output[offset(out_strides, index)] =
scale[c] * (x - input_mean[c]) / std::sqrt(input_var[c] + epsilon) +
scale[c] * (x - input_mean[offi]) /
std::sqrt(input_var[offi] + epsilon) +
bias[c];
return ok();
});
Expand Down
3 changes: 3 additions & 0 deletions src/Native/src/kernels/stackvm/reference/ref_ops.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#pragma once
#include <nncase/kernels/apply.h>
#include <nncase/kernels/kernel_context.h>
#include <nncase/kernels/kernel_utils.h>
#include <nncase/runtime/datatypes.h>
#include <nncase/runtime/error.h>
#include <nncase/runtime/result.h>
Expand Down Expand Up @@ -345,6 +346,8 @@ NNCASE_API result<void> resize_nearest_neighbor(
gsl::span<const size_t> in_shape, gsl::span<const size_t> in_strides,
gsl::span<const size_t> out_strides, int32_t out_h, int32_t out_w,
bool align_corners, bool half_pixel_centers,
get_coordinate_func_t get_coordinate_func,
get_nearest_pixel_func_t get_nearset_func,
kernel_context &context) noexcept;

NNCASE_API result<void> reverse_sequence(
Expand Down
Loading

0 comments on commit 93769df

Please sign in to comment.