Skip to content

Commit

Permalink
a blog to introduce profiling with perf in wamr
Browse files Browse the repository at this point in the history
  • Loading branch information
lum1n0us committed May 7, 2024
1 parent d781683 commit 1612211
Show file tree
Hide file tree
Showing 3 changed files with 16,087 additions and 0 deletions.
Empty file.
122 changes: 122 additions & 0 deletions content/en/blog/wamr_support_perf/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
---
title: "Profile Wasm applications with perf in WAMR JIT"
description: "WAMR JIT supports linux perf"
excerpt: ""
date: 2023-12-22T13:14:15+08:00
lastmod: 2023-12-22T13:14:15+08:00
draft: false
weight: 50
images: []
categories: ["profiling", "tool"]
tags: ["profiling"]
contributors: ["lum1n0us"]
pinned: false
homepage: false
---

Profiling a Wasm application can provide valuable insights into its performance. In this blog post, we'll explore how to use [linux perf](https://perf.wiki.kernel.org/index.php/Main_Page) to analyze Wasm applications running on the WAMR with JIT compilation.

Linux perf is a versatile performance analysis tool that helps developers understand an optimize the behavior of their applications. It provides detailed information about various aspects of program execution, including CPU usage, memory access patterns, and function call traces.

## With perf report

Let's dive into profiling a Wasm application using WAMR(aot and jit) and linux perf.

1. Before profiling, recompile WAMR with the LLVM JIT and AOT compilation option:

```bash
$ cmake -S . -B bulid -DWAMR_BUILD_JIT=1 -DWAMR_BUILD_AOT=1
```

2. Use perf to profiling

```bash
# perf.data.raw is perf output. it records all call stacks for every sample event.
# but it can't translate jiited function address to jitted function name
$ perf record -k mono -g --output=perf.data.raw -- iwasm --perf-profile <.wasm or .aot>
```

2.1 merge jitted symbols information

*only if iwasm is running under jit mode. aot doesn't need this step*

``` bash
# read jit-xxx.dump file generated by wamr and get jitted symbols information
$ perf inject --jit --intput=perf.data.raw --output=perf.data
```

You can use `perf report` to review _perf.data_. It includes performance counter profile information recorded via `perf record`.

```
76.07% 0.00% iwasm libc.so.6 [.] __libc_start_call_main
|
---__libc_start_call_main
main
|
--68.33%--app_instance_main
wasm_application_execute_main
execute_main
wasm_runtime_call_wasm
wasm_call_function
call_wasm_with_hw_bound_check
wasm_interp_call_wasm
llvm_jit_call_func_bytecode
wasm_runtime_invoke_native
push_args_end
aot_func#1
aot_func#32
|
--68.33%--aot_func#5
|
--68.32%--aot_func#4
|
--68.19%--aot_func#2
68.33% 0.00% iwasm iwasm [.] app_instance_main
|
---app_instance_main
wasm_application_execute_main
execute_main
wasm_runtime_call_wasm
wasm_call_function
call_wasm_with_hw_bound_check
wasm_interp_call_wasm
llvm_jit_call_func_bytecode
wasm_runtime_invoke_native
push_args_end
aot_func#1
aot_func#32
|
--68.33%--aot_func#5
|
--68.32%--aot_func#4
|
--68.19%--aot_func#2
```

## With Flamegraph

[Flamegraph](https://github.com/brendangregg/FlameGraph0) is a visualization technique that represents the call stack of a program. They provide a clean overview of where CPU time is spent during program execution.

All based on previous generated _perf.data_. And need to download [FlameGraphs](https://github.com/brendangregg/FlameGraph) firstly.

```bash
$ perf script -i perf.data > out.perf
#fold stacks
$ ./FlameGraph/stackcollapse-perf.pl out.perf > out.folded
#render a flamegraph
$ ./FlameGraph/flamegraph.pl out.folded.translated > perf.svg
```

Because jitted functions all have the same form names like _aot_func#NUMBER_, it's hard for developers to understand. There is a script to do translation.

```bash
# translate jitted function names into their original wasm function names
$ python trans_wasm_func_name.py --wabt_home <wabt installation> --folded out.folded <wasm>
#render a flamegraph
$ ./FlameGraph/flamegraph.pl out.folded.translated > perf.svg
```
![example flamegraph](./perf.svg)


**For more details, please refer to [doc](https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/doc/perf_tune.md#7-use-linux-perf)**
Loading

0 comments on commit 1612211

Please sign in to comment.