Skip to content

Latest commit

 

History

History
292 lines (176 loc) · 14.3 KB

CG-09-29.md

File metadata and controls

292 lines (176 loc) · 14.3 KB

WebAssembly logo

Agenda for the September 29th video call of WebAssembly's Community Group

  • Where: zoom.us
  • When: September 29th, 4pm-5pm UTC (September 29th, 9am-10am Pacific Daylight Time)
  • Location: link on calendar invite

Registration

None required if you've attended before. Send an email to the acting WebAssembly CG chair to sign up if it's your first time. The meeting is open to CG members only.

Logistics

The meeting will be on a zoom.us video conference. Installation is required, see the calendar invite.

Agenda items

  1. Opening, welcome and roll call
    1. Opening of the meeting
    2. Introduction of attendees
  2. Find volunteers for note taking (acting chair to volunteer)
  3. Adoption of the agenda
  4. Proposals and discussions
    1. Review of action items from prior meeting.
    2. POLL: Memory 64 to phase 2 (Wouter Van Oortmerssen) [5-10 min]
    3. Presentation and feedback gathering on branch hinting (issue) (Yuri Iozzelli) [20 min]
    4. Fix typing of select (WebAssembly/reference-types#116) (Andreas Rossberg) [20 min]
    5. POLL: Relaxed dead code validation to phase 1 (Conrad Watt and Ross Tate) [5-10 min]
  5. Closure

Agenda items for future meetings

None

Schedule constraints

None

Meeting Notes

Attendees

  • Derek Schuff
  • Wouter Van Oortmerssen
  • Zalim Basharov
  • Sergey Rubanov
  • Sabine
  • Fatih Bakir
  • Conrad Watt
  • Yuri Iozzelli
  • Ingvar Stepanyan
  • Ross Tate
  • Francis McGabe
  • Paolo Severini
  • Daniel Hillerström
  • Rick
  • Sven Sauleau
  • Lars Hansen
  • Nick Fitzgerald
  • Yury Delendik
  • Ioanna Dimitriou
  • Paul Dworzanski
  • Arun Purushan
  • Thomas Lively
  • Jay Phelps
  • Daniel Wirtz
  • Benjamin Titzer
  • Luke Wagner
  • Tobias Tebbi
  • Keith Miller
  • Asumu Takikawa
  • Nabeel Al-Shamma
  • Steve Sanderson
  • Jakob Kummerow
  • Rich Winterton
  • Zhi An Ng

Memory64 proposal to phase 2 (Wouter Van Oortmerssen)

Wouter Van Oortmerssen is taking over as the champion for this proposal.

WV: This proposal now has had some discussion, and implementations in LLVM/lld and wabt; working on Binaryen currently. There is one open question right now, namely how constant offsets in loads/stores are applied. The current text says that they wrap, but discussion is happening on the issue.

KM: I haven't had a chance to look at that yet.

WV: load and store have an offset in addition to the address. So you end up with a 65 bit result, so you have to figure out how to handle it. There are several options. We could check with branches. That would be slow. We could wrap around at 64 bits, which could result in people addressing low memory unintentionally instead of high memory. So far this is the status quo, easy to implement and doesn’t require lots of checking. We could also disallow offsets in 64 bit memories. But the offset is very useful for things like accessing structs.

KM: Also thinking about the implementation questions, around VM tricks like we use in wasm32

WV: Yeah it’s definitely harder. There’s a risk that wasm64 will be slower because of the bounds checks; any input on techniques is welcome.

LW: is that issue #3 on the repo?

WV: that’s the right one, if you look at Ben’s comment, it lists 6 options to do the wrapping, for completeness sake. I recommend everyone to read those 6 options and list those in preference. So far option 2 looks like the only one that will be fast enough.

DS: I'm not hearing any objections or more questions, let's have the poll for phase 2:

Poll: Memory64 to phase 2

SF F N A SA
13 16 4 0 0

Poll passes

switching presenter; meanwhile:

Announcement from Ross Tate:

RT: SOIL Initiative is starting a seminar series. First one is Monday October 12 at 12 PM EDT/9 AM PDT Then next one on Friday Oct 23 at 4 PM EDT/1 PM PDT

First up is Thomas Lively on module splitting

Presentation and feedback-gathering on branch hinting (issue) (Yuri Iozzelli)

Yuri Iozzelli presenting

WV: is the reason for having this proposal mostly to delay compilation of these unlikely blocks? Most modern CPUs seem to ignore such branch hinting.

YI: AFAIK, x86 doesn’t really care about branch hinting, actually saying this branch is more likely doesn’t do anything. No knowledge on other archs. It’s still useful to do these kind of things, in C++20 there is new likely/unlikely for forcing compiler to move code that is not hot out of the hot path or do different instruction selection. E.g. better to do a branch on +ve or -ve version of conditions. That can inform engine to generate different instructions. In practice we know this can make a difference, we saw it in Wasm and also people are doing it for native compilation. This could also be useful for profile-guided optimization.

WV: have you compared this against optimization that LLVM can do?

YI: problem is that for Wasm, there is no control on layout of machine code. No way of saying this if block in the middle of Wasm, you can’t say where it is in the final machine code. We tried some workaround, put it in a loop, then put at the end, other downsides, loop has more instructions. These are workarounds, if we can convince v8 to do that, it might stop working next version.

PP: in gcc compatible compilers, it moves the code around, less likely branch get moved to the end. Can we implement this in clang?

YI: clang in this case produce Wasm, but how can you say what the VM will do in the final compilation?

KM: at least in JSC, we do different optimizations if we see an unlikely block, we won’t hoist things out of unlikely block into hot path, we assume that code isn’t executed a lot, we don’t want to do tail duplication etc.

PP: will affect code gen for sure, in C++ native, when you use builtin, you’re not producing instruction that is different, you are affecting codegen of the native code, in that slow path gets lower priority and not hoisted to hot path. We can do the same for Clang in Wasm. Make it similar to native compilers.

TL: we can do all that normal clang side optimizations for unlikely blocks, most of that should work today, Clang supports C++20 attributes.

PP: if it doesn’t then it’s a bug

TL: won’t be surprised if adding that information to binary, so engine can continue to make decisions based on likely/unlikely blocks, can improve perf. Open question before standardization, we want to measure how much of the win can be gotten from toolchain improvements, and how much is really locked on the engine, and we need to tell the engine that.

RW: If you want to send me some POC code, we wan look at why it got faster. We can determine if it was caches, what caused performance gains.

TT: reg allocator is affected. We put all the reg moves in deferred code, might be the biggest in V8.

YI: yea think that is it, we see reg alloc is better.

RW: you’re seeing reg alloc is better?

KM: also icache

YI: yea both of that. Knowing one of it is deferred in the end.

RW: that’s what I want to measure, to make sure we know what is helping performance.

YI: we don’t have a small test case, we can trace our actual application and get some of the codegen. I can make an artificial test case that can have an if branch taken 1% of the time and loops forever. For small examples, the reg alloc benefits won’t be realized.

DS: It looks like there is some interest in this proposal. The procedure is that you put an issue on the proposal. Next step is call to advance to stage 1, if there is enough interest, at which point you get your own forked repo.

AR: before that, another question. Previous times, anytime optimization hints came up, we ask if it can be done in a custom section, why does it have to be a language extension.

DS: It does seem similar to name section, where we do have a spec through the CG for it.

AR: then it’s a tool convention, not in the core spec. Might affect how you work on the proposal. Make sense to keep options open. This keeps coming back, we need to have a more general answer on how to deal with optimization hints. Need to have a more scalable story.

YI: that’s a possibly we didn’t talk about

RT: also some hints aren’t semantics preserving, toolchain can optimize, but engines won’t trust it, might be only useful for toolchains. Agree with AR, good space to explore

TL: moving this to phase 1 repo does not preclude any of the outcomes, good thing to do

DS: one difference between this and tool conventions is that we will expect engines to interpret this. That's like the name section but unlike the things in the tool-conventions repo where we haven't gone through any kind of standards process. Anyway, nothing needs to be done today.

TL: do we want to take a vote? We have that carve out for phase 1 (where we don't need advance notice).

DS: YI do you want to take a vote today? You’ll be the champion.

YI: I wanted to see reaction. If there is a possibly of doing a vote for Phase 1, sure.

[some other discussion]

DS: we are running a bit behind on time, maybe we take this offline and discuss more, e.g. how we want to scope this; we can easily bring it back.

Fix typing of select (WebAssembly/reference-types#116) (Andreas Rossberg)

TODO(AR): slides

WV: discussion somewhere suggestion that we might want to change how we handle unreachable in terms of type checking, if that goes through, will this still be required?

AR: this is more conservative, that will subsume this. I assume that discussion, judging from previous discussion, will take a long time, and much more implementation work. Will not want to block on that.

CW: we’re going to be talking about wider change to unreachable in the later talk. Agree that we can go ahead with this.

RT: bottom type will have long term problems, unreachable changes will resolve this. So this SGTM to as a short-term.

AR: I'm not aware of any problems with use of bottom type here?

DS: Before we go into more general issues, I'm hearing agreement on this particular tweak; we should go ahead with unanimous consent, any objections?

Add type for unreachable states (WebAssembly/design#1379) (Conrad Watt and Ross Tate)

CW presenting

AR: you keep saying there is a special case for select. There is an abstraction that pops and operand from the abstract stack, all you have is that there is an extra type that is unknown which accepts anything. Select does not require any special handling there.

CW: it's not just special casing of select, and of other instructions that consume the type select produces. There is another question of what to do with this unknown type. In the spec, you can say x subtype y then you can type this instruction. In impl, need to have short circuits.

AR: that’s why we have a bottom type, which didn’t affect anything else

RT: e.g. adding dup instructions.

CW: i think AR’s proposal can handle dup, which is why i didn’t bring it up here. AR is completely right that adding Bot type gives you a consistent view. But you’ll have to consider it when adding new operations.

AR: no, it only show up in subtyping, bot is subtype of everything. Pretty much the only place.

RT: branch on null will have to say that bot -> bot.

CW: It’s not that it is impossible to handle, it just looks like more trouble than it should be

AR: you’ll still have special case, that’s your mode

CW: you have to know the mode when doing validation…

AR: i will predict this is more complicated, because currently you can completely encapsulate in pop instruction, but now you have to make that distinction in every check you want to skip

CW: plan to talk about it in more details, not necessarily now, maybe next meeting. You say encapsulated in Pop abstraction, it is pretty big, especially if proposal add new types.

AR: pop abstraction is like 5 lines. None of these discussions is new.

CW: Specifically now, the web compat, practical pain that implementors have faced

KM: what’s the web compat issue?

AR: everyone need to agree on what is dead

CW: one path for making things simpler is not allow dead code. But now due to web compat, we cant think of that at all, since there are already modules with dead code after unreachable.

AR: some of us argued strongly against dead code, more work for codegen

CW: my point is that the discussion is simpler since some paths are closed out

KM: in JSC we ignore the stack once you get past unreachable.

CW: that’s what this proposal is suggesting

BT: don’t think you can hide things behind Pop, in particular, return call ref, you have to pop funcref off stack, then check signature against function you’re in, since it’s a tail call, you can’t hide that behind pop

CW: you can’t hide not doing it

BT: nothing to do with select

CW: we can think about a little more, we have 2 mins left, great discussion points on hopefully proposal repository and future presentations

RT: haven’t heard from implementors yet

CW: maybe they can express opinions in the form of a consensus vote

AR: there are various concerns from previous discussion, you make it observable distinction between decoding and validation. Currently not having this observable gives a lot of leeway to spec and implementation. E.g. in SIMD discussion, we keep moving stuff around, should it be validation/decoding error...

CW: if we are putting in effort into moving it around, people are already caring about it. If we are only moving a small thing like magnitude of immediates.

AR: practical issue, spec might optimize it for one way, implementation might optimize it for another way

CW: agree that there needs to be more discussion

AR: i think you’re pretty native about this, underestimating the amount of work

CW: checking immediates will let them reuse a lot of work, looked at existing implemetors

DS: clearly this needs a lot more discussion, but the bar for phase 1 is quite low.

Poll: Relaxed dead code validation to phase 1

SF F N A SA
9 15 10 0 0