Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid copy of data for UDP parsers #1657

Closed
awelzel opened this issue Jan 23, 2024 · 1 comment · Fixed by #1723
Closed

Avoid copy of data for UDP parsers #1657

awelzel opened this issue Jan 23, 2024 · 1 comment · Fixed by #1723
Assignees

Comments

@awelzel
Copy link
Contributor

awelzel commented Jan 23, 2024

Chatting with @rsmmr , one idea came up to prevent copying data for UDP/block analyzers:

auto input = hilti::rt::reference::make_value<hilti::rt::Stream>(data, size);
input->freeze();
if ( ! _parser->parse1 )
throw InvalidUnitType(
fmt("unit type '%s' cannot be used as external entry point because it requires arguments",
_parser->name));
if ( _parser->context_new ) {
if ( _context )
DRIVER_DEBUG("context was provided");
else
DRIVER_DEBUG("no context provided");
}
hilti::rt::profiler::stop(profiler);
_resumable = _parser->parse1(input, {}, _context);
if ( ! *_resumable )
hilti::rt::internalError("block-based parsing yielded");
return Done;

If any iterators into the stream are invalidated after parse1, seems the stream would not necessarily need to own the data.

This might improve performance for the spicy-quic analyzer when crunching through large transfers.

Relates to #1644

@bbannier
Copy link
Member

bbannier commented Jan 23, 2024

My takeaway from #1644 was that introducing a non-owning Chunk introduces new overhead even in code not making use of it since it partially undos (the spirit of) the optimizations done in #1607, so we probably wouldn't want to use this approach here.

Since here the Stream is always fully consumed we could instead introduce a non-owning Stream for this problem. The naive zeroth implementation could just be a non-owning class derived from Stream which stores a string_view into the data; that would still incur the overhead of creating the base Stream, but might already bring sufficient perf improvements.

@rsmmr rsmmr self-assigned this Apr 16, 2024
rsmmr added a commit that referenced this issue Apr 19, 2024
…erformance.

This does two things:

- When adding data to a stream, we now do that without copying
  anything initially. For block input (e.g., UDP) that's always fine
  because the parser will never suspend before it's fully done
  parsing; hence we can safely delete it once the parser returns. For
  stream input (e.g., TCP), we make the stream own its data later
  if (and only if) the parser suspends.

- For block input (e.g., UDP) we now keep reusing the same stream for
  subsequent blocks, instead of creating a new one each time. This
  allows the stream to reuse an allocated chunk that it may have still
  cached internally.

The result of this, plus the new chunk caching introduced earlier, is
that for a UDP flow, we never need to allocate more than one chunk,
and never need to copy any data; and for TCP it's the same as long as
parsing consumes all data before suspending (which should be a common
case), plus, when we allocate new storage we only copy data that didn't
get trimmed immediately anyways.

Closes #1657.
rsmmr added a commit that referenced this issue Apr 22, 2024
…erformance.

This does two things:

- When adding data to a stream, we now do that without copying
  anything initially. For block input (e.g., UDP) that's always fine
  because the parser will never suspend before it's fully done
  parsing; hence we can safely delete it once the parser returns. For
  stream input (e.g., TCP), we make the stream own its data later
  if (and only if) the parser suspends.

- For block input (e.g., UDP) we now keep reusing the same stream for
  subsequent blocks, instead of creating a new one each time. This
  allows the stream to reuse an allocated chunk that it may have still
  cached internally.

The result of this, plus the new chunk caching introduced earlier, is
that for a UDP flow, we never need to allocate more than one chunk,
and never need to copy any data; and for TCP it's the same as long as
parsing consumes all data before suspending (which should be a common
case), plus, when we allocate new storage we only copy data that didn't
get trimmed immediately anyways.

Closes #1657.
@bbannier bbannier linked a pull request Apr 23, 2024 that will close this issue
rsmmr added a commit that referenced this issue Apr 24, 2024
…erformance.

This does two things:

- When adding data to a stream, we now do that without copying
  anything initially. For block input (e.g., UDP) that's always fine
  because the parser will never suspend before it's fully done
  parsing; hence we can safely delete it once the parser returns. For
  stream input (e.g., TCP), we make the stream own its data later
  if (and only if) the parser suspends.

- For block input (e.g., UDP) we now keep reusing the same stream for
  subsequent blocks, instead of creating a new one each time. This
  allows the stream to reuse an allocated chunk that it may have still
  cached internally.

The result of this, plus the new chunk caching introduced earlier, is
that for a UDP flow, we never need to allocate more than one chunk,
and never need to copy any data; and for TCP it's the same as long as
parsing consumes all data before suspending (which should be a common
case), plus, when we allocate new storage we only copy data that didn't
get trimmed immediately anyways.

Closes #1657.
rsmmr added a commit that referenced this issue May 7, 2024
…erformance.

This does two things:

- When adding data to a stream, we now do that without copying
  anything initially. For block input (e.g., UDP) that's always fine
  because the parser will never suspend before it's fully done
  parsing; hence we can safely delete it once the parser returns. For
  stream input (e.g., TCP), we make the stream own its data later
  if (and only if) the parser suspends.

- For block input (e.g., UDP) we now keep reusing the same stream for
  subsequent blocks, instead of creating a new one each time. This
  allows the stream to reuse an allocated chunk that it may have still
  cached internally.

The result of this, plus the new chunk caching introduced earlier, is
that for a UDP flow, we never need to allocate more than one chunk,
and never need to copy any data; and for TCP it's the same as long as
parsing consumes all data before suspending (which should be a common
case), plus, when we allocate new storage we only copy data that didn't
get trimmed immediately anyways.

Closes #1657.
@rsmmr rsmmr closed this as completed in b64a48a May 13, 2024
rsmmr added a commit that referenced this issue May 13, 2024
* origin/topic/robin/gh-1657-udp:
  Address review feedback.
  Update Spicy runtime driver to use new stream features for improved performance.
  Give stream a method to reset it into freshly initialized state.
  Cache previously trimmed chunks inside stream for reuse.
  Extend stream API to allow for chunks that don't own their data.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants