Unicode-preserving mutators #1542

addisoncrump · 2023-09-22T01:41:24Z

This PR adds mutators which preserve the unicode categories of mutated regions.

Cargo.toml

addisoncrump · 2023-09-23T16:47:00Z

~~This needs to be modified s.t. we don't require the whole input to be UTF-8 -- this requirement proves to be too strong in practice.~~ completed

libafl/src/mutators/string.rs

tokatoka · 2023-09-25T20:59:32Z

perhaps next step could be adding token's category info when add it to the dictionary?

libafl/src/mutators/string.rs

tokatoka · 2023-09-25T21:11:21Z

I will try this tomorrow.

tokatoka · 2023-09-25T21:26:38Z

when I add a "real" utf8 chars to test I got lots of Utf8Error.

tokatoka · 2023-09-25T21:37:26Z

Now I fixed the test so they return proper utf8error.
cargo test mutators::string::test::mutate_hex is the command (with unicode enabled)

but with the current state it already panick. I'll look tomorrow too

addisoncrump · 2023-09-26T07:12:48Z

Right -- I think it is not strictly possible to sanely select UTF-8 data from an input, because there's not a clear indicator for the start of a code point. This makes some mutations lead to non-UTF-8 data, which is less than optimal. Not sure how to get around this without a "this region is UTF-8" pass of some kind. That should be fairly cheap to implement, so maybe we can try this.

libafl/src/mutators/string.rs

libafl/src/stages/string.rs

libafl/src/mutators/string.rs

addisoncrump · 2023-09-27T20:35:24Z

@tokatoka more CI failures... 😢

andreafioraldi · 2023-11-20T09:32:29Z

status?

domenukk · 2023-11-20T23:24:36Z

This is cool!

* create the string classification stage * modify API to pre-group * preserving mutator * more meaningful test * subproperty mutators + some fixes * document, finalise, integrate with libafl_libfuzzer * add example, fix for weird range select * fix for introspection * fix fuzzer build * speed optimisation: allow, but do not require, stacking * property => category * token replacement * fixup: rare case where rust does not agree on valid character * fix CI again * again again * take two: dynamic unicode discovery * oops * fix: last byte is never selected * opt: bias to smaller unicode categories * fix test * opt: precompute regions and fix tests * cache and allow stacking * document and update libafl_libfuzzer * oops, use reverse * fix bolts clippy error * fixup part 2 * clippy * part 2 * clippy warning allow * clippy complaint * use alloc not std --------- Co-authored-by: toka <[email protected]>

domenukk reviewed Sep 22, 2023

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

addisoncrump mentioned this pull request Sep 22, 2023

GeneralizationStage takes an unreasonably long time #1545

Closed

2 tasks

addisoncrump marked this pull request as draft September 23, 2023 16:46

tokatoka reviewed Sep 25, 2023

View reviewed changes

libafl/src/mutators/string.rs Show resolved Hide resolved

tokatoka reviewed Sep 25, 2023

View reviewed changes

libafl/src/mutators/string.rs Outdated Show resolved Hide resolved

tokatoka reviewed Sep 25, 2023

View reviewed changes

libafl/src/mutators/string.rs Outdated Show resolved Hide resolved

tokatoka reviewed Sep 25, 2023

View reviewed changes

libafl/src/mutators/string.rs Outdated Show resolved Hide resolved

addisoncrump marked this pull request as ready for review September 27, 2023 02:36

addisoncrump force-pushed the unicode-mutator branch from 1ab2620 to 01245c5 Compare September 27, 2023 11:25

tokatoka reviewed Sep 27, 2023

View reviewed changes

libafl/src/mutators/string.rs Outdated Show resolved Hide resolved

tokatoka reviewed Sep 27, 2023

View reviewed changes

libafl/src/stages/string.rs Show resolved Hide resolved

tokatoka reviewed Sep 27, 2023

View reviewed changes

libafl/src/mutators/string.rs Show resolved Hide resolved

addisoncrump force-pushed the unicode-mutator branch from 8575e8b to 080aef5 Compare September 29, 2023 15:34

addisoncrump force-pushed the unicode-mutator branch from 080aef5 to ca2d26a Compare November 7, 2023 19:44

addisoncrump force-pushed the unicode-mutator branch 2 times, most recently from a41121d to 1e4b04c Compare November 20, 2023 14:01

addisoncrump added 6 commits November 20, 2023 21:57

create the string classification stage

4498d0e

modify API to pre-group

2dc0ba0

preserving mutator

2c74d1a

more meaningful test

5f45423

subproperty mutators + some fixes

5247201

document, finalise, integrate with libafl_libfuzzer

e225991

addisoncrump and others added 23 commits November 20, 2023 21:57

fix fuzzer build

551da89

speed optimisation: allow, but do not require, stacking

dbe1650

property => category

50b0eaa

token replacement

fc512a5

fixup: rare case where rust does not agree on valid character

639c225

fix CI again

5663de7

again again

4bb1a9a

take two: dynamic unicode discovery

02d0e60

oops

16e23d9

fix: last byte is never selected

8039107

opt: bias to smaller unicode categories

9c39911

fix test

fa0d77b

opt: precompute regions and fix tests

e7564f7

cache and allow stacking

0c36363

document and update libafl_libfuzzer

645d7f2

oops, use reverse

5b6f539

fix bolts clippy error

79ccea5

fixup part 2

bc86c59

clippy

334ffa3

part 2

92b0ece

clippy warning allow

9ecb33d

clippy complaint

be19d25

use alloc not std

93e2e2d

addisoncrump force-pushed the unicode-mutator branch from 1e4b04c to 93e2e2d Compare November 20, 2023 21:00

domenukk merged commit 281524d into main Nov 20, 2023
17 checks passed

domenukk deleted the unicode-mutator branch November 20, 2023 23:41

ThomasTNO mentioned this pull request Sep 6, 2024

Update LibAFL to latest TNO-S3/WuppieFuzz#3

Closed

ThomasTNO mentioned this pull request Sep 17, 2024

Inventorise interesting new features LibAFL TNO-S3/WuppieFuzz#15

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unicode-preserving mutators #1542

Unicode-preserving mutators #1542

addisoncrump commented Sep 22, 2023 •

edited

Loading

addisoncrump commented Sep 23, 2023 •

edited

Loading

tokatoka commented Sep 25, 2023

tokatoka commented Sep 25, 2023

tokatoka commented Sep 25, 2023

tokatoka commented Sep 25, 2023 •

edited

Loading

addisoncrump commented Sep 26, 2023

addisoncrump commented Sep 27, 2023

andreafioraldi commented Nov 20, 2023

domenukk commented Nov 20, 2023

Unicode-preserving mutators #1542

Unicode-preserving mutators #1542

Conversation

addisoncrump commented Sep 22, 2023 • edited Loading

addisoncrump commented Sep 23, 2023 • edited Loading

tokatoka commented Sep 25, 2023

tokatoka commented Sep 25, 2023

tokatoka commented Sep 25, 2023

tokatoka commented Sep 25, 2023 • edited Loading

addisoncrump commented Sep 26, 2023

addisoncrump commented Sep 27, 2023

andreafioraldi commented Nov 20, 2023

domenukk commented Nov 20, 2023

addisoncrump commented Sep 22, 2023 •

edited

Loading

addisoncrump commented Sep 23, 2023 •

edited

Loading

tokatoka commented Sep 25, 2023 •

edited

Loading