[core] Convert limit NF4 conversion to FP16 -> NF4 #23806

praasz · 2024-04-02T05:18:41Z

Details:

Limit NF4 conversion to FP16 -> NF4 in convert operator as values are correctly quantized, Not support NF4 conversion to/from other types.

Tickets:

CVS-135304

NF4 <-> FP

mlukasze · 2024-04-03T05:31:43Z

@AlexKoff88 && @itikhono && @vurusovs
could you please review it, as we have green CI? Looks like useful change would be nice to have in a release

AlexKoff88 · 2024-04-03T06:21:31Z

src/core/src/op/convert.cpp

 struct Evaluate : public element::NoAction<bool> {
    using element::NoAction<bool>::visit;

-    template <element::Type_t ET_IN, class TI = fundamental_type_for<ET_IN>>
+    // convert from FP (except NF4) to any other.
+    template <element::Type_t ET_IN,


I've put a comment in the relevant ticket. In a nutshell, it would be better to support FP16->NF4.

Changes applied to support FP16 -> NF4. Required to remove some test as they use not supported conversion by convert.

But can we update the test instead of removing it to make sure that it works on the Python API level which is important for external users such as NNCF?

Restored tests for python API and CPU. It looks like conversion from NF4 to F16 is required to make possible of decompression of constants (other types by adding additional convert).

But as CPU plugin on not ARM device always converts F16 to F32, I think would be better to add support in Convert to decompress NF4 to F32 directly without adding additional conversion.

This is just a temporary situation. CPU will use BF16, INT8, of FP32 to unpack this type and it should be done on a micro-kernel level (not reference implementation). The reference implementation can have any type. My only concern is to provide an efficient way for compression from FP16/FP32 to NF4. Ideally, we should support both.

@KodiaqQ mention in ticket that these changes makes no regression.
I think PR can be merged, @AlexKoff88 can you confirm?

in convert operator

supports conversions not handled by core op

- convert requires support NF4 <-> F16

…t-to-f32-nf4

src/core/src/op/convert.cpp

Improve test accuracy

…3806) ### Details: - Limit NF4 conversion to FP16 -> NF4 in convert operator as values are correctly quantized, Not support NF4 conversion to/from other types. ### Tickets: - [CVS-135304](https://jira.devtools.intel.com/browse/CVS-135304)

Convert limit NF4 conversion:

Loading
Loading status checks…

6d5ea20

NF4 <-> FP

praasz requested review from AlexKoff88, vurusovs and t-jankowski April 2, 2024 05:18

praasz requested a review from a team as a code owner April 2, 2024 05:18

github-actions bot added the category: Core label Apr 2, 2024

Correct code style

Loading
Loading status checks…

89a463e

praasz requested a review from a team as a code owner April 2, 2024 05:51

praasz requested review from itikhono and removed request for a team April 2, 2024 05:51

github-actions bot added the category: transformations label Apr 2, 2024

AlexKoff88 reviewed Apr 3, 2024

View reviewed changes

praasz added this to the 2024.2 milestone Apr 3, 2024

praasz requested review from a team as code owners April 3, 2024 14:44

github-actions bot added category: CPU category: Python API category: TEMPLATE labels Apr 3, 2024

praasz changed the title ~~[core] Convert limit NF4 conversion to NF4 <-> FP~~ [core] Convert limit NF4 conversion to FP16 -> NF4 Apr 3, 2024

Reduce NF4 conversion only from FP16

Loading
Loading status checks…

f8ce4eb

in convert operator

praasz force-pushed the convert-op-nf4-limit-to-f32-nf4 branch from 47adcb2 to f8ce4eb Compare April 3, 2024 15:40

praasz requested a review from AlexKoff88 April 4, 2024 04:38

ilya-lavrenov assigned AlexKoff88 Apr 4, 2024

praasz added the do_not_merge label Apr 4, 2024

praasz added 3 commits April 11, 2024 04:46

Expand TEMPLATE by Convert op evaluates

9d22661

supports conversions not handled by core op

Restore nf4 decompression tests

Loading
Loading status checks…

2c0521f

- convert requires support NF4 <-> F16

Merge remote-tracking branch 'origin/master' into convert-op-nf4-limi…

Loading
Loading status checks…

d5260b0

…t-to-f32-nf4

github-actions bot removed the category: transformations label Apr 11, 2024

Fix ConvertLike tests

Loading
Loading status checks…

897cc0a

olpipi approved these changes Apr 11, 2024

View reviewed changes

t-jankowski approved these changes Apr 12, 2024

View reviewed changes

src/core/src/op/convert.cpp Outdated Show resolved Hide resolved

Add NF4->F32 to avoid additional converts for decompression

Loading
Loading status checks…

f56f22f

Improve test accuracy

praasz removed the do_not_merge label Apr 16, 2024

praasz added this pull request to the merge queue Apr 17, 2024

github-merge-queue bot removed this pull request from the merge queue due to failed status checks Apr 17, 2024

praasz added this pull request to the merge queue Apr 17, 2024

Merged via the queue into openvinotoolkit:master with commit d199994 Apr 17, 2024
110 checks passed

praasz deleted the convert-op-nf4-limit-to-f32-nf4 branch April 17, 2024 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[core] Convert limit NF4 conversion to FP16 -> NF4 #23806

[core] Convert limit NF4 conversion to FP16 -> NF4 #23806

praasz commented Apr 2, 2024 •

edited

Loading

mlukasze commented Apr 3, 2024

AlexKoff88 Apr 3, 2024

praasz Apr 4, 2024

AlexKoff88 Apr 4, 2024

praasz Apr 11, 2024

AlexKoff88 Apr 14, 2024

praasz Apr 16, 2024

[core] Convert limit NF4 conversion to FP16 -> NF4 #23806

[core] Convert limit NF4 conversion to FP16 -> NF4 #23806

Conversation

praasz commented Apr 2, 2024 • edited Loading

Details:

Tickets:

mlukasze commented Apr 3, 2024

AlexKoff88 Apr 3, 2024

Choose a reason for hiding this comment

praasz Apr 4, 2024

Choose a reason for hiding this comment

AlexKoff88 Apr 4, 2024

Choose a reason for hiding this comment

praasz Apr 11, 2024

Choose a reason for hiding this comment

AlexKoff88 Apr 14, 2024

Choose a reason for hiding this comment

praasz Apr 16, 2024

Choose a reason for hiding this comment

praasz commented Apr 2, 2024 •

edited

Loading