Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Improve base weights consistency and make sure they're never zero #11806

Conversation

koute
Copy link
Contributor

@koute koute commented Jul 8, 2022

Fixes #11804

Here's my attempt at fixing the way we generate our base weights. The fix is relatively straightforward:

  1. Grab the minimum time it takes to run the extrinsic.
  2. From all of the benchmark results substract that minimum.
  3. Run the linear regression with the intercept value forced to y=0.
  4. Set the previously calculated minimum time as the base weight.

Here's a sample of how the benchmark results look before this PR:

pallet_contracts/seal_transfer,328713000
pallet_contracts/seal_transfer/r,3335054000

pallet_contracts/seal_transfer,230924000
pallet_contracts/seal_transfer/r,3319470000

pallet_contracts/seal_transfer,281081000
pallet_contracts/seal_transfer/r,3316840000

pallet_contracts/seal_transfer,277473000
pallet_contracts/seal_transfer/r,3386931000

pallet_contracts/seal_transfer,517279000
pallet_contracts/seal_transfer/r,3350051000

pallet_contracts/seal_transfer,193294000
pallet_contracts/seal_transfer/r,3361404000

pallet_contracts/seal_transfer,137453000
pallet_contracts/seal_transfer/r,3380590000

pallet_contracts/seal_transfer,0
pallet_contracts/seal_transfer/r,3452650000

As you can see the base weights are all over the place, and we even have one zero here.

And here's how they look after this PR:

pallet_contracts/seal_transfer,349462000
pallet_contracts/seal_transfer/r,2550642000

pallet_contracts/seal_transfer,349162000
pallet_contracts/seal_transfer/r,2511251000

pallet_contracts/seal_transfer,350352000
pallet_contracts/seal_transfer/r,2537639000

pallet_contracts/seal_transfer,348642000
pallet_contracts/seal_transfer/r,2497103000

pallet_contracts/seal_transfer,349471000
pallet_contracts/seal_transfer/r,2519135000

pallet_contracts/seal_transfer,348562000
pallet_contracts/seal_transfer/r,2466470000

This PR also increases the precision of the component weights by a factor of 1000; this helps in cases where the range of inputs is very wide and prevents the component weights from becoming zero due to insufficient precision.

There might be better ways of doing this. This PR is also unfinished and should not be merged as-is - it obviously needs to be actually properly tested to make sure this isn't subtly broken is some way; some tests also still need fixing, and the code to print out errors needs to be added back.

I'm opening this mostly to discuss whether we actually want to fix it in this way, before I invest more time into it.

I don't see any outstanding issues here so the PR should be good to go.

@koute koute added A0-please_review Pull request needs code review. B7-runtimenoteworthy C1-low PR touches the given topic and has a low impact on builders. D5-nicetohaveaudit ⚠️ PR contains trivial changes to logic that should be properly reviewed. labels Jul 8, 2022
@koute koute requested review from shawntabrizi, athei and ggwpez July 8, 2022 14:51
@koute koute force-pushed the master_fix_weights_linear_regression branch from 60b67de to d1300d6 Compare July 8, 2022 14:53
@koute koute marked this pull request as draft July 8, 2022 14:56
@shawntabrizi
Copy link
Member

shawntabrizi commented Jul 9, 2022

The steps make sense to me, and seem to not break any assumptions for the benchmarking pipeline overall.

Set the previously calculated minimum time as the base weight.

Doesn't this step assume that the first benchmark run is the absolute minimum weight the extrinsic could be?

I don't think that is always true, for example, I could create a benchmark which runs between [10, 100], but actually 0 is a valid starting point, and less weight than 10.

This might just be bad benchmark design, but a consideration here.

@koute
Copy link
Contributor Author

koute commented Jul 11, 2022

The steps make sense to me, and seem to not break any assumptions for the benchmarking pipeline overall.

Great! I was very interested in seeing whether this makes sense to you before I continue with it.

Set the previously calculated minimum time as the base weight.

Doesn't this step assume that the first benchmark run is the absolute minimum weight the extrinsic could be?

I don't think that is always true, for example, I could create a benchmark which runs between [10, 100], but actually 0 is a valid starting point, and less weight than 10.

This might just be bad benchmark design, but a consideration here.

Well, yeah, it just picks the minimum of all of the runs. Perhaps not ideal, but it's the simplest and the most straightforward way of fixing this that I could think of.

It is a good point that someone might only benchmark for, say, 10, but have the extrinsic accept a 0. That, said:

  • Do we know how prevalent something like that is? Do we know how many such benchmarks we have? Do we know any parachain which has such benchmarks?
  • Since one of the main objectives of actually having the weights is to prevent attackers from exhausting a chain's resources I imagine it's better to be more conservative than not, and when in doubt to just err on the side of caution. Or in other words, when given the choice of a bad benchmark reducing your chain's throughput because it won't accept as many extrinsics, or a bad benchmark making your chain vulnerable to a denial of service attack because it will accept too many, I imagine the former should be preferable?

If such benchmarks are not widespread I think it should suffice to just update our docs regarding benchmarking to make it explicit how the base weights are calculated, and in the (hopefully?) rare case it happens to just eat the cost.

@koute koute marked this pull request as ready for review July 12, 2022 13:17
@ggwpez
Copy link
Member

ggwpez commented Jul 14, 2022

We have a quite a lot of benchmarks with components that do not start at 0.

You can find them in the Polkadot repo with the regex The range of component `\w+` is `\[[^0].
It will report more spots than benchmarks, since it searches for the component ranges.
Some good examples:

/// The range of component `v` is `[1000, 2000]`.
/// The range of component `t` is `[500, 1000]`.
/// The range of component `d` is `[5, 16]`.
fn phragmms(v: u32, _t: u32, d: u32, ) -> Weight {
   ...
}

or

/// The range of component `v` is `[1000, 2000]`.
/// The range of component `t` is `[500, 1000]`.
fn create_snapshot_internal(v: u32, t: u32, ) -> Weight {
    ...
}

Maybe we can get around this by only applying your intercept-shift-logic if the linear regression reports a negative intercept?

@koute
Copy link
Contributor Author

koute commented Jul 14, 2022

We have a quite a lot of benchmarks with components that do not start at 0.

So what was the rationale to not run those benchmarks with zero values?

Maybe we can get around this by only applying your intercept-shift-logic if the linear regression reports a negative intercept?

Hmm... well, in some cases that could work, but imagine that the linear regression returns an intercept of, say, 1 (which is possible), and then essentially we'll have the same problem. So I'm not entirely sure what the best course of action is here...


There's one more problem I've found when digging through the numbers; the slopes can also be zero for some benchmarks! Behold, the benchmark for frame_system/remark when ran on my machine:

frame_system__remark

The dots are real samples from the benchmark, while the line on the bottom is the cost calculated using the weights. This happens because the slope here is calculated as 0.4102529741049291, which is rounded down to zero. When I regenerate the graph with the proper slope it looks like this:

frame_system__remark

So, uh, this is pretty bad. Weight calculation shouldn't randomly break down like this.

@shawntabrizi
Copy link
Member

So what was the rationale to not run those benchmarks with zero values?

In many cases, the 0 value makes no sense. For example, creating a membership group with 0 users. The underlying runtime code probably would reject this extrinsic.

Regarding the slope issue, you must note that when looking at the raw, unaltered data, is is basically zero slope:

https://www.shawntabrizi.com/substrate-graph-benchmarks/old/?p=system&e=remark

image

Yes, the benchmarking process is truncating really small slopes to zero, but also these effects are nearly zero when compared to the underlying weights here.

Im not against ideas which can bubble these things up accurately, but we should not also say that it is "break"ing. The weights we are generating, even for thinks like remark, are good for what they are used for.

@ggwpez
Copy link
Member

ggwpez commented Jul 15, 2022

Somehow I also see a linear timing for System::Remark 😕, while the linear regression reports 0.

cargo r --release --features=runtime-benchmarks --bin substrate -- benchmark pallet --dev --steps=50 --repeat=20 --pallet="frame_system" '--extrinsic=remark' --execution=wasm --wasm-execution=compiled

produces on reference hardware:

Data points distribution:                                                                                                                                                              [10/685]
    b   mean µs  sigma µs       %                                                                                                                              
    0     5.419       0.1    1.8%                                                                                                                              
78643     34.82     0.072    0.2%                                                                                                                              
157286     63.95       0.1    0.1%                                                                                                                             
235929     90.53     0.083    0.0%                                                                                                                             
314572     121.4     0.137    0.1%                                                                                                                             
393215     150.1     0.176    0.1%                                                                                                                             
471858     177.3     1.756    0.9%                                                                                                                             
550501     203.3     0.149    0.0%                                                                                                                             
629144     231.6     0.161    0.0%                                                                                                                             
707787     259.8     0.128    0.0% 
786430     288.2     0.445    0.1% 
865073     316.2     0.145    0.0% 
943716     344.4     0.156    0.0% 
1022359     380.3     3.213    0.8%
1101002     401.3     0.502    0.1%
1179645     429.3     0.458    0.1%
1258288     457.6     0.191    0.0%
1336931     486.3     1.577    0.3%
1415574     513.9     0.164    0.0%
1494217     542.2     0.296    0.0%
1572860     570.5      0.26    0.0%
1651503     598.8     0.216    0.0%
1730146     634.6     7.387    1.1%
1808789     659.5     10.44    1.5%
1887432     684.4     0.928    0.1%
1966075     712.9     0.922    0.1%
2044718     741.3      0.74    0.0%
2123361     769.1     0.239    0.0%
2202004       798     0.777    0.0%
2280647       832     11.92    1.4%
2359290     865.5     9.494    1.0%
2437933     884.1     1.061    0.1%
2516576     912.2     0.983    0.1%
2595219     942.8     2.526    0.2%
2673862     993.1     14.54    1.4%
2752505      1012     13.67    1.3%
2831148      1028     4.116    0.4%            
2909791      1060     7.656    0.7%            
2988434      1091     13.12    1.2%            
3067077      1113     1.849    0.1%            
3145720      1147     11.67    1.0%            
3224363      1172     3.371    0.2%            
3303006      1200     2.122    0.1%            
3381649      1230     4.119    0.3%            
3460292      1259     4.096    0.3%            
3538935      1288     3.585    0.2%            
3617578      1317     4.995    0.3%            
3696221      1347     4.183    0.3%            
3774864      1377     9.218    0.6%            
3853507      1409     8.801    0.6%            
3932150      1435     4.814    0.3%

The timing values seem to increase from 5.4µs at b=0 to 1_435µs at maximum b=3932150.
That is quite the difference. Obviously a small slope that we can normally ignore can start to make a difference when it is multiplied with 4 million...

@koute
Copy link
Contributor Author

koute commented Jul 15, 2022

So what was the rationale to not run those benchmarks with zero values?

In many cases, the 0 value makes no sense. For example, creating a membership group with 0 users. The underlying runtime code probably would reject this extrinsic.

So can we assume that if a parameter of zero doesn't make sense and the extrinsic would reject it then in normal use it wouldn't be called with such a parameter anyway? (Or at least wouldn't be called often if the caller is non-malicious.) In which case does it really matter that the base weight would be higher than necessary? I know that in general there is no such guarantee, but it'd make sense to me to basically require that extrinsics are benchmarked with full range of expected inputs to ensure optimal weights, and if you don't benchmark with the full range then you might get slightly higher weights, but you won't risk opening yourself to a potential DoS attack.

Yes, the benchmarking process is truncating really small slopes to zero, but also these effects are nearly zero when compared to the underlying weights here.

Im not against ideas which can bubble these things up accurately, but we should not also say that it is "break"ing. The weights we are generating, even for thinks like remark, are good for what they are used for.

Sorry, I might have used a too strong of a word here. My point was more that it is a subtle landmine that someone could step on without realizing it, since whether this is a problem or not will depend on the exact range the input component can have.

Regarding the slope issue, you must note that when looking at the raw, unaltered data, is is basically zero slope:

Yes, I was looking at the raw, unaltered data. (: And if you only look at the first 16k of the input component it is indeed basically zero and doesn't matter, but the benchmark goes up to 4 million (as @ggwpez confirmed, thanks!) where it seems to start to matter.

Again, whether this is actually a problem in practice for this particular extrinsic - I haven't checked, but even if it isn't it doesn't change the fact that it is a potential subtle correctness issue that might bite someone somewhere without them realizing.

So I see two ways we could fix this: either always round up the weights to 1, or just use higher precision weights. Since our weights are multiplied by 1000 anyway I'm guessing the second option would be preferable?

@shawntabrizi
Copy link
Member

shawntabrizi commented Jul 15, 2022

@koute either solutions sound good to me.

I am still mostly in the camp that the current system is "good enough", we have plenty of full integration benchmarks which show that they are quite accurate (and in the cases they are not perfect, always pessimistic), but happy to have you find these further improvements that can be made :)

@koute
Copy link
Contributor Author

koute commented Jul 18, 2022

Okay, I've increased the precision with which the weights are returned. So now the analysis code returns the extrinsic times' already essentially premultiplied by 1000 instead of multiplying them in the writer.

From my side I think this should be good to go?

As a sanity check here is an example diff comparing weights' outputs before and after this PR:

@@ -62,27 +62,27 @@
 impl<T: frame_system::Config> WeightInfo for SubstrateWeight<T> {
 	// Storage: Recovery Proxy (r:1 w:0)
 	fn as_recovered() -> Weight {
-		(10_050_000 as Weight)
+		(10_110_000 as Weight)
 			.saturating_add(T::DbWeight::get().reads(1 as Weight))
 	}
 	// Storage: Recovery Proxy (r:0 w:1)
 	fn set_recovered() -> Weight {
-		(20_610_000 as Weight)
+		(20_140_000 as Weight)
 			.saturating_add(T::DbWeight::get().writes(1 as Weight))
 	}
 	// Storage: Recovery Recoverable (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn create_recovery(n: u32, ) -> Weight {
-		(41_123_000 as Weight)
-			// Standard Error: 14_000
-			.saturating_add((155_000 as Weight).saturating_mul(n as Weight))
+		(39_920_000 as Weight)
+			// Standard Error: 5_800
+			.saturating_add((182_163 as Weight).saturating_mul(n as Weight))
 			.saturating_add(T::DbWeight::get().reads(1 as Weight))
 			.saturating_add(T::DbWeight::get().writes(1 as Weight))
 	}
 	// Storage: Recovery Recoverable (r:1 w:0)
 	// Storage: Recovery ActiveRecoveries (r:1 w:1)
 	fn initiate_recovery() -> Weight {
-		(48_000_000 as Weight)
+		(47_270_000 as Weight)
 			.saturating_add(T::DbWeight::get().reads(2 as Weight))
 			.saturating_add(T::DbWeight::get().writes(1 as Weight))
 	}
@@ -90,9 +90,9 @@
 	// Storage: Recovery ActiveRecoveries (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn vouch_recovery(n: u32, ) -> Weight {
-		(31_502_000 as Weight)
-			// Standard Error: 15_000
-			.saturating_add((271_000 as Weight).saturating_mul(n as Weight))
+		(30_050_000 as Weight)
+			// Standard Error: 7_087
+			.saturating_add((334_068 as Weight).saturating_mul(n as Weight))
 			.saturating_add(T::DbWeight::get().reads(2 as Weight))
 			.saturating_add(T::DbWeight::get().writes(1 as Weight))
 	}
@@ -101,9 +101,9 @@
 	// Storage: Recovery Proxy (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn claim_recovery(n: u32, ) -> Weight {
-		(41_489_000 as Weight)
-			// Standard Error: 16_000
-			.saturating_add((192_000 as Weight).saturating_mul(n as Weight))
+		(40_270_000 as Weight)
+			// Standard Error: 6_915
+			.saturating_add((229_951 as Weight).saturating_mul(n as Weight))
 			.saturating_add(T::DbWeight::get().reads(3 as Weight))
 			.saturating_add(T::DbWeight::get().writes(1 as Weight))
 	}
@@ -111,9 +111,9 @@
 	// Storage: System Account (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn close_recovery(n: u32, ) -> Weight {
-		(47_169_000 as Weight)
-			// Standard Error: 16_000
-			.saturating_add((240_000 as Weight).saturating_mul(n as Weight))
+		(46_000_000 as Weight)
+			// Standard Error: 8_005
+			.saturating_add((296_745 as Weight).saturating_mul(n as Weight))
 			.saturating_add(T::DbWeight::get().reads(2 as Weight))
 			.saturating_add(T::DbWeight::get().writes(2 as Weight))
 	}
@@ -121,15 +121,15 @@
 	// Storage: Recovery Recoverable (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn remove_recovery(n: u32, ) -> Weight {
-		(44_697_000 as Weight)
-			// Standard Error: 16_000
-			.saturating_add((196_000 as Weight).saturating_mul(n as Weight))
+		(43_940_000 as Weight)
+			// Standard Error: 9_009
+			.saturating_add((169_804 as Weight).saturating_mul(n as Weight))
 			.saturating_add(T::DbWeight::get().reads(2 as Weight))
 			.saturating_add(T::DbWeight::get().writes(1 as Weight))
 	}
 	// Storage: Recovery Proxy (r:1 w:1)
 	fn cancel_recovered() -> Weight {
-		(17_090_000 as Weight)
+		(17_141_000 as Weight)
 			.saturating_add(T::DbWeight::get().reads(1 as Weight))
 			.saturating_add(T::DbWeight::get().writes(1 as Weight))
 	}
@@ -139,27 +139,27 @@
 impl WeightInfo for () {
 	// Storage: Recovery Proxy (r:1 w:0)
 	fn as_recovered() -> Weight {
-		(10_050_000 as Weight)
+		(10_110_000 as Weight)
 			.saturating_add(RocksDbWeight::get().reads(1 as Weight))
 	}
 	// Storage: Recovery Proxy (r:0 w:1)
 	fn set_recovered() -> Weight {
-		(20_610_000 as Weight)
+		(20_140_000 as Weight)
 			.saturating_add(RocksDbWeight::get().writes(1 as Weight))
 	}
 	// Storage: Recovery Recoverable (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn create_recovery(n: u32, ) -> Weight {
-		(41_123_000 as Weight)
-			// Standard Error: 14_000
-			.saturating_add((155_000 as Weight).saturating_mul(n as Weight))
+		(39_920_000 as Weight)
+			// Standard Error: 5_800
+			.saturating_add((182_163 as Weight).saturating_mul(n as Weight))
 			.saturating_add(RocksDbWeight::get().reads(1 as Weight))
 			.saturating_add(RocksDbWeight::get().writes(1 as Weight))
 	}
 	// Storage: Recovery Recoverable (r:1 w:0)
 	// Storage: Recovery ActiveRecoveries (r:1 w:1)
 	fn initiate_recovery() -> Weight {
-		(48_000_000 as Weight)
+		(47_270_000 as Weight)
 			.saturating_add(RocksDbWeight::get().reads(2 as Weight))
 			.saturating_add(RocksDbWeight::get().writes(1 as Weight))
 	}
@@ -167,9 +167,9 @@
 	// Storage: Recovery ActiveRecoveries (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn vouch_recovery(n: u32, ) -> Weight {
-		(31_502_000 as Weight)
-			// Standard Error: 15_000
-			.saturating_add((271_000 as Weight).saturating_mul(n as Weight))
+		(30_050_000 as Weight)
+			// Standard Error: 7_087
+			.saturating_add((334_068 as Weight).saturating_mul(n as Weight))
 			.saturating_add(RocksDbWeight::get().reads(2 as Weight))
 			.saturating_add(RocksDbWeight::get().writes(1 as Weight))
 	}
@@ -178,9 +178,9 @@
 	// Storage: Recovery Proxy (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn claim_recovery(n: u32, ) -> Weight {
-		(41_489_000 as Weight)
-			// Standard Error: 16_000
-			.saturating_add((192_000 as Weight).saturating_mul(n as Weight))
+		(40_270_000 as Weight)
+			// Standard Error: 6_915
+			.saturating_add((229_951 as Weight).saturating_mul(n as Weight))
 			.saturating_add(RocksDbWeight::get().reads(3 as Weight))
 			.saturating_add(RocksDbWeight::get().writes(1 as Weight))
 	}
@@ -188,9 +188,9 @@
 	// Storage: System Account (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn close_recovery(n: u32, ) -> Weight {
-		(47_169_000 as Weight)
-			// Standard Error: 16_000
-			.saturating_add((240_000 as Weight).saturating_mul(n as Weight))
+		(46_000_000 as Weight)
+			// Standard Error: 8_005
+			.saturating_add((296_745 as Weight).saturating_mul(n as Weight))
 			.saturating_add(RocksDbWeight::get().reads(2 as Weight))
 			.saturating_add(RocksDbWeight::get().writes(2 as Weight))
 	}
@@ -198,15 +198,15 @@
 	// Storage: Recovery Recoverable (r:1 w:1)
 	/// The range of component `n` is `[1, 9]`.
 	fn remove_recovery(n: u32, ) -> Weight {
-		(44_697_000 as Weight)
-			// Standard Error: 16_000
-			.saturating_add((196_000 as Weight).saturating_mul(n as Weight))
+		(43_940_000 as Weight)
+			// Standard Error: 9_009
+			.saturating_add((169_804 as Weight).saturating_mul(n as Weight))
 			.saturating_add(RocksDbWeight::get().reads(2 as Weight))
 			.saturating_add(RocksDbWeight::get().writes(1 as Weight))
 	}
 	// Storage: Recovery Proxy (r:1 w:1)
 	fn cancel_recovered() -> Weight {
-		(17_090_000 as Weight)
+		(17_141_000 as Weight)
 			.saturating_add(RocksDbWeight::get().reads(1 as Weight))
 			.saturating_add(RocksDbWeight::get().writes(1 as Weight))
 	}

And here's an example diff of what the benchmarking prints out on the stdout:

 Pallet: "pallet_recovery", Extrinsic: "remove_recovery", Lowest values: [], Highest values: [], Steps: 50, Repeat: 20
 Raw Storage Info
 ========
 Storage: Recovery ActiveRecoveries (r:1 w:0)
 Storage: Recovery Recoverable (r:1 w:1)
 
 Median Slopes Analysis
 ========
 -- Extrinsic Time --
 
 Model:
-Time ~=    44.66
-    + n    0.188
+Time ~=    44.38
+    + n    0.078
               µs
 
 Reads = 2 + (0 * n)
 Writes = 1 + (0 * n)
 
 Min Squares Analysis
 ========
 -- Extrinsic Time --
 
 Data points distribution:
     n   mean µs  sigma µs       %
-    1     44.97     0.183    0.4%
-    2     44.75     0.207    0.4%
-    3     45.64     0.173    0.3%
-    4     45.48     0.288    0.6%
-    5     45.94     0.536    1.1%
-    6     45.51     0.134    0.2%
-    7     45.82     0.244    0.5%
-    8     46.18     0.213    0.4%
-    9     46.74     0.408    0.8%
+    1     44.52     0.142    0.3%
+    2     44.68     0.135    0.3%
+    3     44.92     0.364    0.8%
+    4     44.39      0.24    0.5%
+    5     44.61     0.131    0.2%
+    6     44.75      0.27    0.6%
+    7     44.66     0.102    0.2%
+    8      45.8     0.661    1.4%
+    9     45.42     0.377    0.8%
 
 Quality and confidence:
 param     error
-n         0.016
+n         0.009
 
 Model:
-Time ~=    44.69
-    + n    0.196
+Time ~=    43.94
+    + n    0.169
               µs
 
 Reads = 2 + (0 * n)
 Writes = 1 + (0 * n)

@shawntabrizi
Copy link
Member

@koute can you bump the version of the crate?

Otherwise, looking good, will review

@koute
Copy link
Contributor Author

koute commented Jul 19, 2022

@koute can you bump the version of the crate?

Sure. Do you mean version of frame-benchmarking (from 4.0.0-dev to 6.0.0-dev)? Or do you mean the linregress dependency (since they apparently released 0.5 a week ago)?

Otherwise, looking good, will review

Thanks!

@ggwpez
Copy link
Member

ggwpez commented Jul 22, 2022

The precision increase to pico-seconds is good, thanks.
About the does-not-start-at-zero benchmarks, I dont think there is a straightforward solution.
Our standard approach would be to round it up to zero, since then we over-estimate and are therefore safe.

@shawntabrizi
Copy link
Member

bot rebase

@paritytech-processbot
Copy link

Rebased

Copy link
Member

@ggwpez ggwpez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets try it with the benchbot and some pallets.

frame/benchmarking/src/analysis.rs Show resolved Hide resolved
@ggwpez
Copy link
Member

ggwpez commented Sep 2, 2022

/cmd queue -c bench-bot $ pallet dev pallet_contracts

@command-bot
Copy link

command-bot bot commented Sep 2, 2022

@ggwpez https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/1805446 was started for your command "$PIPELINE_SCRIPTS_DIR/bench-bot.sh" pallet dev pallet_contracts. Check out https://gitlab.parity.io/parity/mirrors/substrate/-/pipelines?page=1&scope=all&username=group_605_bot to know what else is being executed currently.

Comment /cmd cancel 45-bf695fe2-d5f3-4753-9107-ca5b6deb3bb9 to cancel this command or /cmd cancel to cancel all commands in this pull request.

@command-bot
Copy link

command-bot bot commented Sep 2, 2022

@ggwpez Command "$PIPELINE_SCRIPTS_DIR/bench-bot.sh" pallet dev pallet_contracts has finished. Result: https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/1805446 has finished. If any artifacts were generated, you can download them from https://gitlab.parity.io/parity/mirrors/substrate/-/jobs/1805446/artifacts/download.

}
/// The range of component `r` is `[0, 50]`.
fn instr_select(r: u32, ) -> Weight {
Weight::from_ref_time(73_844_000 as u64)
// Standard Error: 1_000
.saturating_add(Weight::from_ref_time(1_773_000 as u64).saturating_mul(r as u64))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it's increasing quite a bit… but if the math was wrong before, then we have on choice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a shame we can't see the raw extrinsic times as we can't be sure whether this increase is due to the different way the regression is done or we've simply hit the issue from #12096. It be nice to have the bot always pass --json-file= to the benchmarking CLI and always upload those files in the artifacts.

Considering that the majority of the weights got bumped I wouldn't be surprised if the increases are unrelated to this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some bench bot does that, not sure which one 😅 Maybe the Polkadot release weight-update.
But yea, its hopefully the increase that we spotted in the other MR. Then lets go!

@shawntabrizi
Copy link
Member

bot merge

@paritytech-processbot
Copy link

Waiting for commit status.

@paritytech-processbot paritytech-processbot bot merged commit 7c2c0c1 into paritytech:master Sep 14, 2022
Lezek123 pushed a commit to Lezek123/substrate that referenced this pull request Oct 19, 2022
…ritytech#11806)

* Improve base weights consistency and make sure they're never zero

* Switch back to `linregress` crate; add back estimation errors

* Move the test into `mod tests`

* Use all of the samples instead of only the first one

* Use full precision extrinsic base weights and slopes

* Add an extra comment

* ".git/.scripts/bench-bot.sh" pallet dev pallet_contracts

Co-authored-by: parity-processbot <>
Co-authored-by: Shawn Tabrizi <[email protected]>
ark0f pushed a commit to gear-tech/substrate that referenced this pull request Feb 27, 2023
…ritytech#11806)

* Improve base weights consistency and make sure they're never zero

* Switch back to `linregress` crate; add back estimation errors

* Move the test into `mod tests`

* Use all of the samples instead of only the first one

* Use full precision extrinsic base weights and slopes

* Add an extra comment

* ".git/.scripts/bench-bot.sh" pallet dev pallet_contracts

Co-authored-by: parity-processbot <>
Co-authored-by: Shawn Tabrizi <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. C1-low PR touches the given topic and has a low impact on builders. D5-nicetohaveaudit ⚠️ PR contains trivial changes to logic that should be properly reviewed.
Projects
Status: done
3 participants