You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am the author of the Fastest Method to Printing Integers and Floating-point numbers, which contains the manual modulo optimization for Ryu. When you do mod 100 div 100, you just did 2 division instructions when you only needed to do one. You're welcome. You are now apart of the Uniprinter (Universal Printer), which I'm aiming to be the world's fastest text processing library that grows from stack to heap with optional dynamic memory, compiles in about 3 seconds, and actually prints Unicode in C++ right.
I'm still looking through the algorithm for optimizations, I just cloned.
I just ran the 128-bit bit shifting tables from Puff and yes, the pattern generalizes. The uint32_t decimalLength(const uint128_t v) function is very slow.
#Bits
Max
#Bits
Max
#Bits
#Bits
Max
1
1.00E+00
33
8.59E+09
65
3.69E+19
97
2
3.00E+00
34
1.72E+10
66
7.38E+19
98
3
7.00E+00
35
3.44E+10
67
1.48E+20
99
4
1.50E+01
36
6.87E+10
68
2.95E+20
100
5
3.10E+01
37
1.37E+11
69
5.90E+20
101
6
6.30E+01
38
2.75E+11
70
1.18E+21
102
7
1.27E+02
39
5.50E+11
71
2.36E+21
103
8
2.55E+02
40
1.10E+12
72
4.72E+21
104
9
5.11E+02
41
2.20E+12
73
9.44E+21
105
10
1.02E+03
42
4.40E+12
74
1.89E+22
106
11
2.05E+03
43
8.80E+12
75
3.78E+22
107
12
4.10E+03
44
1.76E+13
76
7.56E+22
108
13
8.19E+03
45
3.52E+13
77
1.51E+23
109
14
1.64E+04
46
7.04E+13
78
3.02E+23
110
15
3.28E+04
47
1.41E+14
79
6.04E+23
111
16
6.55E+04
48
2.81E+14
80
1.21E+24
112
17
1.31E+05
49
5.63E+14
81
2.42E+24
113
18
2.62E+05
50
1.13E+15
82
4.84E+24
114
19
5.24E+05
51
2.25E+15
83
9.67E+24
115
20
1.05E+06
52
4.50E+15
84
1.93E+25
116
21
2.10E+06
53
9.01E+15
85
3.87E+25
117
22
4.19E+06
54
1.80E+16
86
7.74E+25
118
23
8.39E+06
55
3.60E+16
87
1.55E+26
119
24
1.68E+07
56
7.21E+16
88
3.09E+26
120
25
3.36E+07
57
1.44E+17
89
6.19E+26
121
26
6.71E+07
58
2.88E+17
90
1.24E+27
122
27
1.34E+08
59
5.76E+17
91
2.48E+27
123
28
2.68E+08
60
1.15E+18
92
4.95E+27
124
29
5.37E+08
61
2.31E+18
93
9.90E+27
125
30
1.07E+09
62
4.61E+18
94
1.98E+28
126
31
2.15E+09
63
9.22E+18
95
3.96E+28
127
32
4.29E+09
64
1.84E+19
96
7.92E+28
128
Mod 10 is very slow due to the number of decimals in the result (i.e. the less you have to divide by the faster their newton's method they use for the hardware division) I've benchmarked on 64-bit systems to be faster than mod 10 what is called the MSD Look-Up-Down (MSD-LUD), but I gave up on it when I discovered Puff. The only version I saved prints both ends of the number simultaneously, it is here, but that version of the algorithm has issues and Script2 is down for surgery for at least one more month to update the v1 ASCII Data Spec.
The text was updated successfully, but these errors were encountered:
I don't understand what you're talking about. Did you realize that the compiler replaces division by a constant with a multiplication? The decimalLength function is written that way because we expect ~17 digits on average after conversion if the numbers are uniformly randomly distributed.
I am the author of the Fastest Method to Printing Integers and Floating-point numbers, which contains the manual modulo optimization for Ryu. When you do mod 100 div 100, you just did 2 division instructions when you only needed to do one. You're welcome. You are now apart of the Uniprinter (Universal Printer), which I'm aiming to be the world's fastest text processing library that grows from stack to heap with optional dynamic memory, compiles in about 3 seconds, and actually prints Unicode in C++ right.
I'm still looking through the algorithm for optimizations, I just cloned.
I just ran the 128-bit bit shifting tables from Puff and yes, the pattern generalizes. The
uint32_t decimalLength(const uint128_t v)
function is very slow.Mod 10 is very slow due to the number of decimals in the result (i.e. the less you have to divide by the faster their newton's method they use for the hardware division) I've benchmarked on 64-bit systems to be faster than mod 10 what is called the MSD Look-Up-Down (MSD-LUD), but I gave up on it when I discovered Puff. The only version I saved prints both ends of the number simultaneously, it is here, but that version of the algorithm has issues and Script2 is down for surgery for at least one more month to update the v1 ASCII Data Spec.
The text was updated successfully, but these errors were encountered: