-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QX_4 quantization #1240
Comments
There's interesting paper about weight 'outliers' within matrixes: https://huggingface.co/blog/hf-bitsandbytes-integration So maybe implementing some hacks for better scaling those will give better perplexity for lower-bit models. |
... soo, finally 65B with 32gigs of ram? 😄 |
Here's another idea, which maybe could be combined with QX_4. FP32 has 8 bits for the exponent, 23 bits for the mantissa:
bfloat16 maintains 8 bits for the exponent, leaving 7 bits for the mantissa:
FP16 has 5 bits for the exponent, 10 bits for the mantissa:
Since the values in the LLaMA models fall roughly in the range [-2,+2], we may want to further reduce the number of bits dedicated to the exponent. I suspect that a significant range of the exponent field currently is unused. One could define a different 16-bit format:
Or an 8-bit minifloat:
|
@ggerganov I see you have put this into the improve integer quantization project. Do you see this as just "improve integer quantization"? Given the most recent results, |
@sw Can you please elaborate how we can use the alternative float formats you are proposing within the context of this issue? I can see that, perhaps, using your |
I must admit I haven't thought this through completely. As far as performance goes, it could be implemented with a look-up table, just as FP16 currently is (except on Arm), so I don't expect it to be worse. I'm simply saying that a generic float wastes a bit of space because we're not using values |x| > 2. But of course that's not specific to your proposal, but true of the LLaMA models in general. |
A couple of minor thoughts:
|
Closed via #1684 |
Hi @ikawrakow , I have the same concern, could you please share the details? |
The number of elements (weights) in a row must be divisible by 256. If it is not (e.g., Falcon-7B, OpenLLaMA-3B), then one can turn on |
Summary
Use
16 x 8
"super-blocks" for quantization, having onefp16
scale for the "super-block" and 16 quantized scales per 8 model weights. This is particularly useful for 2- and 3-bit quantization, but it also outperforms the existing 4-bit quantization schemesQ4_0
andQ4_2
.Details
The naming of existing
llama.cpp
quantizations follows the schemeQX_Y
, whereX
is the number of bits used for the quants, andY
is0, 1, 2,
or3
. WhenY
is even (0 or 2), model weightsx
are computed from the quantsq
asx = d * q
. WhenY
is odd, thenx = m + d * q
is used. If we look at the integer part ofY/2
([Y/2]
), then the number of weights in a quantization block is 32 (Q4_0
,Q4_1
,Q5_0
) when[Y/2] = 0
, and 16 (Q4_2
,Q4_3
) when[Y/2] = 1
. From the latest perplexity results one can see that quantization using blocks of 16 weights performs better than quantization that uses blocks of 32. The logical conclusion from this would be to look into using blocks of 8 weights. Following the existing naming convention, quantization of typex = d * q
for blocks with 8 weights would beQX_4
, and quantization of typex = m + d * q
would beQX_5
. The problem with going to blocks with 8 weights using the same strategy as utilized inQ4_2
andQ4_3
is that the bits needed to store the scaled
(or scaled
and offsetm
) start becoming comparable to the number of bits used for the quantsq
. For instance, usingfp16
for the scale in a block of 8 weights requires 16 bits, while the quants need 32 bits for 4-bit quantization, so effectively 6 bits per weight (bpw).So, after this long introduction, here is an idea how one can use quantization blocks of 8 weights while keeping bpw reasonable: one can use "super-blocks" that combine
N
quantization blocks. The scale in each block of 8 weights is stored asint8_t
, and there is a singlefp16
that converts the quantized scales to their final value. E.g., for 4-bit quantizationIn the above,
N = 16
, i.e., there are16
blocks of 8 weights, each having its own 8-bit quantized scale. This ends up using5.125
bpw (4 + 1.125
).To further clarify the idea, here is a simple scalar implementation of the de-quantization for
Q4_4
:Perplexity results
I have done some experiments with this idea for 2-, 3- and 4-bit quantization and the following table summarizes the perplexity results. All calculations are with output tensor kept as
fp16
, which adds about 200 MB to the size of the quantized model (compared to theoutput.weight
tensor also being quantized):A few observations from the experiments and existing 4- and 5-bit results
x = m + d * q
(QX_1
,QX_3
) performs better thanx = d * m
(QX_0
,QX_2
, and theQX_4
proposed here). This trend is reversed for 2- and 3-bit quantization. Especially for 2-bit quantization,Q2_1
andQ2_3
give basically useless resultsQ2_4
quantization proposed here gives much lower perplexity compared to what is reported there forQ2_2
(and my own experiment withQ2_2
gives a 7B perplexity of10.6271
and 13B perplexity of8.3552
. The 30BQ2_2
perplexity of6.9507
reported there is higher than the 13BQ2_4
perplexity found here ).9.0087
vs8.3618
from the above table. At 3-bit quantization the difference is much smaller (e.g.,6.4433
vs6.3559
for 7B).Q4_4
is better thanQ4_0
andQ4_2
, but the difference is much less compared to 2- and 3-bit quantizationsN = 8, 16, 32
(so "super-blocks" of64, 128, 256
weights). Perplexity results remain effectively the same, while extra bits per weight (extra as in addition to theX
quantization bits) change from1.25
to1.125
to1.0625
. Tensor sizes are divisible by 256 for all layers in the 7B and 13B models, so one could use this instead of the super-block size of 128 used here (this saves ~0.1G for the 13B model).Here are the perplexity runs reported above:
Q2_4, 7B
main: seed = 1682671488
llama.cpp: loading model from ../models/7B/q24.bin
llama_model_load_internal: format = ggjt v1 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 15 (mostly Q2_4)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 59.11 KB
llama_model_load_internal: mem required = 4504.40 MB (+ 1026.00 MB per state)
llama_init_from_file: kv self size = 256.00 MB
system_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
1.49 seconds per pass - ETA 16 minutes
[1]6.2670,[2]7.2397,[3]7.8484,[4]8.7113,[5]8.5541,[6]8.5304,[7]8.6772,[8]8.7266,[9]9.2108,[10]9.5711,[11]9.9331,[12]10.0286,[13]10.0075,[14]10.2435,[15]10.5844,[16]10.0359,[17]9.8055,[18]9.8147,[19]9.2709,[20]9.2379,[21]9.0826,[22]8.9060,[23]8.8751,[24]8.7826,[25]8.7974,[26]8.5861,[27]8.3354,[28]8.2662,[29]8.1377,[30]7.9415,[31]7.9138,[32]7.9309,[33]7.8544,[34]7.9014,[35]7.9400,[36]8.0232,[37]8.0318,[38]8.0524,[39]8.1139,[40]8.1847,[41]8.2215,[42]8.2709,[43]8.1977,[44]8.2695,[45]8.2579,[46]8.2205,[47]8.2469,[48]8.1920,[49]8.1907,[50]8.1222,[51]8.1247,[52]8.1002,[53]8.1535,[54]8.1312,[55]8.0808,[56]8.1320,[57]8.1640,[58]8.1951,[59]8.2061,[60]8.2674,[61]8.2525,[62]8.3396,[63]8.3781,[64]8.3872,[65]8.4551,[66]8.4626,[67]8.4924,[68]8.5125,[69]8.5512,[70]8.5979,[71]8.6292,[72]8.6747,[73]8.7532,[74]8.7463,[75]8.7566,[76]8.7672,[77]8.7906,[78]8.7662,[79]8.7960,[80]8.7820,[81]8.8039,[82]8.8159,[83]8.7322,[84]8.7181,[85]8.7079,[86]8.6686,[87]8.6050,[88]8.5635,[89]8.5359,[90]8.5171,[91]8.5563,[92]8.5568,[93]8.5615,[94]8.5647,[95]8.6043,[96]8.6056,[97]8.6059,[98]8.5980,[99]8.5716,[100]8.5662,[101]8.5971,[102]8.5883,[103]8.6169,[104]8.6284,[105]8.6296,[106]8.6542,[107]8.6555,[108]8.6706,[109]8.6619,[110]8.6569,[111]8.6789,[112]8.7064,[113]8.7181,[114]8.7197,[115]8.7335,[116]8.7271,[117]8.7349,[118]8.7694,[119]8.7968,[120]8.8443,[121]8.8727,[122]8.9000,[123]8.9478,[124]8.9703,[125]8.9524,[126]9.0055,[127]9.0459,[128]9.0799,[129]9.0528,[130]9.0613,[131]9.0543,[132]9.0445,[133]9.0327,[134]9.0539,[135]9.0482,[136]9.0383,[137]9.0256,[138]9.0123,[139]9.0012,[140]9.0007,[141]8.9808,[142]8.9746,[143]8.9590,[144]8.9421,[145]8.9386,[146]8.9224,[147]8.9316,[148]8.9294,[149]8.9273,[150]8.9260,[151]8.9272,[152]8.9072,[153]8.8795,[154]8.8652,[155]8.8713,[156]8.8626,[157]8.8816,[158]8.8812,[159]8.8950,[160]8.8969,[161]8.9131,[162]8.8704,[163]8.8516,[164]8.8103,[165]8.7617,[166]8.7196,[167]8.6616,[168]8.6151,[169]8.5920,[170]8.5716,[171]8.5289,[172]8.4991,[173]8.4753,[174]8.4339,[175]8.4022,[176]8.3800,[177]8.3518,[178]8.3213,[179]8.2960,[180]8.2789,[181]8.2471,[182]8.2158,[183]8.1923,[184]8.1901,[185]8.1759,[186]8.1755,[187]8.1800,[188]8.1783,[189]8.2022,[190]8.2040,[191]8.2313,[192]8.2490,[193]8.2748,[194]8.2906,[195]8.3192,[196]8.3382,[197]8.3638,[198]8.3830,[199]8.3825,[200]8.3868,[201]8.3827,[202]8.4192,[203]8.4279,[204]8.4391,[205]8.4521,[206]8.4600,[207]8.4536,[208]8.4630,[209]8.4689,[210]8.4726,[211]8.4873,[212]8.4963,[213]8.5092,[214]8.5160,[215]8.5206,[216]8.5372,[217]8.5580,[218]8.5740,[219]8.5719,[220]8.5640,[221]8.5553,[222]8.5496,[223]8.5326,[224]8.5205,[225]8.5155,[226]8.5383,[227]8.5536,[228]8.5615,[229]8.5653,[230]8.5606,[231]8.5816,[232]8.5697,[233]8.5431,[234]8.5208,[235]8.5105,[236]8.4995,[237]8.4836,[238]8.4880,[239]8.4663,[240]8.4513,[241]8.4579,[242]8.4639,[243]8.4602,[244]8.4461,[245]8.4448,[246]8.4290,[247]8.4139,[248]8.4028,[249]8.3996,[250]8.4048,[251]8.3959,[252]8.3912,[253]8.3790,[254]8.3743,[255]8.3575,[256]8.3323,[257]8.3143,[258]8.3039,[259]8.3025,[260]8.2925,[261]8.2885,[262]8.2801,[263]8.2747,[264]8.2566,[265]8.2547,[266]8.2506,[267]8.2400,[268]8.2504,[269]8.2489,[270]8.2463,[271]8.2539,[272]8.2614,[273]8.2570,[274]8.2603,[275]8.2742,[276]8.2809,[277]8.3035,[278]8.3175,[279]8.3288,[280]8.3324,[281]8.3439,[282]8.3494,[283]8.3672,[284]8.3763,[285]8.3869,[286]8.4031,[287]8.4049,[288]8.4160,[289]8.4034,[290]8.3824,[291]8.3640,[292]8.3433,[293]8.3263,[294]8.3285,[295]8.3263,[296]8.3322,[297]8.3320,[298]8.3405,[299]8.3354,[300]8.3234,[301]8.3182,[302]8.3099,[303]8.2992,[304]8.2859,[305]8.2847,[306]8.2684,[307]8.2696,[308]8.2736,[309]8.2506,[310]8.2432,[311]8.2368,[312]8.2386,[313]8.2294,[314]8.2276,[315]8.2045,[316]8.2053,[317]8.1839,[318]8.1579,[319]8.1772,[320]8.1937,[321]8.2004,[322]8.1927,[323]8.1903,[324]8.1923,[325]8.2089,[326]8.2077,[327]8.2130,[328]8.2180,[329]8.2283,[330]8.2368,[331]8.2554,[332]8.2503,[333]8.2620,[334]8.2542,[335]8.2433,[336]8.2464,[337]8.2399,[338]8.2423,[339]8.2345,[340]8.2289,[341]8.2386,[342]8.2404,[343]8.2479,[344]8.2476,[345]8.2446,[346]8.2377,[347]8.2412,[348]8.2458,[349]8.2465,[350]8.2403,[351]8.2399,[352]8.2413,[353]8.2316,[354]8.2341,[355]8.2429,[356]8.2474,[357]8.2407,[358]8.2533,[359]8.2570,[360]8.2481,[361]8.2455,[362]8.2539,[363]8.2650,[364]8.2735,[365]8.2813,[366]8.2828,[367]8.2931,[368]8.2887,[369]8.2886,[370]8.2892,[371]8.2801,[372]8.2848,[373]8.2920,[374]8.2879,[375]8.2865,[376]8.2957,[377]8.2873,[378]8.2901,[379]8.2977,[380]8.2848,[381]8.2801,[382]8.2740,[383]8.2709,[384]8.2678,[385]8.2664,[386]8.2672,[387]8.2645,[388]8.2579,[389]8.2485,[390]8.2391,[391]8.2277,[392]8.2256,[393]8.2288,[394]8.2330,[395]8.2318,[396]8.2207,[397]8.2295,[398]8.2333,[399]8.2437,[400]8.2453,[401]8.2475,[402]8.2493,[403]8.2497,[404]8.2573,[405]8.2502,[406]8.2458,[407]8.2465,[408]8.2466,[409]8.2624,[410]8.2776,[411]8.2933,[412]8.3162,[413]8.3303,[414]8.3414,[415]8.3478,[416]8.3594,[417]8.3760,[418]8.3834,[419]8.3934,[420]8.4054,[421]8.4215,[422]8.4264,[423]8.4389,[424]8.4543,[425]8.4667,[426]8.4751,[427]8.4789,[428]8.4899,[429]8.4956,[430]8.5069,[431]8.5263,[432]8.5289,[433]8.5255,[434]8.5161,[435]8.5146,[436]8.5163,[437]8.5286,[438]8.5396,[439]8.5341,[440]8.5309,[441]8.5231,[442]8.5200,[443]8.5216,[444]8.5225,[445]8.5190,[446]8.5207,[447]8.5240,[448]8.5283,[449]8.5239,[450]8.5232,[451]8.5163,[452]8.5112,[453]8.5021,[454]8.4972,[455]8.4964,[456]8.5014,[457]8.5044,[458]8.5017,[459]8.5020,[460]8.5125,[461]8.5089,[462]8.5070,[463]8.5138,[464]8.5136,[465]8.5099,[466]8.5021,[467]8.5045,[468]8.5073,[469]8.5106,[470]8.5116,[471]8.5045,[472]8.5100,[473]8.5005,[474]8.5033,[475]8.5011,[476]8.5043,[477]8.4954,[478]8.4967,[479]8.5104,[480]8.5171,[481]8.5200,[482]8.5139,[483]8.5089,[484]8.5131,[485]8.5129,[486]8.5049,[487]8.5066,[488]8.5061,[489]8.4980,[490]8.4952,[491]8.4915,[492]8.4829,[493]8.4796,[494]8.4758,[495]8.4773,[496]8.4722,[497]8.4668,[498]8.4663,[499]8.4568,[500]8.4459,[501]8.4388,[502]8.4402,[503]8.4385,[504]8.4277,[505]8.4303,[506]8.4317,[507]8.4305,[508]8.4251,[509]8.4240,[510]8.4299,[511]8.4356,[512]8.4377,[513]8.4391,[514]8.4473,[515]8.4395,[516]8.4386,[517]8.4401,[518]8.4385,[519]8.4423,[520]8.4454,[521]8.4476,[522]8.4521,[523]8.4524,[524]8.4590,[525]8.4639,[526]8.4657,[527]8.4686,[528]8.4647,[529]8.4674,[530]8.4580,[531]8.4541,[532]8.4608,[533]8.4632,[534]8.4588,[535]8.4638,[536]8.4558,[537]8.4510,[538]8.4575,[539]8.4578,[540]8.4657,[541]8.4691,[542]8.4701,[543]8.4711,[544]8.4726,[545]8.4708,[546]8.4720,[547]8.4648,[548]8.4550,[549]8.4551,[550]8.4508,[551]8.4454,[552]8.4420,[553]8.4362,[554]8.4316,[555]8.4256,[556]8.4259,[557]8.4311,[558]8.4275,[559]8.4277,[560]8.4264,[561]8.4258,[562]8.4236,[563]8.4254,[564]8.4323,[565]8.4359,[566]8.4354,[567]8.4333,[568]8.4318,[569]8.4280,[570]8.4301,[571]8.4303,[572]8.4308,[573]8.4289,[574]8.4256,[575]8.4269,[576]8.4264,[577]8.4243,[578]8.4222,[579]8.4236,[580]8.4137,[581]8.4081,[582]8.4046,[583]8.4039,[584]8.4026,[585]8.3949,[586]8.3877,[587]8.3879,[588]8.3944,[589]8.4027,[590]8.4066,[591]8.4067,[592]8.4038,[593]8.3969,[594]8.3972,[595]8.3932,[596]8.3984,[597]8.3938,[598]8.3913,[599]8.3928,[600]8.3922,[601]8.3893,[602]8.3954,[603]8.3990,[604]8.4015,[605]8.4037,[606]8.4054,[607]8.4049,[608]8.3980,[609]8.3974,[610]8.4014,[611]8.3989,[612]8.4027,[613]8.3976,[614]8.3919,[615]8.3800,[616]8.3859,[617]8.3765,[618]8.3683,[619]8.3586,[620]8.3366,[621]8.3254,[622]8.3233,[623]8.3250,[624]8.3239,[625]8.3229,[626]8.3212,[627]8.3259,[628]8.3252,[629]8.3237,[630]8.3274,[631]8.3340,[632]8.3404,[633]8.3379,[634]8.3423,[635]8.3431,[636]8.3403,[637]8.3381,[638]8.3427,[639]8.3394,[640]8.3394,[641]8.3390,[642]8.3474,[643]8.3494,[644]8.3498,[645]8.3465,[646]8.3537,[647]8.3518,[648]8.3533,[649]8.3524,[650]8.3579,[651]8.3656,[652]8.3674,[653]8.3719,[654]8.3636,[655]8.3618,
llama_print_timings: load time = 2570.97 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 906622.46 ms / 335360 tokens ( 2.70 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 933921.98 ms
Q3_4, 7B
main: seed = 1682612164
llama.cpp: loading model from junk.bin
llama_model_load_internal: format = ggjt v1 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 10 (mostly Q3_4)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 59.11 KB
llama_model_load_internal: mem required = 5390.48 MB (+ 1026.00 MB per state)
llama_init_from_file: kv self size = 256.00 MB
system_info: n_threads = 8 / 12 | AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
13.33 seconds per pass - ETA 2 hours 25 minutes
[1]4.5663,[2]4.9408,[3]5.8361,[4]6.5915,[5]6.6755,[6]6.6236,[7]6.8095,[8]6.9148,[9]7.2707,[10]7.5192,[11]7.7399,[12]7.7851,[13]7.7344,[14]7.8086,[15]8.0569,[16]7.6602,[17]7.5283,[18]7.4705,[19]7.0917,[20]7.0792,[21]6.9849,[22]6.8041,[23]6.7713,[24]6.6866,[25]6.6918,[26]6.5324,[27]6.3450,[28]6.2465,[29]6.1521,[30]5.9987,[31]5.9758,[32]5.9998,[33]5.9411,[34]5.9724,[35]5.9995,[36]6.0390,[37]6.0437,[38]6.0577,[39]6.0941,[40]6.1552,[41]6.1714,[42]6.2174,[43]6.1721,[44]6.2245,[45]6.2294,[46]6.2025,[47]6.2265,[48]6.1973,[49]6.2013,[50]6.1552,[51]6.1511,[52]6.1385,[53]6.1806,[54]6.1639,[55]6.1354,[56]6.1712,[57]6.1937,[58]6.2181,[59]6.2339,[60]6.2785,[61]6.2698,[62]6.3279,[63]6.3618,[64]6.3756,[65]6.4226,[66]6.4309,[67]6.4517,[68]6.4697,[69]6.4930,[70]6.5255,[71]6.5479,[72]6.5798,[73]6.6429,[74]6.6461,[75]6.6590,[76]6.6751,[77]6.6892,[78]6.6736,[79]6.7003,[80]6.6948,[81]6.7049,[82]6.7107,[83]6.6550,[84]6.6395,[85]6.6251,[86]6.6019,[87]6.5435,[88]6.5150,[89]6.4959,[90]6.4806,[91]6.5057,[92]6.4994,[93]6.4989,[94]6.4935,[95]6.5215,[96]6.5199,[97]6.5145,[98]6.5099,[99]6.4929,[100]6.4931,[101]6.5206,[102]6.5163,[103]6.5351,[104]6.5432,[105]6.5422,[106]6.5577,[107]6.5536,[108]6.5665,[109]6.5615,[110]6.5573,[111]6.5786,[112]6.6020,[113]6.6021,[114]6.5969,[115]6.6038,[116]6.5944,[117]6.6001,[118]6.6283,[119]6.6498,[120]6.6840,[121]6.6992,[122]6.7240,[123]6.7622,[124]6.7804,[125]6.7694,[126]6.8078,[127]6.8453,[128]6.8757,[129]6.8576,[130]6.8657,[131]6.8604,[132]6.8510,[133]6.8362,[134]6.8459,[135]6.8407,[136]6.8283,[137]6.8197,[138]6.8047,[139]6.7936,[140]6.7900,[141]6.7636,[142]6.7598,[143]6.7322,[144]6.7116,[145]6.7051,[146]6.6920,[147]6.6975,[148]6.6982,[149]6.6923,[150]6.6885,[151]6.6909,[152]6.6807,[153]6.6650,[154]6.6550,[155]6.6620,[156]6.6565,[157]6.6743,[158]6.6784,[159]6.6818,[160]6.6830,[161]6.6951,[162]6.6634,[163]6.6521,[164]6.6259,[165]6.5926,[166]6.5632,[167]6.5231,[168]6.4907,[169]6.4767,[170]6.4657,[171]6.4369,[172]6.4188,[173]6.4011,[174]6.3693,[175]6.3458,[176]6.3342,[177]6.3129,[178]6.2889,[179]6.2715,[180]6.2615,[181]6.2389,[182]6.2196,[183]6.2037,[184]6.2019,[185]6.1936,[186]6.1946,[187]6.2012,[188]6.1977,[189]6.2168,[190]6.2182,[191]6.2409,[192]6.2573,[193]6.2741,[194]6.2860,[195]6.3075,[196]6.3233,[197]6.3455,[198]6.3612,[199]6.3641,[200]6.3691,[201]6.3647,[202]6.3864,[203]6.3947,[204]6.3980,[205]6.4091,[206]6.4162,[207]6.4121,[208]6.4209,[209]6.4261,[210]6.4310,[211]6.4415,[212]6.4490,[213]6.4591,[214]6.4633,[215]6.4680,[216]6.4832,[217]6.5014,[218]6.5150,[219]6.5162,[220]6.5118,[221]6.5051,[222]6.5023,[223]6.4909,[224]6.4835,[225]6.4790,[226]6.5006,[227]6.5076,[228]6.5134,[229]6.5182,[230]6.5144,[231]6.5315,[232]6.5184,[233]6.5008,[234]6.4849,[235]6.4693,[236]6.4610,[237]6.4500,[238]6.4531,[239]6.4369,[240]6.4258,[241]6.4290,[242]6.4326,[243]6.4305,[244]6.4181,[245]6.4153,[246]6.4041,[247]6.3923,[248]6.3846,[249]6.3822,[250]6.3861,[251]6.3801,[252]6.3766,[253]6.3666,[254]6.3631,[255]6.3511,[256]6.3323,[257]6.3202,[258]6.3114,[259]6.3098,[260]6.3011,[261]6.2972,[262]6.2914,[263]6.2865,[264]6.2683,[265]6.2681,[266]6.2659,[267]6.2593,[268]6.2693,[269]6.2675,[270]6.2675,[271]6.2760,[272]6.2799,[273]6.2786,[274]6.2805,[275]6.2891,[276]6.2950,[277]6.3109,[278]6.3215,[279]6.3307,[280]6.3332,[281]6.3427,[282]6.3475,[283]6.3624,[284]6.3708,[285]6.3798,[286]6.3937,[287]6.3928,[288]6.3992,[289]6.3897,[290]6.3734,[291]6.3572,[292]6.3419,[293]6.3274,[294]6.3303,[295]6.3298,[296]6.3349,[297]6.3338,[298]6.3372,[299]6.3346,[300]6.3234,[301]6.3228,[302]6.3155,[303]6.3066,[304]6.2975,[305]6.2948,[306]6.2815,[307]6.2830,[308]6.2856,[309]6.2690,[310]6.2629,[311]6.2567,[312]6.2591,[313]6.2540,[314]6.2524,[315]6.2360,[316]6.2325,[317]6.2155,[318]6.1942,[319]6.2070,[320]6.2197,[321]6.2236,[322]6.2183,[323]6.2123,[324]6.2093,[325]6.2197,[326]6.2195,[327]6.2218,[328]6.2254,[329]6.2315,[330]6.2353,[331]6.2479,[332]6.2454,[333]6.2528,[334]6.2471,[335]6.2406,[336]6.2445,[337]6.2410,[338]6.2406,[339]6.2349,[340]6.2300,[341]6.2388,[342]6.2411,[343]6.2469,[344]6.2470,[345]6.2466,[346]6.2438,[347]6.2479,[348]6.2522,[349]6.2543,[350]6.2511,[351]6.2515,[352]6.2519,[353]6.2461,[354]6.2467,[355]6.2520,[356]6.2548,[357]6.2507,[358]6.2604,[359]6.2628,[360]6.2577,[361]6.2571,[362]6.2634,[363]6.2747,[364]6.2805,[365]6.2861,[366]6.2869,[367]6.2959,[368]6.2936,[369]6.2947,[370]6.2959,[371]6.2899,[372]6.2945,[373]6.3000,[374]6.2986,[375]6.2982,[376]6.3058,[377]6.3004,[378]6.3028,[379]6.3093,[380]6.3012,[381]6.2973,[382]6.2922,[383]6.2908,[384]6.2898,[385]6.2888,[386]6.2885,[387]6.2885,[388]6.2839,[389]6.2783,[390]6.2714,[391]6.2634,[392]6.2598,[393]6.2582,[394]6.2612,[395]6.2601,[396]6.2523,[397]6.2594,[398]6.2624,[399]6.2696,[400]6.2688,[401]6.2714,[402]6.2727,[403]6.2746,[404]6.2816,[405]6.2732,[406]6.2692,[407]6.2692,[408]6.2711,[409]6.2829,[410]6.2948,[411]6.3068,[412]6.3233,[413]6.3350,[414]6.3431,[415]6.3488,[416]6.3577,[417]6.3705,[418]6.3745,[419]6.3817,[420]6.3908,[421]6.4034,[422]6.4078,[423]6.4150,[424]6.4265,[425]6.4360,[426]6.4423,[427]6.4469,[428]6.4554,[429]6.4599,[430]6.4688,[431]6.4829,[432]6.4860,[433]6.4848,[434]6.4798,[435]6.4800,[436]6.4816,[437]6.4912,[438]6.4991,[439]6.4958,[440]6.4950,[441]6.4895,[442]6.4880,[443]6.4888,[444]6.4892,[445]6.4875,[446]6.4892,[447]6.4916,[448]6.4960,[449]6.4937,[450]6.4942,[451]6.4897,[452]6.4792,[453]6.4706,[454]6.4647,[455]6.4659,[456]6.4703,[457]6.4721,[458]6.4701,[459]6.4703,[460]6.4789,[461]6.4761,[462]6.4740,[463]6.4784,[464]6.4771,[465]6.4748,[466]6.4669,[467]6.4671,[468]6.4666,[469]6.4686,[470]6.4690,[471]6.4642,[472]6.4688,[473]6.4631,[474]6.4643,[475]6.4586,[476]6.4605,[477]6.4535,[478]6.4529,[479]6.4602,[480]6.4651,[481]6.4669,[482]6.4626,[483]6.4583,[484]6.4605,[485]6.4590,[486]6.4533,[487]6.4536,[488]6.4514,[489]6.4462,[490]6.4439,[491]6.4405,[492]6.4345,[493]6.4318,[494]6.4300,[495]6.4294,[496]6.4262,[497]6.4207,[498]6.4187,[499]6.4142,[500]6.4045,[501]6.3977,[502]6.3982,[503]6.3975,[504]6.3887,[505]6.3914,[506]6.3922,[507]6.3870,[508]6.3831,[509]6.3822,[510]6.3861,[511]6.3908,[512]6.3944,[513]6.3960,[514]6.4024,[515]6.3970,[516]6.3961,[517]6.3972,[518]6.3967,[519]6.3997,[520]6.4024,[521]6.4039,[522]6.4071,[523]6.4077,[524]6.4132,[525]6.4167,[526]6.4175,[527]6.4192,[528]6.4137,[529]6.4147,[530]6.4096,[531]6.4080,[532]6.4132,[533]6.4158,[534]6.4138,[535]6.4163,[536]6.4104,[537]6.4078,[538]6.4132,[539]6.4142,[540]6.4182,[541]6.4187,[542]6.4199,[543]6.4211,[544]6.4221,[545]6.4199,[546]6.4205,[547]6.4159,[548]6.4104,[549]6.4102,[550]6.4072,[551]6.4035,[552]6.4016,[553]6.3975,[554]6.3949,[555]6.3915,[556]6.3912,[557]6.3940,[558]6.3902,[559]6.3897,[560]6.3893,[561]6.3894,[562]6.3868,[563]6.3866,[564]6.3912,[565]6.3934,[566]6.3934,[567]6.3908,[568]6.3910,[569]6.3895,[570]6.3927,[571]6.3928,[572]6.3939,[573]6.3937,[574]6.3899,[575]6.3895,[576]6.3898,[577]6.3881,[578]6.3859,[579]6.3864,[580]6.3795,[581]6.3757,[582]6.3744,[583]6.3751,[584]6.3752,[585]6.3680,[586]6.3610,[587]6.3616,[588]6.3663,[589]6.3722,[590]6.3754,[591]6.3775,[592]6.3755,[593]6.3719,[594]6.3725,[595]6.3698,[596]6.3737,[597]6.3711,[598]6.3683,[599]6.3702,[600]6.3694,[601]6.3680,[602]6.3704,[603]6.3732,[604]6.3745,[605]6.3776,[606]6.3799,[607]6.3788,[608]6.3751,[609]6.3755,[610]6.3790,[611]6.3770,[612]6.3797,[613]6.3758,[614]6.3703,[615]6.3627,[616]6.3654,[617]6.3589,[618]6.3537,[619]6.3478,[620]6.3331,[621]6.3260,[622]6.3239,[623]6.3254,[624]6.3260,[625]6.3260,[626]6.3248,[627]6.3273,[628]6.3271,[629]6.3268,[630]6.3300,[631]6.3358,[632]6.3411,[633]6.3395,[634]6.3428,[635]6.3431,[636]6.3407,[637]6.3375,[638]6.3405,[639]6.3373,[640]6.3381,[641]6.3380,[642]6.3446,[643]6.3467,[644]6.3480,[645]6.3463,[646]6.3508,[647]6.3474,[648]6.3484,[649]6.3487,[650]6.3527,[651]6.3582,[652]6.3594,[653]6.3633,[654]6.3566,[655]6.3559,
llama_print_timings: load time = 13794.33 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 4708961.29 ms / 335360 tokens ( 14.04 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 4740083.96 ms
Q4_4, 7B
main: seed = 1682662628
llama.cpp: loading model from ../models/7B/q44.bin
llama_model_load_internal: format = ggjt v1 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 14 (mostly Q4_4)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 59.11 KB
llama_model_load_internal: mem required = 6079.65 MB (+ 1026.00 MB per state)
llama_init_from_file: kv self size = 256.00 MB
system_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
1.66 seconds per pass - ETA 18 minutes
[1]4.4671,[2]4.9332,[3]5.7966,[4]6.3903,[5]6.4889,[6]6.4445,[7]6.6354,[8]6.7448,[9]7.1008,[10]7.3327,[11]7.5469,[12]7.5702,[13]7.4939,[14]7.5431,[15]7.8020,[16]7.4115,[17]7.2997,[18]7.2633,[19]6.8989,[20]6.8895,[21]6.7942,[22]6.6200,[23]6.5946,[24]6.4943,[25]6.4910,[26]6.3306,[27]6.1518,[28]6.0550,[29]5.9628,[30]5.7997,[31]5.7669,[32]5.7880,[33]5.7247,[34]5.7606,[35]5.7797,[36]5.8213,[37]5.8266,[38]5.8396,[39]5.8747,[40]5.9274,[41]5.9369,[42]5.9771,[43]5.9365,[44]5.9939,[45]5.9977,[46]5.9743,[47]5.9951,[48]5.9683,[49]5.9721,[50]5.9325,[51]5.9288,[52]5.9188,[53]5.9631,[54]5.9476,[55]5.9211,[56]5.9503,[57]5.9739,[58]5.9943,[59]6.0098,[60]6.0524,[61]6.0460,[62]6.1021,[63]6.1367,[64]6.1507,[65]6.1955,[66]6.2021,[67]6.2187,[68]6.2364,[69]6.2616,[70]6.2919,[71]6.3106,[72]6.3415,[73]6.4011,[74]6.4065,[75]6.4199,[76]6.4314,[77]6.4421,[78]6.4259,[79]6.4530,[80]6.4448,[81]6.4560,[82]6.4602,[83]6.4082,[84]6.3919,[85]6.3798,[86]6.3574,[87]6.2929,[88]6.2639,[89]6.2441,[90]6.2288,[91]6.2508,[92]6.2444,[93]6.2448,[94]6.2434,[95]6.2717,[96]6.2711,[97]6.2662,[98]6.2604,[99]6.2463,[100]6.2476,[101]6.2717,[102]6.2658,[103]6.2852,[104]6.2931,[105]6.2930,[106]6.3089,[107]6.3065,[108]6.3193,[109]6.3124,[110]6.3074,[111]6.3307,[112]6.3508,[113]6.3523,[114]6.3484,[115]6.3549,[116]6.3466,[117]6.3512,[118]6.3800,[119]6.3996,[120]6.4352,[121]6.4512,[122]6.4768,[123]6.5140,[124]6.5320,[125]6.5220,[126]6.5619,[127]6.5989,[128]6.6290,[129]6.6127,[130]6.6231,[131]6.6199,[132]6.6103,[133]6.5962,[134]6.6053,[135]6.6013,[136]6.5899,[137]6.5826,[138]6.5660,[139]6.5555,[140]6.5498,[141]6.5198,[142]6.5166,[143]6.4876,[144]6.4677,[145]6.4586,[146]6.4460,[147]6.4515,[148]6.4512,[149]6.4450,[150]6.4413,[151]6.4425,[152]6.4321,[153]6.4147,[154]6.4056,[155]6.4131,[156]6.4087,[157]6.4255,[158]6.4298,[159]6.4351,[160]6.4369,[161]6.4483,[162]6.4189,[163]6.4069,[164]6.3823,[165]6.3515,[166]6.3244,[167]6.2874,[168]6.2551,[169]6.2412,[170]6.2300,[171]6.2027,[172]6.1858,[173]6.1688,[174]6.1387,[175]6.1174,[176]6.1069,[177]6.0861,[178]6.0630,[179]6.0461,[180]6.0368,[181]6.0151,[182]5.9970,[183]5.9832,[184]5.9832,[185]5.9755,[186]5.9762,[187]5.9828,[188]5.9785,[189]5.9951,[190]5.9961,[191]6.0184,[192]6.0350,[193]6.0522,[194]6.0635,[195]6.0853,[196]6.1015,[197]6.1229,[198]6.1381,[199]6.1415,[200]6.1468,[201]6.1411,[202]6.1608,[203]6.1688,[204]6.1675,[205]6.1781,[206]6.1847,[207]6.1806,[208]6.1893,[209]6.1939,[210]6.1997,[211]6.2106,[212]6.2183,[213]6.2291,[214]6.2316,[215]6.2350,[216]6.2496,[217]6.2681,[218]6.2813,[219]6.2813,[220]6.2777,[221]6.2732,[222]6.2704,[223]6.2607,[224]6.2532,[225]6.2496,[226]6.2706,[227]6.2796,[228]6.2846,[229]6.2911,[230]6.2883,[231]6.3053,[232]6.2927,[233]6.2763,[234]6.2615,[235]6.2437,[236]6.2364,[237]6.2263,[238]6.2291,[239]6.2136,[240]6.2034,[241]6.2061,[242]6.2098,[243]6.2076,[244]6.1963,[245]6.1933,[246]6.1819,[247]6.1698,[248]6.1622,[249]6.1601,[250]6.1644,[251]6.1573,[252]6.1532,[253]6.1435,[254]6.1388,[255]6.1269,[256]6.1089,[257]6.0971,[258]6.0891,[259]6.0872,[260]6.0795,[261]6.0754,[262]6.0699,[263]6.0643,[264]6.0427,[265]6.0420,[266]6.0407,[267]6.0339,[268]6.0434,[269]6.0410,[270]6.0421,[271]6.0497,[272]6.0532,[273]6.0534,[274]6.0557,[275]6.0640,[276]6.0700,[277]6.0856,[278]6.0958,[279]6.1049,[280]6.1076,[281]6.1168,[282]6.1228,[283]6.1381,[284]6.1461,[285]6.1548,[286]6.1684,[287]6.1687,[288]6.1747,[289]6.1662,[290]6.1505,[291]6.1353,[292]6.1199,[293]6.1070,[294]6.1090,[295]6.1082,[296]6.1124,[297]6.1110,[298]6.1144,[299]6.1116,[300]6.1005,[301]6.1005,[302]6.0925,[303]6.0832,[304]6.0749,[305]6.0724,[306]6.0599,[307]6.0621,[308]6.0658,[309]6.0495,[310]6.0435,[311]6.0375,[312]6.0401,[313]6.0346,[314]6.0328,[315]6.0165,[316]6.0113,[317]5.9953,[318]5.9745,[319]5.9864,[320]5.9991,[321]6.0034,[322]5.9992,[323]5.9924,[324]5.9893,[325]5.9993,[326]5.9989,[327]6.0011,[328]6.0052,[329]6.0116,[330]6.0142,[331]6.0267,[332]6.0234,[333]6.0306,[334]6.0251,[335]6.0182,[336]6.0215,[337]6.0188,[338]6.0183,[339]6.0132,[340]6.0088,[341]6.0169,[342]6.0194,[343]6.0240,[344]6.0238,[345]6.0243,[346]6.0216,[347]6.0255,[348]6.0282,[349]6.0300,[350]6.0266,[351]6.0272,[352]6.0275,[353]6.0218,[354]6.0218,[355]6.0268,[356]6.0295,[357]6.0262,[358]6.0353,[359]6.0384,[360]6.0350,[361]6.0349,[362]6.0416,[363]6.0531,[364]6.0599,[365]6.0656,[366]6.0665,[367]6.0749,[368]6.0722,[369]6.0729,[370]6.0744,[371]6.0687,[372]6.0733,[373]6.0784,[374]6.0768,[375]6.0770,[376]6.0837,[377]6.0790,[378]6.0816,[379]6.0876,[380]6.0798,[381]6.0762,[382]6.0711,[383]6.0704,[384]6.0698,[385]6.0690,[386]6.0684,[387]6.0682,[388]6.0644,[389]6.0591,[390]6.0524,[391]6.0446,[392]6.0405,[393]6.0391,[394]6.0420,[395]6.0406,[396]6.0330,[397]6.0401,[398]6.0440,[399]6.0523,[400]6.0522,[401]6.0539,[402]6.0546,[403]6.0566,[404]6.0632,[405]6.0534,[406]6.0498,[407]6.0490,[408]6.0505,[409]6.0626,[410]6.0736,[411]6.0848,[412]6.1006,[413]6.1118,[414]6.1195,[415]6.1248,[416]6.1322,[417]6.1444,[418]6.1483,[419]6.1556,[420]6.1642,[421]6.1757,[422]6.1804,[423]6.1873,[424]6.1986,[425]6.2072,[426]6.2136,[427]6.2181,[428]6.2262,[429]6.2315,[430]6.2398,[431]6.2542,[432]6.2585,[433]6.2580,[434]6.2538,[435]6.2546,[436]6.2568,[437]6.2665,[438]6.2738,[439]6.2713,[440]6.2704,[441]6.2652,[442]6.2637,[443]6.2651,[444]6.2653,[445]6.2633,[446]6.2660,[447]6.2689,[448]6.2734,[449]6.2704,[450]6.2717,[451]6.2676,[452]6.2548,[453]6.2464,[454]6.2409,[455]6.2419,[456]6.2463,[457]6.2483,[458]6.2461,[459]6.2467,[460]6.2551,[461]6.2525,[462]6.2511,[463]6.2558,[464]6.2547,[465]6.2519,[466]6.2441,[467]6.2441,[468]6.2439,[469]6.2458,[470]6.2462,[471]6.2412,[472]6.2457,[473]6.2402,[474]6.2412,[475]6.2353,[476]6.2371,[477]6.2300,[478]6.2289,[479]6.2348,[480]6.2394,[481]6.2412,[482]6.2367,[483]6.2323,[484]6.2343,[485]6.2332,[486]6.2278,[487]6.2278,[488]6.2258,[489]6.2209,[490]6.2185,[491]6.2154,[492]6.2094,[493]6.2065,[494]6.2051,[495]6.2051,[496]6.2017,[497]6.1960,[498]6.1943,[499]6.1898,[500]6.1803,[501]6.1738,[502]6.1741,[503]6.1733,[504]6.1644,[505]6.1673,[506]6.1681,[507]6.1625,[508]6.1586,[509]6.1579,[510]6.1617,[511]6.1661,[512]6.1694,[513]6.1715,[514]6.1779,[515]6.1723,[516]6.1713,[517]6.1722,[518]6.1721,[519]6.1750,[520]6.1777,[521]6.1792,[522]6.1821,[523]6.1827,[524]6.1884,[525]6.1919,[526]6.1931,[527]6.1949,[528]6.1897,[529]6.1904,[530]6.1854,[531]6.1840,[532]6.1885,[533]6.1907,[534]6.1894,[535]6.1918,[536]6.1864,[537]6.1840,[538]6.1889,[539]6.1900,[540]6.1936,[541]6.1938,[542]6.1950,[543]6.1967,[544]6.1976,[545]6.1955,[546]6.1963,[547]6.1920,[548]6.1871,[549]6.1872,[550]6.1841,[551]6.1805,[552]6.1783,[553]6.1745,[554]6.1723,[555]6.1694,[556]6.1691,[557]6.1713,[558]6.1675,[559]6.1669,[560]6.1667,[561]6.1668,[562]6.1641,[563]6.1641,[564]6.1682,[565]6.1701,[566]6.1698,[567]6.1678,[568]6.1683,[569]6.1668,[570]6.1696,[571]6.1702,[572]6.1711,[573]6.1712,[574]6.1677,[575]6.1675,[576]6.1675,[577]6.1662,[578]6.1642,[579]6.1649,[580]6.1582,[581]6.1544,[582]6.1534,[583]6.1543,[584]6.1544,[585]6.1467,[586]6.1399,[587]6.1404,[588]6.1449,[589]6.1505,[590]6.1536,[591]6.1558,[592]6.1545,[593]6.1514,[594]6.1523,[595]6.1500,[596]6.1535,[597]6.1513,[598]6.1484,[599]6.1506,[600]6.1502,[601]6.1486,[602]6.1500,[603]6.1533,[604]6.1542,[605]6.1574,[606]6.1593,[607]6.1577,[608]6.1546,[609]6.1551,[610]6.1587,[611]6.1569,[612]6.1595,[613]6.1557,[614]6.1506,[615]6.1432,[616]6.1462,[617]6.1402,[618]6.1353,[619]6.1297,[620]6.1158,[621]6.1088,[622]6.1073,[623]6.1087,[624]6.1091,[625]6.1091,[626]6.1078,[627]6.1098,[628]6.1099,[629]6.1096,[630]6.1128,[631]6.1183,[632]6.1237,[633]6.1221,[634]6.1256,[635]6.1265,[636]6.1232,[637]6.1200,[638]6.1227,[639]6.1197,[640]6.1206,[641]6.1210,[642]6.1278,[643]6.1300,[644]6.1312,[645]6.1292,[646]6.1331,[647]6.1294,[648]6.1302,[649]6.1303,[650]6.1343,[651]6.1398,[652]6.1408,[653]6.1448,[654]6.1384,[655]6.1378,
llama_print_timings: load time = 2868.41 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 986629.03 ms / 335360 tokens ( 2.94 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1016443.58 ms
Q2_4, 13B
main: seed = 1682672513
llama.cpp: loading model from ../models/13B/q24.bin
llama_model_load_internal: format = ggjt v1 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 15 (mostly Q2_4)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 73.73 KB
llama_model_load_internal: mem required = 7149.75 MB (+ 1608.00 MB per state)
llama_init_from_file: kv self size = 400.00 MB
system_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
2.46 seconds per pass - ETA 26 minutes
[1]4.7585,[2]5.3449,[3]6.1921,[4]6.9859,[5]7.0915,[6]6.9582,[7]7.1905,[8]7.3194,[9]7.6847,[10]7.9856,[11]8.2037,[12]8.2346,[13]8.2531,[14]8.4211,[15]8.6654,[16]8.1885,[17]8.0491,[18]8.0571,[19]7.6317,[20]7.5630,[21]7.4568,[22]7.2688,[23]7.2103,[24]7.1009,[25]7.0970,[26]6.8966,[27]6.6696,[28]6.5677,[29]6.4672,[30]6.2927,[31]6.2522,[32]6.2650,[33]6.2214,[34]6.2911,[35]6.3188,[36]6.3626,[37]6.3676,[38]6.3648,[39]6.4049,[40]6.4696,[41]6.5031,[42]6.5451,[43]6.4929,[44]6.5349,[45]6.5355,[46]6.4898,[47]6.5203,[48]6.4929,[49]6.5029,[50]6.4658,[51]6.4713,[52]6.4578,[53]6.5058,[54]6.4917,[55]6.4656,[56]6.4997,[57]6.5208,[58]6.5559,[59]6.5796,[60]6.6247,[61]6.6097,[62]6.6783,[63]6.7118,[64]6.7189,[65]6.7631,[66]6.7637,[67]6.7817,[68]6.7975,[69]6.8369,[70]6.8757,[71]6.9031,[72]6.9442,[73]7.0056,[74]7.0067,[75]7.0174,[76]7.0390,[77]7.0566,[78]7.0475,[79]7.0756,[80]7.0690,[81]7.0858,[82]7.0833,[83]7.0255,[84]7.0156,[85]7.0123,[86]6.9888,[87]6.9278,[88]6.8928,[89]6.8682,[90]6.8598,[91]6.8908,[92]6.8824,[93]6.8846,[94]6.8808,[95]6.9110,[96]6.9089,[97]6.9063,[98]6.8987,[99]6.8890,[100]6.8806,[101]6.9075,[102]6.8944,[103]6.9125,[104]6.9135,[105]6.9150,[106]6.9329,[107]6.9309,[108]6.9469,[109]6.9416,[110]6.9362,[111]6.9575,[112]6.9752,[113]6.9789,[114]6.9764,[115]6.9810,[116]6.9712,[117]6.9763,[118]7.0062,[119]7.0290,[120]7.0601,[121]7.0781,[122]7.1003,[123]7.1417,[124]7.1628,[125]7.1536,[126]7.1928,[127]7.2300,[128]7.2598,[129]7.2410,[130]7.2510,[131]7.2454,[132]7.2380,[133]7.2288,[134]7.2421,[135]7.2391,[136]7.2287,[137]7.2241,[138]7.2107,[139]7.2015,[140]7.2012,[141]7.1776,[142]7.1732,[143]7.1527,[144]7.1378,[145]7.1344,[146]7.1172,[147]7.1279,[148]7.1340,[149]7.1299,[150]7.1284,[151]7.1312,[152]7.1194,[153]7.1053,[154]7.0951,[155]7.1006,[156]7.0991,[157]7.1170,[158]7.1222,[159]7.1258,[160]7.1297,[161]7.1436,[162]7.1072,[163]7.0966,[164]7.0683,[165]7.0330,[166]6.9992,[167]6.9570,[168]6.9234,[169]6.9088,[170]6.8939,[171]6.8657,[172]6.8461,[173]6.8297,[174]6.7949,[175]6.7696,[176]6.7531,[177]6.7300,[178]6.7037,[179]6.6849,[180]6.6748,[181]6.6528,[182]6.6307,[183]6.6164,[184]6.6141,[185]6.6069,[186]6.6110,[187]6.6160,[188]6.6144,[189]6.6356,[190]6.6369,[191]6.6560,[192]6.6695,[193]6.6896,[194]6.7040,[195]6.7267,[196]6.7426,[197]6.7660,[198]6.7803,[199]6.7816,[200]6.7834,[201]6.7782,[202]6.8005,[203]6.8093,[204]6.8139,[205]6.8266,[206]6.8314,[207]6.8280,[208]6.8349,[209]6.8376,[210]6.8433,[211]6.8518,[212]6.8573,[213]6.8660,[214]6.8694,[215]6.8724,[216]6.8841,[217]6.9016,[218]6.9167,[219]6.9157,[220]6.9105,[221]6.9024,[222]6.9009,[223]6.8909,[224]6.8810,[225]6.8771,[226]6.8996,[227]6.9145,[228]6.9243,[229]6.9329,[230]6.9299,[231]6.9460,[232]6.9357,[233]6.9161,[234]6.8990,[235]6.8808,[236]6.8716,[237]6.8604,[238]6.8644,[239]6.8478,[240]6.8351,[241]6.8392,[242]6.8429,[243]6.8411,[244]6.8283,[245]6.8257,[246]6.8136,[247]6.8008,[248]6.7913,[249]6.7872,[250]6.7898,[251]6.7811,[252]6.7757,[253]6.7633,[254]6.7598,[255]6.7463,[256]6.7255,[257]6.7133,[258]6.7034,[259]6.7016,[260]6.6927,[261]6.6882,[262]6.6811,[263]6.6731,[264]6.6577,[265]6.6581,[266]6.6546,[267]6.6455,[268]6.6556,[269]6.6559,[270]6.6559,[271]6.6630,[272]6.6677,[273]6.6665,[274]6.6678,[275]6.6755,[276]6.6830,[277]6.7003,[278]6.7100,[279]6.7188,[280]6.7223,[281]6.7342,[282]6.7384,[283]6.7524,[284]6.7615,[285]6.7707,[286]6.7866,[287]6.7841,[288]6.7917,[289]6.7842,[290]6.7667,[291]6.7510,[292]6.7324,[293]6.7158,[294]6.7170,[295]6.7174,[296]6.7228,[297]6.7217,[298]6.7237,[299]6.7202,[300]6.7097,[301]6.7079,[302]6.6990,[303]6.6904,[304]6.6805,[305]6.6760,[306]6.6626,[307]6.6652,[308]6.6659,[309]6.6506,[310]6.6450,[311]6.6401,[312]6.6418,[313]6.6342,[314]6.6339,[315]6.6170,[316]6.6160,[317]6.6004,[318]6.5804,[319]6.5945,[320]6.6077,[321]6.6118,[322]6.6055,[323]6.5989,[324]6.5969,[325]6.6085,[326]6.6091,[327]6.6111,[328]6.6139,[329]6.6186,[330]6.6225,[331]6.6357,[332]6.6312,[333]6.6398,[334]6.6321,[335]6.6254,[336]6.6294,[337]6.6266,[338]6.6272,[339]6.6223,[340]6.6194,[341]6.6274,[342]6.6307,[343]6.6368,[344]6.6358,[345]6.6355,[346]6.6318,[347]6.6356,[348]6.6404,[349]6.6428,[350]6.6401,[351]6.6418,[352]6.6438,[353]6.6379,[354]6.6392,[355]6.6448,[356]6.6477,[357]6.6436,[358]6.6529,[359]6.6557,[360]6.6506,[361]6.6486,[362]6.6565,[363]6.6678,[364]6.6738,[365]6.6800,[366]6.6819,[367]6.6929,[368]6.6892,[369]6.6907,[370]6.6922,[371]6.6857,[372]6.6918,[373]6.6976,[374]6.6955,[375]6.6940,[376]6.7022,[377]6.6962,[378]6.6974,[379]6.7035,[380]6.6939,[381]6.6905,[382]6.6856,[383]6.6836,[384]6.6840,[385]6.6822,[386]6.6813,[387]6.6816,[388]6.6758,[389]6.6701,[390]6.6638,[391]6.6557,[392]6.6529,[393]6.6534,[394]6.6558,[395]6.6539,[396]6.6467,[397]6.6557,[398]6.6598,[399]6.6702,[400]6.6694,[401]6.6701,[402]6.6706,[403]6.6732,[404]6.6793,[405]6.6683,[406]6.6646,[407]6.6647,[408]6.6652,[409]6.6781,[410]6.6897,[411]6.7017,[412]6.7187,[413]6.7319,[414]6.7406,[415]6.7476,[416]6.7561,[417]6.7672,[418]6.7697,[419]6.7761,[420]6.7854,[421]6.7970,[422]6.8007,[423]6.8079,[424]6.8207,[425]6.8308,[426]6.8387,[427]6.8423,[428]6.8514,[429]6.8557,[430]6.8635,[431]6.8784,[432]6.8802,[433]6.8788,[434]6.8730,[435]6.8730,[436]6.8755,[437]6.8857,[438]6.8954,[439]6.8912,[440]6.8895,[441]6.8831,[442]6.8802,[443]6.8809,[444]6.8829,[445]6.8802,[446]6.8811,[447]6.8832,[448]6.8871,[449]6.8847,[450]6.8841,[451]6.8796,[452]6.8733,[453]6.8643,[454]6.8579,[455]6.8578,[456]6.8623,[457]6.8644,[458]6.8620,[459]6.8618,[460]6.8698,[461]6.8655,[462]6.8631,[463]6.8662,[464]6.8657,[465]6.8642,[466]6.8564,[467]6.8586,[468]6.8591,[469]6.8619,[470]6.8626,[471]6.8578,[472]6.8629,[473]6.8565,[474]6.8588,[475]6.8554,[476]6.8561,[477]6.8481,[478]6.8469,[479]6.8548,[480]6.8607,[481]6.8621,[482]6.8569,[483]6.8536,[484]6.8568,[485]6.8558,[486]6.8488,[487]6.8492,[488]6.8468,[489]6.8411,[490]6.8390,[491]6.8360,[492]6.8294,[493]6.8257,[494]6.8236,[495]6.8228,[496]6.8191,[497]6.8134,[498]6.8116,[499]6.8061,[500]6.7967,[501]6.7880,[502]6.7892,[503]6.7873,[504]6.7777,[505]6.7790,[506]6.7806,[507]6.7771,[508]6.7734,[509]6.7717,[510]6.7750,[511]6.7811,[512]6.7847,[513]6.7874,[514]6.7947,[515]6.7885,[516]6.7874,[517]6.7884,[518]6.7876,[519]6.7900,[520]6.7920,[521]6.7937,[522]6.7953,[523]6.7954,[524]6.8017,[525]6.8047,[526]6.8057,[527]6.8079,[528]6.8026,[529]6.8051,[530]6.7994,[531]6.7978,[532]6.8046,[533]6.8084,[534]6.8061,[535]6.8101,[536]6.8043,[537]6.8016,[538]6.8073,[539]6.8080,[540]6.8123,[541]6.8142,[542]6.8147,[543]6.8169,[544]6.8180,[545]6.8166,[546]6.8171,[547]6.8123,[548]6.8059,[549]6.8058,[550]6.8032,[551]6.7988,[552]6.7972,[553]6.7924,[554]6.7894,[555]6.7866,[556]6.7858,[557]6.7887,[558]6.7852,[559]6.7856,[560]6.7835,[561]6.7842,[562]6.7818,[563]6.7808,[564]6.7860,[565]6.7882,[566]6.7879,[567]6.7853,[568]6.7853,[569]6.7822,[570]6.7854,[571]6.7860,[572]6.7865,[573]6.7865,[574]6.7829,[575]6.7816,[576]6.7811,[577]6.7778,[578]6.7755,[579]6.7751,[580]6.7677,[581]6.7637,[582]6.7634,[583]6.7635,[584]6.7630,[585]6.7563,[586]6.7496,[587]6.7503,[588]6.7555,[589]6.7622,[590]6.7651,[591]6.7657,[592]6.7647,[593]6.7611,[594]6.7620,[595]6.7595,[596]6.7636,[597]6.7606,[598]6.7572,[599]6.7593,[600]6.7579,[601]6.7563,[602]6.7597,[603]6.7628,[604]6.7645,[605]6.7670,[606]6.7681,[607]6.7671,[608]6.7631,[609]6.7631,[610]6.7687,[611]6.7669,[612]6.7689,[613]6.7652,[614]6.7594,[615]6.7506,[616]6.7538,[617]6.7459,[618]6.7394,[619]6.7332,[620]6.7181,[621]6.7112,[622]6.7089,[623]6.7105,[624]6.7107,[625]6.7115,[626]6.7104,[627]6.7136,[628]6.7136,[629]6.7137,[630]6.7170,[631]6.7235,[632]6.7290,[633]6.7270,[634]6.7302,[635]6.7294,[636]6.7265,[637]6.7236,[638]6.7265,[639]6.7229,[640]6.7237,[641]6.7239,[642]6.7309,[643]6.7328,[644]6.7346,[645]6.7328,[646]6.7373,[647]6.7336,[648]6.7350,[649]6.7353,[650]6.7393,[651]6.7443,[652]6.7446,[653]6.7485,[654]6.7419,[655]6.7409,
llama_print_timings: load time = 4488.53 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 1516595.65 ms / 335360 tokens ( 4.52 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1547168.54 ms
Q3_4, 13B
main: seed = 1682656187
llama.cpp: loading model from ../models/13B/q34.bin
llama_model_load_internal: format = ggjt v1 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 10 (mostly Q3_4)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 73.73 KB
llama_model_load_internal: mem required = 8681.78 MB (+ 1608.00 MB per state)
llama_init_from_file: kv self size = 400.00 MB
system_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
2.75 seconds per pass - ETA 30 minutes
[1]3.9691,[2]4.3609,[3]5.1508,[4]5.6357,[5]5.8245,[6]5.7682,[7]5.8930,[8]6.0160,[9]6.2782,[10]6.4992,[11]6.7043,[12]6.7592,[13]6.7085,[14]6.8204,[15]7.0162,[16]6.6686,[17]6.5717,[18]6.5474,[19]6.2354,[20]6.1972,[21]6.1250,[22]5.9506,[23]5.9279,[24]5.8369,[25]5.8490,[26]5.6949,[27]5.5136,[28]5.4184,[29]5.3375,[30]5.1977,[31]5.1640,[32]5.1797,[33]5.1311,[34]5.1711,[35]5.1947,[36]5.2175,[37]5.2144,[38]5.2114,[39]5.2423,[40]5.2872,[41]5.3119,[42]5.3487,[43]5.3108,[44]5.3543,[45]5.3571,[46]5.3300,[47]5.3585,[48]5.3381,[49]5.3477,[50]5.3142,[51]5.3187,[52]5.3117,[53]5.3565,[54]5.3451,[55]5.3237,[56]5.3481,[57]5.3650,[58]5.3886,[59]5.4044,[60]5.4383,[61]5.4305,[62]5.4858,[63]5.5112,[64]5.5220,[65]5.5596,[66]5.5598,[67]5.5773,[68]5.5892,[69]5.6185,[70]5.6480,[71]5.6698,[72]5.7054,[73]5.7564,[74]5.7630,[75]5.7734,[76]5.7888,[77]5.8019,[78]5.7873,[79]5.8141,[80]5.8085,[81]5.8180,[82]5.8151,[83]5.7677,[84]5.7567,[85]5.7513,[86]5.7341,[87]5.6712,[88]5.6305,[89]5.6089,[90]5.5978,[91]5.6195,[92]5.6135,[93]5.6146,[94]5.6130,[95]5.6419,[96]5.6394,[97]5.6359,[98]5.6316,[99]5.6235,[100]5.6211,[101]5.6452,[102]5.6405,[103]5.6557,[104]5.6603,[105]5.6634,[106]5.6775,[107]5.6761,[108]5.6907,[109]5.6889,[110]5.6830,[111]5.7020,[112]5.7191,[113]5.7178,[114]5.7158,[115]5.7198,[116]5.7080,[117]5.7090,[118]5.7330,[119]5.7506,[120]5.7808,[121]5.7966,[122]5.8185,[123]5.8563,[124]5.8753,[125]5.8695,[126]5.9059,[127]5.9397,[128]5.9679,[129]5.9551,[130]5.9628,[131]5.9577,[132]5.9536,[133]5.9413,[134]5.9490,[135]5.9496,[136]5.9397,[137]5.9349,[138]5.9208,[139]5.9126,[140]5.9112,[141]5.8840,[142]5.8817,[143]5.8558,[144]5.8403,[145]5.8328,[146]5.8204,[147]5.8257,[148]5.8279,[149]5.8249,[150]5.8232,[151]5.8275,[152]5.8208,[153]5.8108,[154]5.8054,[155]5.8122,[156]5.8106,[157]5.8269,[158]5.8289,[159]5.8305,[160]5.8342,[161]5.8453,[162]5.8188,[163]5.8082,[164]5.7861,[165]5.7595,[166]5.7354,[167]5.7028,[168]5.6739,[169]5.6606,[170]5.6517,[171]5.6296,[172]5.6167,[173]5.6035,[174]5.5753,[175]5.5553,[176]5.5420,[177]5.5254,[178]5.5038,[179]5.4904,[180]5.4827,[181]5.4657,[182]5.4485,[183]5.4360,[184]5.4348,[185]5.4271,[186]5.4282,[187]5.4332,[188]5.4297,[189]5.4473,[190]5.4471,[191]5.4648,[192]5.4785,[193]5.4948,[194]5.5065,[195]5.5265,[196]5.5385,[197]5.5577,[198]5.5713,[199]5.5728,[200]5.5747,[201]5.5688,[202]5.5830,[203]5.5901,[204]5.5849,[205]5.5946,[206]5.5999,[207]5.5954,[208]5.6013,[209]5.6056,[210]5.6114,[211]5.6216,[212]5.6280,[213]5.6369,[214]5.6402,[215]5.6434,[216]5.6548,[217]5.6719,[218]5.6855,[219]5.6862,[220]5.6827,[221]5.6775,[222]5.6773,[223]5.6701,[224]5.6627,[225]5.6593,[226]5.6796,[227]5.6871,[228]5.6945,[229]5.7014,[230]5.6982,[231]5.7139,[232]5.7031,[233]5.6877,[234]5.6733,[235]5.6521,[236]5.6465,[237]5.6370,[238]5.6397,[239]5.6280,[240]5.6186,[241]5.6219,[242]5.6239,[243]5.6226,[244]5.6122,[245]5.6090,[246]5.5987,[247]5.5889,[248]5.5823,[249]5.5786,[250]5.5821,[251]5.5736,[252]5.5693,[253]5.5598,[254]5.5562,[255]5.5464,[256]5.5294,[257]5.5194,[258]5.5122,[259]5.5121,[260]5.5040,[261]5.4993,[262]5.4950,[263]5.4898,[264]5.4684,[265]5.4679,[266]5.4647,[267]5.4583,[268]5.4651,[269]5.4649,[270]5.4662,[271]5.4723,[272]5.4756,[273]5.4771,[274]5.4788,[275]5.4854,[276]5.4918,[277]5.5050,[278]5.5138,[279]5.5224,[280]5.5257,[281]5.5354,[282]5.5409,[283]5.5541,[284]5.5634,[285]5.5703,[286]5.5834,[287]5.5802,[288]5.5859,[289]5.5799,[290]5.5657,[291]5.5528,[292]5.5389,[293]5.5267,[294]5.5278,[295]5.5279,[296]5.5330,[297]5.5319,[298]5.5333,[299]5.5313,[300]5.5220,[301]5.5223,[302]5.5154,[303]5.5067,[304]5.4995,[305]5.4974,[306]5.4861,[307]5.4896,[308]5.4904,[309]5.4765,[310]5.4726,[311]5.4684,[312]5.4703,[313]5.4646,[314]5.4629,[315]5.4493,[316]5.4465,[317]5.4335,[318]5.4164,[319]5.4272,[320]5.4385,[321]5.4432,[322]5.4399,[323]5.4338,[324]5.4321,[325]5.4421,[326]5.4438,[327]5.4446,[328]5.4478,[329]5.4528,[330]5.4549,[331]5.4651,[332]5.4612,[333]5.4687,[334]5.4637,[335]5.4582,[336]5.4602,[337]5.4589,[338]5.4584,[339]5.4544,[340]5.4515,[341]5.4582,[342]5.4613,[343]5.4661,[344]5.4667,[345]5.4682,[346]5.4668,[347]5.4704,[348]5.4741,[349]5.4761,[350]5.4743,[351]5.4755,[352]5.4756,[353]5.4707,[354]5.4717,[355]5.4764,[356]5.4792,[357]5.4759,[358]5.4840,[359]5.4864,[360]5.4827,[361]5.4826,[362]5.4892,[363]5.5000,[364]5.5057,[365]5.5100,[366]5.5115,[367]5.5204,[368]5.5175,[369]5.5187,[370]5.5204,[371]5.5161,[372]5.5205,[373]5.5250,[374]5.5229,[375]5.5222,[376]5.5284,[377]5.5247,[378]5.5269,[379]5.5311,[380]5.5240,[381]5.5207,[382]5.5167,[383]5.5148,[384]5.5146,[385]5.5136,[386]5.5127,[387]5.5122,[388]5.5088,[389]5.5049,[390]5.4993,[391]5.4931,[392]5.4895,[393]5.4891,[394]5.4920,[395]5.4913,[396]5.4857,[397]5.4919,[398]5.4961,[399]5.5033,[400]5.5021,[401]5.5026,[402]5.5038,[403]5.5060,[404]5.5116,[405]5.4965,[406]5.4924,[407]5.4922,[408]5.4934,[409]5.5046,[410]5.5136,[411]5.5238,[412]5.5381,[413]5.5484,[414]5.5549,[415]5.5611,[416]5.5685,[417]5.5786,[418]5.5810,[419]5.5860,[420]5.5939,[421]5.6041,[422]5.6074,[423]5.6130,[424]5.6227,[425]5.6304,[426]5.6365,[427]5.6407,[428]5.6480,[429]5.6515,[430]5.6580,[431]5.6710,[432]5.6739,[433]5.6728,[434]5.6689,[435]5.6702,[436]5.6729,[437]5.6814,[438]5.6891,[439]5.6861,[440]5.6852,[441]5.6803,[442]5.6787,[443]5.6799,[444]5.6813,[445]5.6802,[446]5.6823,[447]5.6846,[448]5.6880,[449]5.6865,[450]5.6875,[451]5.6846,[452]5.6697,[453]5.6600,[454]5.6546,[455]5.6552,[456]5.6593,[457]5.6607,[458]5.6587,[459]5.6584,[460]5.6658,[461]5.6619,[462]5.6586,[463]5.6573,[464]5.6571,[465]5.6549,[466]5.6477,[467]5.6467,[468]5.6446,[469]5.6459,[470]5.6450,[471]5.6401,[472]5.6416,[473]5.6368,[474]5.6356,[475]5.6290,[476]5.6277,[477]5.6199,[478]5.6175,[479]5.6193,[480]5.6221,[481]5.6226,[482]5.6179,[483]5.6137,[484]5.6148,[485]5.6092,[486]5.6025,[487]5.6015,[488]5.5988,[489]5.5934,[490]5.5903,[491]5.5869,[492]5.5803,[493]5.5774,[494]5.5757,[495]5.5735,[496]5.5693,[497]5.5634,[498]5.5607,[499]5.5572,[500]5.5488,[501]5.5419,[502]5.5411,[503]5.5401,[504]5.5326,[505]5.5329,[506]5.5339,[507]5.5286,[508]5.5248,[509]5.5250,[510]5.5271,[511]5.5315,[512]5.5355,[513]5.5382,[514]5.5438,[515]5.5399,[516]5.5387,[517]5.5387,[518]5.5384,[519]5.5406,[520]5.5419,[521]5.5431,[522]5.5447,[523]5.5454,[524]5.5508,[525]5.5537,[526]5.5542,[527]5.5559,[528]5.5503,[529]5.5514,[530]5.5475,[531]5.5468,[532]5.5516,[533]5.5544,[534]5.5528,[535]5.5551,[536]5.5504,[537]5.5484,[538]5.5534,[539]5.5542,[540]5.5561,[541]5.5560,[542]5.5574,[543]5.5592,[544]5.5606,[545]5.5593,[546]5.5596,[547]5.5561,[548]5.5520,[549]5.5521,[550]5.5498,[551]5.5469,[552]5.5450,[553]5.5416,[554]5.5393,[555]5.5371,[556]5.5361,[557]5.5376,[558]5.5341,[559]5.5346,[560]5.5337,[561]5.5339,[562]5.5311,[563]5.5311,[564]5.5352,[565]5.5365,[566]5.5368,[567]5.5350,[568]5.5358,[569]5.5341,[570]5.5368,[571]5.5380,[572]5.5386,[573]5.5391,[574]5.5360,[575]5.5347,[576]5.5345,[577]5.5326,[578]5.5307,[579]5.5309,[580]5.5253,[581]5.5223,[582]5.5225,[583]5.5233,[584]5.5234,[585]5.5177,[586]5.5119,[587]5.5122,[588]5.5165,[589]5.5217,[590]5.5246,[591]5.5262,[592]5.5250,[593]5.5212,[594]5.5226,[595]5.5208,[596]5.5247,[597]5.5228,[598]5.5199,[599]5.5227,[600]5.5216,[601]5.5204,[602]5.5213,[603]5.5241,[604]5.5249,[605]5.5277,[606]5.5292,[607]5.5277,[608]5.5247,[609]5.5255,[610]5.5296,[611]5.5285,[612]5.5303,[613]5.5274,[614]5.5234,[615]5.5171,[616]5.5197,[617]5.5144,[618]5.5095,[619]5.5049,[620]5.4935,[621]5.4880,[622]5.4859,[623]5.4872,[624]5.4875,[625]5.4881,[626]5.4876,[627]5.4904,[628]5.4910,[629]5.4916,[630]5.4944,[631]5.4990,[632]5.5038,[633]5.5026,[634]5.5056,[635]5.5053,[636]5.5018,[637]5.4981,[638]5.5003,[639]5.4969,[640]5.4974,[641]5.4977,[642]5.5030,[643]5.5049,[644]5.5073,[645]5.5057,[646]5.5094,[647]5.5046,[648]5.5059,[649]5.5062,[650]5.5095,[651]5.5135,[652]5.5138,[653]5.5176,[654]5.5119,[655]5.5110,
llama_print_timings: load time = 5957.47 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 1701223.56 ms / 335360 tokens ( 5.07 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1733067.74 ms
Q4_4, 13B
main: seed = 1682790225
llama.cpp: loading model from ../models/13B/q44.bin
llama_model_load_internal: format = ggjt v1 (latest)
llama_model_load_internal: n_vocab = 32000
llama_model_load_internal: n_ctx = 512
llama_model_load_internal: n_embd = 5120
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 40
llama_model_load_internal: n_layer = 40
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 14 (mostly Q4_4)
llama_model_load_internal: n_ff = 13824
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 13B
llama_model_load_internal: ggml ctx size = 73.73 KB
llama_model_load_internal: mem required = 10213.81 MB (+ 1608.00 MB per state)
llama_init_from_file: kv self size = 400.00 MB
system_info: n_threads = 16 / 32 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 |
perplexity : calculating perplexity over 655 chunks, batch_size=512
2.72 seconds per pass - ETA 29 minutes
[1]3.7371,[2]4.2097,[3]5.0064,[4]5.3764,[5]5.5586,[6]5.4949,[7]5.6305,[8]5.7337,[9]5.9985,[10]6.2164,[11]6.4018,[12]6.4542,[13]6.4223,[14]6.5122,[15]6.7109,[16]6.3997,[17]6.3208,[18]6.2951,[19]6.0050,[20]5.9874,[21]5.9108,[22]5.7379,[23]5.7112,[24]5.6188,[25]5.6312,[26]5.4855,[27]5.3108,[28]5.2153,[29]5.1408,[30]5.0041,[31]4.9644,[32]4.9804,[33]4.9376,[34]4.9779,[35]4.9952,[36]5.0181,[37]5.0117,[38]5.0095,[39]5.0362,[40]5.0770,[41]5.1000,[42]5.1361,[43]5.0994,[44]5.1415,[45]5.1435,[46]5.1173,[47]5.1457,[48]5.1302,[49]5.1325,[50]5.1018,[51]5.1096,[52]5.1021,[53]5.1478,[54]5.1379,[55]5.1189,[56]5.1389,[57]5.1573,[58]5.1792,[59]5.1969,[60]5.2336,[61]5.2273,[62]5.2826,[63]5.3079,[64]5.3183,[65]5.3554,[66]5.3534,[67]5.3714,[68]5.3824,[69]5.4101,[70]5.4406,[71]5.4614,[72]5.4958,[73]5.5438,[74]5.5508,[75]5.5605,[76]5.5748,[77]5.5862,[78]5.5737,[79]5.5997,[80]5.5943,[81]5.6018,[82]5.5978,[83]5.5517,[84]5.5406,[85]5.5338,[86]5.5187,[87]5.4533,[88]5.4109,[89]5.3896,[90]5.3801,[91]5.4009,[92]5.3968,[93]5.3979,[94]5.3971,[95]5.4232,[96]5.4202,[97]5.4171,[98]5.4136,[99]5.4063,[100]5.4035,[101]5.4261,[102]5.4222,[103]5.4380,[104]5.4423,[105]5.4440,[106]5.4578,[107]5.4561,[108]5.4715,[109]5.4708,[110]5.4650,[111]5.4836,[112]5.4995,[113]5.4999,[114]5.4986,[115]5.5033,[116]5.4916,[117]5.4909,[118]5.5141,[119]5.5324,[120]5.5620,[121]5.5776,[122]5.5990,[123]5.6352,[124]5.6525,[125]5.6471,[126]5.6825,[127]5.7151,[128]5.7425,[129]5.7312,[130]5.7397,[131]5.7352,[132]5.7318,[133]5.7202,[134]5.7284,[135]5.7279,[136]5.7190,[137]5.7155,[138]5.7017,[139]5.6938,[140]5.6922,[141]5.6652,[142]5.6613,[143]5.6361,[144]5.6211,[145]5.6124,[146]5.6018,[147]5.6069,[148]5.6100,[149]5.6069,[150]5.6060,[151]5.6105,[152]5.6049,[153]5.5951,[154]5.5893,[155]5.5957,[156]5.5933,[157]5.6083,[158]5.6106,[159]5.6112,[160]5.6147,[161]5.6256,[162]5.6003,[163]5.5907,[164]5.5704,[165]5.5452,[166]5.5225,[167]5.4905,[168]5.4638,[169]5.4505,[170]5.4419,[171]5.4213,[172]5.4093,[173]5.3965,[174]5.3701,[175]5.3500,[176]5.3366,[177]5.3201,[178]5.3003,[179]5.2871,[180]5.2796,[181]5.2638,[182]5.2477,[183]5.2357,[184]5.2347,[185]5.2275,[186]5.2280,[187]5.2337,[188]5.2312,[189]5.2476,[190]5.2479,[191]5.2649,[192]5.2784,[193]5.2932,[194]5.3039,[195]5.3229,[196]5.3343,[197]5.3533,[198]5.3668,[199]5.3688,[200]5.3693,[201]5.3628,[202]5.3753,[203]5.3811,[204]5.3766,[205]5.3852,[206]5.3904,[207]5.3867,[208]5.3925,[209]5.3957,[210]5.4014,[211]5.4117,[212]5.4178,[213]5.4265,[214]5.4290,[215]5.4325,[216]5.4442,[217]5.4606,[218]5.4739,[219]5.4737,[220]5.4708,[221]5.4662,[222]5.4661,[223]5.4595,[224]5.4525,[225]5.4492,[226]5.4689,[227]5.4743,[228]5.4817,[229]5.4885,[230]5.4849,[231]5.5003,[232]5.4900,[233]5.4753,[234]5.4608,[235]5.4390,[236]5.4339,[237]5.4251,[238]5.4284,[239]5.4171,[240]5.4082,[241]5.4115,[242]5.4126,[243]5.4118,[244]5.4020,[245]5.3983,[246]5.3883,[247]5.3786,[248]5.3726,[249]5.3694,[250]5.3727,[251]5.3645,[252]5.3597,[253]5.3509,[254]5.3468,[255]5.3376,[256]5.3213,[257]5.3114,[258]5.3047,[259]5.3035,[260]5.2952,[261]5.2901,[262]5.2863,[263]5.2813,[264]5.2584,[265]5.2582,[266]5.2553,[267]5.2490,[268]5.2555,[269]5.2550,[270]5.2558,[271]5.2620,[272]5.2649,[273]5.2665,[274]5.2676,[275]5.2736,[276]5.2795,[277]5.2917,[278]5.3000,[279]5.3081,[280]5.3118,[281]5.3217,[282]5.3270,[283]5.3396,[284]5.3483,[285]5.3564,[286]5.3687,[287]5.3654,[288]5.3710,[289]5.3649,[290]5.3511,[291]5.3385,[292]5.3253,[293]5.3133,[294]5.3137,[295]5.3138,[296]5.3186,[297]5.3176,[298]5.3198,[299]5.3175,[300]5.3089,[301]5.3093,[302]5.3030,[303]5.2950,[304]5.2876,[305]5.2848,[306]5.2740,[307]5.2769,[308]5.2776,[309]5.2648,[310]5.2621,[311]5.2579,[312]5.2592,[313]5.2536,[314]5.2522,[315]5.2396,[316]5.2361,[317]5.2236,[318]5.2077,[319]5.2178,[320]5.2289,[321]5.2333,[322]5.2301,[323]5.2244,[324]5.2226,[325]5.2319,[326]5.2336,[327]5.2343,[328]5.2379,[329]5.2426,[330]5.2446,[331]5.2548,[332]5.2511,[333]5.2589,[334]5.2544,[335]5.2495,[336]5.2518,[337]5.2507,[338]5.2503,[339]5.2460,[340]5.2434,[341]5.2501,[342]5.2534,[343]5.2578,[344]5.2583,[345]5.2597,[346]5.2581,[347]5.2620,[348]5.2656,[349]5.2677,[350]5.2659,[351]5.2674,[352]5.2677,[353]5.2628,[354]5.2634,[355]5.2684,[356]5.2713,[357]5.2683,[358]5.2761,[359]5.2783,[360]5.2749,[361]5.2747,[362]5.2813,[363]5.2920,[364]5.2969,[365]5.3006,[366]5.3024,[367]5.3111,[368]5.3090,[369]5.3105,[370]5.3126,[371]5.3086,[372]5.3133,[373]5.3173,[374]5.3155,[375]5.3151,[376]5.3206,[377]5.3171,[378]5.3197,[379]5.3236,[380]5.3168,[381]5.3140,[382]5.3098,[383]5.3080,[384]5.3080,[385]5.3068,[386]5.3057,[387]5.3054,[388]5.3025,[389]5.2988,[390]5.2936,[391]5.2880,[392]5.2844,[393]5.2841,[394]5.2872,[395]5.2864,[396]5.2812,[397]5.2879,[398]5.2922,[399]5.2994,[400]5.2985,[401]5.2992,[402]5.3002,[403]5.3026,[404]5.3081,[405]5.2931,[406]5.2890,[407]5.2878,[408]5.2889,[409]5.3000,[410]5.3092,[411]5.3186,[412]5.3326,[413]5.3428,[414]5.3491,[415]5.3550,[416]5.3624,[417]5.3719,[418]5.3740,[419]5.3788,[420]5.3865,[421]5.3960,[422]5.3994,[423]5.4049,[424]5.4137,[425]5.4214,[426]5.4276,[427]5.4316,[428]5.4389,[429]5.4427,[430]5.4488,[431]5.4614,[432]5.4646,[433]5.4638,[434]5.4604,[435]5.4616,[436]5.4644,[437]5.4727,[438]5.4801,[439]5.4774,[440]5.4765,[441]5.4720,[442]5.4709,[443]5.4721,[444]5.4739,[445]5.4731,[446]5.4750,[447]5.4774,[448]5.4805,[449]5.4789,[450]5.4800,[451]5.4771,[452]5.4617,[453]5.4525,[454]5.4469,[455]5.4473,[456]5.4512,[457]5.4524,[458]5.4506,[459]5.4501,[460]5.4573,[461]5.4533,[462]5.4495,[463]5.4477,[464]5.4473,[465]5.4452,[466]5.4377,[467]5.4366,[468]5.4347,[469]5.4357,[470]5.4346,[471]5.4296,[472]5.4303,[473]5.4257,[474]5.4247,[475]5.4179,[476]5.4152,[477]5.4071,[478]5.4045,[479]5.4049,[480]5.4075,[481]5.4078,[482]5.4032,[483]5.3992,[484]5.4000,[485]5.3932,[486]5.3868,[487]5.3860,[488]5.3837,[489]5.3786,[490]5.3753,[491]5.3719,[492]5.3651,[493]5.3621,[494]5.3605,[495]5.3584,[496]5.3546,[497]5.3485,[498]5.3458,[499]5.3424,[500]5.3344,[501]5.3273,[502]5.3263,[503]5.3252,[504]5.3175,[505]5.3174,[506]5.3180,[507]5.3127,[508]5.3091,[509]5.3096,[510]5.3118,[511]5.3160,[512]5.3199,[513]5.3222,[514]5.3277,[515]5.3237,[516]5.3227,[517]5.3225,[518]5.3226,[519]5.3247,[520]5.3260,[521]5.3270,[522]5.3283,[523]5.3290,[524]5.3345,[525]5.3372,[526]5.3377,[527]5.3393,[528]5.3339,[529]5.3348,[530]5.3311,[531]5.3306,[532]5.3354,[533]5.3382,[534]5.3363,[535]5.3384,[536]5.3341,[537]5.3323,[538]5.3373,[539]5.3381,[540]5.3398,[541]5.3396,[542]5.3409,[543]5.3431,[544]5.3444,[545]5.3433,[546]5.3435,[547]5.3403,[548]5.3362,[549]5.3363,[550]5.3343,[551]5.3318,[552]5.3299,[553]5.3270,[554]5.3247,[555]5.3228,[556]5.3221,[557]5.3237,[558]5.3205,[559]5.3207,[560]5.3194,[561]5.3195,[562]5.3168,[563]5.3166,[564]5.3209,[565]5.3219,[566]5.3225,[567]5.3206,[568]5.3216,[569]5.3201,[570]5.3228,[571]5.3241,[572]5.3251,[573]5.3254,[574]5.3226,[575]5.3207,[576]5.3201,[577]5.3185,[578]5.3166,[579]5.3164,[580]5.3112,[581]5.3082,[582]5.3083,[583]5.3092,[584]5.3098,[585]5.3040,[586]5.2987,[587]5.2987,[588]5.3031,[589]5.3080,[590]5.3109,[591]5.3125,[592]5.3114,[593]5.3075,[594]5.3089,[595]5.3073,[596]5.3114,[597]5.3095,[598]5.3062,[599]5.3088,[600]5.3079,[601]5.3068,[602]5.3067,[603]5.3094,[604]5.3099,[605]5.3124,[606]5.3137,[607]5.3123,[608]5.3095,[609]5.3104,[610]5.3144,[611]5.3130,[612]5.3151,[613]5.3122,[614]5.3083,[615]5.3025,[616]5.3051,[617]5.3002,[618]5.2960,[619]5.2916,[620]5.2808,[621]5.2758,[622]5.2741,[623]5.2754,[624]5.2758,[625]5.2766,[626]5.2763,[627]5.2790,[628]5.2798,[629]5.2802,[630]5.2832,[631]5.2876,[632]5.2923,[633]5.2911,[634]5.2940,[635]5.2936,[636]5.2901,[637]5.2864,[638]5.2885,[639]5.2854,[640]5.2859,[641]5.2863,[642]5.2913,[643]5.2930,[644]5.2947,[645]5.2933,[646]5.2967,[647]5.2916,[648]5.2927,[649]5.2929,[650]5.2959,[651]5.3000,[652]5.3004,[653]5.3042,[654]5.2988,[655]5.2981,
llama_print_timings: load time = 6350.49 ms
llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: prompt eval time = 1667989.72 ms / 335360 tokens ( 4.97 ms per token)
llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run)
llama_print_timings: total time = 1699934.63 ms
The text was updated successfully, but these errors were encountered: