Windows Makefile review comments addressed. Remove extra commands and CI changes. #256

rosslwheeler · 2024-04-26T06:48:57Z

Addressed the review comments. Removed the extra rename Cuda command. Tested against CI, WSL2, Windows 11 and Windows Server 2022.

@Ricardicus - please review and let me know if you have any comments. Tried to incorporate our discussed changes. This should be cleaner now. Thanks.

…tensors to put the layernorms at the end. the training loop seems to work ok, and the tests pass and the loss and optimization looks ok, but the gradients don't match. which can't be right. so there is a bug, but it's a bit too late in the day for me to debug right now, creating a PR and going to sleep, will fix tomorrow

Tested locally and reduced compilation time by 200ms, unfortunately for me upgrading to 12.4 made my compilations times slow by 2x but at least this can make it a bit faster

Addressed the review comments. Remove the extra rename Cuda command. Tested against CI, WSL2, Windows 11 and Windows Server 2022.

…ded them in the old order, so yeah...

… for some tensors and i don't exactly know why sad

…g around precisions

…ested via defines

…fp32 or bf16 or fp16. fp16 will error, though

load bf16 directly, and some "quality of life" handling of fp32/fp16/bf16 precisions

…prone, but i think it is done. had to bump versions on all .bin files, invalidating the previous files. re-run the python training script to re-export the new version files. let's not do too much of things like this in the future lol. actually, fun fact i had a chance to do the padded vocab really really early in the history of llm.c development, and chose not do it, thinking i'll just do it later. i should have done it. such is life, you make mistakes, you accumulate scar tissue, and you learn, and you become better, faster, stronger. this is the mindset one must have to lead a happy and fulfilling life. it's not important that you are perfect at any point in time, it's only important that you keep improving, every day.

…evious PR, should satisfy the CI now

…llm.c into lancerts-encoder_forward-float4

Enable multithreading in nvcc

…leanup

Updating the CI to build different precisions

…2-split-file

Separates out common error-checking wrapper utils, that are broadly useful across all file

Addressed the review comments. Remove the extra rename Cuda command. Tested against CI, WSL2, Windows 11 and Windows Server 2022.

…ler/llm.c into makefile_windows_fixes

rosslwheeler mentioned this pull request Apr 26, 2024

Windows changes for Makefile #236

Closed

karpathy and others added 9 commits April 27, 2024 00:54

i think i am making things cleaner, but i am not fixing the problem

09d935c

i think github copilot betrayed me on this index here, i cant remember

d4a642b

fix dumb bug. i'll blame github copilot but i can't remember

e067a27

tweak the tolerances until we pass lol

9d6fd30

print more in the comparison

a58b8d5

Enable multithreading in nvcc

2954d90

Tested locally and reduced compilation time by 200ms, unfortunately for me upgrading to 12.4 made my compilations times slow by 2x but at least this can make it a bit faster

Addressed review comments. Remove extra commands and CI changes.

5ed4364

Addressed the review comments. Remove the extra rename Cuda command. Tested against CI, WSL2, Windows 11 and Windows Server 2022.

Resolving conflicts in Makefile

25240ec

rosslwheeler force-pushed the makefile_windows_fixes branch from e4ce057 to 25240ec Compare April 27, 2024 21:41

karpathy and others added 19 commits April 27, 2024 23:17

fix a really bad bug in how i was checking the gradients, where i loa…

0062707

…ded them in the old order, so yeah...

bring back original ordering. i also had to bump the thresholds by 3X…

9a91b40

… for some tensors and i don't exactly know why sad

adjust comment

82d7907

Include the float4 kernel

cfccd82

amend the float4 kernel

61c5c05

amend the float4 kernel

1c7d23a

allow user to make different precisions, add prints and error handlin…

4f7d8d9

…g around precisions

reshuffle the ifdefs to make bf16 the default if no PRECISION is requ…

a3f5ad9

…ested via defines

profile and test only use bf16. but the train script can be run with …

9d70d9a

…fp32 or bf16 or fp16. fp16 will error, though

Merge pull request karpathy#265 from karpathy/feature/load_bf16

d95b8d8

load bf16 directly, and some "quality of life" handling of fp32/fp16/bf16 precisions

make padded vocab fixes in the .c code as well, i missed it in the pr…

b7972ff

…evious PR, should satisfy the CI now

Merge branch 'encoder_forward-float4' of https://github.com/lancerts/…

4b6a532

…llm.c into lancerts-encoder_forward-float4

incorporate faster encoder_forward kernel to fp32 CUDA version

327eef3

Merge branch 'lancerts-encoder_forward-float4'

4b6f68a

Merge pull request karpathy#269 from ChrisDryden/patch-3

10aa24e

Enable multithreading in nvcc

Updating the CI to build different precisions

b522333

as promised, cleanup enabled by padding :)

ca48791

and even more cleanup

4c295c7

karpathy and others added 11 commits April 28, 2024 20:20

add small comment on -t=0

49228b0

Merge branch 'cleanup' of https://github.com/ngc92/llm.c into ngc92-c…

5185656

…leanup

Merge branch 'ngc92-cleanup'

66a92c6

Merge pull request karpathy#279 from Ricardicus/prec-ci

18b41b4

Updating the CI to build different precisions

moved checked helper functions into a separate file

4a3c278

Merge branch 'split-file' of https://github.com/ngc92/llm.c into ngc9…

b1c80e9

…2-split-file

add comments pointing to the definition of the utils functions

c20497c

Merge branch 'ngc92-split-file'

50acc12

Separates out common error-checking wrapper utils, that are broadly useful across all file

Addressed review comments. Remove extra commands and CI changes.

dc0295c

Addressed the review comments. Remove the extra rename Cuda command. Tested against CI, WSL2, Windows 11 and Windows Server 2022.

Resolving conflicts in Makefile

f265f60

Merge branch 'makefile_windows_fixes' of https://github.com/rosslwhee…

dd9bb3e

…ler/llm.c into makefile_windows_fixes

rosslwheeler closed this Apr 29, 2024

rosslwheeler deleted the makefile_windows_fixes branch April 29, 2024 09:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows Makefile review comments addressed. Remove extra commands and CI changes. #256

Windows Makefile review comments addressed. Remove extra commands and CI changes. #256

rosslwheeler commented Apr 26, 2024

Windows Makefile review comments addressed. Remove extra commands and CI changes. #256

Windows Makefile review comments addressed. Remove extra commands and CI changes. #256

Conversation

rosslwheeler commented Apr 26, 2024