-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows Makefile review comments addressed. Remove extra commands and CI changes. #256
Closed
rosslwheeler
wants to merge
39
commits into
karpathy:master
from
rosslwheeler:makefile_windows_fixes
Closed
Windows Makefile review comments addressed. Remove extra commands and CI changes. #256
rosslwheeler
wants to merge
39
commits into
karpathy:master
from
rosslwheeler:makefile_windows_fixes
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…tensors to put the layernorms at the end. the training loop seems to work ok, and the tests pass and the loss and optimization looks ok, but the gradients don't match. which can't be right. so there is a bug, but it's a bit too late in the day for me to debug right now, creating a PR and going to sleep, will fix tomorrow
Tested locally and reduced compilation time by 200ms, unfortunately for me upgrading to 12.4 made my compilations times slow by 2x but at least this can make it a bit faster
Addressed the review comments. Remove the extra rename Cuda command. Tested against CI, WSL2, Windows 11 and Windows Server 2022.
rosslwheeler
force-pushed
the
makefile_windows_fixes
branch
from
April 27, 2024 21:41
e4ce057
to
25240ec
Compare
…ded them in the old order, so yeah...
… for some tensors and i don't exactly know why sad
…g around precisions
…ested via defines
…fp32 or bf16 or fp16. fp16 will error, though
load bf16 directly, and some "quality of life" handling of fp32/fp16/bf16 precisions
…prone, but i think it is done. had to bump versions on all .bin files, invalidating the previous files. re-run the python training script to re-export the new version files. let's not do too much of things like this in the future lol. actually, fun fact i had a chance to do the padded vocab really really early in the history of llm.c development, and chose not do it, thinking i'll just do it later. i should have done it. such is life, you make mistakes, you accumulate scar tissue, and you learn, and you become better, faster, stronger. this is the mindset one must have to lead a happy and fulfilling life. it's not important that you are perfect at any point in time, it's only important that you keep improving, every day.
…evious PR, should satisfy the CI now
…llm.c into lancerts-encoder_forward-float4
Enable multithreading in nvcc
Updating the CI to build different precisions
Separates out common error-checking wrapper utils, that are broadly useful across all file
Addressed the review comments. Remove the extra rename Cuda command. Tested against CI, WSL2, Windows 11 and Windows Server 2022.
…ler/llm.c into makefile_windows_fixes
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addressed the review comments. Removed the extra rename Cuda command. Tested against CI, WSL2, Windows 11 and Windows Server 2022.
@Ricardicus - please review and let me know if you have any comments. Tried to incorporate our discussed changes. This should be cleaner now. Thanks.