Skip to content
View amo33's full-sized avatar

Highlights

  • Pro

Block or report amo33

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
amo33/README.md

NICE To Meet you! 👋

🔭 I’m currently working on Model serving optimization, training acceleration.

Expecially system for AI is main research topic. vLLM, Flash attention, and Megatron-LM is on my watch list LOL.

Moreover, I'm dissecting best practice cuda implementation and really interested in parallel programming & low level programming(os-level).

📫 How to reach me: email me to [email protected] :) ...

cuda docker Pytorch onnx c++

Pinned Loading

  1. executorch executorch Public

    Forked from pytorch/executorch

    On-device AI across mobile, embedded and edge for PyTorch

    C++

  2. flatflow flatflow Public

    Forked from 9rum/flatflow

    A learned system for parallel training of deep neural networks

    Python

  3. NeMo NeMo Public

    Forked from NVIDIA/NeMo

    A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

    Python

  4. onnx-mlir onnx-mlir Public

    Forked from onnx/onnx-mlir

    Representation and Reference Lowering of ONNX Models in MLIR Compiler Infrastructure

    C++

  5. onnxruntime onnxruntime Public

    Forked from microsoft/onnxruntime

    ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

    C++