Skip to content
View xujz18's full-sized avatar

Organizations

@THUDM

Block or report xujz18

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
xujz18/README.md

Hi, welcome to my Github ๐Ÿ‘‹

I am Jiazheng Xu, a third-year PhD student in Tsinghua University.

  • ๐Ÿ”ญ Interested in multimodal generative models, especially RLHF and alignment. Find my up-to-date publication list in Google Scholar!
  • ๐ŸŒฑ Some of my proud leading works about RLHF for multimodal generative models:
    • ImageReward (NeurIPS'23): the first general-purpose text-to-image human preference reward model (RM) for RLHF, outperforming CLIP/BLIP/Aesthetic by 30% in terms of human preference prediction.
    • VisionReward: a fine-grained and multi-dimensional reward model for image and video generation, outperforming VideoScore by 17.2% and enabling multi-objective optimization.
  • ๐ŸŒฑ I'm also honored to work with the team on multimodal foundation models:
    • CogVLM (NeurIPS'24): a powerful open-source visual language model (VLM), which achieves state-of-the-art performance on 10 classic cross-modal benchmarks.
    • CogAgent (CVPR'24): a visual agent being able to return a plan, next action, and specific operations with coordinates for any given task on any GUI screenshot, enhancing GUI-related question-answering capabilities.
    • CogVideoX: a large-scale diffusion transformer models designed for generating videos based on text prompts.
  • ๐Ÿ’ฌ Feel free to drop me an email for:
    • Any form of collaboration
    • Any issue about my works or code
    • Interesting ideas to discuss or just chatting

Pinned Loading

  1. THUDM/VisionReward THUDM/VisionReward Public

    VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

    Python 62

  2. THUDM/ImageReward THUDM/ImageReward Public

    [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation

    Python 1.2k 65

  3. THUDM/SwissArmyTransformer THUDM/SwissArmyTransformer Public

    SwissArmyTransformer is a flexible and powerful library to develop your own Transformer variants.

    Python 1k 97

  4. THUDM/CogView THUDM/CogView Public

    Text-to-Image generation. The repo for NeurIPS 2021 paper "CogView: Mastering Text-to-Image Generation via Transformers".

    Python 1.7k 176