Hi! Here is Yuanjun Chai 柴源君 (sounds like Y-wen Joon, Ch-eye), aka Allen. Now I am a MSEE student in University of Washington Seattle (Go Husky!).

I graduated with highest honors from Xidian University, earning a bachelor’s degree. My thesis about image inpainting received invaluable support by Chao Dong from SIAT-CAS . Also, I am so lucky to have the privilege of collaborating with Chao Dong and Yu Qiao from CAS working on image&video super-resolution, Jason Cheung from HKU working on AI healthcare, Yue Gao from Tsinghua University working on CV.

Previously, I worked a machine learning engineer over 3 yrs at IT companies, like VMware AI Lab , focusing on LLM, agent and RAG. Before this, I worked in YeahMobi – affiliation of Alibaba Group , as a machine learning scientist. I was responsible for the all technical development of AIGC platform Kreado AI , including AI Video Creation, AI model, custom clone service, and etc.

Research Interests:

  • Computer Vision: low&high-level vision, text-to-image, Vision-Language Model (VLM)
  • NLP: Language Model, LLM diversity
  • Embodiment: sim2real, diffusion policy, Vision-Language Action Model (VLA)

Not only diving into research, I am also willing to empower new technologies into products to make people’s lives better. Thus, I co-founded a start-up INGREM inc, to help high-paraplegia disabled people using computer with precise eyes controling platform.

🔥 News

  • 2025.04:  🔥🔥 Our arxiv paper DiffPure-VLM about Vision-Language Model Safeguarding has been released!
  • 2024.09:  🥰🥰 Go to University of Washington! I am so excited to start my research new journey in UW!
  • 2022.07:  🎉🎉 Thrilled to join VMware as MLE! We do some interesting projects on own LLM platform like h2oGPT (⭐️8k+).
  • 2021.04:  👏👏 Glad to obtain fully-funded PhD offer from University of HongKong (HKU)!
  • 2021.01:  🥰🥰 Our eyes control platform has helped high-paraplegia disabled people more than 300!
  • 2020.08:  🎉🎉 Our IKC – CVPR project about real-world super-resolution get more than ⭐️200+.

📝 Publications

Vision-Language Model

Arxiv
sym

Safeguarding Vision-Language Models: Mitigating Vulnerabilities to Gaussian Noise in Perturbation-based Attacks

Jiawei Wang, Yushen Zuo, Yuanjun Chai, Zhendong Liu, Yicheng Fu, Yichun Feng, Kin-Man Lam

Project | GitHub Repo stars

  • We propose the Robust-VLGuard dataset and DiffPure-VLM defense framework to address the problem that visual language models (VLMs) are vulnerable to Gaussian noise and adversarial perturbation attacks. Through fine-tuning with Gaussian noise enhancement, the attack success rate of VLMs on datasets such as CIFAR-10 and ImageNet is significantly reduced (for example, the attack success rate of MiniGPT-4 in the RealToxicityPrompts benchmark test dropped from 44.1% to 16.5%). Combined with the distribution conversion capability of the diffusion model DiffPure, adversarial noise is converted into Gaussian noise, further improving the defense effect, especially under strong attacks (for example, the attack success rate of InternVL2 dropped from 57.3% to 36.1% when ϵ=64/255).

👁️ Computer Vision

CVPR
sym

IKC: Blind Super-Resolution With Iterative Kernel Correction

Jinjin Gu, Hannan Lu, Wangmeng Zuo, Chao Dong

Project |

  • We propose an Iterative Kernel Correction (IKC) method for blur kernel estimation in blind SR problem, where the blur kernels are unknown. We draw the observation that kernel mismatch could bring regular artifacts (either over-sharpening or over-smoothing), which can be applied to correct inaccurate blur kernels. Thus we introduce an iterative correction scheme – IKC that achieves better results than direct kernel estimation. We further propose an effective SR network architecture using spatial feature transform (SFT) layers to handle multiple blur kernels, named SFTMD.

🖥️ Innovational Products

🧑‍🎨 AIGC (Generative Model)

sym

Kreado AI: AIGC Platform for Marketing Content Generation

Yuanjun Chai and YeahMobi.inc

  • We propose a new AIGC platform for marketing content generation, named Kreado AI. Kreado AI is a hybrid worldwide AIGC platform that combines the strengths of so many AIGC functions:
    • Virtual Avatar (talking-face generation, speech synthesis, LLM)
    • AI model (text-to-image, LoRA, control net)
    • Custom clone serivces (image-to-video, voice clone)
    • Many AI tools and AI property
  • Here I mainly focus on the Virtual Avatar and AI model algorithms improvement, as well as collaborate with system architect for entire architecture improvement. Users radiate to Europe, Africa, Southeast Asia, and the Americas, with quarterly revenue exceeding US$1 million.

🧙‍♂️ RAG-based LLM

sym

h2o GPT: AIGC Platform for Marketing Content Generation

Yuanjun Chai

Project |

  • We develop a new RAG-based LLM platform for AI cloud native and private AI. The platform could leverage diverse LLMs with extended dataset such as pdf, code base, dataset and internet links. Here I am responsible for all RAG-based LLM algorithm development, as well as industrial deployment. Functionality includes:
    • QA and chat with extended dataset extraction
    • Content summarization
    • Content generation
    • AI agent
  • Next step we would take research about Multi-modal LLM for AI cloud native.

👨‍⚕️ AI healthcare & Charity

sym

Face Control: Fine Facial Control Platform for High-paraplegia Disabled People

Yuanjun Chai, Ingrem.inc

  • I co-founded a start-up Ingrem, with other hardcore guys. We aim to build up a entire bed for living and playing of high-paraplegia disabled people. Here, I am responsible for the development of the software – eyes&facial control platform. Based on computer vision algorithms, the system could help the diabled use their face details (such as eyebrow, eye, mouth, etc.) to control mouse and keyborad elaborately. Thus, our platform and our bed entirely enhance the accessibility of normal computer usage and social networks. We do believe tech make people’s lives better, and we do it!

🎖 Honors and Awards

  • 2021.04 Obtain a fully-funded PhD return offer from Li Ka Shing Faculty of Medicine, University of Hong Kong.
  • 2019.06 Outstanding Undergraduate Student Award.
  • 2019.06 Outstanding Undergraduate Thesis Award (10/5000), Topic: Image Inpainting Based on Deep Learning.
  • 2018.08 Cambridge Summer AI Academic Programme Excellent Student – All A+ of Artificial Intelligence Classes.
  • 2017.05 Golden Medal of National Computer Design Contest – Birdsong Recognition with Machine Learning.

🎓 Educations

  • 2015.08 - 2019.06, Undergraduate, Xidian University.
  • 2018.08 - 2018.09, Summer School, University of Cambridge (with Prof. Pietro Lio).
  • 2012.08 - 2015.06, High School Affiliated to Northwestern University

🧑‍💻 Professional Experience

  • 2022.07 - 2024.09, Senior Machine Learning Engineer, VMware AI Lab
  • 2021.07 - 2022.07, Senior Machine Learning Scientist, YeahMobi – Alibaba Group.
  • 2019.05 - 2022.07, Research Assistant in CAS, Tsinghua University and HKU (get return PhD offer).

🏃‍♂️ Hobbies

My hobbies include Fencing🤺, Basketball🏀, Swimming🏊, Guitar🎸 and Motorcycle🏍️. In the high school, I get my first gold medal in Fencing🤺 at the National Province Games🏅.

Besides, I have my lovely cats🐱:
sym02 sym01 sym03