Xingchen Zhao

I am a machine learning engineer at Meta working on GenAI model and serving infrastructure for creative generation systems and agentic workflows. My work spans diffusion transformers, LLM serving, post-training and RL, multimodal video editing, robotics perception, and real-time AI tutors.

I have 6+ years across ML research and engineering, including 3+ years in GenAI, LLMs, agents, image/video generation, post-training, inference, and infrastructure. Previously, I worked on Text-to-Edit at TikTok, barcode-less package identification at Amazon Robotics, and real-time AI avatar tutors at Thinkverse as the 1st Founding Engineer.

I was a PhD candidate in Machine Learning at Northeastern University, advised by Raymond Fu at SMILE Lab / FuLab, and completed an MS in Computer Engineering there. I was also an ML visiting scholar at UC San Diego.

Portrait of Xingchen Zhao

Work Experience

Applied research and engineering across production GenAI, multimodal video systems, robotics perception, and real-time AI tutoring.

2025 - Present

Meta / Monetization GenAI

Building large-scale GenAI model and serving infrastructure for creative generation systems. Work spans CUDA/Triton sparse attention kernels for Diffusion Transformer inference, adaptive sparse attention for video diffusion models, vLLM-based benchmarking for dense and MoE LLM serving, disaggregated prefill analysis, and RL/post-training infrastructure with VERL, GRPO, multi-objective RL, tool calling, and LLM-as-judge evaluation for agentic workflows.

2024 - 2025

TikTok / ByteDance GenAI

Post-trained a multimodal LLM for controllable end-to-end video ad creation. The system turns product context, free-form instructions, video understanding signals, and slow-fast temporal evidence into structured edit drafts for advertising workflows.

2023 - 2024

Amazon Robotics

Developed deployment-oriented perception models for package identification in robotics environments, including real-time edge localization and optimized ONNX/TensorRT inference for barcode-less and barcode-based package understanding.

Amazon Science

2023

Thinkverse / Real-time AI Tutors

1st Founding Engineer for an AI tutoring platform. Designed and built the end-to-end full-stack architecture for real-time avatar math tutors, moving student speech through ASR, LLM reasoning, voice generation, and lip-sync avatar rendering with user-provided character images as the base.

Thinkverse

Research

My research started from domain adaptation and generalization, then moved toward multimodal generation, controllable video editing, and deployed AI systems.

Academic foundation

I was a PhD candidate in Machine Learning at Northeastern University, advised by Raymond Fu in SMILE Lab / FuLab, and completed an MS in Computer Engineering there. Earlier research includes ML visiting scholar work at UC San Diego.

Research thread

Domain adaptation, domain generalization, medical imaging robustness, semantic segmentation, multimodal video understanding, and LLM-driven controllable editing. The common thread is making learned systems work outside narrow benchmark assumptions.

Publications & Patent

Selected research outputs across multimodal generation, video editing, domain adaptation, and generalization.

Google Scholar
Text-to-Edit paper method figure

2025

Controllable End-to-End Video Ad Creation via Multimodal LLMs

Text-to-Edit · arXiv

Dabing Cheng, Haosen Zhan, Xingchen Zhao, Guisheng Liu, Zemin Li, Jinghui Xie, Zhao Song, Weiguo Feng, Bingyue Peng

Paper
Pseudo Label Self-Refinement method framework figure

2024

Unsupervised Domain Adaptation for Semantic Segmentation with Pseudo Label Self-Refinement

Pseudo Label Self-Refinement · WACV

Xingchen Zhao, Niluthpol Chowdhury Mithun, Abhinav Rajvanshi, Han-Pang Chiu, Supun Samarasekera

Paper
Pseudo-label curation patent related method figure

2024

Unsupervised Domain Adaptation of Models with Pseudo-Label Curation

US Patent Application 20240312197

Han-Pang Chiu, Niluthpol C. Mithun, Supun Samarasekera, Abhinav Rajvanshi, Xingchen Zhao, Md Nazmul Karim

Patent
Four-Level Optimization overall framework figure

2024

Simultaneous Selection and Adaptation of Source Data via Four-Level Optimization

Four-Level Optimization · TACL

Pengtao Xie, Xingchen Zhao, Xuehai He

Paper
DANN-DG domain generalization geometry figure

2023

Domain Adversarial Neural Networks for Domain Generalization: When It Works and How to Improve

DANN-DG · Machine Learning

Anthony Sicilia, Xingchen Zhao, Seong Jae Hwang

Paper
Source Data Reweighting three-level optimization framework figure

2023

Improve the Performance of CT-Based Pneumonia Classification via Source Data Reweighting

Source Data Reweighting · Scientific Reports

Pengtao Xie, Xingchen Zhao, Xuehai He

Paper
Fourier Style Calibration method overview figure

2022

Test-time Fourier Style Calibration for Domain Generalization

Fourier Style Calibration · IJCAI

Xingchen Zhao, Chang Liu, Anthony Sicilia, Seong Jae Hwang, Yun Fu

Paper
Spatial Modeling FCViT block architecture figure

2022

A Close Look at Spatial Modeling: From Attention to Convolution

Spatial Modeling · arXiv

Xu Ma, Huan Wang, Can Qin, Kunpeng Li, Xingchen Zhao, Jie Fu, Yun Fu

Paper
White Matter Hyperintensity Segmentation MixDANN model figure

2021

Robust White Matter Hyperintensity Segmentation on Unseen Domain

White Matter Hyperintensity Segmentation · ISBI

Xingchen Zhao, Anthony Sicilia, Davneet Minhas, Erin O'Connor, Howard Aizenstein, William Klunk, Dana Tudorascu, Seong Jae Hwang

Paper
PAC Bayesian Medical Imaging generalization bound figure

2021

PAC Bayesian Performance Guarantees for Deep Stochastic Networks in Medical Imaging

PAC-Bayesian Medical Imaging

Anthony Sicilia, Xingchen Zhao, Anastasia Sosnovskikh, Seong Jae Hwang

Paper
Multi-Domain Meta-Learning proposed approach geometry figure

2021

Multi-Domain Learning by Meta-Learning: Taking Optimal Steps in Multi-Domain Loss Landscapes by Inner-Loop Learning

Multi-Domain Meta-Learning · ISBI

Anthony Sicilia, Xingchen Zhao, Davneet Minhas, Erin O'Connor, Howard Aizenstein, William Klunk, Dana Tudorascu, Seong Jae Hwang

Paper
Learning by Ignoring process illustration figure

2020

Learning by Ignoring, with Application to Domain Adaptation

Learning by Ignoring · arXiv

Xingchen Zhao, Xuehai He, Pengtao Xie

Paper

Contact

I am interested in building reliable multimodal AI systems, agentic workflows, and deployment infrastructure for frontier AI products.