Ruiying Ma

I'm a CS PhD student at UC Berkeley EPIC Data Lab, advised by Prof. Aditya Parameswaran. I received a B.Eng in Computer Science and Technology from Tsinghua University, Yao Class in June, 2025. Prior to that, I visited UC Berkeley as a undergraduate student researcher. I was also a research intern at Systems and Networking Research Group, Microsoft Resarch Asia, where I was fortunate to work with Dr. Chieh-Jan Mike Liang, Prof. Francis Y. Yan, and Yanjie Gao.

Email  /  Google Scholar  /  Github  /  LinkedIn

profile photo

Research

I am interested in LLM-driven data systems, including both the development of LLM agents for data systems and the design of data systems that better support LLMs. My work focuses on studying and building agents for accurate and scalable data processing, as well as designing data structures and algorithms that enable efficient LLM-driven data processing.
- Data Agent: Benchmarking LLM agents on a wide range of data processing tasks to inform better data agent design.
- SHED: Designed algorithms and data structures that leverage document hierarchical structures to improve document processing and understanding, balancing correctness and efficiency with theoretical guarantees.
- MetaMuse: An LLM-driven framework for creative system algorithm design.

Publications

Algorithm Generation via Creative Ideation
Ruiying Ma, Chieh-Jan Mike Liang, Yanjie Gao, Francis Y. Yan
arXiv preprint, 2025
arXiv

Querying Templatized Document Collections with Large Language Models
Yiming Lin, Madelon Hulsebos, Ruiying Ma, Shreya Shanker, Sepanta Ziegham, Aditya G. Parameswaran, Eugene Wu
ICDE, 2025
paper / arXiv

Source code from Jon Barron's website.