Ruiying Ma

I'm a CS PhD student at UC Berkeley EPIC Data Lab, advised by Prof. Aditya Parameswaran. I received a B.Eng in Computer Science and Technology from Tsinghua University, IIIS (Yao Class) in June, 2025. Prior to that, I visited UC Berkeley as a student researcher, working on databases for unstructured data. I was also a research intern at the Systems and Networking Research Group of Microsoft Resarch Asia, where I worked on caching problems with Dr. Chieh-Jan Mike Liang, Prof. Francis Y. Yan, and Yanjie Gao.

Email  /  Google Scholar  /  Github  /  LinkedIn

profile photo

Research

I am interested in Databases, and how they can be combined with LLMs. Currently, I formulate, analyze, and leverage various structures from real-world PDFs to improve efficiency and accuracy in document analytics.

Publications

Querying Templatized Document Collections with Large Language Models
Yiming Lin, Madelon Hulsebos, Ruiying Ma, Shreya Shanker, Sepanta Ziegham, Aditya G. Parameswaran, Eugene Wu
ICDE, 2025
paper / arXiv

Source code from Jon Barron's website.