Research
I am interested in Computer Systems, especially Databases. Currently, I'm working on databases for PDF documents, a common category of unstructured data. Sepcifically, I formulate, analyze, and leverage various structures from real-world PDFs to improve efficiency and accuracy in document analytics. I'm also working on a traditional computer system problem: caching. I design and analyze new cache replacement policies that achieve state-of-the-art efficiency on evolving system workloads.
|
Publications
Querying Templatized Document Collections with Large Language Models
Yiming Lin, Madelon Hulsebos, Ruiying Ma, Shreya Shanker, Sepanta Ziegham, Aditya G. Parameswaran, Eugene Wu
ICDE, 2025
paper
/
arXiv
By leveraging semantic hierarchical structures from templatized documents, we design ZenDB, a document analytics system with a novel query engine, for accurate and cost-effective (~31x cost savings) query execution.
|
|