One focus of my prior research is to understand documents text jointly with their visual layouts. From manuscripts earlier than A.D. 200 to corporate reports in the 21st century, documents have always been typeset with intricate layouts that optimizes for human readability while being difficult for machine perception. I design methods and tools that helps parse these complex documents, some of the represent work include LayoutParser and VILA.
Recently, I am also interested in Human-AI Collaboration. From interfaces (e.g., PAWLS) to algorithms like OLALA, my goal is to enable better communication between humans and machine learning models, and use AI to empower humans in work and creative endeavors.
The LayoutParser project is featured in Standford Digital Economy Seminar Series.
Give a talk on LayoutParser and Document Intelligence at Online Seminar in Economics + Data Science @ ETH Zurich
The LayoutParser talk video is available on Youtube.
Please check our new paper on Scientific Document Parsing - Incorporating Visual Layout Structures for Scientific Text Classification.
The LayoutParser paper is accepted as an oral paper at ICDAR 2021.
A Unified Toolkit for Deep Learning Based Document Image AnalysisOnline Seminar in Economics + Data Science @ ETH Zurich
Information Retrieval over Historical Scans with Non-trivial LayoutsKaggle Days Meetup Boston
Deep Learning-based Framework for Automatic Damage Detection in Aircraft Engine Borescope InspectionICNC 2019, Honolulu, Hawaii