My general research interests are within the area of text information management. Text information, notably the various types of online information, is continuously produced everywhere in every possible way. The dramatic growth of text information has brought changes to many aspects of one's daily life. Even in our own research, online text information, such as scientific literature and technical wiki, is becoming increasingly useful. Search engines, as the most useful tools to help users find and access text information, have already made a huge impact in the real world. However, many challenges remain to be solved to make search more accurate, efficient, intelligent, and to go beyond search to discover, analyze, and summarize useful knowledge from the information found. Thus, there is an urgent need to develop more powerful tools to manage and make use of large-scale text information.

The goal of my research is to develop both principled methodologies and innovative applications for automatically processing, managing, accessing, analyzing, discovering knowledge from, and summarizing large-scale text information. On a larger scope, this can be generalized to any symbolic information. Various disciplines are playing important roles along this research pipeline, including natural language processing, database management, information retrieval, and data mining, with machine learning and statistical models serving as a solid theoretical foundation. Thus I have been collaborating with researchers in many of these areas. My solid academic training and industry internship experiences have influenced me to develop a research methodology that emphasizes both theoretical principles and practical applications. My intern experiences with Microsoft Research and Yahoo! Research have taught me the importance of motivating my research to solve real-world problems that are valuable to common users, and evaluating my research results with large-scale real data and user judgements. In solving research problems, I appreciate principled approaches and often flexibly absorb and adapt ideas and techniques from multiple disciplines.

