Statement of Purpose - Draft 3

Graduate School Application · 13 Nov 2022

Statement of Purpose

I am currently an undergraduate student majoring in Statistics and Computer Science at the University of Illinois, Urbana - Champaign (UIUC). During my four-years of college study, I have completed several advanced courses in artificial intelligence and computer vision and received a high GPA (3.95/4.0). Besides the course work, I had two software engineer internships: one as an android developer at NetEase and the other as a machine learning data analyst at VMware. After finishing my industrial tour, I participated in three different artificial intelligence-focused research projects. I want to pursue a Master of Computer Science degree with a concentrate on teaching our computers to learn new things and understand their surroundings using machine perception and general-purpose computer vision. I believe that these fields are critical for the next generation of technology products that will enrich and facilitate people’s lives.

My motivation to explore computer science stems from my life goal. I’ve always considered myself fortunate to have not only a happy and fulfilling family, but also a clear life-long ambition to fight for: I dreamed to be a successful entrepreneur who, like Steve Jobs, could stand on the stage of a product event and launch a revolutionary product. AR (augmented reality) device, I believe, is the product that could change everyone’s lifestyles and perspective on technology. However, AR device has a long way to go before becoming a mature product. Since it must “see” the real-world environment in which it operates, I plan to begin by studying the state-of-the-art works in computer vision.

To better understand how to enable a machine to recognize the real world, I started my research at the Intelligent Motion Lab in my junior year, under the supervision of Prof. Kris Hauser and Prof. Yuxiong Wang. Our project’s primary objective is to develop a new few-shot continual learning system to address practical problems. I mainly focused on designing and implementing a system for enabling robots to learn new object categories based on human provided annotations (mask and label). We realized that providing a mask label for a real object would be cost prohibitive for user operations. Consequently, I integrated an interactive segmentation model (RITM) that could generate a high-quality mask with a few clicks into our system. Another realistic issue is that during the novel object training period, the system should continue to perform inference tasks rather than halting until training is done. Therefore, I assisted my team in designing and developing a new continual learning system benchmark for realistic setting, especially considering the inference latency during few-shot training period as a crucial factor. We required the system’s inference output has relatively low latency in any time in order to ensure the robot’s normal functionality. In addition, I added a video object segmentation tracking component (VOS) to increase the instant segmentation accuracy, which proved to improve the system performance by enabling our robot to execute the relative task immediately after the annotation being provided. In this project, I worked as a full-stack developer who not only dealt with frontend and backend development but also enhance our system from various aspects based on my solid academic foundation.

After establishing a knowledge basis of continual learning and semantic segmentation, I intend to dive deeper into the area of artificial intelligence and gain more experience working on research papers. Following my junior year, I was selected to work as an undergraduate research assistant with Prof. Tarek Abdelzaher to investigate graph adversarial learning. We want to detect the noisy labels in the knowledge graph and improve the model’s robustness based on structure learning. I was responsible for constructing a “config + trainer” deep learning infrastructure and accommodating numerous baselines, attacking methods, and our own method into this experiment codebase. When preparing the experiment, I found that there are very few academic papers that mainly focus on the related research. Therefore, I conducted extensive studies on the graph data after global attack and finally identified several generic patterns. Based on these patterns, we devised a few new optimization restrictions and ultimately beat most existing approaches. Our paper about Robust Reasoning over Noisy Knowledge Graphs via Structure Learning is due to be submitted to ACL’2023 by the end of December. This project provided me with a deeper understanding of research, especially the entire pipeline of a research project from a vague idea to a highly specific and comprehensive article, thereby growing my motivation to work in a research lab to develop new technologies and products.

Working on graph adversarial learning made me aware of the fragility of deep learning models and trigger my curiosity about AI’s black box. Why will the model make this prediction? Which variable determined its outcome? In my senior year, I therefore choose to return to the field of computer vision and begin exploring the black box of the model. Supervised by Prof. Derek Hoiem, we try to build up a pipeline for concept-based classification and wish to interpret model’s prediction by measuring the contribution of different object concepts (e.g., “bird” could be divided to: “wings”, “beak”, and “feathers”). Hence, one of the most important parts in this project is to figure out how to determine these concepts from training data. We decide to extract one layer of CLIP’s image encoder in inference time and cut this latent map to several patches. The k-means algorithm is then used to cluster patches and build up those concepts. I implemented this concept extraction process and construct a similarity data set for training. What excites me most is the test accuracy of our attribute classification pipeline was comparable to that of previous end-to-end models, and we could use this concept-based pipeline to determine the contribution of each concept at the time of inference using SHAP or Grad Cam. I am still working on this project with a plan to submit our work to ICCV 2023, and I am eager to uncover more interesting applications for our attribute classification pipeline.

Professional training at a higher institute has been a key priority in my agenda. Carnegie Mellon University, as the world’s best university in computer science, is the ideal solution for me. I explored the program brochure and am excited to find courses like Visual Learning and Recognition and Geometry-based Methods in Vision, which will lay a solid foundation for the modeling and sophisticated calculations required in the construction of augmented reality. These courses will bring me closer to understanding 3D-vision and actual machine perception technology. The Master of Science degree program in Computer Science at CMU embodies my dream program because of the rigorous coursework it offers and also because of the school’s research facility, the Augmented Perception Lab. Since 2014, I have established the objective of creating a wonderful product that can impact people’s lives; I never lost sight of this goal and continue to strive in the connected field. I wish to make the machine feel the world because a deeper understanding of the world would allow it to better serve human life. Therefore, I look forward to becoming a part of the most dynamic computer science community in the world and hope to meet like-minded friends to pursue our dreams. Overall, I believe CMU’s abundant resources and collaborative environment can provide the best guidance to my academia and career path, and I would also love to give back to the CMU community with my positive spirit and ceaseless hard work.