Statement of Purpose - Draft 2

Graduate School Application · 13 Nov 2022

Statement of Purpose

I am currently an undergraduate student majoring in Statistics and Computer Science at the University of Illinois, Urbana - Champaign (UIUC). During my four-years of college study, I have completed several high-level courses on artificial intelligence and computer vision and received a high GPA (3.95/4.0). Besides the course work, I had two software engineer internships: one as an android developer at NetEase and another as a machine learning data analyst at VMware. After finishing my industrial tour, I participated in three different research projects of which the primary focus is artificial intelligence. I want to pursue a Master of Computer Science degree with a concentrate on teaching our computers to learn new things and understand their surroundings with the help from research on machine perception, explainable artificial intelligence, and general-purpose computer vision. I believe that those fields are critical for the future generation of technology products that will enrich and facilitate people’s lives.

My motivation to pursue a master’s degree in computer science stems from my life goal. I’ve always considered myself fortunate to have not only a happy and fulfilling family, but also a clear life-long ambition to fight for: I dreamed to be a successful entrepreneur who, like Steve Jobs, could stand on the stage of a product event and launch a product that revolutionized the world. AR device, I believe, is the product that could change everyone’s lifestyles and perspective on technology. However, AR glasses have a long way to go before becoming a mature product. Since the AR device requires to “see” the real-world environment in which it operates, I devise my plan to first start with learning the state-of-the-art works in the computer vision field.

To better understand how to enable a machine to recognize the real world, I started my research at the Intelligent Motion Lab in my junior year, under the supervision of Prof. Kris Hauser and Prof. Yuxiong Wang. I mainly focused on designing and implementing a continual few-shot learning segmentation system (CLSS) for helping robots to continuously learn to recognize new object categories using our incremental learning algorithm (GAPS). Our pipeline performs semantic level segmentation and provides our robot with the capacity to classify the new objects based on human instructions. When a user notices an unrecognized object, he or she could utilize CLSS to provide the annotation (mask and label) to the robot, and the robot will learn this object simultaneously without halting. We integrated an interactive segmentation model (RITM) into our continual learning pipeline to reduce user operation cost when performing annotation. Additionally, we incorporated a video tracking component (VOS) to improve the accuracy of instant segmentation In this project, I worked as a full-stack developer who not only deals with frontend and backend development of our system but also completely understood GAPS/RITM/VOS and modified them to accommodate our entire system. In addition to designing the whole pipeline, I assisted my team in developing an incremental learning benchmark in which we consider the inference latency during few-shot training time as a crucial factor since in the real-world settings, robot needs to get the inference result immediately to determine its subsequent actions.

After establishing a rough knowledge basis of the field of continual learning and semantic segmentation, I intend to dive deeper into the area of artificial intelligence and gain more experience working on research papers. In the summer of 2022, I was selected to work as an undergraduate research assistant with Professor Tarek Abdelzaher to investigate graph adversarial learning on dynamic link prediction. In this project, I was responsible for constructing our experiment infrastructure and reimplementing numerous baselines (VGAE/Evolve-GCN/Euler) and attacking methods such as random-attack and meta-attack. When preparing the experiment, I found that there are very few academic papers that mainly focus on adversarial defense for dynamic link prediction in graph neural network. Therefore, I conducted extensive research on the graph data after global attack on other GNN tasks and finally identified several generic patterns (e.g., the attacked graph sparsity is dense). Based on these patterns, we devised a few new optimization restrictions and ultimately beat most existing approaches. Our paper on building a novel model to defend against adversarial global attacks on knowledge graph is due to be submitted to XXXX by the end of December this year. This project provided me with a deeper understanding of research, especially the entire pipeline of a research project from a vague notion to a highly specific thorough article, thus enhancing my passion for working in a research lab to develop new technologies and products.

Working on graph adversarial learning made me aware of the fragility of deep learning models and at the same time also trigger my curiosity about AI’s black box. Why will the model make this prediction? Which variable determined its outcome? In my senior year, I therefore choose to return to the field of computer vision and start to work on exploring the model’s black box. Supervised by Prof. Derek Hoiem and Dr. Yao Xiao, we try to build up a pipeline for concept-based classification. We wish to interpret model’s prediction by measuring the contribution of different object concepts (e.g., “bird” could be divided to: “wings”, “beak”, and “feathers”). Hence, one of the most important parts in this project is to figure out how to extract these concepts from training data. We decide to extract one layer of CLIP’s image encoder in inference time and cut this latent map to several patches. The k-means algorithm is then used to cluster patches and build up those concepts. I implemented this concept extraction process and construct a similarity dataset for training. What excites me most is the test accuracy of our attribute classification pipeline was comparable to that of previous end-to-end models, and we could use this concept-based pipeline to determine the contribution of each concept at the time of inference using SHAP or Grad Cam. . I am still working on this project with a plan to submit our work on ICCV 2023, and I am eager to uncover more interesting applications for our attribute classification pipeline.

Holding the growing mindset to progress, professional education at a higher institute has been a key priority in my agenda. Carnegie Mellon University, as the world’s best university in computer science, is the ideal option for me. I explored the program brochure and was thrilled to find courses like Visual Learning and Recognition and Geometry-based Methods in Vision, which will lay a solid foundation for the modeling and sophisticated calculations required in the development of augmented reality, bringing me closer to understanding 3D-vision and actual machine perception technology. The Master of Science in Computer Science program at CMU stands out as my dream program not only because of the rigorous coursework it offers, but also because of the school’s attractive research facility, the Augmented Perception Lab. Since 2014, I’ve established the ambition of developing a wonderful AR product that have a positive impact on people’s lives; I never lose sight of this goal and continue to strive in the connected field. I wish to make the machine feel the world because a deeper understanding of the world would allow it to better serve human life. Therefore, I look forward to becoming a part of the most dynamic computer science community in the world and hope to meet like-minded friends to pursue our dreams. Overall, I believe CMU’s abundant resources and collaborative environment can provide the best guidance to my academia and career path, and I would also love to give back to the CMU community with my positive spirit and ceaseless hard work.