My current research interests include multimodal understanding, large-scale/distributed deep learning, and connections between machine learning and neuroscience. I was first introduced to machine learning by Pieter Abbeel’s Introduction to AI (CS188) in early 2016, which convinced me to abruptly drop my physics research and focus solely on ML. This was around the same time I was applying for graduate school, and I naively assumed I could just apply to a few top schools and get accepted no problem, due to my background in physics. I was promptly rejected by every school I applied to (in hindsight, this was an obvious outcome).
Facing a looming existential crisis, I decided the most interesting problem for me at the time was training a model that I could have a conversation with. I also figured this could help my resume a bit. I proceeded to essentially lock myself in my room for a few months, doing nothing but implementing sequence to sequence models (which were quite new at the time, circa early 2017) in TensorFlow. The result of this effort was the DeepChatModels repository, which ended up getting a little over 300 stars on GitHub and my first job offer at an NLP startup in Cambridge, MA.
Although doing ML in the fast-paced environment that startups provide was fun, I really still just wanted to do research. However, as many are now aware, to do research in ML it often feels like a necessary prerequisite is that you already have a strong track record of doing research in ML. Anyway, I somehow convinced an NLP team at Apple to hire me after about a year of startup life. In that team I focused primarily on language modeling and building up training infrastructure (pre-HuggingFace era).
In early 2021, after having done some investigations on VQA and video captioning, I switched to another applied research team in Apple where I could focus solely on these kinds of multi-modal investigations. The main project I’m able to share is our work on training and deploying CLIP in a multi-task setting in iOS (check out our blog post here).
I’ve recently transitioned to the new Machine Learning Research organization at Apple, directed by Samy Bengio. My current focus is on longer-term research on understanding and improving large multi-modal multi-task models.
NOTE: Everything below is quite old and from my time at Berkeley.
What You’ll Find on This Site
Below are links to the different sections of this site and brief summaries of the corresponding content.
- Research: Brief descriptions of some select research projects of mine.
- Notes: Descriptions and Links to PDFs of my course notes, all written in LaTeX.
- Posts: A mix of miscellaneous blog posts and mini-projects. Currently extremely outdated, hoping to post some new information soon!
I plan on moving this to the “research” section later, but for now I just want to get these up here.
- Toy Model Event Generation (LBNL)
- Lattice QCD Computation of the Proton Isovector Scalar Charge (BNL)
- Upsilon Polarization and Event Activity
Similarly, here are links to various presentations (posters/slides/etc.) I’ve made on my research.