Arvind Rajaraman


I am an engineer on Databricks' Applied AI team, working on LLM post-training, evaluation, and deployment. I am also continuing research at Berkeley Artificial Intelligence Research with Professor Anca Dragan. Broadly, I am interested in building effective learning and reasoning systems, whether that be LLMs, digital agents, or embodied robots.

Industry Experience. I was a Machine Learning Scientist Intern at Atlassian, where I worked on large language model (LLM) infrastructure. In 2022, I interned at Nuro, where I worked on low-latency video streaming and model uncertainty estimation. In 2021, I was at NVIDIA working on vision models for autonomous driving.

Other Experience. I completed my undergrad at UC Berkeley, where I was the Head TA for Berkeley's CS 188 (Artificial Intelligence) and CS 189 (Machine Learning), and on the executive board of Machine Learning at Berkeley (ML@B). I am an Accel Scholar, Conviction Fellow, and part of Berkeley's Management, Entrepreneurship, and Technology (M.E.T.) Program.

Email  /  CV  /  Twitter  /  GitHub  /  LinkedIn  /  Devpost

profile photo

Engineering Experience

My engineering experience is primarily in highly performant systems for machine learning, from autonomous vehicles to more recently LLMs (large language models).

databricks Databricks (Current)
Software Engineer
Applied AI Team

Applied AI works on LLM {evaluation, post-training, & deployment} for {search, text-to-SQL, code correction, & code generation}. I am involved in technical efforts across the stack.
atlassian Atlassian
Machine Learning Scientist Intern
Core Machine Learning Team

Worked on a search relevance algorithm, RLAIF (reinforcement learning with AI feedback) infrastructure, text-to-SQL, and chatbots for question answering.
nuro Nuro
Software Engineer Intern
Fleet Infrastructure Team

Worked on video streaming infrastructure, model uncertainty estimation, and auto-labeling for video classification tasks.
nvidia NVIDIA
Software Engineer Intern
Autonomous Vehicles Division, DriveIX

Worked on AutoML for hyperparameter tuning of vision models, increasing data fidelity of vision data, and ML engineering infrastructure.
segmed Segmed (YC W20)
Software Engineer Intern

Worked on authentication, authorization, and developer productivity tools.

Research Experience

I am excited by the prospect of embodied robots that can generalize easily to unseen tasks and environments, in order to become widely useful to humans. My research interests include deep reinforcement learning, unsupervised learning, language modeling, and human-robot interaction.

More specifically, I am interested in creating embodied agents that model human learning, effectively representing their goals, intent, and biases. Becuase language is inherently information-dense, abstractable, highly available from a data standpoint, and contains knowledge about usefulness to humans, I am interested in building learning systems that use language to interact with humans, represent knowledge, and plan.

Discovering Skills with Language
Arvind Rajaraman, Vivek Myers, Anca Dragan
Project in progress

Using language to scale unsupervised reinforcement learning and learn skills more useful to humans.

Explicit vs. Implicit Modeling of Human Internal State for Robot Planning
Arvind Rajaraman, Ran (Thomas) Tian, Anca Dragan, Andrea Bajcsy
[Presentation]

A new method for robots to collaborate with humans by co-evolving a sequence model that estimates a human's internal state (with a model-based prior) and a robotic influence policy.

Teaching

Instructors of each course are listed in parantheses.

cs189 CS 189: Introduction to Machine Learning
Head Teaching Assistant, Fall 2023 (Jitendra Malik, Jennifer Listgarten)
Head Teaching Assistant, Spring 2023 (Jonathan Shewchuk)
Teaching Assistant, Fall 2022 (Jitendra Malik, Jennifer Listgarten)
cs188 CS 188: Introduction to Artificial Intelligence
Head Teaching Assistant, Summer 2022 (Yanlai Yang, Angela Liu)
Teaching Assistant, Spring 2022 (Stuart Russell, Dawn Song)
cs70 CS 70: Discrete Mathematics and Probability Theory
Academic Intern, Spring 2021 (Shyam Parekh, Satish Rao)

Selected Side Projects and Open-Source Contributions

Below are a set of selected side projects. To see more, visit my Github and Devpost.

* Indicates equal contribution and co-authorship.

origin Origin
Best Frontier Tech Hack, Stanford TreeHacks 2023
[Blog Post] [Devpost] [Code] [Tweet]

Built an LLM-based browser extension that cleans up your tabs and builds context-aware workspaces. Won Best Frontier Tech Hack from Pear VC and received an investment offer at a $2.5 million valuation. Also received interest from Sequoia and shout-out from Harrison Chase (creator of LangChain). 70+ stars on GitHub.
verbal coding Verbal Coding
Winner of Education Track and Best Use of Google Cloud, HackNYU 2019
[Devpost]

Developed a verbal code editor that uses NLP to convert spoken pseudocode into well-formed Python code. Continued work and received mentorship from MIT Professor Kyle Keane.

Some other projects I pursued are below. Any awards won are noted in parantheses.

  • Ephemeral (Best Use of Together.ai, TreeHacks 2024) - agentic meeting assistant that contributes to meeting conversation and automates mundane tasks.
  • BiteBuddy (Best Use of Reflex, CalHacks 2023) - meal planner app with social networking integrations.
  • Unscrambit (First Place, JumpStart Hackathon 2020) - code analysis app that uses NLP to identify common algorithms implemented in one's codebase.
  • Autodeploy (2023) - developer tool that automatically creates Terraform files using natural language descriptions and analyzing one's codebase.
  • Crib (2020) - smart lock that uses real-time crime data to automatically lock your front door.
  • Disperse (2020) - grocery store search app that ranks places in order of least crowded to most crowded. Built during COVID-19 pandemic to decrease infection rates.
  • NextEniac (2018) - grade calculation and insights tool used by 1,000 students at my high school.
  • Buzz (2016) - social networking app that makes the shopping experience social.
  • Formulate (2013) - first substantial coding project, which would solve my Pre-Algebra homework.

Miscellaneous

I was previously the Vice President of Machine Learning at Berkeley (ML@B), which is Berkeley's undergraduate ML group. I taught introductory ML workshops across the Bay Area, ran an internal new member education program, and managed $100,000 of finances.

Below are some links of content I've developed:


Website template from Jon Barron