I am a final year PhD student in EECS at UC Berkeley, advised by Prof. Ken Goldberg. My research focuses on general robot learning. Recently I have worked on the in-context learning of robotics, the alignment between tactile, vision and language and the whole body control of a quadruped.
In-Context Imitation Learning via Next-Token Prediction
Letian Fu*, Huang Huang*, Gaurav Datta*, Lawrence Yunliang Chen, William Chung-Ho Panitch, Fangchen Liu, Hui Li, Ken Goldberg *Equal contribution
arXiv preprint arXiv:2408.15980 2024, Website
We explore how to enhance next-token prediction models to perform in-context imitation learning on a real robot. We propose In-Context Robot Transformer (ICRT), a causal transformer that generalizes to unseen tasks conditioned on prompts of sensorimotor trajectories of the new task composing of image observations, actions and states tuples, collected through human teleoperation.
A touch, vision, and language dataset for multimodal alignment
Letian Fu, Gaurav Datta*, Huang Huang*, William Chung-Ho Panitch*, Jaimyn Drake*, Joseph Ortiz, Mustafa Mukadam, Mike Lambeta, Roberto Calandra, Ken Goldberg *Equal contribution
ICML 2024, Oral Presentation, Website
We introduce the Touch-Vision-Language (TVL) dataset, which combines paired tactile and visual observations with both human-annotated and VLM-generated tactile-semantic labels. We then leverage a contrastive learning approach to train a CLIP-aligned tactile encoder and finetune an open-source LLM for a tactile description task. Our results show that incorporating tactile information allows us to significantly outperform state-of-the-art VLMs (including the label generating model) on a tactile understanding task.
Manipulator as a Tail: Promoting Dynamic Stability for Legged Locomotion
Huang Huang, Antonio Loquercio, Ashish Kumar, Neerja Thakkar, Ken Goldberg, Jitendra Malik
ICRA 2024, Website
For locomotion, is an arm on a legged robot a liability or an asset for locomotion? Biological systems evolved additional limbs beyond legs that facilitates postural control. This work shows how a manipulator can be an asset for legged locomotion at high speeds or under external perturbations, where the arm serves beyond manipulation.
Learning Self-Supervised Representations from Vision and Touch
for Active Sliding Perception of Deformable Surfaces
Justin Kerr*, Huang Huang*, Albert Wilcox, Ryan Hoque, Jeffrey Ichnowski, Roberto Calandra, and Ken Goldberg,
*Equal contribution
RSS 2023, Paper
We learn a self-supervised representation cross tactile and vistion using contrastive loss. We collect vision-tacitile pairs in a self-supervised way in real. The learned representation is utilized in the downstream active perception tasks without fine-tuning.
Evo-NeRF: Evolving NeRF for Sequential Robot Grasping
Justin Kerr, Letian Fu, Huang Huang, Yahav Avigal, Matthew Tancik, Jeffrey Ichnowski, Angjoo Kanazawa, Ken Goldberg
CoRL 2022, Oral Presentation, OpenReview
We propose Evo-NeRF with additional geometry regularizations improving performance in rapid capture settings to achieve real-time, updateable scene reconstruction for rapidly grasping table-top transparent objects.
We train a NeRF-adapted grasping network learns to ignore floaters.
Real2Sim2Real: Self-Supervised Learning of Physical
Single-Step Dynamic Actions for Planar Robot Casting
Vincent Lim*, Huang Huang*, Lawrence Yunliang Chen, Jonathan Wang,
Jeffrey Ichnowski, Daniel Seita, Michael Laskey, Ken Goldberg, *Equal contribution
ICRA 2021, paper
We collect planar robot casting data in real in a self-supervised way to tune the simulation in Isaac Gym. We then collect more data in the tuned simulator.
Combined with upsampled real data, we learn a policy for planar robot casting to reach to a given target, attaining median error distance (as % of cable length) ranging
from 8% to 14%.