Few-shot transfer of tool-use skills using human demonstrations with proximity and tactile sensing

Marina Y. Aoyama1, Sethu Vijayakumar1, Tetsuya Narita2

1The University of Edinburgh, 2Sony Group Corporation

IEEE Robotics and Automation Letters (RA-L) 2025

Paper Video

Our few-shot approach learns to manipulate new tools with few demonstrations!

Abstract

Tools extend the manipulation abilities of robots, much like they do for humans. Despite human expertise in tool manipulation, teaching robots these skills faces challenges. The complexity arises from the interplay of two points of contact: one between the robot and the tool, and another between the tool and the environment. Tactile and proximity sensors play a crucial role in identifying these complex contacts. However, learning tool manipulation with a small amount of real-world data using these sensors remains challenging due to the large sim-to-real gap and sensor noise. To address this, we propose a few-shot tool-use skill transfer framework using multimodal sensing. The framework involves pre-training the base policy to capture contact states common in tool-use skills in simulation and fine-tuning it with human demonstrations collected in the real-world target domain to bridge the domain gap. We validate that this framework enables teaching surface-following tasks using tools with diverse physical and geometric properties with a small number of demonstrations on the Franka Emika robot arm. Our analysis suggests that the robot acquires new tool-use skills by transferring the ability to recognise tool-environment contact relationships from pre-trained to fine-tuned policies. Additionally, integrating proximity and tactile sensors enhances the identification of contact states and environmental geometry.

Summary Video (with voice 🔈)

Method

To address this, we propose a few-shot tool-use skill transfer framework using multimodal sensing. The framework involves pre-training the base policy to capture contact states common in tool-use skills in simulation and fine-tuning it with human demonstrations collected in the real-world target domain to bridge the domain gap.

Data collection

Pre-training in sim
Fine-tuning in real

Results

All videos are shown at 1.0× speed!

Four surface following tasks

Small brush
Wide brush
Broom
Sponge

Online adaptation

Changing inclination
Deformable surface
Step painting

Demo only (baseline) vs. Pre-train+Demo (proposed)

Demo only (baseline)
Pressing too much! Tool slippage.
Pre-train+Demo (proposed)

Robustness to external disturbances

Citation

@article{aoyama2025few,
  title={Few-shot transfer of tool-use skills using human demonstrations with proximity and tactile sensing},
  author={Aoyama, Marina Y and Vijayakumar, Sethu and Narita, Tetsuya},
  journal={IEEE Robotics and Automation Letters},
  year={2025},
  publisher={IEEE}
}