File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Boosting Small Language Models in Robotics Task Planning via LLMs as a Data Generator

Author(s)
Choi, Gawon
Advisor
Ahn, Hye-Min
Issued Date
2025-02
URI
https://scholarworks.unist.ac.kr/handle/201301/86453 http://unist.dcollection.net/common/orgView/200000865605
Abstract
In robotics, Large Language Models (LLMs) have been noted as task planners for robots due to their advanced perception and reasoning capabilities based on Chain-of-Thought (CoT). However, we claim that LLMs are over-specified for robotic task planning for two reasons. First, the language commands given to robots have much lower linguistic complexity than what LLMs can handle. Second, modern robots in practice are primarily tailored to a specific environment (i.e., tabletop, kitchen), as opposed to LLMs, which are domain-agnostic. We further believe that within a specific environment, small language models (LMs) have the potential to be effective robot task planners. To demonstrate this, we introduce a comprehensive framework that covers the entire workflow of training small LMs as environment-specific task planners, from generating datasets for the task planning to fine-tuning small LMs on these datasets, based on knowledge distillation [1]. We refer to the synthetic dataset generated from this framework as the COmmand-STeps Dataset (COST), containing commands to robots and corresponding actionable plans to execute the commands. In this framework, both data collection via LLMs and post-processing are automated, allowing anyone to build their own COST dataset for any environment. We generate the COST datasets for the kitchen and tabletop environments as examples, and evaluate their effectiveness by comparing the task planning performance of LLMs and fine-tuned small LMs on the COST dataset. As a result, we find that fine-tuned GPT2-medium performs comparable with GPT3.5 in both environments.
Publisher
Ulsan National Institute of Science and Technology
Degree
Master
Major
Graduate School of Artificial Intelligence

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.