Boosting Small Language Models in Robotics Task Planning via LLMs as a Data Generator

Choi, Gawon

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Ahn, Hye-Min	-
dc.contributor.author	Choi, Gawon	-
dc.date.accessioned	2025-04-04T13:49:14Z	-
dc.date.available	2025-04-04T13:49:14Z	-
dc.date.issued	2025-02	-
dc.description.abstract	In robotics, Large Language Models (LLMs) have been noted as task planners for robots due to their advanced perception and reasoning capabilities based on Chain-of-Thought (CoT). However, we claim that LLMs are over-specified for robotic task planning for two reasons. First, the language commands given to robots have much lower linguistic complexity than what LLMs can handle. Second, modern robots in practice are primarily tailored to a specific environment (i.e., tabletop, kitchen), as opposed to LLMs, which are domain-agnostic. We further believe that within a specific environment, small language models (LMs) have the potential to be effective robot task planners. To demonstrate this, we introduce a comprehensive framework that covers the entire workflow of training small LMs as environment-specific task planners, from generating datasets for the task planning to fine-tuning small LMs on these datasets, based on knowledge distillation [1]. We refer to the synthetic dataset generated from this framework as the COmmand-STeps Dataset (COST), containing commands to robots and corresponding actionable plans to execute the commands. In this framework, both data collection via LLMs and post-processing are automated, allowing anyone to build their own COST dataset for any environment. We generate the COST datasets for the kitchen and tabletop environments as examples, and evaluate their effectiveness by comparing the task planning performance of LLMs and fine-tuned small LMs on the COST dataset. As a result, we find that fine-tuned GPT2-medium performs comparable with GPT3.5 in both environments.	-
dc.description.degree	Master	-
dc.description	Graduate School of Artificial Intelligence	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/86453	-
dc.identifier.uri	http://unist.dcollection.net/common/orgView/200000865605	-
dc.language	ENG	-
dc.publisher	Ulsan National Institute of Science and Technology	-
dc.subject	Large language models	-
dc.subject	Small language model	-
dc.subject	Robotics task planning	-
dc.title	Boosting Small Language Models in Robotics Task Planning via LLMs as a Data Generator	-
dc.type	Thesis	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.