Assessing Vision-Language Models for Failure Detection in Robotic Manipulation

Chowdhury, Md Sameer Iqbal; Au, Tsz-Chiu

doi:10.1109/MPULS.2026.3659245

Scholarworks@UNIST

UNIST Library

File Download

There are no files associated with this item.

SFX Link

Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience

Cited time in scopus

Metadata Downloads

Full metadata record

DC Field	Value	Language
dc.citation.endPage	65	-
dc.citation.number	1	-
dc.citation.startPage	63	-
dc.citation.title	IEEE PULSE	-
dc.citation.volume	17	-
dc.contributor.author	Chowdhury, Md Sameer Iqbal	-
dc.contributor.author	Au, Tsz-Chiu	-
dc.date.accessioned	2026-04-30T14:30:06Z	-
dc.date.available	2026-04-30T14:30:06Z	-
dc.date.created	2026-04-27	-
dc.date.issued	2026-01	-
dc.description.abstract	Vision-language models (VLMs) offer transformative potential for robotics, but their deployment is constrained by performance limitations. In safety-critical manipulation, a model must recognize its own limitations to prevent a catastrophic failure. We conduct a systematic study of VLMs for robotic failure detection, evaluating six architectures on real-world trajectories. We put forward a decision-making process that allows a VLM to evaluate whether it can successfully complete a task, and if not, pause its operation and hand over the task to human operators. Our results show that well-calibrated VLMs can be trustworthy partners that know exactly when to ask for help.	-
dc.identifier.bibliographicCitation	IEEE PULSE, v.17, no.1, pp.63 - 65	-
dc.identifier.doi	10.1109/MPULS.2026.3659245	-
dc.identifier.issn	2154-2287	-
dc.identifier.scopusid	2-s2.0-105035817775	-
dc.identifier.uri	https://scholarworks.unist.ac.kr/handle/201301/91611	-
dc.identifier.wosid	001740998900007	-
dc.language	영어	-
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC	-
dc.title	Assessing Vision-Language Models for Failure Detection in Robotic Manipulation	-
dc.type	Article	-
dc.description.isOpenAccess	FALSE	-
dc.relation.journalWebOfScienceCategory	Engineering, Biomedical	-
dc.relation.journalResearchArea	Engineering	-
dc.type.docType	Article	-
dc.description.journalRegisteredClass	scie	-
dc.description.journalRegisteredClass	scopus	-
dc.subject.keywordAuthor	Foundation models	-
dc.subject.keywordAuthor	Robots	-
dc.subject.keywordAuthor	Vision language model	-
dc.subject.keywordAuthor	Failure analysis	-
dc.subject.keywordAuthor	Semantic communication	-
dc.subject.keywordAuthor	Visual perception	-
dc.subject.keywordAuthor	Safety	-
dc.subject.keywordAuthor	Monte Carlo methods	-
dc.subject.keywordAuthor	Heuristic algorithms	-
dc.subject.keywordAuthor	End effectors	-
dc.subject.keywordAuthor	Bayes methods	-
dc.subject.keywordAuthor	Calibration	-
dc.subject.keywordAuthor	Medical robotics	-
dc.subject.keywordAuthor	Computer architecture	-

Show Simple Item Record

qrcode

RSS 1.0 RSS 2.0

UNIST | Library

Tel : 052-217-1403 / Email : scholarworks@unist.ac.kr

ScholarWorks@UNIST was established as an OAK Project for the National Library of Korea.