File Download

There are no files associated with this item.

  • Find it @ UNIST can give you direct access to the published full text of this article. (UNISTARs only)

Views & Downloads

Detailed Information

Cited time in webofscience Cited time in scopus
Metadata Downloads

Visual Recognition for Autonomous Vehicles

Author(s)
Jo, Ahra
Advisor
Hwang, Sung Ju
Issued Date
2016-08
URI
https://scholarworks.unist.ac.kr/handle/201301/72085 http://unist.dcollection.net/jsp/common/DcLoOrgPer.jsp?sItemId=000002301152
Abstract
Over the past several years, the computer vision community has been researching object detection and classification. These can be applied many fields, but are particularly important to autonomous car development. In initial work in this area, many researchers sought higher accuracy on images. The recently developed Convolutional Neural Network (CNN) stands as great achievement. However, its training model is only suitable for photos, because its training data consists of many photos. To achieve higher performance on dash-cam videos, we need to acquire and process voluminous dash-cam training data for suitable training. For supervised learning, each object must be labelled; however, is the computation cost for this is very high because the video has many frames and many objects.
For reducing cost of labeling and automatically increasing object detection and classification system accuracy, the present study proposes incremental learning for an object recognition system. The proposed model is based on a Faster R-CNN (Regions with Convolutional Neural Network features) model for object detection and recognition. My main idea can be divided into two parts.
The first is detection with object tracking. Faster R-CNN’s object detector depends solely on an object proposal network. Thus, sometimes it ignores many objects. In the proposed method, this shortcoming is overcome using video’s feature. Video has a time domain, and previous and current frames have correlation of visualized objects. Accordingly, an object tracker is used. Summation of object proposal and object tracking results yields remarkable progress in object detection.
The second is object classification and incremental learning. Previous works train a model using training data, and they focus on optimizing the model with training data. Thus, to improve the accuracy or add training data, we should use a re-training model. This method requires a division of time between training and testing. Thus, a simultaneous learning system for real-time to improve accuracy is proposed here. If inputted data’s output object score is higher than a threshold, then we retain this object for reuse as training data. This method makes mAP improvement effects.

For availability of autonomous vehicles, security is paramount. State-of-the-art works are based on sensor or simple visual detection. For example, distance between vehicles is detected using the radar, and lane departure detection is done using a line detection algorithm. However, this is just a single function; it can detect head-on collisions but cannot execute sufficiently rapidly for dangerous situations. So, a hazard or accident prediction and accident categorization system is proposed here. The proposed system can detect a variety of hazards and accidents, and it can categorize kinds of accident, not only involving the user’s car but also between others. The proposed system focuses on giving warning information to drivers.
The algorithm used here is based on a CNN-LSTM model with many accident videos. Two models are used for this system. The first is a hazard or accident detection model. This model attaches the label of accident probability, and final loss layer is the Euclidean loss layer for model training. This enables it to detect the probability of hazard or accident, and if its output is over the threshold then a hazard is identified and then input into the accident categorization model.
Second is an accident categorization model. My system can detect nine kinds of accidents: forward, cut in car accident, intersection accident, jay walking, collision with two-wheeled vehicle, collision between another vehicle and a two-wheeled vehicle, another car slipping, another car reversing, and rollover accident of another car. The algorithm uses a max-pooling layer after LSTM layers for feature information propagation of having striking features.
Publisher
Ulsan National Institute of Science and Technology (UNIST)
Degree
Doctor
Major
Department of Electrical and Computer Engineering

qrcode

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.