Current service robots perform flawless demos under close human supervision, but they often fail when working autonomously in long-term deployments. This dissertation studies the failures in long-term autonomous service robots and proposes methods to improve their reliability.
We built TritonBot, a receptionist and tour-guide robot, as a realistic example to discover the failure modes of a long-term autonomous service robot. TritonBot recognizes people's faces, talks to people, and shows people the labs and facilities in a university building. We deployed TritonBot for hundreds of hours to identify failure modes and common issues on service robots.
Following the experience from TritonBot, we designed two reliability engineering methods to improve the robustness of service robots. First, we found software encapsulation and dynamic orchestration streamline development workflows and avoid resource contention in service robots. Software encapsulation allows the developers to pack software into self-contained containers and simplify development workflows, and dynamic orchestration schedules the components on demand to avoid CPU/memory resource contention. We developed Rorg, a Linux container-based scheme to manage software components on service robots. Second, we found simulating a broad spectrum of rare failures at system level exposes design flaws and improves the robustness of service robots. Design errors in robotics are challenging to discover due to the need for extensive and resource-demanding testing. Broad-spectrum system-level failure injection exposes both software- and hardware-related design flaws and assists developers in reproducing rare failures, verifying the fixes, and testing the robustness of a robot system. We implemented RoboVac, an extensible and convenient fault injection framework that works at the system level and covers many failure patterns seen in long-term autonomous service robot deployments.
After working with TritonBot for two years and implementing reliability engineering methods, we concluded a few design principles for a long-term autonomous service robot at different levels in the system hierarchy. These design principles guide robust and reliable long-term autonomous service robot designs.
We use a set of automated tools, engineering methods, and design principles to build service robots that are available 24x7, and we call it "Reliability Engineering for Long-term Deployment of Autonomous Service Robots."
Google's multilingual speech recognition system combines low-level acoustic signals with language-specific recognizer signals to better predict the language of an utterance. This paper presents our experience with different signal combination methods to improve overall language identification accuracy. We compare the performance of a lattice-based ensemble model and a deep neural network model to combine signals from recognizers with that of a baseline that only uses low-level acoustic signals. Experimental results show that the deep neural network model outperforms the lattice-based ensemble model, and it reduced the error rate from 5.5% in the baseline to 4.3%, which is a 21.8% relative reduction.
While many researchers have built service robot prototypes that work perfectly under close human supervision, deploying an autonomous robot in an open environment for a long time is not always trivial. This paper presents our experience with TritonBot, a long-term autonomous receptionist and tour guide robot. We deployed TritonBot as an example to study reliability challenges in long-term autonomous service robots. During the past two years, we regularly do maintenance, fix issues, and roll out new features. In the process, we identified reliability engineering challenges in three aspects of long-term autonomy: scalability, resilience, and learning; we also formulated techniques to confront these challenges. Our experience shows that proper engineering practices and design principles reduces manual interventions and increases general reliability in long-term autonomous service robot deployments.
Scaling up the software system on service robots significantly increases the maintenance burden of developers and the risk of resource contention of the computer embedded with robots. As a result, developers spend much time on configuring, deploying, and monitoring the robot software system; robots utilize significant computer resources when all software processes are running. We propose Rorg, a Linux container-based scheme to manage, schedule, and monitor software components on service robots. Although Linux container is already widely-used in cloud environments, this technique is challenging to efficiently adopt in service robot systems due to the unique characteristics of service robots -- multi-purpose systems with resource limitations and high performance requirements. To pave the way of Linux containers on service robots in an efficient manner, we propose a programmable container management interface and a resource time-sharing mechanism incorporated with the robot operating system (ROS). Rorg allows the developers to pack software into self-contained images and runs them in isolated environments using Linux containers; it also allows the robot to turn on and off software components on demand to avoid resource contention. We evaluate Rorg with a long-term autonomous tour guide robot: It manages 41 software components on the robot and relieved our maintenance burden, and it also reduces CPU by 45.5% and memory usage by 16.5% on average.
Service robots perform flawless demos while their developers are keeping a close eye, but they often fail when working autonomously, especially in a long-term deployment. To study failure modes and human-robot interaction patterns in long-term deployments, we built TritonBot, a long-term autonomy robot working as a building receptionist and a tour guide. It recognizes people’s face, talks to them, and guides people to the labs and facilities in an office building. This paper presents the design of TritonBot and the lessons we learned from the first-month deployment with respect to technical and human-robot interaction aspects. TritonBot and its variant BoxBot have worked for 108.7 hours, actively interacted with people for 22.1 hours, greeted people 2950 times, guided 150 tours, and traveled 9.9 kilometers. We share the components of TritonBot using an open licence to help the community to replicate the TritonBot platform and inspire long-term autonomy and human-robot interaction research.
Algorithms such as Bag of Words and Simhash have been widely used in image recognition. To achieve better performance as well as energy-efficiency, a hardware implementation of these two algorithms is proposed in this paper. To the best of our knowledge, it is the first time that these algorithms have been implemented on hardware for image recognition purpose. The proposed implementation is able to generate a fingerprint of an image and find the closest match in the database accurately. It is implemented on Xilinx’s Virtex-6 SX475T FPGA. Tradeoffs between high performance and low hardware overhead are obtained through proper parallelization. The experimental result shows that the proposed implementation can process 1,018 images per second, approximately 17.8x faster than software on Intel's 12-thread Xeon X5650 processor. On the other hand, the power consumption is 0.35x compared to software-based implementation. Thus, the overall advantage in energy-efficiency is as much as 46x. The proposed architecture is scalable, and is able to meet various requirements of image recognition.
Most of the implementations of boundary scan chains are of fixed length, typically hundreds or thousands. Because the whole chain is scanned every time, many clock cycles are wasted when only a small part of it is concerned. In this paper, a novel structure of configurable boundary scan chain is proposed. Its length and content can be reconfigured without interrupting the chip’s functionality. Experimental result shows that the maximum frequency can be as high as 510.4MHz for a full-configurable chain with 512 cells, under 32 nm process, which is 15.7x better than the intuitive method. The proposed structure has been applied to a processor prototype design, and is expected to meet requirements of different applications.
This paper proposes a high performance hardware architecture of Speeded Up Robust Features (SURF) algorithm based on OpenSURF. In order to achieve high processing frame rate, the hardware architecture is designed with several characteristics. Firstly, a sliding window method is proposed to extract feature points in parallel at selected scale levels. As a result, the time cost in feature extraction can be greatly reduced. Secondly, data reuse strategy is proposed in orientation generation and descriptor generation to reduce the memory access times. In this way, 3.87x and 2.25X speedup are achieved respectively. Thirdly, the integral image is segmented to buffer in different memory blocks in order to support multiple data accessing in one clock cycle, which will further reduce the whole calculating time of our implementation. The hardware architecture is implemented on an XC6VSX475T FPGA with 156 MHz and its maximal frame rate for VGA format image can reach 356 frames per second (fps), which is 6.25 times frame rate of OpenSURF running on a server with a Xeon 5650 processor, and 6 times the reported frame rate of the recent implementation on three Vritex4 FPGAs.
In this paper, a high-performance match engine for content-based image retrieval is proposed. Highly customized floating-point(FP) units are designed, to provide the dynamic range and precision of standard FP units, but with considerably less area than standard FP units. Match calculation arrays with various architectures and scales are designed and evaluated. An CBIR system is built on a 12-FPGA cluster. Inter-FPGA connections are based on standard 10-Gigabyte Ethernet. The whole FPGA cluster can compare a query image against 150 million library images within 10 seconds, basing on detailed local features. Compared with the Intel Xeon 5650 server based solution, our implementation is 11.35 times faster and 34.81 times more power efficient.