On a Usefulness Metric for robots -

The last ten years has seen a significant spurt of new domestic robots developed with the aim of being deployed in the home. Many of these robots are adorned with fancy human-like arms, state-of-the-art arrays of sensors and the most powerful computers. It is however not obvious that adorning a robot with the fanciest hardware or the most complex arm configurations makes the robot any better at performing the household tasks it is purchased to do. Even worse is the fact that such a robot, by virtue of its expensive motors, compute and sensor suite, would cost an arm, a leg, and perhaps a couple of kidneys. I’d even go far to argue that, tossing in complex high degree-of-freedom arms, expensive sensor suites and energy-hungry high performance compute into the poor robot indirectly contributes to its inability to efficiently perform the very tasks we purchase them to perform. Like the proverbial spoiled overfed English bulldog, it lazies in its rag all day, eating turkey, making poop and barking at strangers. It needs to be spanked back into action!

How do we spank our robots back into action? Or more accurately, spank the imagination of robot engineers back into action; to dispel the lazy idea of building every domestic robot in the image of a human anatomy rather than optimizing for the best morphology that suits the tasks the robot is being built to perform? The answer I propose in this post is Metrics! More specifically, a Usefulness Metric.

The Usefulness Metric should ideally measure how much work the robot can get done within a particular period of time. The greater the Usefulness Score of a robot, the more useful it is.

The right way to think of a robot is to view it as a tool. Like a blender, or a sieve, or a fleshlight, or a coffee maker. A tool that is built to perform a task or a specific set of tasks. The usefulness of a tool is judged by how best it is able to accomplish the task it is built to do. A rice-cooker that is unable to cook rice to the right softness is cast out into the bottomless pit, where there is weeping and gnashing of teeth. As roboticists, viewing robots in the same light would enable us to make objective decisions, quickly blot out bad ideas and make rapid progress in building efficient, cost-effective robots that can actually do useful stuff.

How do we determine a robot’s Usefulness Score? A propose the following method:

Decide on the tasks you’d like to benchmark the robot on. Let $T$ represent the number of tasks.
For each task $t$ , get the robot to perform the task $N_t$ number of times. Track the time it takes the robot to perform each task.
Let $S_t$ represent the number of successful runs of task $t$ , where $S_t \leq N_t$
Let $avg\_time$ represent the average of the duration, in seconds, of successful runs $S_t$ . If $S_t = 0$ , set $avg\_time$ to some pre-determined timeout time, in seconds. Say $1800s$ .
The Usefulness score, $U$ , is then expressed as
$U = \frac{100}{T} \sum_{t=1}^{T} \frac{S_t}{N_t} \cdot\frac{1}{avg\_time}$

Considering two tasks, cleaning the dining table and putting clothes in the laundry machine, the Usefulness score for Stretch, a domestic robot could be calculated as follows;

Number of tasks, T = 2; Number of trials, N = 5; Timeout = 1800s.

Let’s assume that Stretch is able to successfully clean the dining table in 4 out of 5 tries, spending an average time of 600s for all successful tries and is able to put clothes into the laundry machine in 3 out of 5 tries, spending an average time of 480s for all successful tries, Stretch’s Usefulness score is

$U_{stretch} = \frac{100}{2} \cdot (\frac{4}{5} \cdot \frac{1}{600} + \frac{3}{5} \cdot \frac{1}{480})$

$U_{stretch} = 0.129$

For another more ‘sophisticated’ robot, PR2, let’s assume that PR2 is able to successfully clean the dining table in 2 out of 5 tries, spending an average time of 1600s for all successful tries and is able to put clothes into the laundry machine in 1 out of 5 tries, spending 1800s for that successful try. PR2’s Usefulness score is computed as

$U_{PR2} = \frac{100}{2} \cdot (\frac{2}{5} \cdot \frac{1}{1600} + \frac{1}{5} \cdot \frac{1}{1800})$

$U_{PR2} = 0.018$

In these hypothetical experimental runs, we can conclude that despite its simplicity, Stretch is a more useful robot than PR2. Not to bash on PR2, but I strongly believe that if these experiments were actually performed, Stretch would no doubt have a higher Usefulness Score than PR2.

It’s glaringly obvious that I haven’t once mentioned the role of control, perception and planning algorithms in influencing the Usefulness Score of robots. My reason is that, the complexity and tractability of control and planning algorithms are heavily dependent on the morphology of the robot. Due to its simplicity, the feedback control loop of a robot like Stretch would be magnitudes faster than that of PR2, where the controller has to tame all the 14 joints that make up the arms of PR2. Similarly, full-body motion planning will be much faster with Stretch. Since Stretch is just as physically capable of performing the specified cleaning and laundry tasks, as PR2, it is clear that Stretch is a more useful robot than PR2 for these tasks.

“But Alphonsus!”, you might retort, “PR2 can grasp my shorts in many different postures and fancy human-like ways due to its 7DOF arms!”.

“True”, I’d respond. “But is that why you’d spend $250,000 on a PR2? Isn’t the robot’s job to perform the tasks it was purchased to do regardless of how it manages to do it? “.

Whether or not you agree with my assessment, the benefit of the Usefulness Score is that it will enable us to have these objective conversations and make the right design choices when building the domestic robots that will loiter in our homes in the years to come.

Do you agree? Please let me know in the comments.

Leave a Reply Cancel reply