This is an introduction to the paper: “Efficient human-machine control with asymmetric marginal reliability input devices”
Most of time, user interfaces have fairly reliable inputs. Small levels of error are tolerable when we have simple correction features like undo. But in some contexts, like in brain-computer interfaces input is very noisy. This tends to result in a frustrating user experience, and a variety of correction methods to reduce errors are used. Unfortunately, undo-style correction can be inefficient when controls are very unreliable; standard techniques can result in correction cascades where attempts to undo previous actions introduce new errors that need undone.
To get a feel for how the ideas work in a user interface, watch the video, then play with the interactive “noisy guessing game” demo in the section below.
A simple abstract model of a “marginally usable” interface is a pair of buttons. We can assume a user can elect to press either button, but there is some probability that the input will be flipped at random because the input device mis-recognises an intention.
A non-invasive EEG-based BCI, like one which classifies changes in rhythms in the motor cortex when imagining movement, fits this model. You imagine wiggling your left hand, and the weak neural signals associated with this imaginary movement are picked up by electrodes on the scalp and eventually the BCI registers that you “hit the left button”, with some probability.
An interface like this might produce binary decisions with an error rate of 5% for a “good” user. But performance varies hugely among different users, and even across one user on different days. Other users might have error rates of the order of 25%. Given that each “noisy button push” might take a second or more to produce, it is imperative to have efficient ways of error free control. The problem is how to optimally decode the noise-corrupted inputs to recover user intention.
We can see this as a game played between a user and a system:
If it weren't for the lying, the optimal solution — in terms of the fewest questions to get within some tolerance ϵ of x — would be simple bisection. In the input device problem the “lies” become bit flips induced by noise. Horstein's algorithm [Horstein1963], discussed below, is the (information-theoretic) optimal solution to the noisy version if the “lies” are made randomly.
To get a feel for Horstein's algorithm, try the demo:
Error level 20%
|
Returning to our user abstraction, if I, the user, want to do something useful with a computer — “start music playing”, say — we need a way of mapping noisy button pushes onto UI options. This process should be:
Ideally, it would also be probabilistic, so that it gives a probability distribution over options. This makes it easier to incorporate prior models about what options I might want, or utility functions about which options are most “valuable” (or dangerous!) in a consistent way.
Assume we have a collection of UI options denoted X=x1,x2,x3,…. I have an intention to select a specific X=xi. How can we update the conditional distribution P(X=xi|Xt−1,Yt), where Yt is the input from a user (e.g. a BCI signal) at time t and Xt−1 is the belief distribution at the previous time step (or a prior before we begin interaction)? If we could do that we could then:
From an information theory perspective, we can view the human-machine loop as a way of coding for a noisy, bandwidth-limited input channel. How can we efficiently and robustly transport intention to system state? We can map common user interface elements onto the standard elements of a communication system:
We will focus on developing channel codes for low-reliability (high noise) interfaces. And we will explore how feedback control allows all of this coding to be done on the system-side, without introducing additional mental demands on the user.
There are two general concepts we can apply to build a robust decoder. We can use:
We often encounter asymmetric interfaces, where we have rich, virtually noise-free visual display coupled with a low-bandwidth noisy input device (for example, a low-bandwidth BCI with a high-capacity visual display). This leads us to the question: how can we rebalance the control loop so it leans more heavily on the feedback path?
Framing this probabilistically, how can we optimally do online probabilistic updates over a distribution over UI options? What input should be elicit from the user via the feedback path, and how should we decode this input to update the probability distribution? This is a question of information theory, and Shannon [Shannon1948] showed that no matter how high the noise level, it is always possible to communicate with arbitrarily low error rates. The question then becomes:
Horstein [Horstein1963] showed an optimal code for noisy channels with noise-free feedback. This is a kind of posterior matching feedback code [ShayevitzFeder2008] which is provably optimal if the assumptions about noise-free feedback and perfectly known channel properties hold. It is the optimal solution to the noisy guessing game introduced above.
There are a few key assumptions for this to work:
Assume we have an input device like the noisy button, where a user's input is a binary decision: left or right of a dividing line. We assume we want to select one of N options, where for simplicity we can assume N=2^k. That means we have to transmit exactly k bits of information from a user's head to the system state to make a selection (k could be fractional, if we want). We will have some residual probability that the decoded symbol is incorrect; we can denote this e_k. We can control this error level by asking the user to “confirm” their decision with extra information; this confirmation we denote \beta, the number of bits of additional confirmation.
We will have a channel which flips a fraction f of inputs. We configure a decoder by telling it what fraction of flips to expect, f'. The Horstein decoder is optimal if f=f'. We continue observing inputs from the user until we are sufficiently sure (i.e. entropy is low enough) to make a final decision.
A decoder is specified by the tuple (k, \beta, f').
Pseudo-code for the algorithm is:
function horstein(k, beta, f')
p = (1-f')
F_0 = line_segement(0,0 => 1,1)
while entropy < k + beta do
median = find_median(F_i)
display(median, targets)
bit = receive_input()
left, right = split(F_i, m_i)
if bit == 0 then
left = p * left
right = (1-p) * right
else
left = (1-p) * left
right = p * right
end if
F_i+1 = left : right
end while
return median
(see below for real Python code)
The key step is splitting the distribution function at the median, then scaling the left and right segments proportionally to the probability of error.
The effect of the algorithm is to gradually concentrate probability mass around the user's intended target. Because a full history is maintained, multiple hypotheses can be retained during selection.
Even in the presence of error, this process will converge to a distribution which represents the user's intended selection, given a sufficient number of steps, and if the assumptions we made about the known channel statistics hold. The example below shows the PDF as noisy button pushes are registered:
This gives a robust bisection method that will tolerate any level of error in the inputs and can produce output with a bounded residual error level e_k — which we can choose to make very small. To make this into a user interface, we can:
This works, but has a couple of issues:
These issues can be mitigated with a few design tweaks:
That leads to the a final probabilistic spatial interface for unreliable binary inputs: a pair of Horstein decoders, using diagonal splits and linear zooming for display.
If you have questions like:
then you can find all of the answers in the full paper [Williamson2020].
Sure, the code is just below :) It will make sense to use this type of decoder in an interface if:
It will be particularly useful if:
The following is a basic implementation of Horstein's algorithm. No care is taken to be efficient or numerically stable.
from math import log
def f(xs, ys, y_test):
"""Find the x-value which meets the given y value"""
for i in range(len(xs) - 1):
x, nx = xs[i], xs[i + 1]
y, ny = ys[i], ys[i + 1]
slope = (ny - y) / (nx - x)
if y < y_test <= ny:
return i + 1, x + (y_test - y) / slope
def split(xs, ys, at, p):
"""Split the PDF at "at", reweighting the
left side by p, the right side by 1-p"""
i, split = f(xs, ys, at)
xs = xs[:i] + [split] + xs[i:]
left_y = [2 * y * p for y in ys[:i] + [at]]
right_y = [1 - (2 * (1 - y) * (1 - p)) for y in ys[i:]]
return xs, left_y + right_y
def entropy(xs, ys):
"""Return the differential entropy of the PDF"""
slopes = [(ys[i + 1] - ys[i]) / (xs[i + 1] - xs[i])
for i in range(len(xs) - 1)]
hs = [
p * (log(p) / log(2)) * (xs[i + 1] - xs[i])
for i, p in enumerate(slopes) if p != 0
]
return -sum(hs)
def horstein(k, beta, f0, f1, elicit):
"""Horstein decoding loop.
k: symbol length
beta: confirmation steps
f0, f1: expected BER for each input
elicit: function(m_i) which should return 1 or 0
(or True and False)
"""
xs, ys = [0.0, 1.0], [0.0, 1.0]
p = (1 - f0) / ((1 - f0) + f1)
q = (1 - f1) / ((1 - f1) + f0)
while entropy(xs, ys) > -(k + beta):
_, m_i = f(xs, ys, 0.5)
# get input: is target left or right of median?
bit = elicit(m_i)
if bit:
xs, ys = split(xs, ys, 0.5, 1 - q)
else:
xs, ys = split(xs, ys, 0.5, p)
# return MAP estimate
return f(xs, ys, 0.5)[1]
def demo_elicit(m_i):
return m_i < 0.71875