Intro: What the heck is NUI?
When we communicate with others, we look at each other, shake hands and start conversation. What if we can interact with the same way with machine? Natural user interface (NUI) is an innovative, multimodal interaction system using gaze, gesture and voice to control computers. These days it's not really hard to find NUI in everyday lives. The most common type of natural interface used is the voice assistance - people talk to smart speakers or voice assistant app just like talking with human.
Amazon Alexa is probably the most well-known product using NUI
Another type of NUI is gesture. Gestures are replacing keyboard and mouse, enabling human to use their natural hand gestures to control machine. There have been a series of cool projects that illustrate the future with NUI. One of the earliest gesture controller efforts is Myoband. (Yes, the artifact I brought to the class
) Myoband detects electrical signal coming out from the arm muscles and convert human gestures to digital input. Users can manually mapping keyboard/mouse control with their hand gestures.
https://www.youtube.com/watch?v=HD-2NWvjuSA
Myoband is one of the early gesture control efforts that allows users to control digital media with gestures.
The Future of NUI
Thanks to the wearable devices, NUI is evolving as multi-modal interaction system. For example, when you are curious about a flower bloomed in your garden, you can 'ask' your mobile device what it is by gazing the flower or pointing it with your own fingers. That's just like asking your teacher about what this flower is. The multimodal NUI system will enable designers and creative technologists to create the fully immersive digital experience.
Microsoft Hololens is the most well-known mixed reality headset that controlling augmented digital content with NUI.
Hololens allows users to manipulate digital content into the real world without using keyboard or mouse. Users can feel that they are controlling digital content with their hands. To interact with it, users should gaze a target object, and then either move it with fingers or speak to it. Hololens is originally designed to support immersive learning/vocational training by overlaying digital instructions onto the physical surface. In the TV series Good Doctor shows how Hololens can be used for a surgery briefing.
BMW's vision video on NUI in vehicle shows the future of smart cars
BMW has recently announced the BMW Natural Interaction system at Mobile World Congress 2019 that combines voice control with natural gesture control and gaze recognition to enable multi-modal interaction. Their first NUI feature is promised to be available in the BMW iNEXT electric car from 2021. (link) This vision video shows how drivers can communicate with their car through the most natural human interaction (pointing with fingers, asking and gazing a target place).
Playing with Myoband
Over the past week, I've been playing with Myoband, and reflecting on what worked and what didn't. Myoband doesn't take a lot of time to set up. After finishing connection setups and calibrations, I could control Netfilx and Google Slides. Myoband supports multimodal gesture interactions - I can combine more than 2 different type of gestures to manipulate screen UIs.


Opening the application menu on top by raising hand + pulling it down

Controlling the volume with fisting + rotating hand
I didn't have any with physical objects that the band supports, but with Myoband you can interact with physical objects as well. I learned that Myoband also supports Parrot drones we used for the drone racing last week - It certainly makes you feel like you become a Jedi.

"Use the force, Luke!"
Limitations of NUI
It's very ironic to say this, Myoband shows that it requires a steep learning curve for a couple of reasons. First, most of the products don't support the true natural interface, as gestures are still very much artificial. Take a look at Myoband for example - To select something, users are required to either purse their fingers or make a fist. That's not how we select a thing in real world. To open an app or a main menu, spread all fingers, which is hard to associate with gestures we use everyday (Normally we spread all fingers to count five or say hi to others).

On top of that, there are 3 to 5 'artificial' gestures to learn to fully interact with the devices in general, which is a burden to our muscle memories. When these gestures are combined with eye tracking or voice control, it's even harder to remember. To make matters worse, it's often observed that each applications configured to use a different set of interactions. For example, the motion sets for pausing videos are different between Netflix and YouTube when using Myoband.
In summary
Despite of its learning curve, NUI is still a very attractive system to explore. That said, there are a couple of things I learned while playing with it. First, human brain doesn't have enough capacity to remember an intensive set of gestures, especially if they are different from what we use in everyday lives. If more than three gestures are required, keep them as close to the natural gestures as possible. I think ethnography would be one of the greatest way to learn deeply about human behaviors and incorporate them to gesture controls. And second, it's very important to keep the interaction set universal no matter what apps/devices use.