HoloLens: The UI of the Future

19 July 2017
| |
Reading time: 4 minutes

The HoloLens experience is awesome. I still love trying out new apps and immerse myself in Augmented and Mixed Reality. Existing UX concepts are still being adapted to the HoloLens, and new ideas abound.

Two aspects regularly dispel the illusion, however. As has been written countless of times, the field of view is just too small. This problem is being worked on by several companies from different angles. It will be solved soon, I am certain of that. The second aspect is the tediousness of the Air Tap. It is hard to target a specific point in space with my nose and execute a non-trivial gesture in front of myself. I keep wanting to nod affirmatively, thus clicking on some random widget border. It is also the single most difficult thing to teach a customer trying out the HoloLens for the first time.
The clicker is a highly underrated device that helps a lot with this. But you still need to fix your gaze (actually your nose) and hold it steady.

Voice input is great and fits AR well, but selecting a specific point or area will always be needed, and voice is not well suited to that. Also, voice input is not always preferable (e.g. on the train) and in an office context, separating actual voice commands from background chatter will prove challenging. BCI (brain-computer-interface) is the next big thing, but it will probably be a while before it becomes actually useful.

Therefore, this blog post is an appeal to collaborate on finding new concepts to complement voice input for AR! Here’s what I have heard of or thought up myself:

Gesture Tracking with the HoloLens

Whether you are using the HoloLens built-in tracking, Leap Motion, Kinect, or some of the newer devices around, this is what most users expect. At least half of the first-time users try the Minority-Report-Hand-Wave at some time. This is a clear indication that this gesture is intuitive and should be supported somehow. On the other hand, it is hard to discern gestures directed at the device from a wave directed at the co-worker walking past the friendly Holo-Zombie, just as with voice input.

Gaze tracking solo

How about letting us just shake or nod our heads to say no or yes to a dialogue question? Again, it might be hard to prevent false positives, and it is not culture-invariant. But the approach feels very intuitive to me.

Game controllers

Modern game consoles sport advanced input devices with a variety of concepts that are well suited to interact with a 3D environment. They are relatively inexpensive and easy to integrate. They are very specific to games, however, and not equally well suited for the different tasks in a business application. Also, they are often bulky, and it may not be desirable to carry them at all times.

Specific UIs on general-purpose devices

But most of us are carrying a highly integrated package of several useful sensors with us at all times. I am talking about smartphones, of course! A smartphone can give us basic hand tracking via its accelerators, may be able to localise itself independently via its camera, and, best of all, it has a highly sensitive touch display, which may be used as a slider, to track a gaze-independent spatial cursor, or to display pop-up dialogue buttons. It can even vibrate to let you know it is offering input choices.

Spatial UIs

Then again, 2D is dead. Therefore, most of our time-honoured (or time-worn?) 2D widgets might be decrepit as well. We probably should think about new ways to interact with a scene that do not involve having to look at a few microradians of dihedral angle to make our will known.
Whenever a decision needs to be made, I could take a decisive action representing my choice, walking to a specific place, picking up a specific tool, or just looking at a specific object.
Where screen real estate is the limit on desktop and mobile, we have all the world as a stage in 3D!

The app "HoloRepair" in an early stage

The app “HoloRepair” in an early stage

Eliminate binary Workflows

Guessing the user’s choices for him based on position and gaze direction is prone to errors. Therefore, recovery from incorrect guesses must be graceful and unobtrusive for the user. Walking away, gazing away must undo everything that has been triggered before.

We have already expanded on some of these concepts at our AR-Camp and my colleagues Michael and Ines will blog about this soon. Please share your thoughts in the comments or just let me know on Twitter!

Comments (4)



19 July 2017 at 14:44

Hey Jörg – super Artikel – Gratuliere. Freue mich schon auf weitere…


Stefan Roth

9 August 2017 at 14:20

Hi Jörg,

Great article – as always!
Just some amendment on behalf of game controllers: some colleagues buildt a quite impressive VR app where they had two controllers of the Vive for left and right hand, both equipped with several buttons. Sounds promising at first glance, but experience with first test users brought them all away from that approach: users simply wouldn’t understand which button or even controller was meant for which purpose and as a consequence mess things up. Therefore they moved away from a multi-selection controller towards a single-click controller supported by an additional virtual tool belt fixed to the users hip.
So sticking to single gesture apps might as well be beneficial. I’m not saying in all cases, but as usual it depends a lot on the test person. I personally would have never had the idea to nod my had as a “yes”, but as I said that differs from each person.


Jörg Neulist

10 August 2017 at 08:56

Hello Stefan,
thank you for your contribution!
I am trying to explore different options here. Of course, some will prove viable, others will not. But we clearly need to open our mind here. Our current UX approaches target completely different devices, and we need to come up with a sensible way to interact with 3d space.
We also need to be prepared for a learning curve. A pre-smartphone person faced with a current Android device will certainly be overwhelmed. So are users with new UX patterns. That does not necessarily mean that they are bad.
VI rocks, but no one gets it without learning it first…
Best regards,

Brewster Barclay

Brewster Barclay

21 August 2018 at 16:00

Hi Jorg,
Have you ever looked at the ring controller from Spatialize? https://spatialize.xyz
Looked very interesting when I saw a demo.


Sign up for our Updates

Sign up now for our updates.

This field is required
This field is required
This field is required

I'm interested in:

Select at least one category
You were signed up successfully.

Receive regular updates from our blog


Or would you like to discuss a potential project with us? Contact us »