One of the biggest advantages of Augmented Reality is that most users can use it quite intuitively
Now that we at Zühlke have a Microsoft HoloLens on our hands (or rather, our heads), we can finally see what the HoloLens brings to the table, quite literally. From all kinds of tech news that we had seen and read before, our expectation was that this was basically another type of AR goggles. We were wrong. This is an entirely new breed of animal. It’s no wonder that Microsoft uses the term “Mixed Reality” here instead of “Augmented Reality”: The HoloLens doesn’t just augment elements of the real world – it creates its own holographic world that interacts with the real world. Only after trying it out yourself can you grasp the very concept behind the HoloLens. Stefan Roth already talked about this in his first HoloLens blog post (in German).
One of the main features of HoloLens is its ability to get a three-dimensional perception of your surroundings and track the movement of your head in space. Using the information about its surroundings and its own position, the HoloLens can then overlay a portion of your optical field of view with holograms that can interact with real-world objects. For instance, a hologram can be placed on the wall or sit on your table’s surface, just like a real-world object. Wherever you place a hologram, it stays – the HoloLens memorizes the room’s layout, and it can also recognize a room it’s been used in before, even if it was turned off in the meantime.
To interact with holograms, HoloLens uses three main input methods:
- Gaze: You move your head to use the direction HoloLens is facing as a pointer. This way you can either mark holograms for interaction or move holograms around that are tied to your direction of view.
- Gestures: You can perform hand gestures inside the field of view to interact with holograms. Aside from a “Bloom” gesture that the system uses for menu navigation, the main gestures are Air Tap and Hold & Drag.
The Air Tap gesture executes a distinct action, usually one that’s related to the hologram you’re currently looking at. It doesn’t matter where you perform the gesture, as long as it’s inside the field of view.
The Hold & Drag gesture can be used to manipulate holograms with your hand. For example, you can pick up a hologram and move it around, or you can grab an edge like a handle to rotate the hologram. When using this gesture, HoloLens tracks the position of your hand in space.
- Voice commands: You can just speak certain key phrases to trigger actions in an app. For instance, an app can remove the holograms placed in the scene when the user says “reset scene”.
The combination of these interaction methods and the knowledge about your surroundings allows HoloLens to create a truly immersive user experience with holograms.
Developing for HoloLens
For an early development version of a device as potent and versatile as this, the documentation and tools provided for development are very well-made. Despite being completely new to HoloLens development myself, I’ve been able to build a small demo app that showcases the device’s most important features in just two days – watch the video:
There is a good set of documentation available in the form of Microsoft’s Holographic Academy, which is basically a set of tutorials to get you started with the most important features of HoloLens. Your main tools for development are Unity3D and Visual Studio.
We have already experienced how quickly you can implement augmented reality apps at our education camp. Crafting your holographic apps with Unity is a breeze as well. You just set up your scene and scripts in Unity3D as usual: The main camera represents the HoloLens, and all objects in the scene appear as holograms when viewed in HoloLens. You just have to make sure to use bright colors for your objects, as dark/black colors will appear transparent.
The Hololens API available in C# UnityScript let you implement your scene interactions in no time. The Gaze origin and direction can be taken from the main camera and used for ray casting operations with the Unity API. Voice command key phrases are simply defined as plain old strings and handed to a KeywordRecognizer, which triggers its OnPhraseRecognized event whenever it hears a key phrase. The well-known HoloLens gestures can be used with a GestureRecognizer: Just set up the recognizer and bind your actions to the events you’re interested in. The AirTap gesture is mapped to a Tap event, whereas Hold & Drag is mapped to Manipulation and Navigation events. All gesture events provide you with the gaze ray at the time of recognition, while the Manipulation/Navigation events also provide the position of the user’s hand in space.
It’s theoretically possible to also implement custom gestures, for example a “swipe” gesture. However, there is no gesture recording tool of any kind currently, so gestures have to be implemented manually. To make this work properly, machine learning is required to accommodate differences from human to human. Therefore Microsoft discourages custom gestures and recommends sticking with the well-known gestures that all apps use, as users are already familiar with these. But maybe we’ll see some new built-in gestures for HoloLens in the future.
The Intricacies of Spatial Recognition
Using the depth field information perceived by HoloLens is a lot more effort than using the input mechanisms. You have to set up a SurfaceObserver, make it periodically poll for Spatial Recognition information and process any surface update into a Unity GameObject with a mesh renderer and collider as needed. This requires a lot of boilerplate code. Fortunately for us, Microsoft provides a collection of useful assets in the form of the HoloToolkit for Unity, which take most of the work off our hands. Just add the SpatialMapping prefab from the Toolkit to your scene, set its material and configure it via the SpatialMapping class defined in the accompanying script file. This already provides you with a surface mesh that is periodically updated and can be used with Unity’s physics engine, for example.
Unfortunately though, the surface data is pretty coarse-grained. The precision on the walls or floor is sufficient for most applications, but at object edges, such as the edge of a table, for example, the default surface mesh is somewhat bumpy. Finer details, such as door knobs, are mostly ignored. This can be a problem in some cases, for example when you want real-world objects to properly occlude your holograms. The mesh tesselation can be tuned to some extent to use more or fewer triangles per square meter, but even with high triangle counts, polygons are still about the size of a smartphone on average. Therefore Microsoft recommends performing custom optimization on the spatial perception data and the spatial surface mesh if you want straight object edges. Of course this will require some mathematical proficiency.
Running Your App
With a few tweaks to the project and build settings, the app can be brought to the HoloLens in a two-step process: Unity builds your app into a .NET solution to open in Visual Studio, and from Visual Studio you deploy the app to HoloLens. As HoloLens apps are basically UWP apps, your machine has to run Windows 10 to deploy to HoloLens. For deploying to HoloLens, you have several options. The first one is deployment via USB, which worked best for me (except in some cases where some error kept showing up until I restarted Visual Studio). This option also allows for debugging, at least theoretically – I haven’t yet figured out a way to step through my UnityScripts. Another option is deployment over WiFi. Unfortunately, this didn’t work on my machine, as deployment was always aborted in the middle of the process for some reason I couldn’t figure out. Given the HoloLens’ short battery life of 2-3 hours however, I’d recommend deploying over USB anyway to charge the battery in the process.
The third option, packaged deployment, is the one that’s needed to make your app appear in the app browser, so you can start it up at any time without a connection to Visual Studio. After setting up your project for packaging and exporting the package, you can access the App Manager on HoloLens via your browser to upload the app package (appxbundle).
If you don’t have a HoloLens available for development, you can also run your apps on the Hyper-V-based HoloLens Emulator. It supports most of the required features: Walking and looking around, performing Air Tap, Drag and Bloom gestures. Voice input is taken from your machine’s microphone. You can even load a virtual scene as a background for using spatial recognition features.
It’s easy to get overwhelmed by the euphoria of trying out all the abilities of the HoloLens. In our company, many colleagues who tried it out immediately came up with ideas for cool showcases to implement with the device. However, when thinking about potential applications, you always have to keep the limitations of the HoloLens in mind.
For head tracking, the HoloLens uses four “environment understanding” cameras together with an “inertial measurement unit”, which in combination work very reliably and fast. Depth perception, however, is provided by an IR-based “Time-of-Flight” depth sensor like the one used in the newest version of Kinect. This sensor has the same limitations as Kinect, of course. For example, it cannot “see” glass surfaces or black surfaces that absorb all light. Surfaces in direct sunlight are also problematic, mainly because you can hardly see the holograms in front of a bright background. That’s why the HoloLens should only be used indoors.
The small field of view (or rather, field of rendering) also has some implications. When developing apps, you have to take care that the holograms are small enough and/or far enough away so that they’re not cut off at the edges. The distance is also limited in both directions: Microsoft recommends to render holograms at least 85 centimeters away from the user, because the optical focus is fixed at about 2 meters. With objects closer than 85 cm, the user’s eye will have trouble refocusing between holograms and the real world. The maximum distance for rendering holograms is theoretically unlimited, but the volume in which the HoloLens collects spatial perception data isn’t. The depth perception volume around the user has a radius of about 3 to 10 meters, depending on the settings of the app.
There seems to be no easy way yet to combine the HoloLens with “classic” (marker-based) Augmented Reality. It would be great, for example, if the HoloLens could recognize the surface of certain objects to automatically place holograms on them. This would be especially helpful to work around the rather coarse-grained spatial recognition. Vuforia is already about to address this issue. A Vuforia SDK with HoloLens support is in the making and has already been demonstrated publicly, though it isn’t yet available to developers.
On the development front, it would be great to see some more API documentation. The tutorials are great to get you started, but the workings of the API can only be gleaned from code examples.
The Game has Changed
Considering that this is still the developer version of the HoloLens that we’re using here, it’s amazing how potent this device is, how reliably it works, and how easy it is to create holographic applications for it. Sure, the device and its tools still leave room for improvement. A wider field of view would be great, or a more seamless tooling for deploying and debugging. But still, the HoloLens proves to be a game changer. The possibilities it opens are huge. A lot of our customers have already shown interest in the device or have even come up with ideas for holographic apps of their own.
We can’t wait to see what great immersive experiences we’ll be creating in the future, and what improvements of the HoloLens itself are yet to come!