Hololens
Insights

Quick and useful: How we created an AR app in only three days

By Robin Wiegand & Jonas Wisplinghoff &

One of the biggest advantages of Augmented Reality is that most users can use it quite intuitively. That’s why a lot of realized or planned applications of this technology are in the field of learning or the transfer of know-how.

However, when it comes to creating apps for the Augmented Reality, it seems to be the other way around: Even experienced programmers consider this a difficult and complicated task. And yet Apple and Google are making things much easier with ARKit and ARCore. So, we challenged ourselves: Can we program an AR app within only three days, that is really useful?

Insight in brief

  • As a usecase we came up with augmenting the doorplates that were attached to the doors of the conference rooms in the hotel.
  • All in all, the most difficult part of our project was to automatically read the texts.
  • ARKit has low entry barriers, especially when it comes to recognizing and augmenting objects.
2 diagonalestriche lightgray

Our challenge took place on our annual Zühlke Camp from May 2 to 4, 2018. The staff from all our german locations gathered in Leimen, near the town of Heidelberg, for three days of learning, socializing and partying. The AR track hosted several projects, reaching from investigating shared experiences with the HoloLens and Windows immersive headsets, playing around with the Holo-Stylus and the Myo wristband.

2 diagonalestriche lightgray

1. Recognize doorplate

Our choice, however, was to dive into mobile AR, mainly ARKit (iOS) and ARCore (Android). We decided to give ARKit a try first and challenge ourselves to create a useful app within only three days.

Raum Elba
2 diagonalestriche lightgray

As a usecase we came up with augmenting the doorplates that were attached to the doors of the conference rooms in the hotel. Different interest groups gathered in each conference room to work on specific topics. Our app is intended to extend the doorplate so that the user receives information about which interest groups are hidden behind the corresponding room. In order to recognize the doorplates, we use the new ARKit 1.5 image recognition APIs.

To make ARKit recognize images, you just have to include them in the assets of your app and add them to the ARWorldTrackingConfiguration of your ARSession:

Code
2 diagonalestriche lightgray

When ARKit recognizes one of the images in the camera stream you get a call back to the renderer (renderer: SCNSceneRenderer, didAdd node: SCNNode, for anchor: ARAnchor) method. The SCNNode delivers the coordinates of the recognized image in the real world. These coordinates can be used to add 3D objects to the ARView using the SceneKit iOS framework.

Code
2 diagonalestriche lightgray

2. Read doorplate via OCR

In order to find out which room was scanned, we planned to use OCR (optical character recognition) on the camera image. First, we planned to use the iOS Frameworks Vision and CoreML for this. The Vision Framework supports text detection, but not recognition. So we added a pretrained CoreML model for text recognition and fed the cropped images of the letters into CoreML for classification. Unfortunately, the model was not working good enough for getting reliable results. If you are further interested in solving OCR tasks with CoreML, Martin Mitrevski’s article is a very good starting point.

We used Microsoft’s Computer Vision API as a fallback. You can upload any image to the cloud-based API and the service will analyze the visual content in different ways based on user choices (e.g. OCR, detect human faces and flag adult contect). After processing the image, the developer will get the response as a JSON object. To get your images analyzed, you can register for a free trial.

So, whenever ARKit recognized the image target, we send a frame of the video to the cognitive services. As a return we get all strings from the image.

Code
2 diagonalestriche lightgray

3. Augment additional information into the scene

Now that we have read out the names of the room, we can show additional information to the user. We implemented a simple mapping between room name and interest group shortcut. We want to show this information right below the Zühlke logo as a 3D text.

Code
2 diagonalestriche lightgray

4. Conclusion

All in all, the most difficult part of our project was to automatically read the texts. Apple provides the possibility to detect, if an image contains text. However, recognizing these texts is not yet so easy. A good alternative is to use cloud services for this task until further notice. This also shows how closely the further distribution of Augmented Reality applications is linked to the advancement of technologies like Data Analytics or Artificial Intelligence.

ARKit has low entry barriers, especially when it comes to recognizing and augmenting objects. As a developer, you can make rapid progress with the framework. So, it’s pretty easy to use, once you got familiar with the coordinate system. The web is full of useful tutorials, which makes it even easier. That is why we managed to finish this AR app within only three days – and had lots of fun doing it.

The project is open sourced on Github.

Robin Wiegand Zühlke

Robin Wiegand

Expert Software Engineer
Contact person for Germany robin.wiegand@zuehlke.com +49 6196 777 54 356

Robin Wiegand is Expert Software Engineer in the field of mobile applications. Since 2014 he has been involved in the development of apps for iOS and Android devices. Since mobile apps rarely operate independently, his experience in cloud technologies benefits the entire development process. Due to his many years of experience with Xamarin projects (Xamarin.Native & Xamarin.Forms), Robin's current focus is in this area. Robin Wiegand holds a Master of Science degree in media computer science.

Jonas Wisplinghoff Zühlke

Jonas Wisplinghoff

Expert Software Engineer
Contact person for Germany jonas.wisplinghoff@zuehlke.com +49 6196 777 54 333

Jonas Wisplinghoff is a Software Engineer with the focus on mobile application development. This includes the native development for iOS and Android as well as the cross platform development with different frameworks and toolchains. In multiple projects he has gained experience in different areas of the development cycle of a mobile app, for example in the area of quality assurance and the connectivity to other devices via Bluetooth LE. Jonas Wisplinghoff has a master's degree in Media and Computer Science.