Rebuilding the HoloLens scanning effect with RoomAlive Toolkit

The initial video that explains the HoloLens to the world contains a small clip that visualizes how it can see the environment. It shows a pattern of large and smaller triangles that gradually overlay the real world objects seen in the video. I decided to try to rebuild this effect in real life by using a projection mapping setup that used a projector and a Kinect V2 sensor.

HoloLens room scan

Prototyping in Shadertoy

First I experimented with the idea by prototyping a pixel shader in Shadertoy. Shadertoy is an online tool that allows developers to prototype, experiment, test and share pixel shaders by using WebGL. I started with a raymarching example by Iñigo Quilez and setup a small scene with a floor, wall and a bench. The calculated 3D world coordinates could then be used for overlaying with a triangle effect. The raymarched geometry would later be replaced by geometry scanned with the Kinect V2. The screenshot below shows what the effect looks like. The source code of this shader can be found on the Shadertoy website.

Shadertoy Room Scanning Shader

Projection mapping with RoomAlive Toolkit

During Build 2015 Microsoft open sourced a library called the RoomAlive Toolkit that contains the mathematical building blocks for building RoomAlive-like experiences. The library contains tools to automatically calibrate multiple Kinects and projectors so they can all use the same coordinate system. This means that each projector can be used to project onto the correct location in a room. This can even be done on dynamic geometry. The toolkit also includes an example of reprojecting the recorded image with a pixel shader effect. I used this example to apply the earlier prototyped scan effect pixel shader onto a live scanned 3D geometry.

Source code on GitHub

Bring Your Own Beamer

The installation was shown at the Bring Your Own Beamer event held on September 25th 2015 in Utrecht, The Netherlands. For this event I made some small artistic adjustments. In the original video the scanning of the world seems to start from the location of the person wearing the HoloLens. In the installation shown at the event people were able to trigger the scanning effect with their feet. The effect starts at the triggered location and expands across the floor and up their legs and any other geometry in the room.


The distance from the camera determines the base color used for a particular scan. Multiple scans interfere with each other and generate a colorful experience. The video shows how part of the floor and part of the wall are mapped with a single vertically mounted projector. People seemed to particularly like to play with the latency of the projection onto their body by moving quickly.


Real life Portal; a holographic window using Kinect

The game Portal (released in 2007 by Valve) is known for it’s gameplay where portals could be used to teleport between different locations. Portals where rendered as virtual windows into the connected location with the well-known orange and blue rings around them. The game spawned a huge amount of memes, fan art and YouTube videos that used elements from the game.

Portal by Valve

Real life portals without trickery

The Kinect V2 is a sensor that can be used to record a 3D view of the world in real-time. It can also track users and can see what their body pose is. This can be used to perform head tracking and reconstruct a camera view into a 3D world as if looking through a virtual window. By using one Kinect for head tracking and another Kinect for reconstructing a 3D view, the virtual window effect of a portal can be created in reality. By using both Kinects for 3D world reconstruction and head tracking a two way portal effect can be achieved.

Hardware setup

In the setup used for recording the video two Kinects V2 were used. My laptop was connected to a projector that projected onto the wall. The PC displayed the other much smaller portal. Smaller portals hide more of the other side of the scene and allow for larger head movements. With a bigger portal you will run into the limitations in the field of view of the Kinect much earlier.

Instead of transferring the recorded 3D scene I swapped the Kinects and only transferred the recorded bodyframes through a network connection to mimimize latency. This limits the maximum range that the portals can be placed from each other. (about 7 meters when using USB3 extensions cables)

A portal opens as soon as a user is detected by the Kinect so proper head tracking can be done.  For the video I used the right hand joint for controlling the camera so the viewer would experience what it would look like when head tracking is applied.

Holographic window

Holography is quickly becoming the buzzword of 2015. It’s getting harder to keep a clear understanding of what holographic actually means. (and yes I’ve abused the term too) I like the view that Oliver Kreylos has on the term holography. (See: What is holographic, and what isn’t?)

Since both worlds are already rendered in 3D it is a small step to add stereo rendering. For instance with a Microsoft HoloLens. This brings us closer to a holographic window.
Here’s the checklist that Oliver Kreylos uses in his article:

  1. Perspective foreshortening: farther away objects appear smaller
    Check, due to perspective projection used in rendering
  2. Occlusion: nearer objects hide farther objects
    Check, due to 3D reconstructed world, but objects can occluded
  3. Binocular parallax / stereopsis: left and right eyes see different views of the same objects
    Check, when using a stereo display
  4. Monocular (motion) parallax: objects shift depending on how far away they are when head is moved
    Check, due to head tracking
  5. Convergence: eyes cross when focusing on close objects
    Check, when using a stereo display
  6. Accommodation: eyes’ lenses change focus depending on objects’ distances

Natural user interface

Looking through a window is a familiar experience for almost everyone. Imagine being in a Skype conversation and having the ability to move your head to see who your caller is looking at or talking to (when it’s not you). A holographic window has the power to give people the feeling of being in the same space and allows for interesting new interactions. Anyone care for a game of portal tennis?


Kinect V1 and Kinect V2 fields of view compared


With the impending release of the new Kinect for Windows this summer, I took a closer look at the differences in Field of View between the old and the new Kinect for Windows.
A well known improvement of the new Kinect for Windows sensor is the higher resolution of the image and depth streams. Part of the extra pixels are used to cover the extra viewing area due the increased horizontal and vertical fields of view of both the color and depth camera. The rest of the extra pixels contribute to a higher precision of what the cameras can see.

This article is based on preliminary software and/or hardware and APIs are preliminary and subject to change.

Color image

The old Kinect has a color image resolution of 640 x 480 pixels with a fov of 62 x 48.6 degrees resulting in an average of about 10 x 10 pixels per degree. (see source 1)

The new Kinect has color image resolution of 1920 x 1080 pixels and a fov of 84.1 x 53.8 resulting in an average of about 22 x 20 pixels per degree. (see source 2)

This improves the color image detail with a factor of two in horizontal and vertical direction. This is a welcome improvement for scenario’s that use the color image for taking pictures or videos, background removal (green screening), face recognition and more.

Depth image

The old Kinect has a depth image resolution of 320 x 240 pixels with a fov of 58.5 x 46.6 degrees resulting in an average of about 5 x 5 pixels per degree. (see source 1)

The new Kinect has a depth image resolution of 512 x 424 pixels with a fov of 70.6 x 60 degrees resulting in an average of about 7 x 7 pixels per degree. (see source 2)

This does not seem as a large improvement, but the depth images of the old and new Kinect can not be compared that easily. Due to the use of time-of-flight as the core mechanism for depth retrieval each pixel in the 512 x 424 depth image of the new Kinect contains a real measured depth value (z-coordinate) with a much higher precision than the depth image of the Kinect V1. The depth image of the old Kinect is based on the structured light technique. This results in an interpolated depth image that is based on a much lower number of samples than what the depth image resolution suggests.

Kinect field of view explorer


I built a tool based on WebGL / three.js to allow you to explore the differences between the old and new Kinect, their positioning and how this influences what can be seen. You can switch between the different Kinect sensor fields of view and it allows you to tweak the height and tilt of the sensor. It calculates the intersection with the floor and displays the width of the intersection and distance from the sensor.

The tool was tested to work with Mozilla Firefox 27 and Google Chrome 33 and Internet Explorer 11.

Open Kinect FOV explorer

Data sources

1 Mentioned values were retrieved from a Kinect V1 sensor with help of the Kinect V1 SDK. The KinectSensor object contains a ColorImageStream and DepthImageStream that both contain a FrameWidth and FrameHeight in pixels and the NominalHorizontalFieldOfView and NominalVerticalFieldOfView in degrees. The DepthImageStream also contains values for the MinDepth and MaxDepth in millimeters.

2 Mentioned values were retrieved from a Kinect V2 sensor with help of the Kinect V2 SDK. The KinectSensor object contains a ColorFrameSource and DepthFrameSource that both contain a FrameDescription containing the Width and Height in pixels and the HorizontalFieldOfView and VerticalFieldOfView in degrees. The DepthFrameSource also reports the DepthMinReliableDistance and DepthMaxReliableDistance in millimeters.



Live Kinect holography experiment


I had some fun together with my children and created a live holographic display.
Kinect holography uses a technique commonly known as Pepper’s Ghost. It was invented more than 150 years ago and is often used in theme parks or museums. A recent trend is to use it for product displays with animated special effects.

Kinect background removal

One of the things that can easily be done with the Kinect SDK is to extract a single person from the live image feed. I modified one of the samples in the Kinect SDK to show the background removed image fullscreen on a black background.

The PC is on top of a duplo structure with it’s display pointing down. The image is reflected in a 45 degree tilted glass plane.
The kids are wearing light colored clothes so they reflect better. I used an extra light that was on the floor in front of the kids.

The real magic happens if there is an object behind the glass plane that the person in front of the Kinect can sit in or stand on.


Switched to WordPress

I decided to switch to a WordPress based website to make it easier to manage content and simply make it more modern. I want to add a few of my old VRML projects as YouTube videos.

The glory days of VRML are over. I had a lot of fun with it, but now it’s time to move on.