Wednesday, June 30, 2010

Fishtank VR project with OpenCV and OpenGL (INCOMPLETE)


This idea is not new... basically I wanted to to mimic Johnny Lee's Wiimote head tracking desktop VR display, using face tracking instead of the Wiimote. I've found previous solutions by Ziyan Zhou or using using the framework, there is even a commercial product FaceAPI, among others.

* Camera access                                                 (openCV)
* Face detector / feature traker                          (openCV)
* Trajectory smoothing for the face position      (openCV)

Face detectors:

Feature tracking (manually select the eyes):

Setup of OpenGL camera for fishtank VR

One of the key points for this illusion to work is to setup and move the camera in concordance with the viewer.

In OpenGL the center of projection (the eye) is fixed at world coordinates (0,0,0), and we can manipulate the projection by setting the limits of the rendering volume that is: left/right, upper/lower and near/far (left/right and upper/lower are indicated on the near plane).
There are two interpretation for this figure. The first one is to look it as a virtual camera model as described for OpenGL, in this setting the projection plane is the screen, and the projection center change as the viewer moves.
In the second interpretation we think of the screen as a window, and the projection center as the eye of the observer positioned with respect to the screen. The image is acquired by projecting on the retina inside the eye, and for simulating the window the retinal image must be projected back on the "window" plane (the projection lines are the same).

To simulate the movement of the viewer we must to change the parameters of the projection, but also move the world to compensate. In particular if the viewer moves to the LEFT then we must see more of the right side of the schene. If the viewer approaches the screen, then the distance to the near plane must be reduced.
Observe in the figure that the movement of the camera induce a shift in the scene.
These rules are captured by the following lines of code: if the position of the viewer is indicated as signed (X,Y,Z) with respect to an arbitrary position centered in front of the screen.

/* Adapt the projection to the new eye position */

  glFrustum(  -1.0 + X , 1.0 + X, /* left, right */
              -1.0 + Y , 1.0 + Y, /* botom,top*/
              ZNEAR_proj - Z,     /* Z near : this is the projection plane*/    
              ZFAR                /* Z far */        );

/* Compensate for the movement of the eye */
  glTranslatef(X, Y, Z);

UPDATE: this one handles better the distance to the screen. Also modifying the projection matrix permits to show objects in front of the NEAR plane.

/* Adapt the projection to the new eye position */

  glFrustum(  -1.0 + X , 1.0 + X, /* left, right */
              -1.0 + Y , 1.0 + Y, /* botom,top*/
              ZNEAR * CTE/ Z,     /* Z near : this is the projection plane*/    
              ZFAR                /* Z far */        );

/* Modify the projection matrix */
ZNEAR_clip = 0.1;

/* Recover the current matrix */

GLfloat projectionMatrix[16];
/* Modify the entries */
/* Load the modified matrix */

/* Compensate for the movement of the eye */
  glTranslatef(X, Y, ZNEAR - 
ZNEAR * CTE/ Z ); 

Interesting improvements:

Mulder, J. D. and Robert Van Liere 2000. Enhancing Fish Tank VR. In Proceedings of the IEEE Virtual Reality 2000 Conference (March 18 - 22, 2000). VR. IEEE Computer Society, Washington, DC, 91.

About projection matrices:


  1. What does CTE stand for in the above implementation?

    1. CTE means constant, I used CTE~=1000 for scaling the displacements.