This specification describes support for accessing 3D camera for face tracking and recognition on the Web.

This document was published by the Crosswalk Project as an API Draft. If you wish to make comments regarding this document, please send them to crosswalk-dev@lists.crosswalk-project.org. All comments are welcome.

Introduction

The APIs described in this document are exposed through realsense.Face module.

Interfaces

FaceModule

The FaceModule interface provides methods to track and recognize faces for augmented reality applications.

The MediaStream (described in [[!GETUSERMEDIA]]) passed to the constructor must have at least one video track otherwise an exception will be thrown.

Promise<void> start()

Start to run face module on the previewStream with current configuration.

This method returns a promise. The promise will be fulfilled if there are no errors. The promise will be rejected with the DOMException object if there is a failure. Note: Please call this method after ready event, otherwise you will get a ErrorEvent.

Promise<void> stop()

Stop face module running and reset face configuration to defaults.

This method returns a promise. The promise will be fulfilled if there are no errors. The promise will be rejected with the DOMException object if there is a failure.

Promise<ProcessedSample> getProcessedSample(optional boolean getColor, optional boolean getDepth)

Get processed sample including result face data along with processed color/depth image(optional).

This method returns a promise. The promise will be fulfilled with an ProcessedSample combining color/depth processed images(only if required and available) and face module tracking/recognition output data if there are no errors. The promise will be rejected with the DOMException object if there is a failure.

optional boolean getColor

The flag to indicate whether want to aquire the color image data. The default value is false.

optional boolean getDepth

The flag to indicate whether want to aquire the depth image data. The default value is false.

readonly attribute FaceConfiguration configuration

The interface to configure FaceModule.

readonly attribute Recognition recognition

The interface to access face recognition feature.

readonly attribute MediaStream previewStream;

The MediaStream instance passed in constructor.

attribute EventHandler onready

A property used to set the EventHandler (described in [[!HTML]]) for the Event that is dispatched to FaceModule to indicate that it's ready to start because the previewStream has been started.

attribute EventHandler onended

A property used to set the EventHandler (described in [[!HTML]]) for the Event that is dispatched to FaceModule to indicate that the previewStream has ended and FaceModule has already detached from it completely.

attribute EventHandler onerror

A property used to set the EventHandler (described in [[!HTML]]) for the ErrorEvent that is dispatched to FaceModule when there is an error.

attribute EventHandler onprocessedsample

A property used to set the EventHandler (described in [[!HTML]]) for the Event that is dispatched to FaceModule when a new processed sample is ready.

attribute EventHandler onalert

A property used to set the EventHandler (described in [[!HTML]]) for the AlertEvent that is dispatched to FaceModule when there is an alert happened.

AlertEvent interface

readonly attribute AlertType typeLabel

The label of the alert event.

readonly attribute long timeStamp

The time stamp when the event occurred, in 100ns.

readonly attribute long faceId

The identifier of the relevant face, if relevant and known.

Recognition interface

The Recognition interface provides methods to access face recognition feature.

Promise<long> registerUserByFaceID(long faceId)

Register a detected face into recognition database.

This method returns a promise. The promise will be fulfilled with the user identifier registered in recognition database if there are no errors. The promise will be rejected with the DOMException object if there is a failure.

long faceId

The face id which could be gotten from the detected face data FaceData.

Promise<void> unregisterUserByID(long userId)

Unregister an user from recognition database.

This method returns a promise. The promise will be fulfilled if there are no errors. The promise will be rejected with the DOMException object if there is a failure.

long userId

The user identifier in recognition database, could be gotten from the face recognition data RecognitionData or the return value of the function registerUserByFaceID.

FaceConfiguration interface

The FaceConfiguration interface provides methods to configure FaceModule.

Promise<void> set(FaceConfigurationData config)

Set configuration values.

This method returns a promise. The promise will be fulfilled if there are no errors. The promise will be rejected with the DOMException object if there is a failure.

FaceConfigurationData config

The face configuration to be set effective. Note: some configuration items won't take effect while face module is running, such as TrackingModeType. If you need to set it, please stop face module firstly.

Promise<FaceConfigurationData> getDefaults()

Get configuration default values.

This method returns a promise. The promise will be fulfilled with the default face configuration if there are no errors. The promise will be rejected with the DOMException object if there is a failure.

Promise<FaceConfigurationData> get()

Get current effective configuration values.

This method returns a promise. The promise will be fulfilled with current effective face configuration if there are no errors. The promise will be rejected with the DOMException object if there is a failure.

Dictionaries

Image

PixelFormat format
long width
long height
ArrayBuffer data

Rect

long x
long y
long w
long h

Point3DFloat

double x
double y
double z

Point2DFloat

double x
double y

AlertConfiguration

boolean? newFaceDetected
Enable new_face_detected alert.
boolean? faceOutOfFov
Enable face_out_of_fov alert.
boolean? faceBackToFov
Enable face_back_to_fov alert.
boolean? faceOccluded
Enable face_occluded alert.
boolean? faceNoLongerOccluded
Enable face_no_longer_occluded alert.
boolean? faceLost
Enable face_lost alert.

DetectionConfiguration

boolean? enable
Enable face detection feature.
long? maxFaces
Maximum number of faces to be tracked.

LandmarksConfiguration

boolean? enable
Enable face landmarks feature.
long? maxFaces
Maximum number of faces to be tracked.
long? numLandmarks
Maximum number of landmarks to be tracked.

RecognitionConfiguration

boolean? enable
Enable face recognition feature.

FaceConfigurationData

TrackingModeType? mode
Face tracking input data modes.
TrackingStrategyType? strategy
Face tracking strategy.
AlertConfiguration? alert

The structure describing the alert enable/disable status.

DetectionConfiguration? detection

The structure describing the face detection configuration parameters.

LandmarksConfiguration? landmarks

The structure describing the face landmarks configuration parameters.

RecognitionConfiguration? recognition

The structure describing the face recognition configuration parameters.

DetectionData

Rect boundingRect
The bounding box of the detected face.
double avgDepth
The average depth of the detected face.

LandmarkPoint

LandmarkType type
Landmark point type.
long confidenceImage
The confidence score from 0 to 100 of the image coordinates.
long confidenceWorld
The confidence score from 0 to 100 of the world coordinates.
Point3DFloat coordinateWorld
The world coordinates of the landmark point.
Point2DFloat coordinateImage
The color image coordinates of the landmark point.

LandmarksData

sequence<LandmarkPoint> points
All landmark points of the detected face.

RecognitionData

long userId
The user identifier in the recognition database.

FaceData

long? faceId
ID of the detected face.
DetectionData? detection
Detection data of the detected face.
LandmarksData? landmarks
Landmarks data of the detected face.
RecognitionData? recognition
Recognition result data of the detected face.

ProcessedSample

Image? color
Color stream image.
Image? depth
Depth stream image.
sequence<FaceData> faces
All the detected faces.

Enumerators

TrackingModeType enum

color

Require color data at the module input to run face algorithms.

color-depth

Require color and depth data at the module input to run face algorithms.

TrackingStrategyType enum

appearance-time

Track faces based on their appearance in the scene.

closest-farthest

Track faces from the closest to the furthest.

farthest-closest

Track faces from the furthest to the closest.

left-right

Track faces from left to right.

right-left

Track faces from right to left.

AlertType enum

new-face-detected

A new face enters the FOV and its position and bounding rectangle is available.

face-out-of-fov

A new face is out of field of view (even slightly).

face-back-to-fov

A tracked face is back fully to field of view.

face-occluded

Face is occluded by any object or hand (even slightly).

face-no-longer-occluded

Face is not occluded by any object or hand.

face-lost

A face could not be detected for too long, will be ignored.

PixelFormat enum

rgb32

The 32-bit RGB32 color format.

depth

The depth map data in 16-bit unsigned integer.

LandmarkType enum

not-named

Unspecified.

eye-right-center

The center of the right eye.

eye-left-center,

The center of the left eye.

eyelid-right-top,

The right eye lid top.

eyelid-right-bottom,

The right eye lid bottom.

eyelid-right-right,

The right eye lid right.

eyelid-right-left,

The right eye lid left.

eyelid-left-top,

The left eye lid top.

eyelid-left-bottom,

The left eye lid bottom.

eyelid-left-right,

The left eye lid right.

eyelid-left-left,

The left eye lid left.

eyebrow-right-center,

The right eye brow center.

eyebrow-right-right,

The right eye brow right.

eyebrow-right-left,

The right eye brow left.

eyebrow-left-center,

The left eye brow center.

eyebrow-left-right,

The left eye brow right.

eyebrow-left-left,

The left eye brow left.

nose-tip,

The top most point of the nose in the Z dimension.

nose-top,

The nose top.

nose-bottom,

The nose bottom.

nose-right,

The nose right.

nose-left,

The nose left.

lip-right,

The lip right.

lip-left,

The lip left.

upper-lip-center,

The lip center.

upper-lip-right,

The lip upper right.

upper-lip-left,

The lip upper left.

lower-lip-center,

The lip lower center.

lower-lip-right,

The lip lower right.

lower-lip-left,

The lip lower left.

face-border-top-right,

The face border right.

face-border-top-left,

The face border left.

chin

The bottom chin point.

Examples

Start/Stop face module

          var previewStream;
          var ft;
          var startButton = document.getElementById('startButton');
          var stopButton = document.getElementById('stopButton');

          function errorCallback(error) {
            console.log('getUserMedia failed: ' + error); 
          }

          // Start stream firstly, then start face module.
          startButton.onclick = function(e) {
            navigator.mediaDevices.getUserMedia(constraints)
                .then(function(stream) {
                  // Wire the media stream into a <video> element for preview.
                  previewStream = stream;
                  var previewVideo = document.querySelector('#previewVideo');
                  previewVideo.srcObject = stream;
                  previewVideo.play();

                  try {
                    ft = new realsense.Face.FaceModule(stream);
                  } catch (e) {
                    console.log('Failed to create face module: ' + e.message);
                    return;
                  }

                  ft.onready = function(e) {
                    console.log('Face module ready to start');
                    // The stream got ready, we can start face module now.
                    ft.start().then(
                        function() {
                          console.log('Face module start succeeds');},
                        function(e) {
                          console.log('Face module start failed: ' + e.message);}); 
                  };

                  ft.onprocessedsample = function(e) {
                    console.log('Got face module processedsample event.');
                    ft.getProcessedSample(false, false).then(function(processedSample) {
                      console.log('Got face module processedsample data.');
                      // Use processedSample.faces data to work for you.
                      console.log('Detected faces number: ' + processedSample.faces.length);
                      // You can get all avaiable detection/landmarks/recognition data
                      // of every face from processedSample.faces.
                      // Please refer to FaceData interface.
                    }, function(e) {
                      console.log('Failed to get processed sample: ' + e.message);});
                  };

                  ft.onerror = function(e) {
                    console.log('Got face module error event: ' + e.message);
                  };

                  ft.onended = function(e) {
                    console.log('Face module ends without stop');
                  };

                  ft.ready = false;
                  ftStarted = false;
                  onGetConfButton();
                }, errorCallback);
          };

          function stopPreviewStream() {
            if (previewStream) {
              previewStream.getTracks().forEach(function(track) {
                track.stop();
              });
              if (ft) {
                // Remove listeners as we don't care about the events.
                ft.onerror = null;
                ft.onprocessedsample = null;
                ft = null;
              }
            }
            previewStream = null;
          }

          // Stop face module and stream.
          stopButton.onclick = function(e) {
            if (!ft) return;
            ft.stop().then(
                function() {
                  console.log('Face module stop succeeds');
                  stopPreviewStream();},
                function(e) {
                  console.log('Face module stop failed');
                  stopPreviewStream();});
          };
        

Face module configuration.

          var setConfButton = document.getElementById('setConfButton');
          var getConfButton = document.getElementById('getConfButton');
          var getDefaultConfButton = document.getElementById('getDefaultConfButton');

          // Set configuration. Simple configuration example as bellow.
          // Please refer to FaceConfigurationData interface for confData details.
          var confData = {
            // Set face tracking strategy.
            strategy: 'right-left',
            // Disable landmarks.
            landmarks: {
              enable: false
            }, 
            // Enable recognition.
            recognition: {
              enable: true
            }
          };
          setConfButton.onclick = function(e) {
            ft.configuration.set(confData).then(
                function() {
                  console.log('set configuration succeeds');},
                function(e) {
                  console.log(e.message);});
          };

          // Get current configuration.
          getConfButton.onclick = function(e) {
            ft.configuration.get().then(
                function(confData) {
                  // Use confData values to work for you.
                  console.log('get current configuration succeeds');},
                function(e) {
                  console.log('get configuration failed: ' + e.message);});
          };

          // Get default configuration.
          getDefaultConfButton.onclick = function(e) {
            ft.configuration.getDefaults().then(
                function(confData) {
                  // Use confData values to work for you.
                  console.log('get default configuration succeeds');},
                function(e) {
                  console.log('get default configuration failed: ' + e.message);});
          };
        

Acknowledgements