Skip to content

What is Mojo Facial Expression Recognition API

Facial recognition software is important in many different scenarios, such as remote and hybrid teaching, online surveys, user testing or product analytics.

Emotion AI made easy

Learn more about Use Cases on the Hoomano website

This documentation contains :

  • Quickstarts: step-by-step instructions that let you set up your environment, make calls to the service and get results in a short period of time.
  • Concepts: in-depth explanations of the service's functionality and features.
  • Tutorials: longer guides that show you how to use this service as a component in broader business solutions.
  • How-to guides: instructions for using the service in more specific or customized ways.

What does Mojo Facial Expression API provide?

You use Mojo Facial Expression API to detect facial expressions in an image. At a minimum, each detected face corresponds to a set of anonymized facial keypoints. This set is composed of point coordinates for the 478 facial landmarks located on the face for each frame. Using these timeseries of coordinates, the API estimates the facial expressions.

In the API response, you will find :

  • emotions listed with their detection confidence for the given frame. Confidence scores are normalized, and the scores across all emotions are estimated between 0 and 1. 1 representing the highest probability of appearance, 0 the lowest. The emotions returned are amusement, surprise and confusion.
  • social cues with their prediction value, and status : attention, engagement and interaction_status.

More info

Look at the "Concepts" section for more details.

How does it work in practice?

The API is composed of a frontend and a backend.

The frontend API - for example the Javascript API, is used on the front end of the application. It gets access to the camera to extract key points from the image.

This is done locally, on the local device, no images are sent to the server.

The key points are sent to the Mojo Cloud server as soon as they are computed, speed depends on the end user’s device, ranging from 10 to 30 Hz on recent smartphones or laptops. The recognition component computes the analysis and sends the result through a SocketIO server, available from the backend of the client.

How it works