Detect Faces with ML Kit on iOS

You can use ML Kit to detect faces in images and video.

See the ML Kit quickstart sample on GitHub for an example of this API in use.

Before you begin

  1. If you have not already added Firebase to your app, do so by following the steps in the getting started guide.
  2. Include the ML Kit libraries in your Podfile:
    pod 'Firebase/Core'
    pod 'Firebase/MLVision'
    pod 'Firebase/MLVisionFaceModel'
    After you install or update your project's Pods, be sure to open your Xcode project using its .xcworkspace.
  3. In your app, import Firebase:


    import Firebase


    @import Firebase;
  4. Disable bitcode generation for your project from the Build Settings > Build Options section of your project settings.

On-device face detection

Configure the face detector

Before you apply face detection to an image, if you want to change any of the face detector's default settings, specify those settings with a VisionFaceDetectorOptions object. You can change the following settings:

Detection mode fast (default) | accurate

Favor speed or accuracy when detecting faces.

Detect landmarks none (default) | all

Whether or not to attempt to identify facial "landmarks": eyes, ears, nose, cheeks, mouth.

Classify faces none (default) | all

Whether or not to classify faces into categories such as "smiling", and "eyes open".

Minimum face size CGFloat (default: 0.1)

The minimum size, relative to the image, of faces to detect.

Enable face tracking false (default) | true

Whether or not to assign faces an ID, which can be used to track faces across images.

For example, to change all of the default settings, build a VisionFaceDetectorOptions object as in the following example:


let options = VisionFaceDetectorOptions()
options.landmarkType = .all
options.classificationType = .all
options.modeType = .accurate


FIRVisionFaceDetectorOptions *options = [[FIRVisionFaceDetectorOptions alloc] init];
options.modeType = FIRVisionFaceDetectorModeAccurate;
options.landmarkType =  FIRVisionFaceDetectorLandmarkAll;
options.classificationType = FIRVisionFaceDetectorClassificationAll;
options.minFaceSize = (CGFloat) 0.2f;
options.isTrackingEnabled = YES;

Run the face detector

To detect faces in an image, pass the image as a UIImage or a CMSampleBufferRef to the VisionFaceDetector's detect(in:) method:

  1. Get an instance of VisionFaceDetector:


    lazy var vision =
    let faceDetector = vision.faceDetector(options: options)


    FIRVision *vision = [FIRVision vision];
    FIRVisionFaceDetector *faceDetector = [vision faceDetector];
    // Or, to change the default settings:
    // FIRVisionFaceDetector *faceDetector =
    //     [vision faceDetectorWithOptions:options];
  2. Create a VisionImage object using a UIImage or a CMSampleBufferRef.

    To use a UIImage:

    1. If necessary, rotate the image so that its imageOrientation property is .up.
    2. Create a VisionImage object using the correctly-rotated UIImage. Do not specify any rotation metadata—the default value, .topLeft, must be used.


      let image = VisionImage(image: uiImage)


      FIRVisionImage *image = [[FIRVisionImage alloc] initWithImage:uiImage];

    To use a CMSampleBufferRef:

    1. Create a VisionImageMetadata object that specifies the orientation of the image data contained in the CMSampleBufferRef buffer.

      For example, if the image data must be rotated clockwise by 90 degrees to be upright:


      let metadata = VisionImageMetadata()
      metadata.orientation = .rightTop  // Row 0 is on the right and column 0 is on the top


      // Row 0 is on the right and column 0 is on the top
      FIRVisionImageMetadata *metadata = [[FIRVisionImageMetadata alloc] init];
      metadata.orientation = FIRVisionDetectorImageOrientationRightTop;
    2. Create a VisionImage object using the CMSampleBufferRef object and the rotation metadata:


      let image = VisionImage(buffer: bufferRef)
      image.metadata = metadata


      FIRVisionImage *image = [[FIRVisionImage alloc] initWithBuffer:buffer];
      image.metadata = metadata;
  3. Then, pass the image to the detect(in:) method:


    faceDetector.detect(in: visionImage) { features, error in
      guard error == nil, let features = features, !features.isEmpty else {
        // ...
      // Faces detected
      // ...


    [faceDetector detectInImage:image
                     completion:^(NSArray<FIRVisionFace *> *faces,
                                  NSError *error) {
      if (error != nil) {
      } else if (faces != nil) {
        // Recognized faces

Get information about detected faces

If the face detection operation succeeds, the face detector passes an array of VisionFace objects to the completion handler. Each VisionFace object represents a face that was detected in the image. For each face, you can get its bounding coordinates in the input image, as well as any other information you configured the face detector to find. For example:


for face in faces {
  let frame = face.frame
  if face.hasHeadEulerAngleY {
    let rotY = face.headEulerAngleY  // Head is rotated to the right rotY degrees
  if face.hasHeadEulerAngleZ {
    let rotZ = face.headEulerAngleZ  // Head is rotated upward rotZ degrees

  // If landmark detection was enabled (mouth, ears, eyes, cheeks, and
  // nose available):
  if let leftEye = face.landmark(ofType: .leftEye) {
    let leftEyePosition = leftEye.position

  // If classification was enabled:
  if face.hasSmilingProbability {
    let smileProb = face.smilingProbability
  if face.hasRightEyeOpenProbability {
    let rightEyeOpenProb = face.rightEyeOpenProbability

  // If face tracking was enabled:
  if face.hasTrackingID {
    let trackingId = face.trackingID


for (FIRVisionFace *face in faces) {
  // Boundaries of face in image
  CGRect frame = face.frame;

  if (face.hasHeadEulerAngleY) {
    CGFloat rotY = face.headEulerAngleY;  // Head is rotated to the right rotY degrees
  if (face.hasHeadEulerAngleZ) {
    CGFloat rotZ = face.headEulerAngleZ;  // Head is tilted sideways rotZ degrees

  // If landmark detection was enabled (mouth, ears, eyes, cheeks, and
  // nose available):
  FIRVisionFaceLandmark *leftEar = [face landmarkOfType:FIRFaceLandmarkTypeLeftEar];
  if (leftEar != nil) {
    FIRVisionPoint *leftEarPosition = leftEar.position;

  // If classification was enabled:
  if (face.hasSmilingProbability) {
    CGFloat smileProb = face.smilingProbability;
  if (face.hasRightEyeOpenProbability) {
    CGFloat rightEyeOpenProb = face.rightEyeOpenProbability;

  // If face tracking was enabled:
  if (face.hasTrackingID) {
    NSInteger trackingID = face.trackingID;