在 Android 上使用 Firebase ML 來辨識圖片中的文字

您可以使用 Firebase ML 辨識圖片中的文字。Firebase ML 提供一般用途 API,可用於辨識圖片中的文字,例如路標上的文字,以及經過最佳化調整的 API,可用於辨識文件中的文字。

事前準備

  1. 如果您尚未將 Firebase 新增至 Android 專案,請新增 Firebase
  2. 模組 (應用程式層級) Gradle 檔案 (通常是 <project>/<app-module>/build.gradle.kts<project>/<app-module>/build.gradle) 中,加入 Android 版 Firebase ML Vision 程式庫的依附元件。建議您使用 Firebase Android BoM 來控管程式庫版本。
    dependencies {
        // Import the BoM for the Firebase platform
        implementation(platform("com.google.firebase:firebase-bom:33.7.0"))
    
        // Add the dependency for the Firebase ML Vision library
        // When using the BoM, you don't specify versions in Firebase library dependencies
        implementation 'com.google.firebase:firebase-ml-vision'
    }

    只要使用 Firebase Android BoM,應用程式就會一律使用相容的 Firebase Android 程式庫版本。

    (替代做法)  使用 BoM 新增 Firebase 程式庫依附元件

    如果您選擇不使用 Firebase BoM,則必須在依附元件行中指定每個 Firebase 程式庫版本。

    請注意,如果您在應用程式中使用多個 Firebase 程式庫,強烈建議您使用 BoM 來管理程式庫版本,確保所有版本皆相容。

    dependencies {
        // Add the dependency for the Firebase ML Vision library
        // When NOT using the BoM, you must specify versions in Firebase library dependencies
        implementation 'com.google.firebase:firebase-ml-vision:24.1.0'
    }
    想尋找 Kotlin 專屬的程式庫模組嗎?2023 年 10 月 (Firebase BoM 32.5.0)起,Kotlin 和 Java 開發人員都可以依附主要程式庫模組 (詳情請參閱這項計畫的常見問題)。
  3. 如果您尚未為專案啟用雲端 API,請立即啟用:

    1. 開啟 Firebase 控制台的 Firebase ML API 頁面
    2. 如果您尚未將專案升級至 Blaze 定價方案,請按一下「Upgrade」進行升級 (只有在專案未採用 Blaze 方案時,系統才會提示您升級)。

      只有 Blaze 級別專案可以使用雲端 API。

    3. 如果您尚未啟用雲端 API,請按一下「啟用雲端 API」

您現在可以開始辨識圖片中的文字。

輸入圖片規範

  • 如要讓 Firebase ML 準確辨識文字,輸入圖片必須包含由足夠像素資料代表的文字。以拉丁文為例,每個字元應至少為 16 x 16 像素。若是中文、日文和韓文,每個字元應為 24x24 像素。對於所有語言而言,字元大小超過 24x24 像素通常不會帶來任何準確度優勢。

    舉例來說,如果要掃描名片,640x480 的圖片可能會佔用整個圖片的寬度,如要掃描以 A4 大小紙張列印的文件,可能需要 720 x 1280 像素的圖片。

  • 若圖片對焦不佳,文字辨識準確度可能會降低。如果您無法取得可接受的結果,請嘗試要求使用者重新拍攝圖片。


辨識圖片中的文字

如要辨識圖片中的文字,請按照下方說明執行文字辨識器。

1. 執行文字辨識器

如要辨識圖片中的文字,請從裝置上的 Bitmapmedia.ImageByteBuffer、位元組陣列或檔案建立 FirebaseVisionImage 物件。接著,將 FirebaseVisionImage 物件傳遞至 FirebaseVisionTextRecognizerprocessImage 方法。

  1. 從圖片建立 FirebaseVisionImage 物件。

    • 如要從 media.Image 物件建立 FirebaseVisionImage 物件 (例如從裝置相機擷取圖片時),請將 media.Image 物件和圖片的旋轉角度傳遞至 FirebaseVisionImage.fromMediaImage()

      如果您使用 CameraX 程式庫,OnImageCapturedListenerImageAnalysis.Analyzer 類別會為您計算旋轉值,因此您只需在呼叫 FirebaseVisionImage.fromMediaImage() 之前,將旋轉值轉換為 Firebase MLROTATION_ 常數:

      Kotlin

      private class YourImageAnalyzer : ImageAnalysis.Analyzer {
          private fun degreesToFirebaseRotation(degrees: Int): Int = when(degrees) {
              0 -> FirebaseVisionImageMetadata.ROTATION_0
              90 -> FirebaseVisionImageMetadata.ROTATION_90
              180 -> FirebaseVisionImageMetadata.ROTATION_180
              270 -> FirebaseVisionImageMetadata.ROTATION_270
              else -> throw Exception("Rotation must be 0, 90, 180, or 270.")
          }
      
          override fun analyze(imageProxy: ImageProxy?, degrees: Int) {
              val mediaImage = imageProxy?.image
              val imageRotation = degreesToFirebaseRotation(degrees)
              if (mediaImage != null) {
                  val image = FirebaseVisionImage.fromMediaImage(mediaImage, imageRotation)
                  // Pass image to an ML Vision API
                  // ...
              }
          }
      }

      Java

      private class YourAnalyzer implements ImageAnalysis.Analyzer {
      
          private int degreesToFirebaseRotation(int degrees) {
              switch (degrees) {
                  case 0:
                      return FirebaseVisionImageMetadata.ROTATION_0;
                  case 90:
                      return FirebaseVisionImageMetadata.ROTATION_90;
                  case 180:
                      return FirebaseVisionImageMetadata.ROTATION_180;
                  case 270:
                      return FirebaseVisionImageMetadata.ROTATION_270;
                  default:
                      throw new IllegalArgumentException(
                              "Rotation must be 0, 90, 180, or 270.");
              }
          }
      
          @Override
          public void analyze(ImageProxy imageProxy, int degrees) {
              if (imageProxy == null || imageProxy.getImage() == null) {
                  return;
              }
              Image mediaImage = imageProxy.getImage();
              int rotation = degreesToFirebaseRotation(degrees);
              FirebaseVisionImage image =
                      FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
              // Pass image to an ML Vision API
              // ...
          }
      }

      如果您未使用可提供圖片旋轉角度的相機程式庫,可以根據裝置旋轉角度和裝置中相機感應器的方向計算:

      Kotlin

      private val ORIENTATIONS = SparseIntArray()
      
      init {
          ORIENTATIONS.append(Surface.ROTATION_0, 90)
          ORIENTATIONS.append(Surface.ROTATION_90, 0)
          ORIENTATIONS.append(Surface.ROTATION_180, 270)
          ORIENTATIONS.append(Surface.ROTATION_270, 180)
      }
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      @Throws(CameraAccessException::class)
      private fun getRotationCompensation(cameraId: String, activity: Activity, context: Context): Int {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          val deviceRotation = activity.windowManager.defaultDisplay.rotation
          var rotationCompensation = ORIENTATIONS.get(deviceRotation)
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          val cameraManager = context.getSystemService(CAMERA_SERVICE) as CameraManager
          val sensorOrientation = cameraManager
              .getCameraCharacteristics(cameraId)
              .get(CameraCharacteristics.SENSOR_ORIENTATION)!!
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          val result: Int
          when (rotationCompensation) {
              0 -> result = FirebaseVisionImageMetadata.ROTATION_0
              90 -> result = FirebaseVisionImageMetadata.ROTATION_90
              180 -> result = FirebaseVisionImageMetadata.ROTATION_180
              270 -> result = FirebaseVisionImageMetadata.ROTATION_270
              else -> {
                  result = FirebaseVisionImageMetadata.ROTATION_0
                  Log.e(TAG, "Bad rotation value: $rotationCompensation")
              }
          }
          return result
      }

      Java

      private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
      static {
          ORIENTATIONS.append(Surface.ROTATION_0, 90);
          ORIENTATIONS.append(Surface.ROTATION_90, 0);
          ORIENTATIONS.append(Surface.ROTATION_180, 270);
          ORIENTATIONS.append(Surface.ROTATION_270, 180);
      }
      
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      private int getRotationCompensation(String cameraId, Activity activity, Context context)
              throws CameraAccessException {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
          int rotationCompensation = ORIENTATIONS.get(deviceRotation);
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
          int sensorOrientation = cameraManager
                  .getCameraCharacteristics(cameraId)
                  .get(CameraCharacteristics.SENSOR_ORIENTATION);
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          int result;
          switch (rotationCompensation) {
              case 0:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  break;
              case 90:
                  result = FirebaseVisionImageMetadata.ROTATION_90;
                  break;
              case 180:
                  result = FirebaseVisionImageMetadata.ROTATION_180;
                  break;
              case 270:
                  result = FirebaseVisionImageMetadata.ROTATION_270;
                  break;
              default:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  Log.e(TAG, "Bad rotation value: " + rotationCompensation);
          }
          return result;
      }

      接著,將 media.Image 物件和旋轉值傳遞至 FirebaseVisionImage.fromMediaImage()

      Kotlin

      val image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation)

      Java

      FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
    • 如要從檔案 URI 建立 FirebaseVisionImage 物件,請將應用程式背景資訊和檔案 URI 傳遞至 FirebaseVisionImage.fromFilePath()。這在您使用 ACTION_GET_CONTENT 意圖,提示使用者從相片庫應用程式中選取圖片時,非常實用。

      Kotlin

      val image: FirebaseVisionImage
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri)
      } catch (e: IOException) {
          e.printStackTrace()
      }

      Java

      FirebaseVisionImage image;
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri);
      } catch (IOException e) {
          e.printStackTrace();
      }
    • 如要從 ByteBuffer 或位元組陣列建立 FirebaseVisionImage 物件,請先計算圖片旋轉角度,如上文所述的 media.Image 輸入資料。

      接著,請建立 FirebaseVisionImageMetadata 物件,其中包含圖片的高度、寬度、顏色編碼格式和旋轉角度:

      Kotlin

      val metadata = FirebaseVisionImageMetadata.Builder()
          .setWidth(480) // 480x360 is typically sufficient for
          .setHeight(360) // image recognition
          .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
          .setRotation(rotation)
          .build()

      Java

      FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
              .setWidth(480)   // 480x360 is typically sufficient for
              .setHeight(360)  // image recognition
              .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
              .setRotation(rotation)
              .build();

      使用緩衝區或陣列和中繼資料物件,建立 FirebaseVisionImage 物件:

      Kotlin

      val image = FirebaseVisionImage.fromByteBuffer(buffer, metadata)
      // Or: val image = FirebaseVisionImage.fromByteArray(byteArray, metadata)

      Java

      FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
      // Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);
    • 如要從 Bitmap 物件建立 FirebaseVisionImage 物件,請按照下列步驟操作:

      Kotlin

      val image = FirebaseVisionImage.fromBitmap(bitmap)

      Java

      FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);
      Bitmap 物件所代表的圖片必須是直立的,不需要額外旋轉。

  2. 取得 FirebaseVisionTextRecognizer 的例項。

    Kotlin

    val detector = FirebaseVision.getInstance().cloudTextRecognizer
    // Or, to change the default settings:
    // val detector = FirebaseVision.getInstance().getCloudTextRecognizer(options)
    // Or, to provide language hints to assist with language detection:
    // See https://cloud.google.com/vision/docs/languages for supported languages
    val options = FirebaseVisionCloudTextRecognizerOptions.Builder()
        .setLanguageHints(listOf("en", "hi"))
        .build()

    Java

    FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
            .getCloudTextRecognizer();
    // Or, to change the default settings:
    //   FirebaseVisionTextRecognizer detector = FirebaseVision.getInstance()
    //          .getCloudTextRecognizer(options);
    // Or, to provide language hints to assist with language detection:
    // See https://cloud.google.com/vision/docs/languages for supported languages
    FirebaseVisionCloudTextRecognizerOptions options = new FirebaseVisionCloudTextRecognizerOptions.Builder()
            .setLanguageHints(Arrays.asList("en", "hi"))
            .build();
  3. 最後,將圖片傳遞至 processImage 方法:

    Kotlin

    val result = detector.processImage(image)
        .addOnSuccessListener { firebaseVisionText ->
            // Task completed successfully
            // ...
        }
        .addOnFailureListener { e ->
            // Task failed with an exception
            // ...
        }

    Java

    Task<FirebaseVisionText> result =
            detector.processImage(image)
                    .addOnSuccessListener(new OnSuccessListener<FirebaseVisionText>() {
                        @Override
                        public void onSuccess(FirebaseVisionText firebaseVisionText) {
                            // Task completed successfully
                            // ...
                        }
                    })
                    .addOnFailureListener(
                            new OnFailureListener() {
                                @Override
                                public void onFailure(@NonNull Exception e) {
                                    // Task failed with an exception
                                    // ...
                                }
                            });

2. 從已辨識的文字區塊中擷取文字

如果文字辨識作業成功,系統會將 FirebaseVisionText 物件傳遞至成功事件監聽器。FirebaseVisionText 物件包含圖片中辨識到的完整文字,以及零個或多個 TextBlock 物件。

每個 TextBlock 都代表一個矩形文字區塊,其中可能包含零個或多個 Line 物件。每個 Line 物件都包含零個或多個 Element 物件,代表字詞和類似字詞的實體 (日期、數字等)。

您可以為每個 TextBlockLineElement 物件取得在該區域中辨識到的文字,以及該區域的邊界座標。

例如:

Kotlin

val resultText = result.text
for (block in result.textBlocks) {
    val blockText = block.text
    val blockConfidence = block.confidence
    val blockLanguages = block.recognizedLanguages
    val blockCornerPoints = block.cornerPoints
    val blockFrame = block.boundingBox
    for (line in block.lines) {
        val lineText = line.text
        val lineConfidence = line.confidence
        val lineLanguages = line.recognizedLanguages
        val lineCornerPoints = line.cornerPoints
        val lineFrame = line.boundingBox
        for (element in line.elements) {
            val elementText = element.text
            val elementConfidence = element.confidence
            val elementLanguages = element.recognizedLanguages
            val elementCornerPoints = element.cornerPoints
            val elementFrame = element.boundingBox
        }
    }
}

Java

String resultText = result.getText();
for (FirebaseVisionText.TextBlock block: result.getTextBlocks()) {
    String blockText = block.getText();
    Float blockConfidence = block.getConfidence();
    List<RecognizedLanguage> blockLanguages = block.getRecognizedLanguages();
    Point[] blockCornerPoints = block.getCornerPoints();
    Rect blockFrame = block.getBoundingBox();
    for (FirebaseVisionText.Line line: block.getLines()) {
        String lineText = line.getText();
        Float lineConfidence = line.getConfidence();
        List<RecognizedLanguage> lineLanguages = line.getRecognizedLanguages();
        Point[] lineCornerPoints = line.getCornerPoints();
        Rect lineFrame = line.getBoundingBox();
        for (FirebaseVisionText.Element element: line.getElements()) {
            String elementText = element.getText();
            Float elementConfidence = element.getConfidence();
            List<RecognizedLanguage> elementLanguages = element.getRecognizedLanguages();
            Point[] elementCornerPoints = element.getCornerPoints();
            Rect elementFrame = element.getBoundingBox();
        }
    }
}

後續步驟


辨識文件圖片中的文字

如要辨識文件中的文字,請按照下文所述設定及執行文件文字辨識器。

下文所述的文件文字辨識 API 提供的介面,旨在讓您更方便處理文件圖片。不過,如果您偏好 FirebaseVisionTextRecognizer API 提供的介面,可以改用該介面掃描文件,方法是設定雲端文字辨識器,以使用密集文字模型

如要使用文件文字辨識 API,請按照下列步驟操作:

1. 執行文字辨識器

如要辨識圖片中的文字,請使用 Bitmapmedia.ImageByteBuffer、位元組陣列或裝置上的檔案,建立 FirebaseVisionImage 物件。接著,將 FirebaseVisionImage 物件傳遞至 FirebaseVisionDocumentTextRecognizerprocessImage 方法。

  1. 從圖片建立 FirebaseVisionImage 物件。

    • 如要從 media.Image 物件建立 FirebaseVisionImage 物件 (例如從裝置相機擷取圖片時),請將 media.Image 物件和圖片的旋轉角度傳遞至 FirebaseVisionImage.fromMediaImage()

      如果您使用 CameraX 程式庫,OnImageCapturedListenerImageAnalysis.Analyzer 類別會為您計算旋轉值,因此您只需在呼叫 FirebaseVisionImage.fromMediaImage() 之前,將旋轉值轉換為 Firebase MLROTATION_ 常數:

      Kotlin

      private class YourImageAnalyzer : ImageAnalysis.Analyzer {
          private fun degreesToFirebaseRotation(degrees: Int): Int = when(degrees) {
              0 -> FirebaseVisionImageMetadata.ROTATION_0
              90 -> FirebaseVisionImageMetadata.ROTATION_90
              180 -> FirebaseVisionImageMetadata.ROTATION_180
              270 -> FirebaseVisionImageMetadata.ROTATION_270
              else -> throw Exception("Rotation must be 0, 90, 180, or 270.")
          }
      
          override fun analyze(imageProxy: ImageProxy?, degrees: Int) {
              val mediaImage = imageProxy?.image
              val imageRotation = degreesToFirebaseRotation(degrees)
              if (mediaImage != null) {
                  val image = FirebaseVisionImage.fromMediaImage(mediaImage, imageRotation)
                  // Pass image to an ML Vision API
                  // ...
              }
          }
      }

      Java

      private class YourAnalyzer implements ImageAnalysis.Analyzer {
      
          private int degreesToFirebaseRotation(int degrees) {
              switch (degrees) {
                  case 0:
                      return FirebaseVisionImageMetadata.ROTATION_0;
                  case 90:
                      return FirebaseVisionImageMetadata.ROTATION_90;
                  case 180:
                      return FirebaseVisionImageMetadata.ROTATION_180;
                  case 270:
                      return FirebaseVisionImageMetadata.ROTATION_270;
                  default:
                      throw new IllegalArgumentException(
                              "Rotation must be 0, 90, 180, or 270.");
              }
          }
      
          @Override
          public void analyze(ImageProxy imageProxy, int degrees) {
              if (imageProxy == null || imageProxy.getImage() == null) {
                  return;
              }
              Image mediaImage = imageProxy.getImage();
              int rotation = degreesToFirebaseRotation(degrees);
              FirebaseVisionImage image =
                      FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
              // Pass image to an ML Vision API
              // ...
          }
      }

      如果您未使用可提供圖片旋轉角度的相機程式庫,可以根據裝置旋轉角度和裝置中相機感應器的方向計算:

      Kotlin

      private val ORIENTATIONS = SparseIntArray()
      
      init {
          ORIENTATIONS.append(Surface.ROTATION_0, 90)
          ORIENTATIONS.append(Surface.ROTATION_90, 0)
          ORIENTATIONS.append(Surface.ROTATION_180, 270)
          ORIENTATIONS.append(Surface.ROTATION_270, 180)
      }
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      @Throws(CameraAccessException::class)
      private fun getRotationCompensation(cameraId: String, activity: Activity, context: Context): Int {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          val deviceRotation = activity.windowManager.defaultDisplay.rotation
          var rotationCompensation = ORIENTATIONS.get(deviceRotation)
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          val cameraManager = context.getSystemService(CAMERA_SERVICE) as CameraManager
          val sensorOrientation = cameraManager
              .getCameraCharacteristics(cameraId)
              .get(CameraCharacteristics.SENSOR_ORIENTATION)!!
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          val result: Int
          when (rotationCompensation) {
              0 -> result = FirebaseVisionImageMetadata.ROTATION_0
              90 -> result = FirebaseVisionImageMetadata.ROTATION_90
              180 -> result = FirebaseVisionImageMetadata.ROTATION_180
              270 -> result = FirebaseVisionImageMetadata.ROTATION_270
              else -> {
                  result = FirebaseVisionImageMetadata.ROTATION_0
                  Log.e(TAG, "Bad rotation value: $rotationCompensation")
              }
          }
          return result
      }

      Java

      private static final SparseIntArray ORIENTATIONS = new SparseIntArray();
      static {
          ORIENTATIONS.append(Surface.ROTATION_0, 90);
          ORIENTATIONS.append(Surface.ROTATION_90, 0);
          ORIENTATIONS.append(Surface.ROTATION_180, 270);
          ORIENTATIONS.append(Surface.ROTATION_270, 180);
      }
      
      /**
       * Get the angle by which an image must be rotated given the device's current
       * orientation.
       */
      @RequiresApi(api = Build.VERSION_CODES.LOLLIPOP)
      private int getRotationCompensation(String cameraId, Activity activity, Context context)
              throws CameraAccessException {
          // Get the device's current rotation relative to its "native" orientation.
          // Then, from the ORIENTATIONS table, look up the angle the image must be
          // rotated to compensate for the device's rotation.
          int deviceRotation = activity.getWindowManager().getDefaultDisplay().getRotation();
          int rotationCompensation = ORIENTATIONS.get(deviceRotation);
      
          // On most devices, the sensor orientation is 90 degrees, but for some
          // devices it is 270 degrees. For devices with a sensor orientation of
          // 270, rotate the image an additional 180 ((270 + 270) % 360) degrees.
          CameraManager cameraManager = (CameraManager) context.getSystemService(CAMERA_SERVICE);
          int sensorOrientation = cameraManager
                  .getCameraCharacteristics(cameraId)
                  .get(CameraCharacteristics.SENSOR_ORIENTATION);
          rotationCompensation = (rotationCompensation + sensorOrientation + 270) % 360;
      
          // Return the corresponding FirebaseVisionImageMetadata rotation value.
          int result;
          switch (rotationCompensation) {
              case 0:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  break;
              case 90:
                  result = FirebaseVisionImageMetadata.ROTATION_90;
                  break;
              case 180:
                  result = FirebaseVisionImageMetadata.ROTATION_180;
                  break;
              case 270:
                  result = FirebaseVisionImageMetadata.ROTATION_270;
                  break;
              default:
                  result = FirebaseVisionImageMetadata.ROTATION_0;
                  Log.e(TAG, "Bad rotation value: " + rotationCompensation);
          }
          return result;
      }

      接著,將 media.Image 物件和旋轉值傳遞至 FirebaseVisionImage.fromMediaImage()

      Kotlin

      val image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation)

      Java

      FirebaseVisionImage image = FirebaseVisionImage.fromMediaImage(mediaImage, rotation);
    • 如要從檔案 URI 建立 FirebaseVisionImage 物件,請將應用程式背景資訊和檔案 URI 傳遞至 FirebaseVisionImage.fromFilePath()。這在您使用 ACTION_GET_CONTENT 意圖,提示使用者從相片庫應用程式中選取圖片時,非常實用。

      Kotlin

      val image: FirebaseVisionImage
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri)
      } catch (e: IOException) {
          e.printStackTrace()
      }

      Java

      FirebaseVisionImage image;
      try {
          image = FirebaseVisionImage.fromFilePath(context, uri);
      } catch (IOException e) {
          e.printStackTrace();
      }
    • 如要從 ByteBuffer 或位元組陣列建立 FirebaseVisionImage 物件,請先計算圖片旋轉角度,如上文所述的 media.Image 輸入資料。

      接著,請建立 FirebaseVisionImageMetadata 物件,其中包含圖片的高度、寬度、顏色編碼格式和旋轉角度:

      Kotlin

      val metadata = FirebaseVisionImageMetadata.Builder()
          .setWidth(480) // 480x360 is typically sufficient for
          .setHeight(360) // image recognition
          .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
          .setRotation(rotation)
          .build()

      Java

      FirebaseVisionImageMetadata metadata = new FirebaseVisionImageMetadata.Builder()
              .setWidth(480)   // 480x360 is typically sufficient for
              .setHeight(360)  // image recognition
              .setFormat(FirebaseVisionImageMetadata.IMAGE_FORMAT_NV21)
              .setRotation(rotation)
              .build();

      使用緩衝區或陣列和中繼資料物件,建立 FirebaseVisionImage 物件:

      Kotlin

      val image = FirebaseVisionImage.fromByteBuffer(buffer, metadata)
      // Or: val image = FirebaseVisionImage.fromByteArray(byteArray, metadata)

      Java

      FirebaseVisionImage image = FirebaseVisionImage.fromByteBuffer(buffer, metadata);
      // Or: FirebaseVisionImage image = FirebaseVisionImage.fromByteArray(byteArray, metadata);
    • 如要從 Bitmap 物件建立 FirebaseVisionImage 物件,請按照下列步驟操作:

      Kotlin

      val image = FirebaseVisionImage.fromBitmap(bitmap)

      Java

      FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(bitmap);
      Bitmap 物件所代表的圖片必須是直立的,不需要額外旋轉。

  2. 取得 FirebaseVisionDocumentTextRecognizer 的例項:

    Kotlin

    val detector = FirebaseVision.getInstance()
        .cloudDocumentTextRecognizer
    // Or, to provide language hints to assist with language detection:
    // See https://cloud.google.com/vision/docs/languages for supported languages
    val options = FirebaseVisionCloudDocumentRecognizerOptions.Builder()
        .setLanguageHints(listOf("en", "hi"))
        .build()
    val detector = FirebaseVision.getInstance()
        .getCloudDocumentTextRecognizer(options)

    Java

    FirebaseVisionDocumentTextRecognizer detector = FirebaseVision.getInstance()
            .getCloudDocumentTextRecognizer();
    // Or, to provide language hints to assist with language detection:
    // See https://cloud.google.com/vision/docs/languages for supported languages
    FirebaseVisionCloudDocumentRecognizerOptions options =
            new FirebaseVisionCloudDocumentRecognizerOptions.Builder()
                    .setLanguageHints(Arrays.asList("en", "hi"))
                    .build();
    FirebaseVisionDocumentTextRecognizer detector = FirebaseVision.getInstance()
            .getCloudDocumentTextRecognizer(options);

  3. 最後,將圖片傳遞至 processImage 方法:

    Kotlin

    detector.processImage(myImage)
        .addOnSuccessListener { firebaseVisionDocumentText ->
            // Task completed successfully
            // ...
        }
        .addOnFailureListener { e ->
            // Task failed with an exception
            // ...
        }

    Java

    detector.processImage(myImage)
            .addOnSuccessListener(new OnSuccessListener<FirebaseVisionDocumentText>() {
                @Override
                public void onSuccess(FirebaseVisionDocumentText result) {
                    // Task completed successfully
                    // ...
                }
            })
            .addOnFailureListener(new OnFailureListener() {
                @Override
                public void onFailure(@NonNull Exception e) {
                    // Task failed with an exception
                    // ...
                }
            });

2. 從已辨識的文字區塊中擷取文字

如果文字辨識作業成功,系統會傳回 FirebaseVisionDocumentText 物件。FirebaseVisionDocumentText 物件包含圖片中辨識到的完整文字,以及反映已辨識文件結構的物件階層:

您可以為每個 BlockParagraphWordSymbol 物件取得在該區域中辨識到的文字,以及該區域的邊界座標。

例如:

Kotlin

val resultText = result.text
for (block in result.blocks) {
    val blockText = block.text
    val blockConfidence = block.confidence
    val blockRecognizedLanguages = block.recognizedLanguages
    val blockFrame = block.boundingBox
    for (paragraph in block.paragraphs) {
        val paragraphText = paragraph.text
        val paragraphConfidence = paragraph.confidence
        val paragraphRecognizedLanguages = paragraph.recognizedLanguages
        val paragraphFrame = paragraph.boundingBox
        for (word in paragraph.words) {
            val wordText = word.text
            val wordConfidence = word.confidence
            val wordRecognizedLanguages = word.recognizedLanguages
            val wordFrame = word.boundingBox
            for (symbol in word.symbols) {
                val symbolText = symbol.text
                val symbolConfidence = symbol.confidence
                val symbolRecognizedLanguages = symbol.recognizedLanguages
                val symbolFrame = symbol.boundingBox
            }
        }
    }
}

Java

String resultText = result.getText();
for (FirebaseVisionDocumentText.Block block: result.getBlocks()) {
    String blockText = block.getText();
    Float blockConfidence = block.getConfidence();
    List<RecognizedLanguage> blockRecognizedLanguages = block.getRecognizedLanguages();
    Rect blockFrame = block.getBoundingBox();
    for (FirebaseVisionDocumentText.Paragraph paragraph: block.getParagraphs()) {
        String paragraphText = paragraph.getText();
        Float paragraphConfidence = paragraph.getConfidence();
        List<RecognizedLanguage> paragraphRecognizedLanguages = paragraph.getRecognizedLanguages();
        Rect paragraphFrame = paragraph.getBoundingBox();
        for (FirebaseVisionDocumentText.Word word: paragraph.getWords()) {
            String wordText = word.getText();
            Float wordConfidence = word.getConfidence();
            List<RecognizedLanguage> wordRecognizedLanguages = word.getRecognizedLanguages();
            Rect wordFrame = word.getBoundingBox();
            for (FirebaseVisionDocumentText.Symbol symbol: word.getSymbols()) {
                String symbolText = symbol.getText();
                Float symbolConfidence = symbol.getConfidence();
                List<RecognizedLanguage> symbolRecognizedLanguages = symbol.getRecognizedLanguages();
                Rect symbolFrame = symbol.getBoundingBox();
            }
        }
    }
}

後續步驟