To train an image labeling model, you provide AutoML Vision Edge with a set of images and corresponding labels. AutoML Vision Edge uses this dataset to train a new model in the cloud, which you can use for on-device image labeling in your app.
Before you begin
If you don't already have a Firebase project, create one in the Firebase console.
Familiarize yourself with the guidelines presented in Inclusive ML guide - AutoML.
1. Assemble your training data
First, you need to put together a training dataset of labeled images. Keep the following guidelines in mind:
The images must be in one of the following formats: JPEG, PNG, GIF, BMP, ICO.
Each image must be 30MB or smaller. Note that AutoML Vision Edge downscales most images during preprocessing, so there's generally no accuracy benefit to providing very high resolution images.
Include at least 10, and preferably 100 or more, examples of each label.
If your Firebase project is on the Spark plan, your project can have at most one dataset, with a maximum of 1,000 images.
Include multiple angles, resolutions, and backgrounds for each label.
The training data should be as close as possible to the data on which predictions are to be made. For example, if your use case involves blurry and low-resolution images (such as from a security camera), your training data should be composed of blurry, low-resolution images.
The models generated by AutoML Vision Edge are optimized for photographs of objects in the real world. They might not work well for X-rays, hand drawings, scanned documents, receipts, and so on.
Also, the models can't generally predict labels that humans can't assign. So, if a human can't assign labels by looking at the image for 1-2 seconds, the model likely can't be trained to do it either.
When you have your training images ready, prepare them to import into Firebase. You have three options:
Option 1: Structured zip archive
Organize your training images into directories, each named after a label and containing images that are examples of that label. Then, compress the directory structure into a zip archive. For example:
my_training_data.zip |____accordion | |____001.jpg | |____002.jpg | |____003.jpg |____bass_guitar | |____hofner.gif | |____p-bass.png |____clavier |____well-tempered.jpg |____well-tempered (1).jpg |____well-tempered (2).jpg
Option 2: Cloud Storage with CSV index
Blaze plan only: Upload your training images to Google Cloud Storage and prepare a CSV file listing the URL of each image, and, optionally, the correct labels for each image. This option is helpful when using very large datasets.
For example, upload your images to Cloud Storage, and prepare a CSV file like the following:
gs://your-training-data-bucket/001.jpg,accordion gs://your-training-data-bucket/002.jpg,accordion gs://your-training-data-bucket/003.jpg,accordion gs://your-training-data-bucket/hofner.gif,bass_guitar gs://your-training-data-bucket/p-bass.png,bass_guitar gs://your-training-data-bucket/well-tempered.jpg,clavier gs://your-training-data-bucket/well-tempered%20(1).jpg,clavier gs://your-training-data-bucket/well-tempered%20(2).jpg,clavier
The images must be stored in a bucket that's part of your Firebase project's corresponding Cloud project.
See Preparing your training data in the Cloud AutoML Vision documentation for more information about preparing the CSV file.
Option 3: Unlabeled images
Label your training images in the Firebase console after you upload them, either individually or in an unstructured zip file. See the next step.
2. Train your model
Next, train a model using your images:
Open the ML Kit section of the Firebase console. Select your project when prompted.
Click the AutoML tab, and then click Create your first AutoML dataset or Add another dataset. After you provide a name for the new dataset, the console will guide you through the following steps:
- Upload either the training images or a CSV file containing the Cloud Storage locations you uploaded them to. See Assemble your training data.
- After the upload completes, verify the training data and label any unlabeled images.
- Start training a model using the training data. You can configure the
following settings, which govern the performance of the generated model:
Latency and package size The model configuration to use. You can train faster, smaller, models when low latency or small package size are important, or slower, larger, models when accuracy is most important. Training time
The maximum time, in compute hours, to spend training the model. More training time generally results in a more accurate model.
Note that training can be completed in less than the specified time if the system determines that the model is optimized and additional training would not improve accuracy. You are billed only for the hours actually used.
Typical training times Very small sets 1 hour 500 images 2 hours 1,000 images 3 hours 5,000 images 6 hours 10,000 images 7 hours 50,000 images 11 hours 100,000 images 13 hours 1,000,000 images 18 hours
3. Evaluate your model
When training completes, you can click the model on the dataset details page to see performance metrics for the model.
One important use of this page is to determine the score threshold that works best for your model. The score threshold is the minimum confidence the model must have for it to assign a label to an image. By moving the score threshold slider, you can see how different thresholds affect the model’s performance. Model performance is measured using two metrics: precision and recall.
In the context of image classification, precision is the ratio of the number of images that were correctly labeled to the number of images the model labeled given the selected threshold. When a model has high precision, it assigns labels incorrectly less often (fewer false positives).
Recall is the ratio of the number of images that were correctly labeled to the number of images that had content the model should have been able to label. When a model has high recall, it fails to assign any label less often (fewer false negatives).
When you find a score threshold that produces metrics you're comfortable with, make note of it; you will use the score threshold to configure the model in your app. (You can use this tool any time to get an appropriate threshold value.)
4. Publish or download your model
If you are satisfied with the model's performance and want to use it in an app, publish the model, download the model, or both.
By publishing the model to Firebase, you can update the model without releasing a new app version, and you can use Remote Config and A/B Testing to dynamically serve different models to different sets of users.
If you choose to only provide the model by hosting it with Firebase, and not bundle it with your app, you can reduce the initial download size of your app. Keep in mind, though, that if the model is not bundled with your app, any model-related functionality will not be available until your app downloads the model for the first time.
By bundling your model with your app, you can ensure your app's ML features still work when the Firebase-hosted model isn't available.
If you both publish the model and bundle it with your app, the app will use the latest version available.
To download or publish the model, on the model's Evaluation page, click Use model.
When you download the model, you get a zip archive containing the model file, labels file, and manifest file. ML Kit needs all three files to load the model from local storage.
When you publish the model to Firebase, you specify a name for the model. You will use this name to refer to the model when you load it with the SDK.
Appendix: Where are my files stored?
If you're on the Blaze plan, ML Kit creates a new Storage bucket for AutoML Vision Edge data:
The Firebase console serves images from this bucket when you browse datasets, so there might be associated network usage charges if you exceed the free usage quota.
If you're on the Spark or Flame plan, Firebase stores your AutoML Vision Edge data internally instead of using your project's Cloud Storage.