TensorFlow Lite Tutorial: How to Get Up and Running

Navigate to:

The TensorFlow framework, developed by Google, is one of the most important platforms for creating deep neural network research and for training machine learning algorithms. Generally, it runs on the server side using powerful GPUs, consuming power and a lot of memory.

However, as the use of edge devices, smartphones, and microcontrollers continues to rise, they’ve become important platforms for machine learning as well. Evidently, using only TensorFlow made it difficult to implement or deploy high-performing deep learning models on embedded devices. For example, how do you perform computationally loaded processes on a device with restricted computing power without it exploding? The lack of resources for peripheral devices necessitated the creation of TensorFlow Lite in order to discover new solutions to these problems.

In this tutorial, we’ll discuss how to get TensorFlow Lite up and running on your device.

What is TensorFlow Lite used for?

According to TensorFlow Lite’s official documentation, it’s a “set of tools that enable on-device machine learning by helping developers run their models on mobile, embedded, and edge devices.”

TensorFlow Lite is a machine learning framework regarded as a lightweight version of TensorFlow. Correspondingly, it operates on devices with minimal processing power. It runs trained machine learning models on smartphones (Android and iOS), microcontrollers, and IoT devices and computers (Linux), which are areas where TensorFlow has limitations.

How to get TensorFlow up and running on your device


The first step to using TensorFlow Lite on your device is to install it. TensorFlow Lite can be installed on a variety of platforms including Android, iOS, and Linux.


Before you install TensorFlow Lite, ensure you have Android Studio 4.2 or higher and Android SDK version 21 or higher installed.

To install TensorFlow Lite on Android, use the following steps:

  • Add the following dependency to your app’s gradle file:
implementation 'org.tensorflow:tensorflow-lite:2.8.0'
  • Sync the gradle files to update your project.

Check out the Android quickstart documentation for more details.


TensorFlow Lite can be added to Swift or Objective-C projects. Accordingly, for Bazel developers, in the BUILD file, add the TensorFlow Lite dependency to your target for Swift, Objective-C, and C/C++.


  deps = [


  deps = [


# Using C API directly
  deps = [

# Using C++ API directly
  deps = [

Check out the iOS documentation here.

Linux-based devices

Install the TensorFlow Lite interpreter with Python using the simplified Python package, tflite-runtime.

Install with pip:

python3 -m pip install tflite-runtime

Import with tflite_runtime as follows:

import tflite_runtime.interpreter as tflite

Getting a trained model

The next step is to get a trained model that would run on the device. There are three main ways to do this:

  1. Using a pretrained TensorFlow Lite model

  2. Training a custom TensorFlow Lite model using TensorFlow

  3. Converting a TensorFlow model to TensorFlow Lite

Use a pretrained model

Pretrained TensorFlow Lite models are models that were previously trained to do a specific task. Using a pretrained TensorFlow Lite model is the easiest and fastest method to getting a trained model for deployment. They are deployed exactly as they come, with little to no modifications. Consequently, they only offer slight customizations.

It’s important to point out that TensorFlow Lite has a gallery of sample applications that implement different on-device machine learning use cases. Some examples include image classification, object detection, gesture recognition, and speech recognition. Each use case has options to try on Android, iOS, and Raspberry Pi.

To integrate the machine learning model into an application, first download the pretrained TensorFlow Lite model of your choice from the gallery. Then proceed to use the TensorFlow Lite Task Library to add the model to the application. You can choose from Android, iOS, and Python libraries.

Here’s an example using task vision.

Android dependency

dependencies {
implementation 'org.tensorflow:tensorflow-lite-task-vision'


pod 'TensorFlowLiteTaskVision'


pip install tflite-support

Train a custom TensorFlow Lite model with TensorFlow

What happens when a developer needs a trained model that is not available in the pretrained use cases? In such cases, they can build a unique, custom model from scratch; however, this cannot be done directly with TensorFlow Lite. Instead, the model is trained with TensorFlow using a special library called Model Maker, then optimized for TensorFlow Lite using a technique called transfer learning.

There are two steps to training a custom TensorFlow Lite model.

First, the developer needs to collect and label the training data. Secondly, they train the model using the TensorFlow Lite Model Maker library with TensorFlow.

The TensorFlow Lite Model Maker makes the process of training a TensorFlow Lite model easier. Basically, it uses the transfer learning technique to lessen the time it takes to train the data and decrease the amount of data needed. The Model Maker library supports machine learning tasks like object detection, BERT question answer, audio classification, and text classification.

Custom model training is best done on PCs or devices with powerful GPUs. Google Colab is one such platform. It’s a cloud-based Jupyter Notebook environment that allows the execution of Python codes. It offers both free and paid GPUs to train machine learning models.

To get started, install the Model Maker using pip:

pip install tflite-model-maker

Or clone the source code from GitHub and install:

git clone https://github.com/tensorflow/examples
cd examples/tensorflow_examples/lite/model_maker/pip_package
pip install -e .

Convert a TensorFlow model to TensorFlow Lite

In cases where a developer requires a model that is not enabled by the TensorFlow Lite Model Maker and does not have a pretrained version, it’s best to build the model in TensorFlow and convert it to TensorFlow Lite using the TensorFlow Lite converter. Tools like Keras API will build the model in TensorFlow before converting it to TensorFlow Lite.

In particular, it’s important to evaluate the model’s contents to determine if it is compatible with the TensorFlow format before starting the conversion workflow to convert a TensorFlow model to TensorFlow Lite. Typically, a standard model supported by the TensorFlow Lite runtime environment converts directly, whereas models outside the supported set will require more advanced conversion.

Running inference

Inference is the method of running a TensorFlow Lite model on a device to generate predictions based on the data given to a model. The TensorFlow Lite interpreter runs the inference.

To use the interpreter, follow these steps:

  • Load the model (either the pretrained, custom-built, or converted model) with the .tflite extension into the TensorFlow Lite memory.

  • Allocate memory for the input and output tensors.

  • Run inference on the input data. This involves using the TensorFlow Lite API to execute the model.

  • Interpret the output.

How do I use the TensorFlow Lite model?

You can use TensorFlow Lite models for a variety of activities like network training and inference. In the same fashion, you can use it to deploy real-time machine learning applications like the image classifier that is trained to identify and classify any image a smartphone captures.

Can I train a model with TensorFlow Lite?

No. You cannot train a model with TensorFlow Lite. TensorFlow Lite is a framework for running machine learning models on embedded devices. As mentioned earlier, to train a model, you would have to do so with TensorFlow, which has a much broader range of capabilities and is optimized for training models on powerful hardware such as GPUs.

To emphasize, TensorFlow Lite does not train models; rather, it deploys already-trained models. As a matter of fact, TensorFlow trains the model and converts it to TensorFlow Lite using the TensorFlow Lite converter. TensorFlow Lite then optimizes the model for running on mobile and other edge devices.

How is TensorFlow Lite different from TensorFlow?

Both TensorFlow and TensorFlow Lite are machine learning frameworks by Google, but a couple of things set them apart. We’ll look at some of the differences in this section.

  • TensorFlow runs on the server side, on local machines, clusters in the cloud, CPUs, and GPUs, while TensorFlow Lite runs only on devices with small computational power like smartphones, microcontrollers, and Raspberry Pi.

  • TensorFlow Lite has low latency. This means data movement to and from the server is minimal. That, therefore, makes it ideal for real-time performance use cases.

  • TensorFlow Lite converts TensorFlow models to TensorFlow Lite using a TensorFlow Lite converter. This converter generates an optimized, compressed FlatBuffer file format known as .tflite. However, there is no TensorFlow Lite to TensorFlow converter.

  • Comparatively, TensorFlow Lite is faster than TensorFlow because of the low consumption of power and memory storage.

Wrapping up

In this article, we’ve discussed how to get TensorFlow Lite up and running on your device. To summarize, we covered the steps for installing TensorFlow Lite, the various formats for getting and building a model, and how to run or deploy the model on your device using the TensorFlow Lite interpreter.

Ultimately, with these steps, you should be able to start running machine learning models on your mobile and embedded devices using TensorFlow Lite.

For more technical tutorials, check out our blog.

About the author

This post was written by Barinedum Sambaris. Barine has worked a web developer from 2013-2018 building frontend websites using HTML5, CSS3, Javascript and Bootstrap. Since 2019, she has worked with Python as a data analyst. She is eager to demonstrate how to combine the best of web development with data visualizations.