Larq Compute Engine Inference¶

To perform an inference with Larq Compute Engine (LCE), we use the TensorFlow Lite interpreter. An LCE-compatible TensorFlow Lite interpreter drives the Larq model inference and uses LCE custom operators instead of built-in TensorFlow Lite operators for each applicable subgraph of the model.

This guide describes how to create a TensorFlow Lite interpreter with registered LCE custom Ops and perform an inference with a converted Larq model using LCE C++ API.

Load and run a model in C++¶

Running an inference with TensorFlow Lite consists of multiple steps, which are comprehensively described in the TensorFlow Lite inference guide. Below we list these steps with one additional step to register LCE customs operators using the LCE C++ function RegisterLCECustomOps():

(1) Load FlatBuffer model:

// Load model
std::unique_ptr<tflite::FlatBufferModel> model =
    tflite::FlatBufferModel::BuildFromFile(filename);

(2) Build the BuiltinOpResolver with registered LCE operators:

// create a builtin OpResolver
tflite::ops::builtin::BuiltinOpResolver resolver;

// register LCE custom ops
compute_engine::tflite::RegisterLCECustomOps(&resolver);

(3) Build an Interpreter with custom OpResolver:

// Build the interpreter
InterpreterBuilder builder(*model, resolver);
std::unique_ptr<Interpreter> interpreter;
builder(&interpreter);

(4) Set input tensor values:

float* input = interpreter->typed_input_tensor<float>(0);
// Fill `input`.

// Resize input tensors, if desired.
interpreter->AllocateTensors();

(5) Invoke inference:

interpreter->Invoke();

(6) Read inference results:

float* output = interpreter->typed_output_tensor<float>(0);

To build the inference binary with Bazel, it needs to be linked against //larq_compute_engine/tflite/kernels:lce_op_kernels target. See LCE minimal for an example code.