无法编译 tensorflow lite 文件以在珊瑚 tpu 上工作答案

【问题标题】：Trouble compiling tensorflow lite file to work on coral tpu无法编译 tensorflow lite 文件以在珊瑚 tpu 上工作
【发布时间】：2019-05-28 21:55:07
【问题描述】：

我正在尝试重复简单的模型进行编译，以便它可以在珊瑚 tpu 上运行。到目前为止，我已经冻结并将文件转换为 tflite 文件，但是当我通过 Edge TPU Model Compiler 运行文件时，它给了我一个相对无用的错误消息。

COMPILING FAILED
Something went wrong. Couldn't compile model.
Please make sure your model meets the requirements.
See the log below for more compilation details.
If you believe your model meets the requirements but you still receive this error,
email support at coral‑support@google.com.

我给他们发了电子邮件，他们说使用 /tensorflow/lite/tools:visualize 来查看模型有什么问题。（我也无法让它工作，但似乎我应该问一个单独的问题以获得有关 bazel 的帮助）

我已经按照this site 对模型进行了量化感知训练，并且我已经使用随机输入运行了 tflite 文件，它似乎可以工作。我担心 TPU 模型编译器的部分问题是我在代理后面，所以我通过它运行其他人的文件并成功编译。）

这是评估图：

import pandas as pd
import sys
import tensorflow as tf
import numpy as np
from tensorflow.python.tools import inspect_checkpoint as chkp
from sklearn.model_selection import train_test_split


#test data
seed = np.random.seed(141)

features = pd.read_csv(sys.argv[1], sep=',', index_col=0)
labels = pd.read_csv(sys.argv[2], sep=',', index_col=0)
train_input, test_input, train_labels, test_labels = train_test_split(features, labels, test_size=0.2, random_state=seed)

def neuron_layer(X, n_neurons, name, activation=None):
    with tf.name_scope(name):
        n_inputs = int(X.get_shape()[1])
        W = tf.Variable(tf.zeros([n_inputs, n_neurons]), name="kernal")
        b = tf.Variable(tf.zeros([n_neurons]), name="bias")
        Z = tf.matmul(X, W) + b
        if activation is not None:
            return activation(Z)
        else:
            return Z

X = tf.placeholder(tf.float32, (1, 701), name="X")
n_outputs = 2
n_hidden1 = 700
n_hidden2 = 701
with tf.name_scope("dnn"):
    hidden1 = neuron_layer(X, n_hidden1, name="hidden1", activation=tf.nn.relu)
    # hidden2 = neuron_layer(hidden1, n_hidden2, name="hidden2", activation=tf.nn.relu)
    # trying only one layer
    logits = neuron_layer(hidden1, n_outputs, name="outputs")

with tf.name_scope("final_eval"):
    output = tf.argmax(logits, axis=1, name="output")


# Call the eval rewrite which rewrites the graph in-place with
# FakeQuantization nodes and fold batchnorm for eval.
g = tf.get_default_graph()
tf.contrib.quantize.create_eval_graph(input_graph=g)

# Add ops to save and restore all the variables.
saver = tf.train.Saver()
eval_graph_file = "eval_graph.pb"

#handles different tensorboard runs
from datetime import datetime
now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "eval/{}/run-{}".format(root_logdir, now)

file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())
with tf.Session() as sess:
    saver.restore(sess, "./nbtf/nothing_but_tf_model.ckpt")

    # Save the checkpoint and eval graph proto to disk for freezing
    # and providing to TFLite.
    with open(eval_graph_file, 'w+') as f:
        f.write(str(g.as_graph_def()))
    saver.save(sess, "./nbtf/eval/eval.ckpt")
    pred = output.eval(feed_dict={X: [test_input.values[45]]})
    print(pred, test_labels.values[45])

然后我就这样僵住了：

 freeze_graph --input_graph=eval_graph.pb --input_checkpoint=nbtf\eval\eval.ckpt --output_graph=frozen_eval_graph.pb --output_node_names=final_eval/output

然后用这个转换：

toco --graph_def_file=frozen_eval_graph.pb --output_file=tflite_model.tflite --input_format=TENSORFLOW_GRAPHDEF --output_format=TFLITE --inference_type=QUANTIZED_UINT8 --input_array=X --output_array=final_eval/output --std_dev_value=127 --mean_value=127

tensorboard image

我只是希望这个文件能够编译它不必是完美的或任何东西。

感谢您的帮助。

编辑：

我尝试了两件事，第一件事是我从 tflite 文件中打印出张量（我试图使用 Visualize.py 工具，但我在代理后面并且在让它工作时遇到了很多麻烦。 ) 我得到了这个：

{'name': 'X', 'index': 0, 'shape': array([  1, 701]), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.007874015718698502, 127)}
{'name': 'dnn/fully_connected/MatMul_bias', 'index': 1, 'shape': array([702]), 'dtype': <class 'numpy.int32'>, 'quantization': (3.750092218979262e-05, 0)}
{'name': 'dnn/fully_connected/Relu', 'index': 2, 'shape': array([  1, 702]), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.035464514046907425, 0)}
{'name': 'dnn/fully_connected/weights_quant/FakeQuantWithMinMaxVars/transpose', 'index': 3, 'shape': array([702, 701]), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.004762616939842701, 121)}
{'name': 'dnn/fully_connected_1/MatMul_bias', 'index': 4, 'shape': array([703]), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0001283923047594726, 0)}
{'name': 'dnn/fully_connected_1/Relu', 'index': 5, 'shape': array([  1, 703]), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.019155390560626984, 0)}
{'name': 'dnn/fully_connected_1/weights_quant/FakeQuantWithMinMaxVars/transpose', 'index': 6, 'shape': array([703, 702]), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.0036203034687787294, 120)}
{'name': 'dnn/outputs/MatMul_bias', 'index': 7, 'shape': array([2]), 'dtype': <class 'numpy.int32'>, 'quantization': (3.3737978810677305e-05, 0)}
{'name': 'dnn/outputs/add', 'index': 8, 'shape': array([1, 2]), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.055008530616760254, 131)}
{'name': 'dnn/outputs/weights_quant/FakeQuantWithMinMaxVars/transpose', 'index': 9, 'shape': array([  2, 703]), 'dtype': <class 'numpy.uint8'>, 'quantization': (0.0017612784868106246, 110)}
{'name': 'final_eval/output', 'index': 10, 'shape': array([1, 1]), 'dtype': <class 'numpy.int64'>, 'quantization': (0.0, 0)}
{'name': 'final_eval/output/dimension', 'index': 11, 'shape': array([], dtype=int32), 'dtype': <class 'numpy.int32'>, 'quantization': (0.0, 0)}

我认为问题在于 MatMul_bias 张量没有被转换为 uint8（珊瑚 tpu 需要）。

我不知道如何解决这个问题。

我尝试的另一个更改是使用 tensorflow slim.fully_connected，而不是我自己的自定义全连接神经网络。（不过他们也有同样的问题。）

【问题讨论】：

标签： python tensorflow tensorflow-lite tpu

【解决方案1】：

好的，我能够编译文件没问题，我只需要使用offline compiler。似乎完全解决了它。

【讨论】：