Java Tensorflow
I've created a little web server that loads in a frozen graph and processes the images that are sent to it. It can be found here.
When I first set it up I was really disappointed at the performance, performance on the mobile phones were better than I was getting...
My code was straight out of the example pages, e.g.
try (Session session = new Session(this.graph)) { outputs = session .runner() .feed("image_tensor",tensor) .fetch("detection_scores") .fetch("detection_classes") .fetch("detection_boxes") .run(); }
It was taking an age e.g. At first I thought it was down to the library not being compiled for my cpu as it was warning me on startup... so I recompiled (using the awesome documentation { basically
bazel build --config opt //tensorflow/java:tensorflow //tensorflow/java:libtensorflow_jni} ), but it was still roughly the same. Not that then. But how bad was it?
2018-08-23 19:01:31.971 INFO 34476 --- [nio-9000-exec-8] u.c.s.t.c.TensorflowImageEvaluator : session time:0 2018-08-23 19:01:31.972 INFO 34476 --- [nio-9000-exec-8] u.c.s.t.c.TensorflowImageEvaluator : runner setup time:1 2018-08-23 19:01:36.050 INFO 34476 --- [nio-9000-exec-8] u.c.s.t.c.TensorflowImageEvaluator : run time:4078 2018-08-23 19:01:36.051 INFO 34476 --- [nio-9000-exec-8] u.c.s.t.c.TensorflowImageEvaluator : results time:0
That's right nearly 5 seconds... I though this was supposed to be quick... so what was I doing wrong? Basically not reusing the session. Once that is being reused... just look what happens on the same image. In this one I'm using a set of pre-pared sessions in a BlockingQueue to get the single thread use requirements for session
Session session = this.sessions.take(); Session.Runner runner = session .runner() .feed("image_tensor", tensor) .fetch("detection_scores") .fetch("detection_classes") .fetch("detection_boxes"); start = logTimeDiff("runner setup time:" ,start); outputs = runner.run(); start = logTimeDiff("run time:" , start);Just remember to pop the session back in at the end!
2018-08-23 19:01:57.739 INFO 34476 --- [io-9000-exec-10] u.c.s.t.c.TensorflowImageEvaluator : session time:0 2018-08-23 19:01:57.739 INFO 34476 --- [io-9000-exec-10] u.c.s.t.c.TensorflowImageEvaluator : runner setup time:0 2018-08-23 19:01:57.791 INFO 34476 --- [io-9000-exec-10] u.c.s.t.c.TensorflowImageEvaluator : run time:51 2018-08-23 19:01:57.791 INFO 34476 --- [io-9000-exec-10] u.c.s.t.c.TensorflowImageEvaluator : results time:0
Turns out it *is* fast after all!
My guess is its loading in the graph on first run so saving that... boom! fast runs!