Latest DNN work

It’s been a while since I’ve posted my status, and I’ve been far too busy to include all of the work with various AI/ML conferences and implementations, but since I’ve been doing a lot of work specifically on LSTM implementations I wanted to include some notes for both my future self, and my partner when he starts spinning up some of the same code.

Having identified a few primary use cases for our work; high dimensional trajectories through belief space, word embedding search and classification, and time series analysis we’ve been focusing a little more intently on some specific implementations for each capability. While Phil has been leading the charge with the trajectories in belief space, and we both did a bunch of work in the previous sprint preparing for integration of our word embedding project into the production platform, I have started focusing more heavily on time series analysis.

There are a variety of reasons that this particular niche is useful to focus on, but we have a number of real world / real data examples where we need to either perform time series classification, or time series prediction. These cases range from financial data (such as projected planned/actual deltas), to telemetry anomaly detection for satellites or aircraft, among others. In the past some of our work with ML classifiers has been simple feed forward systems (classic multi layer perceptrons), naive Bayesian, or logistic regression.

I’ve been coming up to speed on deep learning, becoming familiar with both the background, and mathematical underpinings. Btw, for those looking for an excellent start to ML I highly recommend Patrick Winston (MIT) videos:

Over the course of several months I did pretty constant research all the way through the latest arXiv papers. I was particularly interested in Hinton’s papers on capsule networks as it has some direct applicability to some of our work. Here is a article summing up the capsule networks:

I did some research into the progress of current deep learning frameworks as well, looking specifically at examples which were suited to production deployment at scale over frameworks most optimal for single researchers solving pet problems. Our focus is much more on the “applied ML” side of things rather than purely academic. The last time we did a comprehensive deep learning framework “bake off” we came to a strong conclusion that Google TensorFlow was the best choice for our environment, and my recent research validated that assumption was still correct. In addition to providing TensorFlow Serving to serve your own models in production stacks, most cloud hosting environments (Google, AWS, etc) have options for directly running TF models either serverless (AWS lambda functions) or through a deployment/hosting solution (AWS SageMaker).

The reality is that lots of what makes ML difficult boils down to things like training lifecycle, versioning, deployment, security, and model optimization. Some aspects of this are increasingly becoming commodity available through hosting providers which frees up data scientists to work on their data sets and improving their models. Speaking of models, on our last pass at implementing some TensorFlow models we used raw TensorFlow I think right after 1.0 had released. The documentation was pretty shabby, and even simple things weren’t super straightforward. When I went to install and set up a new box this time with TensorFlow 1.4, I went ahead and used Keras as well. Keras is an abstraction API over top of computational graph software (either TensorFlow default, or Theano). Installation is easy, with a couple of minor notes.

Note #1: You MUST install the specific versions listed. I cannot stress this enough. In particular the cuDNN and CUDA Toolkit are updated frequently and if you blindly click through their download links you will get a newer version which is not compatible with the current versions of TensorFlow and Keras. The software is all moving very rapidly, so its important to use the compatible versions.

Note #2: Some examples may require the MKL dependency for Numpy. This is not installed by default. See: which will send you here for the necessary WHL file:

Note #3: You will need to run the TensorFlow install as sudo/administrator or get permission errors.

Once these are installed there is a full directory of Keras examples here:

This includes basic examples of most of the basic DNN types supported by Keras as well as some datasets for use such as MNIST for CNNs. When it comes to just figuring out “does everything I just installed run?” these will work just fine.