# Phil 3.3.17

7:00 – 8:00 Research

• Finished formats and determine requirements for journals. Here’s the blog entry with all the information

8:30 – 4:00 BRC

• CS231n: Convolutional Neural Networks for Visual Recognition (Stanford)
• So this is going to seem very newbie, but I’ve finally figured out how to populate a dictionary of arrays:
import numpy as np

dict = {'doc1':[], 'doc2':[], 'doc3':[]}

for doc in dict:
dict[doc] = np.random.rand(5)

for doc in dict:
print("{0}: {1}".format(doc, dict[doc]))
• It turns out that you HAVE to set the array relationship when the key is defined. Here’s how you do it programmatically
import numpy as np

dict = {}

for i in range(5):
name = 'doc_{0}'.format(i)
dict[name] = np.random.rand(5)

for doc in dict:
print("{0}: {1}".format(doc, dict[doc]))
• Which gives the following results
doc_0: [ 0.53396248  0.10014123  0.40849079  0.76243954  0.29396581]
doc_2: [ 0.21438903  0.68745032  0.1640486   0.51779412  0.05844617]
doc_1: [ 0.36181216  0.78839326  0.90174006  0.29013203  0.76752794]
doc_3: [ 0.44230569  0.63054045  0.80872794  0.83048027  0.87243106]
doc_4: [ 0.08283319  0.72717925  0.29242797  0.90089588  0.34012144]
• Continuing to walk through fully_connected.py along with the tutorial
• math_ops.py – TF doc looks very handy
• gen_nn_ops.py – TF doc looks like the rest of the coed we’ll need
• ReLU. The Rectified Linear Unit has become very popular in the last few years. It computes the function f(x)=max(0,x)”>f(x)=max(0,x)f(x)=max(0,x). In other words, the activation is simply thresholded at zero (see image above on the left). There are several pros and cons to using the ReLUs: (Def from here)
• Discovered the Large-Scale Linear Model tutorial. It looks similar-ish to clustering. These are some of the features in tf.contrib.learn, which is also the home of the kmeans clusterer
• Feature columns and transformations

Much of the work of designing a linear model consists of transforming raw data into suitable input features. tf.learn uses the FeatureColumn abstraction to enable these transformations.

A FeatureColumn represents a single feature in your data. A FeatureColumn may represent a quantity like ‘height’, or it may represent a category like ‘eye_color’ where the value is drawn from a set of discrete possibilities like {‘blue’, ‘brown’, ‘green’}.

In the case of both continuous features like ‘height’ and categorical features like ‘eye_color’, a single value in the data might get transformed into a sequence of numbers before it is input into the model. The FeatureColumn abstraction lets you manipulate the feature as a single semantic unit in spite of this fact. You can specify transformations and select features to include without dealing with specific indices in the tensors you feed into the model.

• WOOHOO! Found what I was looking for!
• The input function must return a dictionary of tensors. Each key corresponds to the name of a FeatureColumn. Each key’s value is a tensor containing the values of that feature for all data instances. See Building Input Functions with tf.contrib.learn for a more comprehensive look at input functions, and input_fn in the linear models tutorial code for an example implementation of an input function.
• So, working with that assumption, here’s a dictionary of tensors.
import numpy as np
import tensorflow as tf;

sess = tf.Session()

dict = {}

for i in range(5):
name = 'doc_{0}'.format(i)
var = tf.Variable(np.random.rand(5), tf.float32)
dict[name] = var

init = tf.global_variables_initializer()
sess.run(init)

print("{0}".format(sess.run(dict)).replace("]),", "])\n"))
• Which, remarkably enough, runs and produces the following!
{'doc_2': array([ 0.17515295,  0.93597391,  0.38829954,  0.49664442,  0.07601639])
'doc_0': array([ 0.40410072,  0.24565424,  0.9089159 ,  0.02825472,  0.28945943])
'doc_1': array([ 0.060302  ,  0.58108026,  0.21500697,  0.40784728,  0.89955796])
'doc_4': array([ 0.42359652,  0.0212912 ,  0.38216499,  0.5089103 ,  0.5616441 ])
'doc_3': array([ 0.41851737,  0.76488499,  0.63983758,  0.17332712,  0.07856653])}