Author Archives: pgfeldman

Phil 3.15.17

7:00 – 8:00 Research

Information Wars: A Window into the Alternative Media Ecosystem
- By Kate Starbird, (Scholar)
  Professor, Human Centered Design & Engineering, University of Washington
  
  Crisis Informatics, Crowdsourcing, Crowdwork, CSCW, Human Computer Interaction
Started reviewing the Tumblr trolling paper. Set up a Tumblr account

8:30 – 5:00 BRC

Heath was able to upgrade to Python 3.5.2
- Ran array_thoughts. Numbers are better than my laptop
- Attempting just_dbscan: Some hiccups due to compiling from sources. (No module named _bz7). Stalled? Sent many links.
- Success! Heath installed a binary Python rather than compiling from sources. A little faster than my laptop. No GPUs, CPU, not memory bound.
Continuing my tour of the SciPy Lecture Notes
Figuring out what a matplotlib backend is
Looks like there are multiple ways to serve graphics: http://matplotlib.org/faq/howto_faq.html#howto-webapp
More on typing Python

Class creation, inheritance and superclass overloading, with type hints:

class Student(object):
    name = 'noName'
    age = -1
    major = 'unset'

    def __init__(self, name: str):
        self.name = name

    def set_age(self, age: int):
        self.age = age

    def set_major(self, major: str):
        self.major = major

    def to_string(self) -> str:
        return "name = {0}\nage = {1}\nmajor = {2}"\
            .format(self.name, self.age, self.major)


class MasterStudent(Student):
    internship = 'mandatory, from March to June'

    def to_string(self) -> str:
        return "{0}\ninternship = {1}"\
            .format(Student.to_string(self), self.internship)


anna = MasterStudent('anna')
print(anna.to_string())

Finished the Python part, Numpy next

Figured out how to to get a matrix shape, (again, with type hints):

import numpy as np


def set_array_sequence(mat: np.ndarray):
    for i in range(mat.shape[0]):
        for j in range(mat.shape[1]):
            mat[i, j] = i * 10 + j


a = np.zeros([10, 3])
set_array_sequence(a)
print(a.shape)
print(a)

Phil Pi Day!

Research

I got accepted into the Collective Intelligence conference!
Working on LaTex formatting. Slow but steady.
Ok, the whole doc is in, but the 2 column charts are not locating well. I need to rerig them so that they are single column. Fixed! Not sure about the gray bg. Maybe an outline instead?

Phil 3.13.17

7:00 – 8:00, 5:00 – 7:00 Research

Back to learning LaTex. Read the docs, which look reasonable, if a little clunkey.
Working out how to integrate RevEx
Spent a while looking at Overleaf and ShareLatex, but decided that I like TexStudio better. Used the MikTex package manager to download revtex 4.1.
Looked for “aiptemplate.tex” and “aipsamp.tex” and found them with all associated files here: ftp://ftp.tug.org/tex/texlive/Contents/live/texmf-dist/doc/latex/revtex/sample/aip. And it pretty much just worked. Now I need to start stuffing text into the correct places.

8:30 – 2:30 BRC

Got a response from the datapipeline folks about their demo code. sked them to update the kmeans_single_iteration.py and functions.py files.

The SciKit DBSCAN is very fast

setup duration for 10000 points = 0.003002166748046875
DBSCAN duration for 10000 points = 1.161818265914917

Drilling down into the documentation. Starting with the SciPy Lecture Notes

Python has native support for imaginary numbers. Huh.
Static typing is also coming. This is allowed, but doesn’t seem to do anything yet:
```
def calcL2Dist(t1:List[float], t2:List[float]) -> float:
```

This is really nice:

In [35]: def variable_args(*args, **kwargs):
   ....:     print 'args is', args
   ....:     print 'kwargs is', kwargs
   ....:

In [36]: variable_args('one', 'two', x=1, y=2, z=3)
args is ('one', 'two')
kwargs is {'y': 2, 'x': 1, 'z': 3}

in my ongoing urge to have interactive applications, I found Bokeh, which seems to create javascript??? More traditionally, wxPython appears to be a set of bindings to the wxWidgets library. Installed, but I had to grab the compiled wheel from here (as per S.O.). I think I’m going to look closely at Bokeh though, if it can talk to the running Python, then we could have some nice diagnostics. And the research browser could possibly work through this interface as well.

Phil 3.10.17

Elbow Tickets!

7:00 – 8:00 Research

artisopensource.net
Accurat is a global, data-driven research, design, and innovation firm with offices in Milan and New York.
Formatting paper for Phys Rev E. Looks like it’s gotta be LaTex, or more specifically, RevTex. My entry about formats
- Downloaded RevTex
- How to get Google Docs to LaTex
- Introduction to LaTex
- Installing TexX slooooow…
- That literally took hours. Don’t install the normal ‘big!’ default install?
- Installing pandoc
- Tried to just export a PDF, but that choked. reading the manual at C:\texlive\2016\tlpkg\texworks\texworks-help\TeXworks-manual\en
- Compiled the converted doc! Not that I actually know what all this stuff does yet…
- And then I thought, ‘gee, this is more like coding – I wonder if there is a plugin for IntelliJ?’. Yest, but this page ->BEST DEVELOPMENT SETUP FOR LATEX – says to use texStudio. downloading to try. This seems to be very nice. Not sure if it will work without a LaTexInstall, but I’ll tray that on my home box. It would be a much faster install if it did. And it’s been updated very recently – Jan 2017
  - Aaaand the answer is no, it needs an install. Trying MikTex this time. Well that’s a LOT faster!

8:30 – 10:30, 11:00 – 2:00 BRC

This looks very cool: t-SNE algo in R and Python, made with same dataset

Phil 3.9.17

7:00 – 7:30, 4:00-5:30 Research

Short session this morning. Dr. Appt at 8:00
Sending Sunstein letter. done!
Lee Boot visualization

9:30 – 3:30BRC

Neat thing from Flickr on finding similar images.
How to install pyLint as an external tool in IntelliJ.

How to find out where your python modules are installed:

C:\Windows\system32>pip3 show pylint
Name: pylint
Version: 1.6.5
Summary: python code static checker
Home-page: https://github.com/PyCQA/pylint
Author: Python Code Quality Authority
Author-email: code-quality@python.org
License: GPL
Location: c:\users\philip.feldman\appdata\local\programs\python\python35\lib\site-packages
Requires: colorama, mccabe, astroid, isort, six

Looking at building scikit DBSCAN clusterer. I think the plan will be to initially use TF as IO. read in the protobuf and eval() out the matrix to scikit. Do the clustering in scikit, and then use TF to write out the results. Since TF and scikit are very similar, that should aid in the transfer from Python to TF, while allowing for debugging and testing in the beginning. And we can then benchmark.
Working on running the scikit.learn plot_dbscan example, and broke the scipy install. Maybe use the Windows installers? Not sure what that might break. Will try again and follow error messages first.
This looks like the fix: http://stackoverflow.com/questions/28190534/windows-scipy-install-no-lapack-blas-resources-found
- Sorry to necro, but this is the first google search result. This is the solution that worked for me:
  1. Download numpy+mkl wheel from http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy. Use the version that is the same as your python version (check using python -V). Eg. if your python is 3.5.2, download the wheel which shows cp35
  2. Open command prompt and navigate to the folder where you downloaded the wheel. Run the command: pip install [file name of wheel]
  3. Download the SciPy wheel from: http://www.lfd.uci.edu/~gohlke/pythonlibs/#scipy (similar to the step above).
  4. As above, pip install [file name of wheel]
got a new error where
```
TypeError: unorderable types: str() < int()
```
- After some searching, here’s the SO answer
- Changed line 406 from fixes.py from:
```
if np_version < (1, 12, 0):
```
  into
```
if np_version < (1, 12):
```
- Success!!!
Sprint Review

Phil 3.8.17

7:00 – 8:00 Research

Tweaking the Sunstein letter
Trying to decide what to do next. There is a good deal of work that can be done in the model, particularly with antibelief. Totalitarianism may actually go further?
- Arendt says:The advantages of a propaganda that constantly “adds the power of organization”[58] to the feeble and unreliable voice of argument, and thereby realizes, so to speak, on the spur of the moment, whatever it says, are obvious beyond demonstration. Foolproof against arguments based on a reality which the movements promised to change, against a counterpropaganda disqualified by the mere fact that it belongs to or defends a world which the shiftless masses cannot and will not accept, it can be disproved only by another, a stronger or better, reality.
  - [58] Hadamovsky, op. cit., p. 21. For totalitarian purposes it is a mistake to propagate their ideology through teaching or persuasion. In the words of Robert Ley, it can be neither “taught” nor “learned,” but only “exercised” and “practiced” (see Der Weg zur Ordensburg, undated).
- On the same page: The moment the movement, that is, the fictitious world which sheltered them, is destroyed, the masses revert to their old status of isolated individuals who either happily accept a new function in a changed world or sink back into their old desperate superfluousness. The members of totalitarian movements, utterly fanatical as long as the movement exists, will not follow the example of religious fanatics and die the death of martyrs (even though they were only too willing to die the death of robots). [59]
  - [59] R. Hoehn, one of the outstanding Nazi political theorists, interpreted this lack of a doctrine or even a common set of ideals and beliefs in the movement in his Reichsgemeinschaft and Volksgeme’mschaft, Hamburg, 1935: “From the point of view of a folk community, every community of values is destructive”
This implies that there a stage where everything outside the cluster is attacked and destroyed, rather than avoided. So there’s actually four behaviors: Explore, Confirm, Avoid, and something like Lie/Destroy/Adhere. This last option cuts the Gordian Knot of game theory – its premise of making decisions with incomplete information – by substituting self-fulfilling fictional information that IS complete. And here, diversity won’t help. It literally is the enemy.
And this is an emergent phenomenon. From Konrad Heiden’s Der Führer. Hitler’s Rise to Power: Propaganda is not “the art of instilling an opinion in the masses. Actually it is the art of receiving an opinion from the masses.”

8:30 – 6:00 BRC

Figured out part of my problem. The native python math is sloooooow. Using numpy makes everything acceptably fast. I’m not sure if I’m doing anything more than calculating in numpy and then sticking the result in TensorFlow, but it’s a start. Anyway, here’s the working code:

import time
import numpy as np
import tensorflow as tf

def calcL2Dist(t1, t2):
    sub = np.subtract(t1, t2)
    squares = np.square(sub)
    dist = np.sum(squares)
    return dist

def createCompareMat(sourceMat, rows):
    resultMat = np.zeros([rows, rows])
    for i in range(rows):
        for j in range(rows):
            if i != j:
                t1 = sourceMat[i]
                t2 = sourceMat[j]
                dist = calcL2Dist(t1, t2)
                resultMat[i, j] = dist
    return resultMat

def createSequenceMatrix(rows, cols, scalar=1.0):
    mat = np.zeros([rows, cols])
    for i in range(rows):
        for j in range(cols):
            val = (i+1)*10 + j
            mat[i, j] = val * scalar
    return mat

for t in range(5, 8):
    side = (t*100)

    sourceMat = createSequenceMatrix(side, side)

    resultMat = tf.Variable(sourceMat) # Use variable

    start = time.time()
    with tf.Session() as sess:
        tf.global_variables_initializer().run() # need to initialize all variables

        distMat = createCompareMat(sourceMat=sourceMat, rows=side)

        resultMat.assign(distMat).eval()
        result = resultMat.eval()
        #print('modified resultMat:\n', result)
        #print('modified sourceMat:\n', sourceMat)
    stop = time.time()
    duration = stop-start
    print("{0} cells took {1} seconds".format(side*side, duration))

Working on the Sprint review. I think we’re in a reasonably good place. We can do our clustering using scikit, at speeds that are acceptable even on my laptop. Initially, we’ll use TF mostly for transport between systems, and then backfill capability.
This is really important for the Research Browser concept:
- Travis Mandel
- Where to Add Actions in Human-in-the-Loop Reinforcement Learning

Phil 3.7.17

7:00 – 8:00 Research

The meeting with Don went well. We’re going to submit to Physical Review E. I need to fix a chart and then we need to make the paper more ‘Math-y’
Creating a copy of the paper for PRE – done
Fix the whisker chart – done
Compose a letter to Cass Sunstein asking for his input. Drafted. Getting sanity checks
On Building a “Fake News” Classification Model

8:30 – 6:00 BRC

Ran into an unexpected problem, the creation of the TF graph for my dictionary is taking exponential time to construct. SAD!
Debugging TF slides. Includes profiler. Pick up here tomorrow

Phil 3.6.17

6:30 – 7:00 , 4:00 – 6:00 Research

Meeting with Don
Something that’s built from agent-based models? Toward a Formal, Visual Framework of Emergent Cognitive Development of Scholars
Detecting and visualizing emerging trends and transient patterns in scientific literature
Agent-based computing from multi-agent systems to agent-based models: a visual survey
Muaz A. Niazi – Complex Adaptive Systems, Agent-based Modeling, Complex Networks, Communication Networks, Cognitive Agent-based Computing

7:30 – 3:30, BRC

From LearningTensorflow.com: KMeans tutorial. Looks pretty good
This looks interesting: Large-Scale Evolution of Image Classifiers: Neural networks have proven effective at solving difficult problems but designing their architectures can be challenging, even for image classification problems alone. Evolutionary algorithms provide a technique to discover such networks automatically. Despite significant computational requirements, we show that evolving models that rival large, hand-designed architectures is possible today. We employ simple evolutionary techniques at unprecedented scales to discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial initial conditions. To do this, we use novel and intuitive mutation operators that navigate large search spaces. We stress that no human participation is required once evolution starts and that the output is a fully-trained model.

Working on calculating distance between two vectors. Oddly, these do not seem to be library functions. This seems to be the way to do it:

def calcL2Dist(t1, t2):
    dist = -1.0
    sub = tf.subtract(t1, t2)
    squares = tf.square(sub)
    sum = tf.reduce_sum(squares)
    return sum

Now I’m trying to build a matrix of distances. Got it working after some confusion. Here’s the full code. Note that the ‘source’ matrix is declared as a constant, since it’s immutable(?)

import numpy as np
import tensorflow as tf;

def calcL2Dist(t1, t2):
    dist = -1.0
    sub = tf.subtract(t1, t2)
    squares = tf.square(sub)
    dist = tf.reduce_sum(squares)
    return dist

def initDictRandom(rows = 3, cols = 5, prefix ="doc_"):
    dict = {}
    for i in range(rows):
        name = prefix+'{0}'.format(i)
        dict[name] = tf.Variable(np.random.rand(cols), tf.float32)
    return dict

def initDictSeries(rows = 3, cols = 5, offset=1, prefix ="doc_"):
    dict = {}
    for i in range(rows):
        name = prefix+'{0}'.format(i)
        array = []
        for j in range(cols):
            array.append ((i+offset)*10 + j)
        #dict[name] = tf.Variable(np.random.rand(cols), tf.float32)
        dict[name] = tf.constant(array, tf.float32)
    return dict

def createCompareDict(sourceDict):
    distCompareDict = {}
    keys = sourceDict.keys();
    for n1 in keys:
        for n2 in keys:
            if n1 != n2:
                name = "{0}_{1}".format(n1, n2)
                t1 = sourceDict[n1]
                t2 = sourceDict[n2]
                dist = calcL2Dist(t1, t2)
                distCompareDict[name] = tf.Variable(dist, tf.float32)
    return distCompareDict

sess = tf.InteractiveSession()
dict = initDictSeries(cols=3)
dict2 = createCompareDict(dict)
init = tf.global_variables_initializer()
sess.run(init)


print("{0}".format(sess.run(dict)).replace("),", ")\n"))
print("{0}".format(sess.run(dict2)).replace(",", "])\n"))

Results:

{'doc_0': array([ 10.,  11.,  12.], dtype=float32)
 'doc_2': array([ 30.,  31.,  32.], dtype=float32)
 'doc_1': array([ 20.,  21.,  22.], dtype=float32)}
{'doc_1_doc_2': 300.0])
 'doc_0_doc_2': 1200.0])
 'doc_1_doc_0': 300.0])
 'doc_0_doc_1': 300.0])
 'doc_2_doc_1': 300.0])
 'doc_2_doc_0': 1200.0}

Looks like the data structures that are used in the tutorials are all using panda.
- Successfully installed pandas-0.19.2

Phil 3.4.17

Helena sent a nice link to Announcing New Research: “A Field Guide to Fake News” it’s from First Draft News, which purports to be Essential resources for reporting and sharing information that emerges online
Woke up this morning thinking about Pervasive Serendipity, which relates to Cass Sunstein’s musings on a Serendipity bar/button. Added that to the paper
Also added the Lynch paper on online clustering in Egypt

Phil 3.3.17

7:00 – 8:00 Research

Finished formats and determine requirements for journals. Here’s the blog entry with all the information

8:30 – 4:00 BRC

CS231n: Convolutional Neural Networks for Visual Recognition (Stanford)
- Assignments on Github This is basically the textbook, going from Python Tutorial through deep NNs

So this is going to seem very newbie, but I’ve finally figured out how to populate a dictionary of arrays:

import numpy as np

dict = {'doc1':[], 'doc2':[], 'doc3':[]}

for doc in dict:
    dict[doc] = np.random.rand(5)

for doc in dict:
    print("{0}: {1}".format(doc, dict[doc]))

It turns out that you HAVE to set the array relationship when the key is defined. Here’s how you do it programmatically

import numpy as np

dict = {}

for i in range(5):
    name = 'doc_{0}'.format(i)
    dict[name] = np.random.rand(5)

for doc in dict:
    print("{0}: {1}".format(doc, dict[doc]))

Which gives the following results

doc_0: [ 0.53396248  0.10014123  0.40849079  0.76243954  0.29396581]
doc_2: [ 0.21438903  0.68745032  0.1640486   0.51779412  0.05844617]
doc_1: [ 0.36181216  0.78839326  0.90174006  0.29013203  0.76752794]
doc_3: [ 0.44230569  0.63054045  0.80872794  0.83048027  0.87243106]
doc_4: [ 0.08283319  0.72717925  0.29242797  0.90089588  0.34012144]

Continuing to walk through fully_connected.py along with the tutorial
- math_ops.py – TF doc looks very handy
- gen_nn_ops.py – TF doc looks like the rest of the coed we’ll need
- ReLU. The Rectified Linear Unit has become very popular in the last few years. It computes the function f(x)=max(0,x)”>f(x)=max(0,x). In other words, the activation is simply thresholded at zero (see image above on the left). There are several pros and cons to using the ReLUs: (Def from here)
Discovered the Large-Scale Linear Model tutorial. It looks similar-ish to clustering. These are some of the features in tf.contrib.learn, which is also the home of the kmeans clusterer
- Feature columns and transformations
  
  Much of the work of designing a linear model consists of transforming raw data into suitable input features. tf.learn uses the FeatureColumn abstraction to enable these transformations.
  
  A FeatureColumn represents a single feature in your data. A FeatureColumn may represent a quantity like ‘height’, or it may represent a category like ‘eye_color’ where the value is drawn from a set of discrete possibilities like {‘blue’, ‘brown’, ‘green’}.
  
  In the case of both continuous features like ‘height’ and categorical features like ‘eye_color’, a single value in the data might get transformed into a sequence of numbers before it is input into the model. The FeatureColumn abstraction lets you manipulate the feature as a single semantic unit in spite of this fact. You can specify transformations and select features to include without dealing with specific indices in the tensors you feed into the model.
- WOOHOO! Found what I was looking for!
  - The input function must return a dictionary of tensors. Each key corresponds to the name of a FeatureColumn. Each key’s value is a tensor containing the values of that feature for all data instances. See Building Input Functions with tf.contrib.learn for a more comprehensive look at input functions, and input_fn in the linear models tutorial code for an example implementation of an input function.
  - So, working with that assumption, here’s a dictionary of tensors.
```
import numpy as np
import tensorflow as tf;

sess = tf.Session()

dict = {}

for i in range(5):
    name = 'doc_{0}'.format(i)
    var = tf.Variable(np.random.rand(5), tf.float32)
    dict[name] = var

init = tf.global_variables_initializer()
sess.run(init)

print("{0}".format(sess.run(dict)).replace("]),", "])\n"))
```
  - Which, remarkably enough, runs and produces the following!
```
{'doc_2': array([ 0.17515295,  0.93597391,  0.38829954,  0.49664442,  0.07601639])
 'doc_0': array([ 0.40410072,  0.24565424,  0.9089159 ,  0.02825472,  0.28945943])
 'doc_1': array([ 0.060302  ,  0.58108026,  0.21500697,  0.40784728,  0.89955796])
 'doc_4': array([ 0.42359652,  0.0212912 ,  0.38216499,  0.5089103 ,  0.5616441 ])
 'doc_3': array([ 0.41851737,  0.76488499,  0.63983758,  0.17332712,  0.07856653])}
```

Phil 3.2.17

7:00 – 8:00 Research

Scheduled a meeting with Don for Monday at 4:00

Working on finding submission formats for my top 3

Physical Review E

Author page
Here’s the format
- My guess is that there will have to be equations for neighbor calculation (construct a vector from visible neighbors and slew heading and speed) plus maybe a table for the figure 8? Not sure how to do that since the populations had no overlap.
Length FAQ Looks like 4500 words

Rapid Communication 4500 words

Comment / Reply 3500 words

Include:

Any text in the body of the article
Any text in a figure caption or table caption
Any text in a footnote or an endnote
I’m at 3073 words in the content.

Here’s the figure word eqivalents:

figure	xsize	ysize	aspect	one col	two cols
10	6.69	4.03	1.66	110.26	401.04
9	8.13	2.85	2.85	72.56	250.24
8	6.28	5.14	1.22	142.79	531.14
7	8.78	2.94	2.98	70.31	241.23
6	6.64	3.97	1.67	109.74	398.97
5	6.80	3.89	1.75	105.79	383.15
4	8.13	2.85	2.85	72.56	250.24
3	8.13	2.85	2.85	72.56	250.24
2	8.13	2.85	2.85	72.56	250.24
1	7.26	5.44	1.33	132.40	489.59
				961.52	3446.08

So it looks like the word count is between 4,034 and 6,519

IEEE Transactions on Automatic Control
- Instructions for full papers
  - PDF
  - Manuscript style is in section C. References are like ACM
  - Normally 12 pages and no more than 16
  - A mandatory page charge is imposed on all accepted full papers exceeding 12 Transactions formatted pages including illustrations, biographies and photos . The charge is $125 per page for each page over the first 12 pages and is a prerequisite for publication. A maximum of 4 such additional pages (for a total of 16 pages) is allowed.
  - Note that the authors will be asked to submit a single-column double-spaced version of their paper as well, under Supplementary Materials
  - To enhance the appearance of your paper on IEEEXplore®, a Graphical Abstract can be displayed along with traditional text. The Graphical Abstract should provide a clear, visual summary of your paper’s findings by means of an image, animation, video, or audio clip. NOTE: The graphical abstract is considered a part of the technical content of the paper, and you must provide it for peer review during the paper submission process.
- Submission policy
- MSWord template and Instructions on How to Create Your Paper
- Guidelines for graphics and charts
Journal of Political Philosophy (Not sure if it makes sense, but this was where The Law of Group Polarization was published)
- Author Guidelines
- Manuscripts accepted for publication must be put into JPP house style, as follows:
  - SPELLING AND PUNCTUATION: Authors may employ either American or English forms, provided that style is used consistently throughout their submission.
  - FOOTNOTES: Should be numbered consecutively. Authors may either:
    - employ footnotes of the traditional sort, containing all bibliographic information within them; or else
    - collect all bibliographic information into a reference list at the end of the article, to which readers should be referred by footnotes (NOT in-text reference) of the form ‘Barry 1965, p. 87’.
  - BIBLIOGRAPHIC INFORMATION: should be presented in either of the following formats:
    - If incorporated into the footnotes themselves:
      Jürgen Habermas, Legitimation Crisis, trans. Thomas McCarthy (London: Heinemann, 1976), p. 68.
      Louise Antony, ‘The socialization of epistemology’, Oxford Handbook of Contextual Political Analysis, ed. by Robert E. Goodin and Charles Tilly (Oxford: Oxford University Press, 2006, pp.58-77, at p. 62.
      John Rawls ‘Justice as fairness’, Philosophical Review, 67 (1958), 164-94 at p. 185.
    - If collected together in a reference list at the end of the article:
      Habermas, Jurgen. 1976. Legitimation Crisis, trans. Thomas McCarthy. London: Heinemann.
      Antony, Louise. 2006. The socialization of epistemology. Pp. 58-77 in Oxford Handbook of Contextual Political Analysis, ed. by Robert E. Goodin and Charles Tilly. Oxford: Oxford University Press.
      Rawls, John. 1958. Justice as Fairness. Philosophical Review, 67, 164-94.
    - In footnotes/references, spelling should follow the original while punctuation should conform to the style adopted in the body of the text, being either American (double quotation marks outside closing commas and full stops) or English (single quotation marks inside them).For Survey Articles or Debates, option (ii) – i.e., the reference list at the end of the article, together with the corresponding footnote style – is preferred.
- Nature (Yeah, I know. But as a letter?)
  - Letters are 4 pages, articles are 5
  - ‘For authors’ site map
  - Presubmission enquiries are not required for Articles or Letters, and can be difficult to assess reliably; Nature editors cannot make an absolute commitment to have a contribution refereed before seeing the entire paper.
  - Editorial process
  - Letters
    - Letters are short reports of original research focused on an outstanding finding whose importance means that it will be of interest to scientists in other fields.
      They do not normally exceed 4 pages of Nature, and have no more than 30 references. They begin with a fully referenced paragraph, ideally of about 200 words, but certainly no more than 300 words, aimed at readers in other disciplines. This paragraph starts with a 2-3 sentence basic introduction to the field; followed by a one-sentence statement of the main conclusions starting ‘Here we show’ or equivalent phrase; and finally, 2-3 sentences putting the main findings into general context so it is clear how the results described in the paper have moved the field forwards.
      
      Please refer to our annotated example to see how the summary paragraph for a Letter should be constructed.
      
      The rest of the text is typically about 1,500 words long. Any discussion at the end of the text should be as succinct as possible, not repeating previous summary/introduction material, to briefly convey the general relevance of the work.
      
      Letters typically have 3 or 4 small display items (figures or tables).
      
      Word counts refer to the text of the paper. References, title, author list and acknowledgements do not have to be included in total word counts

8:30 – 5:30 BRC

Just read Gregg’s response to the white paper. He seems to think that TF is just deep NN. Odd
Working through fully_connected_feed.py from the TF Mechanics 101 tutorial

Multiple returns works in python:

def placeholder_inputs(batch_size):
    images_placeholder = tf.placeholder(tf.float32, shape=(batch_size,
                                                           Mnist.IMAGE_PIXELS))
    labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size))
    return images_placeholder, labels_placeholder

images_placeholder, labels_placeholder = placeholder_inputs(FLAGS.batch_size)

The logit (/ˈloʊdʒɪt/ loh-jit) function is the inverse of the sigmoidal “logistic” function or logistic transform used in mathematics, especially in statistics. When the function’s parameter represents a probability $p$ , the logit function gives the log-odds, or the logarithm of the odds $p /(1 - p)$ .^[1]

In this case,

logits =  Tensor("softmax_linear/add:0", shape=(100, 10), dtype=float32)

Here are some of the other variables:

images_placeholder =  Tensor("Placeholder:0", shape=(100, 784), dtype=float32)
labels_placeholder =  Tensor("Placeholder_1:0", shape=(100,), dtype=int32)
logits =  Tensor("softmax_linear/add:0", shape=(100, 10), dtype=float32)
loss =  Tensor("xentropy_mean:0", shape=(), dtype=float32)
train_op =  name: "GradientDescent"
op: "AssignAdd"
input: "global_step"
input: "GradientDescent/value"
attr {
  key: "T"
  value {
    type: DT_INT32
  }
}
attr {
  key: "_class"
  value {
    list {
      s: "loc:@global_step"
    }
  }
}
attr {
  key: "use_locking"
  value {
    b: false
  }
}

eval_correct =  Tensor("Sum:0", shape=(), dtype=int32)
summary =  Tensor("Merge/MergeSummary:0", shape=(), dtype=string)

Note that everything is a Tensor except the train_op, which is declared as follows

# Add to the Graph the Ops that calculate and apply gradients.
train_op = Mnist.training(loss, FLAGS.learning_rate)
print("train_op = ", train_op)

It looks like dictionaries are the equivalent of may labeled matrices

def fill_feed_dict(data_set, images_pl, labels_pl):
    """Fills the feed_dict for training the given step.
    A feed_dict takes the form of:
    feed_dict = {
        : ,
        ....
    }
    Args:
      data_set: The set of images and labels, from input_data.read_data_sets()
      images_pl: The images placeholder, from placeholder_inputs().
      labels_pl: The labels placeholder, from placeholder_inputs().
    Returns:
      feed_dict: The feed dictionary mapping from placeholders to values.
    """
    # Create the feed_dict for the placeholders filled with the next
    # `batch size` examples.
    images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size,
                                                   FLAGS.fake_data)
    feed_dict = {
        images_pl: images_feed,
        labels_pl: labels_feed,
    }
    return feed_dict

lookup_ops seems to have the pieces we want. Now I just have to make it run…

Training?

https://www.udacity.com/course/deep-learning–ud730
Started the above. So far so good. Nice softmax example
Finished one-hot

Last-second proposal writing

Phil 3.1.17

It’s March and no new wars! Hooray!

7:00 – 8:00 Research

Added an excerpt (the last half or so) from Home to Roost to the paper. Disturbingly appropriate.

Continuing conference/journal spreadsheet. Here’s the sorted list:

journals	website	impact factor	papers collected
Physical Review E	http://journals.aps.org/pre/	2.252	7
IEEE Transactions on Automatic Control	http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=9	2.777	4
Nature	http://www.nature.com/nature/index.html	42.351	2
The Journal of Mathematical Sociology	http://www.tandfonline.com/toc/gmas20/current	0.68	2
IEEE Transactions on Industrial Informatics	http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=9424	4.708	1
Expert Systems with Applications	https://www.journals.elsevier.com/expert-systems-with-applications	2.981	1
Public Opinion Quarterly	https://academic.oup.com/poq	2.825	1
Communications in Mathematical Physics	https://link.springer.com/journal/220	2.375	1
Journal of Conflict Resolution	http://journals.sagepub.com/home/jcr	1.687	1
Physics Letters A	http://www.sciencedirect.com/science/journal/03759601	1.677	1
Journal of Artifical Societies and Social Simulation	http://jasss.soc.surrey.ac.uk/	1.42	1
Behavioural Processes	http://www.sciencedirect.com/science/journal/03766357	1.318	1
Journal of Political Philosophy	http://onlinelibrary.wiley.com/journal/10.1111/(ISSN)1467-9760	1.044	1
The Knowledge Engineering Review	https://www.cambridge.org/core/journals/knowledge-engineering-review	1.039	1
Proceedings of the National Academy of Sciences of the United States of America	http://www.pnas.org/		1
PlosOne	http://journals.plos.org/plosone/		1

Next step is to find submission formats. The obvious targets are Phys Rev E and IEEE TAC. I’m tempted by the JofPP, since that was where the original group polarization paper was published. And I need to understand what a Nature letter is.

8:30 – 4:30 BRC

More TensorFlow

MNIST tutorial – clear, but a LOT of stuff
Neural Networks and Deep Learning is an online book referenced in the TF documentation (at least the softmax chapter)
A one-hot vector is a vector which is 0 in most dimensions, and 1 in a single dimension. In this case, the nth digit will be represented as a vector which is 1 in the nth dimension. For example, 3 would be [0,0,0,1,0,0,0,0,0,0]. Consequently, mnist.train.labels is a [55000, 10] array of floats.
If you want to assign probabilities to an object being one of several different things, softmax is the thing to do, because softmax gives us a list of values between 0 and 1 that add up to 1. Even later on, when we train more sophisticated models, the final step will be a layer of softmax.
```
x = tf.placeholder(tf.float32, [None, 784])
```
We represent this as a 2-D tensor of floating-point numbers, with a shape [None, 784]. (Here None means that a dimension can be of any length.)
A good explanation of cross-entropy, apparently.
tf.reduce_mean

Success!!! Here’s the code:

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

x = tf.placeholder(tf.float32, [None, 784])

W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x, W) + b)

y_ = tf.placeholder(tf.float32, [None, 10]) #note that y_ means 'y prme'

cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))

train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)

sess = tf.InteractiveSession()

tf.global_variables_initializer().run()

for _ in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

And here are the results:

C:\Users\philip.feldman\AppData\Local\Programs\Python\Python35\python.exe C:/Development/Sandboxes/TensorflowPlayground/HelloPackage/MNIST_tutorial.py
Extracting MNIST_data/train-images-idx3-ubyte.gz
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

0.9192

Working on the advanced tutorial. Fixed fully_connected_feed.py to work with local data.
And then my brain died

Phil 2.28.17

7:00 – 8:30 Research

Sent a note to Don about getting together on Thursday
Added to the list of journals and conferences. Included the Journal of Political Philosophy, which is where The Law of GP was originally published

9:00 – 4:30 BRC

Installing Tensorflow as per the instructions for CUDA and Anaconda
- May need to install Python 2.7 version due to TF incompatibility with prebuilt python distros
- Looks like I need visual studio for CUDA support, which is excessive. Going to try the CPU-only version
- Installing Anaconda3-4.3.0.1-Windows-x86_64.exe. Nope, based on Aaron’s experiences, I’m going to install natively

Tensorflow native pip installation

Uninstalled old Python
Installed python-3.5.2-amd64.exe from here
Did the cpu install:
```
 pip3 install --upgrade tensorflow
```

Ran the ‘hello world’ program

import tensorflow as tf
hello = tf.constant('Hello, TensorFlow!')
sess = tf.Session()
print(sess.run(hello))

Success!!!

E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "BestSplits" device_type: "CPU"') for unknown op: BestSplits
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "CountExtremelyRandomStats" device_type: "CPU"') for unknown op: CountExtremelyRandomStats
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "FinishedNodes" device_type: "CPU"') for unknown op: FinishedNodes
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "GrowTree" device_type: "CPU"') for unknown op: GrowTree
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "ReinterpretStringToFloat" device_type: "CPU"') for unknown op: ReinterpretStringToFloat
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "SampleInputs" device_type: "CPU"') for unknown op: SampleInputs
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "ScatterAddNdim" device_type: "CPU"') for unknown op: ScatterAddNdim
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TopNInsert" device_type: "CPU"') for unknown op: TopNInsert
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TopNRemove" device_type: "CPU"') for unknown op: TopNRemove
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "TreePredictions" device_type: "CPU"') for unknown op: TreePredictions
E c:\tf_jenkins\home\workspace\release-win\device\cpu\os\windows\tensorflow\core\framework\op_kernel.cc:943] OpKernel ('op: "UpdateFertileSlots" device_type: "CPU"') for unknown op: UpdateFertileSlots
b'Hello, Tensorflow!'
>>>

The errors are some kind of cruft that has been fixed in the nightly build as per this thread
Got my new Python running in IntelliJ
Working through the tutorials. So far so good, and the support for matrices is very nice
- Getting Started
Some Tensorflow stuff from O’Reilly
- Hello, TensorFlow
- Learning TensorFlow – A guide to building deep learning systems
Some proposal work slinging text on cognitive computing

Phil 2.27.17

7:00 – 8:30 Research

Call Senate, house about CBP, SCOTUS, etc
Add Antibelief, Leader, etc. mentions to Future work

9:00 – 6:00 BRC

How to invoke a trained TensorFlow model from Java programs
Deploying TF in production. Serving up predictions in production
- 14:34 (video)
TF Javadoc: https://www.tensorflow.org/api_docs/java/reference/org/tensorflow/package-summary
TF jar: https://github.com/tensorflow/tensorflow/tree/master/tensorflow/java
Or if you want to call the C++ via JNI: https://www.tensorflow.org/extend/language_bindings
Tensorflow Serving (microservice)
Tensorflow security (WSO2)
Hadoop data filters to train the model
XLA compiler github root
And we wrote a 5 page white paper

Phil 2.24.17

7:00 – 8:00 Research

Continuing paper.
Robert Mercer: the big data billionaire waging war on mainstream media
Defense Against the Dark Arts: Networked Propaganda and Counter-Propaganda
- Jonathan Stray
Hannah Arendt: From an Interview
- They learn whom to kill and how to kill and how to do it together. This is the much talked about Gleichschaltung—the coordination process. You are coordinated not with the powers that be, but with your neighbor—coordinated with the majority.
The fake news phenomenon: How it spreads, and how to fight it
Downward comparison principles in social psychology.
GovTrack.us. Looks like a good source for parsable xml-formatted data for:

Congressional Bills	04-Jan-2017 02:05	–
Bill Status	04-Jan-2017 11:14	–
Bill Summaries	04-Jan-2017 05:13	–
Commerce Business Daily	19-Mar-2012 05:52	–
Code of Federal Regulations (Annual Edition)	22-Feb-2016 03:06	–
Electronic Code of Federal Regulations	20-Sep-2016 09:11	–
Federal Register	31-Dec-2016 08:37	–
United States Government Manual	14-Feb-2017 06:49	–
House Rules and Manual	13-Oct-2016 06:13	–
Privacy Act Issuances	09-Feb-2016 06:41	–
Public Papers of the Presidents of the United States	17-Jan-2017 09:32	–
Supreme Court Decisions 1937-1975 (FLITE)

8:30 – 4:30 BRC

More TensorFlow
- https://www.youtube.com/watch?v=kAOanJczHA0&list=PLOU2XLYxmsIKGc_NBoIhTn2Qhraji53cv
- Speed vs. Memory compiler options
- Unrolling and vectorizing
- Long Short Term Memory (LSTM) overview
- https://www.youtube.com/watch?v=t64ortpgS-E&list=PLOU2XLYxmsIKGc_NBoIhTn2Qhraji53cv&index=5
Need to try turning the integrity data into an angle radius and DBSCAN on Monday
Writing up the justification/needs for going to TensorFlow/GPU

viztales

Dimension reduction, State, Orientation, and Speed

Author Archives: pgfeldman

Phil 3.15.17

Phil Pi Day!

Phil 3.13.17

Phil 3.10.17

Phil 3.9.17

Phil 3.8.17

Phil 3.7.17

Phil 3.6.17

Phil 3.4.17

Phil 3.3.17

Phil 3.2.17

Phil 3.1.17

Phil 2.28.17

Phil 2.27.17

Phil 2.24.17

Rapid Communication	4500 words
Comment / Reply	3500 words