Monthly Archives: September 2019

Phil 9.30.19

7:00 – 7:00 ASRC GOES

Dissertation
Evolutionary hyperparameter tuning. It’s working (60% is better than my efforts), but there’s a problem with the fitness values? Also, I want to save the best chromosome always

Training

Reread weapons paper
Meeting with Aaron M – going to try to rework the paper a bit for ICSE 2020. Deadline is Oct 29.
- Some interesting discussion on how review systems should work
- Also some thoughts about how military AI in a hyperkinetic environment would have to negotiate cease-fires, sue for peace, etc.

RANDOM.ORG offers true random numbers to anyone on the Internet. The randomness comes from atmospheric noise, which for many purposes is better than the pseudo-random number algorithms typically used in computer programs. People use RANDOM.ORG for holding drawings, lotteries and sweepstakes, to drive online games, for scientific applications and for art and music. The service has existed since 1998 and was built by Dr Mads Haahr of the School of Computer Science and Statistics at Trinity College, Dublin in Ireland. Today, RANDOM.ORG is operated by Randomness and Integrity Services Ltd.

Phil 9.29.19

Done with the ride. Here are my stats:

VA19

Phil 9.22.19

Getting ready for a fun trip:

12th International Conference on Agents and Artificial Intelligence – Dammit, the papers are due October 4th. This would be a perfect venue for the GPT2 agents

Novelist Cormac McCarthy’s tips on how to write a great science paper

Unveiling the relation between herding and liquidity with trader lead-lag networks

We propose a method to infer lead-lag networks of traders from the observation of their trade record as well as to reconstruct their state of supply and demand when they do not trade. The method relies on the Kinetic Ising model to describe how information propagates among traders, assigning a positive or negative “opinion” to all agents about whether the traded asset price will go up or down. This opinion is reflected by their trading behavior, but whenever the trader is not active in a given time window, a missing value will arise. Using a recently developed inference algorithm, we are able to reconstruct a lead-lag network and to estimate the unobserved opinions, giving a clearer picture about the state of supply and demand in the market at all times.
We apply our method to a dataset of clients of a major dealer in the Foreign Exchange market at the 5 minutes time scale. We identify leading players in the market and define a herding measure based on the observed and inferred opinions. We show the causal link between herding and liquidity in the inter-dealer market used by dealers to rebalance their inventories.

Phil 9.20.19

7:00 – 5:00 ASRC GOES

Maryland Anatomy Board (Juan Ortega – Acting Director Anatomical Services Division) Dept of vital records 410 764 2922 – Hopefully done on this
Dissertation
- Rewrote abstract
- Tweaked games and finished maps?
Create TimeSeriesML2 for TF2 – done, and everything is working!
- Copy project
- Rename
- Point to new
Write pitch for ASRC funding NZ trip – done
Got my linux box working for vacation

Phil 9.19.19

ASRC AI Workshop 8:00 – 3:00

Maryland Anatomy Board Dept of vital records 410 764 2922
I remember this! Seeking New Physics
- What was happening, it turns out, is that the genetic algorithms actually learned to exploit the magnetic fields created when electrons flow through circuitry. The vestigial circuitry apparently boosted the performance of the algorithm just by existing next to the functional circuitry and emitting the appropriate physical signals.
Dissertation? Some progress on the game section
Working on integrating the test DNN into the EO.
- Need to add a few columns for the output that have the step and set membership.
- Need to not run genomes that have already been run? Or maybe use an average? More output to spreadsheets for now, but I have to think about this more
- Ok, I was expecting this:

I need to work on the path root in timeseriesML. This is probably the answer
Meeting with Don at 4:00

Phil 9.18.19

7:00 – 5:00 ASRC GOES

Dept of vital records 410 764 2922 maryland.gov
Dissertation
EvolutionaryOptimizer
- Work on getting code to work with perceptron model
- Need to record accuracy and fitness to determine a fitness value. Something like -time*(1 – efficiency) – time. We want a short and accurate to win over long and accurate. I’ll need to play around in excel.
- A new penalty-based wrapper fitness function for feature subset selection with evolutionary algorithms
  - Feature subset selection is an important preprocessing task for any real life data mining or pattern recognition problem. Evolutionary computational (EC) algorithms are popular as a search algorithm for feature subset selection. With the classification accuracy as the fitness function, the EC algorithms end up with feature subsets having considerably high recognition accuracy but the number of residual features also remain quite high. For high dimensional data, reduction of number of features is also very important to minimize computational cost of overall classification process. In this work, a wrapper fitness function composed of classification accuracy with another penalty term which penalizes for large number of features has been proposed. The proposed wrapper fitness function is used for feature subset evaluation and subsequent selection of optimal feature subset with several EC algorithms. The simulation experiments are done with several benchmark data sets having small to large number of features. The simulation results show that the proposed wrapper fitness function is efficient in reducing the number of features in the final selected feature subset without significant reduction of classification accuracy. The proposed fitness function has been shown to perform well for high-dimensional data sets with dimension up to 10,000.

Phil 9.17.19

7:00 – 6:00 ASRC GOES

Dept of vital records 410 764 2922 maryland.gov
Working from home today, waiting for a delivery
Meet with Will at 3:00. Went smoothly this time.
Send Aaron a note that I’ll miss next week and maybe the week after
Dissertation – slow going. Wrote a few paragraphs on lists and stories. Need to put the section together on games
EvolutionaryOptimizer – done? It’s working nicely on the test set. You can see that it doesn’t always come up with the best answer, but it’s always close and often much faster:

Need to write the fitness function that builds and evaluates the model
Worked on getting TF 2.0 installed using my instructions, but the TF 2.0 build is broken? Ah, I see that we are now at RC1. Changing the instructions.
Everything works now, but my day is done. Need to update my install at work tomorrow.

Phil 9.16.19

7:00 – 8:00 ASRC GOES

This makes me happy. Older, but not slower. Yet.

Strave

Maryland Anatomy Board Dept of vital records 410 764 2922 – Never got called back
Ping Antonio about virtual crowdsourcing of opinion
Dissertation – write up dissertation house one-pager
Optimizer
- Generating chromosome sequences.
- Created a fitness landscape to evaluate

FitnessLandscape

Working on breeding and mutation

ML Seminar
- Status, and a few more Andrew Ng segments. How to debug gradient descent
Meeting With Aaron M
- Nice chat
- GARY MARCUS is a scientist, best-selling author, and entrepreneur. He is Founder and CEO of Robust.AI.
- His newest book, co-authored with Ernest Davis, Rebooting AI: Building Machines We Can Trust aims to shake up the field of artificial intelligence.
- Don’t put the transformer research in the dissertation
Evolution of Representations in the Transformer (nice looking blog post of deeper paper)
- We look at the evolution of representations of individual tokens in Transformers trained with different training objectives (MT, LM, MLM – BERT-style) from the Information Bottleneck perspective and show, that:
  - LMs gradually forget past when forming future;
  - for MLMs, the evolution has the two stages of context encoding and token reconstruction;
  - MT representations get refined with context, but less processing is happening.
Different Spirals of Sameness: A Study of Content Sharing in Mainstream and Alternative Media
- In this paper, we analyze content sharing between news sources in the alternative and mainstream media using a dataset of 713K articles and 194 sources. We find that content sharing happens in tightly formed communities, and these communities represent relatively homogeneous portions of the media landscape. Through a mix-method analysis, we find several primary content sharing behaviors. First, we find that the vast majority of shared articles are only shared with similar news sources (i.e. same community). Second, we find that despite these echo-chambers of sharing, specific sources, such as The Drudge Report, mix content from both mainstream and conspiracy communities. Third, we show that while these differing communities do not always share news articles, they do report on the same events, but often with competing and counter-narratives. Overall, we find that the news is homogeneous within communities and diverse in between, creating different spirals of sameness.

Phil 9.14.19

Document This document describes the Facebook Privacy-Protected URLs-light release, resulting from a collaboration between Facebook and Social Science One. It was originally prepared for Social Science One grantees and describes the dataset’s scope, structure, and fields.

As part of this project, we are pleased to announce that we are making data from the URLs service available to the broader academic community for projects concerning the effect of social media on elections and democracy. This unprecedented dataset consists of web page addresses (URLs) that have been shared on Facebook starting January 1, 2017 through to and including February 19, 2019. URLs are included if shared by more than on average 100 unique accounts with public privacy settings. Read the complete Request for Proposals for more information.

Phil 9.13.19

7:00 – 4:00 ASRC GOES

Maryland Anatomy Board Dept of vital records 410 764 2922 – Never got called back
Dissertation
Keras NN – code cleanup and prep for optomizer work on Monday

Phil 9.12.19

7:00 – 4:30 ASRC GOES

FractalNet: Ultra-Deep Neural Networks without Residuals
- We introduce a design strategy for neural network macro-architecture based on self-similarity. Repeated application of a simple expansion rule generates deep networks whose structural layouts are precisely truncated fractals. These networks contain interacting subpaths of different lengths, but do not include any pass-through or residual connections; every internal signal is transformed by a filter and nonlinearity before being seen by subsequent layers. In experiments, fractal networks match the excellent performance of standard residual networks on both CIFAR and ImageNet classification tasks, thereby demonstrating that residual representations may not be fundamental to the success of extremely deep convolutional neural networks. Rather, the key may be the ability to transition, during training, from effectively shallow to deep. We note similarities with student-teacher behavior and develop drop-path, a natural extension of dropout, to regularize co-adaptation of subpaths in fractal architectures. Such regularization allows extraction of high-performance fixed-depth subnetworks. Additionally, fractal networks exhibit an anytime property: shallow subnetworks provide a quick answer, while deeper subnetworks, with higher latency, provide a more accurate answer.
Structural diversity in social contagion
- The concept of contagion has steadily expanded from its original grounding in epidemic disease to describe a vast array of processes that spread across networks, notably social phenomena such as fads, political opinions, the adoption of new technologies, and financial decisions. Traditional models of social contagion have been based on physical analogies with biological contagion, in which the probability that an individual is affected by the contagion grows monotonically with the size of his or her “contact neighborhood”—the number of affected individuals with whom he or she is in contact. Whereas this contact neighborhood hypothesis has formed the underpinning of essentially all current models, it has been challenging to evaluate it due to the difficulty in obtaining detailed data on individual network neighborhoods during the course of a large-scale contagion process. Here we study this question by analyzing the growth of Facebook, a rare example of a social process with genuinely global adoption. We find that the probability of contagion is tightly controlled by the number of connected components in an individual’s contact neighborhood, rather than by the actual size of the neighborhood. Surprisingly, once this “structural diversity” is controlled for, the size of the contact neighborhood is in fact generally a negative predictor of contagion. More broadly, our analysis shows how data at the size and resolution of the Facebook network make possible the identification of subtle structural signals that go undetected at smaller scales yet hold pivotal predictive roles for the outcomes of social processes.
- Add this to the discussion section – done
Dissertation
- Started on the theory section, then realized the background section didn’t set it up well. So worked on the background instead. I put in a good deal on how individuals and groups interact with the environment differently and how social interaction amplifies individual contribution through networking.
Quick meetings with Don and Aaron
Time prediction (sequence to sequence) with Keras perceptrons

This was surprisingly straightforward

There was some initial trickiness in getting the IDE to work with the TF2.0 RC0 package:

import tensorflow as tf
from tensorflow import keras
from tensorflow_core.python.keras import layers

The first coding step was to generate the data. In this case I’m building a numpy matrix that has ten variations on math.sin(), using our timeseriesML utils code. There is a loop that sets up the code to create a new frequency, which is sent off to get back a pandas Dataframe that in this case has 10 sequence rows with 100 samples. First, we set the global sequence_length:

sequence_length = 100

then we create the function that will build and concatenate our numpy matrices:

def generate_train_test(num_functions, rows_per_function, noise=0.1) -> (np.ndarray, np.ndarray, np.ndarray):
    ff = FF.float_functions(rows_per_function, 2*sequence_length)
    npa = None
    for i in range(num_functions):
        mathstr = "math.sin(xx*{})".format(0.005*(i+1))
        #mathstr = "math.sin(xx)"
        df2 = ff.generateDataFrame(mathstr, noise=0.1)
        npa2 = df2.to_numpy()
        if npa is None:
            npa = npa2
        else:
            ta = np.append(npa, npa2, axis=0)
            npa = ta

    split = np.hsplit(npa, 2)
    return npa, split[0], split[1]

Now, we build the model. We’re using keras from the TF 2.0 RC0 build, so things look slightly different:

model = tf.keras.Sequential()
# Adds a densely-connected layer with 64 units to the model:
model.add(layers.Dense(sequence_length, activation='relu', input_shape=(sequence_length,)))
# Add another:
model.add(layers.Dense(200, activation='relu'))
# Add a softmax layer with 10 output units:
model.add(layers.Dense(sequence_length))

loss_func = tf.keras.losses.MeanSquaredError()
opt_func = tf.keras.optimizers.Adam(0.01)
model.compile(optimizer= opt_func,
              loss=loss_func,
              metrics=['accuracy'])

We can now fit the model to the generated data:

full_mat, train_mat, test_mat = generate_train_test(10, 10)

model.fit(train_mat, test_mat, epochs=10, batch_size=2)

There is noise in the data, so the accuracy is not bang on, but the loss is nice. We can see this better in the plots above, which were created using this function:

def plot_mats(mat:np.ndarray, cluster_size:int, title:str, fig_num:int):
    plt.figure(fig_num)

    i = 0
    for row in mat:
        cstr = "C{}".format(int(i/cluster_size))
        plt.plot(row, color=cstr)
        i += 1

    plt.title(title)

Which is called just before the program completes:

if show_plots:
    plot_mats(full_mat, 10, "Full Data", 1)
    plot_mats(train_mat, 10, "Input Vector", 2)
    plot_mats(test_mat, 10, "Output Vector", 3)
    plot_mats(predict_mat, 10, "Predict", 4)
    plt.show()

That’s it! Full listing below:

import tensorflow as tf
from tensorflow import keras
from tensorflow_core.python.keras import layers
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import timeseriesML.generators.float_functions as FF


sequence_length = 100

def generate_train_test(num_functions, rows_per_function, noise=0.1) -> (np.ndarray, np.ndarray, np.ndarray):
    ff = FF.float_functions(rows_per_function, 2*sequence_length)
    npa = None
    for i in range(num_functions):
        mathstr = "math.sin(xx*{})".format(0.005*(i+1))
        #mathstr = "math.sin(xx)"
        df2 = ff.generateDataFrame(mathstr, noise=0.1)
        npa2 = df2.to_numpy()
        if npa is None:
            npa = npa2
        else:
            ta = np.append(npa, npa2, axis=0)
            npa = ta

    split = np.hsplit(npa, 2)
    return npa, split[0], split[1]

def plot_mats(mat:np.ndarray, cluster_size:int, title:str, fig_num:int):
    plt.figure(fig_num)

    i = 0
    for row in mat:
        cstr = "C{}".format(int(i/cluster_size))
        plt.plot(row, color=cstr)
        i += 1

    plt.title(title)

model = tf.keras.Sequential()
# Adds a densely-connected layer with 64 units to the model:
model.add(layers.Dense(sequence_length, activation='relu', input_shape=(sequence_length,)))
# Add another:
model.add(layers.Dense(200, activation='relu'))
# Add a softmax layer with 10 output units:
model.add(layers.Dense(sequence_length))

loss_func = tf.keras.losses.MeanSquaredError()
opt_func = tf.keras.optimizers.Adam(0.01)
model.compile(optimizer= opt_func,
              loss=loss_func,
              metrics=['accuracy'])

full_mat, train_mat, test_mat = generate_train_test(10, 10)

model.fit(train_mat, test_mat, epochs=10, batch_size=2)
model.evaluate(train_mat, test_mat)

# test against freshly generated data
full_mat, train_mat, test_mat = generate_train_test(10, 10)
predict_mat = model.predict(train_mat)

show_plots = True
if show_plots:
    plot_mats(full_mat, 10, "Full Data", 1)
    plot_mats(train_mat, 10, "Input Vector", 2)
    plot_mats(test_mat, 10, "Output Vector", 3)
    plot_mats(predict_mat, 10, "Predict", 4)
    plt.show()

Phil 9.11 . 19

be7c6582-044a-4a19-aa8b-de388b4a4f83-cincpt_09-11-2016_enquirer_1_b001__2016_09_10_img_xxx_world_trade_11_1_1_9kfm0g4g_l880019336_img_xxx_world_trade_11_1_1_9kfm0g4g

7:00 – 4:00 ASRC GOES

Model:DLG3501W SKU:6181264
Maryland Anatomy Board Dept of vital records 410 764 2922
arxiv-vanity.com arXiv Vanity renders academic papers from arXiv as responsive web pages so you don’t have to squint at a PDF.
- It works ok. Tables and cation alignment are a problem for now, but it sounds great for phones
DeepPrivacy: A Generative Adversarial Network for Face Anonymization
- We propose a novel architecture which is able to automatically anonymize faces in images while retaining the original data distribution. We ensure total anonymization of all faces in an image by generating images exclusively on privacy-safe information. Our model is based on a conditional generative adversarial network, generating images considering the original pose and image background. The conditional information enables us to generate highly realistic faces with a seamless transition between the generated face and the existing background. Furthermore, we introduce a diverse dataset of human faces, including unconventional poses, occluded faces, and a vast variability in backgrounds. Finally, we present experimental results reflecting the capability of our model to anonymize images while preserving the data distribution, making the data suitable for further training of deep learning models. As far as we know, no other solution has been proposed that guarantees the anonymization of faces while generating realistic images.
Introducing a Conditional Transformer Language Model for Controllable Generation
- CTRL is a 1.6 billion-parameter language model with powerful and controllable artificial text generation that can predict which subset of the training data most influenced a generated text sequence. It provides a potential method for analyzing large amounts of generated text by identifying the most influential source of training data in the model. Trained with over 50 different control codes, the CTRL model allows for better human-AI interaction because users can control the generated content and style of the text, as well as train it for multitask language generation. Finally, it can be used to improve other natural language processing (NLP) applications either through fine-tuning for a specific task or through transfer of representations that the model has learned.
Dissertation
- Started to put together my Linux laptop for vacation writing
- More SIH section
Verify that timeseriesML can be used as a library
Perceptron curve prediction
AI/ML status meetings
Helped Vadim with some python issues

Phil 9.10.19

ASRC GOES 7:00 – 5:30

Got a mention in an article on Albawaba – When the Only Option is ‘Not to Play’? Autonomous Weapons Systems Debated in Geneva
Dissertation – more SIH
Just saw this: On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
- We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores. Note: The abstract above was not written by the authors, it was generated by one of the models presented in this paper.
Working on packaging timeseriesML. I think it’s working!

TimeSeriesML

I’ll try it out when I get back after lunch
Meeting with Vadim
- Showed him around and provided svn access
Model:DLG3501W SKU:6181264

Phil 9/9/19

7:00 – 2:30 ASRC GOES

Dissertation – working on Social Influence Horizon
Research dryer, dammit
Write letter for Aaron – done
TF 2.0 perceptron time series test – nope, first I want to make a utils library
- Packaging Python Projects
FIKA
ML Seminar

Phil 9.6.19