Category Archives: Tensorflow

Phil 5.15.20

Fridays are hard. I feel like I need a break from pushing this rock up hill alone. Nice day for a ride tomorrow, so a few of us will probably meet up.

D20

Zach seems to be making progress in fits and starts. No word from Aaron
One way to make the system more responsive is to see if the rates are above or below the regression. Above can be flagged.

GPT-2 Agents

More PGNtoEnglish. Getting close I think.
Added pawn attack moves (diagonals)
Adding comment regex – done

Some problem handling this:

Evaluating move [Re1 Qb6]
search at (-6, -6) for black queen out of bounds
search at (6, -6) for black queen out of bounds
search at (0, -6) for black queen out of bounds
search at (7, 7) for black queen out of bounds
search at (-7, -7) for black queen out of bounds
search at (7, -7) for black queen out of bounds
search at (7, 0) for black queen out of bounds
search at (0, -7) for black queen out of bounds
raw: white: Re1, black: Qb6
	expanded: white:  Fred Van der Vliet moves white rook from f1 to e1.
	black: unset

GOES

Need to make the output of the generator work as input to the discriminator.
So I need to get to the input vector of latent noise to an output that is the size of the real data. It’s easy to do with Dense, but Dense and Conv1D don’t get along. I think I can get around that by reshaping the dense layer to something that a Conv1D can take. But that probably looses a lot of information, since each neuron will have some of each noise sample in it. But the information is noise in the first place, so it’s just resampled noise? The other option is to upsample, but that requires the latent vector to divide evenly into the input vector for the discriminator.

Here’s my code that does the change from a dense to Conv1D:

self.g_model.add(Dense(self.vector_size*self.num_samples, activation='relu', batch_input_shape=(self.latent_dim, self.num_samples)))
self.g_model.add(Reshape(target_shape=(self.vector_size, self.num_samples)))
self.g_model.add(Conv1D(filters=self.vector_size, kernel_size=5, activation='tanh', batch_input_shape=(self.vector_size, self.num_samples, 1)))

The code that produces the latent noise is:

    def generate_latent_points(self, span:float=1.0) -> np.array:
        x_input = np.random.randn(self.latent_dim * self.num_samples)*self.span
        # reshape into a batch of inputs for the network
        x_input = x_input.reshape(self.latent_dim, self.num_samples)
        return x_input

The “real” values are:

real_matrix = 
[[-0.34737792 -0.7081109   0.93673414 -0.071527   -0.87720268]
 [ 0.99876073 -0.46088645 -0.61516785  0.97288676 -0.19455964]
 [ 0.97121222 -0.18755778 -0.81510907  0.8659679   0.09436946]
 [-0.72593361 -0.32328777  0.99500398 -0.50484775 -0.57482239]
 [ 0.72944027 -0.92555418  0.04089262  0.89151951 -0.78289867]
 [ 0.79514567 -0.88231211 -0.06080288  0.93291797 -0.71565884]
 [ 0.78083404 -0.89301473 -0.03758353  0.92429527 -0.73170157]
 [ 0.08266474 -0.94058595  0.70017899  0.3578314  -0.9979998 ]
 [-0.39534886 -0.67069473  0.95356385 -0.12295042 -0.85123299]
 [ 0.73424796  0.31175013 -0.99371562  0.5153131   0.56482379]]

The latent values are (note that the matrix is transposed):

latent_matrix = 
[[  8.73701754   6.10841293   9.31566343  -2.00751851   0.10715919
    6.94580853  -6.95308374   6.97502697 -11.09777023  -8.79311041]
 [ -3.61789323   0.11091496  10.94717459   3.14579647 -13.23974342
    2.78914476   9.40101397 -17.75756896   2.87461527   6.65877192]
 [  5.77331701   7.71326491   9.9877786   -3.81972802  -5.86490109
   -6.68585542 -13.59478633  -7.66952834 -10.78863284   5.9248856 ]
 [ -3.05226511  -5.36347909   1.3377953   14.87752343  -0.21993387
  -13.47737126   1.39357385  -1.85004465   6.83400948   1.21105276]]

The values created by the generator are:

predict_matrix = 
[[[-0.9839389   0.18747564 -0.9449842  -0.66334486 -0.9822154 ]]
 [[ 0.9514655  -0.9985579   0.76473945 -0.9985249  -0.9828463 ]]
 [[-0.58794653 -0.9982161   0.9855345  -0.93976855 -0.9999758 ]]
 [[-0.9987122   0.9480774  -0.80395573 -0.999845    0.06755089]]]

So now I need to get the number of rows up to the same value as the real data
Ok, so here’s how that works. We use tf.Keras.Reshape(), which is pretty simple. You simply put the most of shape you want as the single argument and it. So for these experiments, I had ten rows of 5 features, plus an extra dimension. So you would think that reshape(10,5,1) would be what you want.

Au contraire! Keras wants to be able to have flexibility, so one dimension is left to vary. The argument is actually (5, 1). Here are two versions. First is a generator using a Dense network:

def define_generator_Dense(self) -> Sequential:
    self.g_model_Dense = Sequential()

    self.g_model_Dense.add(Dense(4, activation='relu', kernel_initializer='he_uniform', input_dim=self.latent_dim))
    self.g_model_Dense.add(Dropout(0.2))
    self.g_model_Dense.add(Dense(self.vector_size, activation='tanh')) # activation was linear
    self.g_model_Dense.add(Reshape((self.vector_size, 1)))
    print("g_model_Dense.output_shape = {}".format(self.g_model_Dense.output_shape))

    # compile model
    loss_func = tf.keras.losses.BinaryCrossentropy()
    opt_func = tf.keras.optimizers.Adam(0.001)
    self.g_model_Dense.compile(loss=loss_func, optimizer=opt_func)

    return self.g_model_Dense

Second is a network using Conv1D layers

def define_generator_Dense_to_CNN(self) -> Sequential:
    self.g_model_Dense_CNN = Sequential()
    self.g_model_Dense_CNN.add(Dense(self.num_samples * self.vector_size, activation='relu', batch_input_shape=(self.num_samples, self.latent_dim)))
    self.g_model_Dense_CNN.add(Reshape(target_shape=(self.num_samples, self.vector_size)))
    self.g_model_Dense_CNN.add(Conv1D(filters=self.vector_size, kernel_size=self.num_samples, activation='tanh', batch_input_shape=(self.num_samples, self.vector_size, 1))) # activation was linear
    self.g_model_Dense_CNN.add(Reshape((self.vector_size, 1)))
    #self.g_model.add(UpSampling1D(size=2))

    # compile model
    loss_func = tf.keras.losses.BinaryCrossentropy()
    opt_func = tf.keras.optimizers.Adam(0.001)
    self.g_model_Dense_CNN.compile(loss=loss_func, optimizer=opt_func)
    return self.g_model_Dense_CNN

Both evaluated correctly against the discriminator, so I should be able to train the whole GAN, once it’s assembled. But that is not something to start at 4:30 on a Friday afternoon!

real predict = (10, 1)[[0.42996567]
 [0.55048925]
 [0.56003207]
 [0.40951794]
 [0.5600004 ]
 [0.5098837 ]
 [0.4046895 ]
 [0.41493616]
 [0.4196912 ]
 [0.5080263 ]]
gdense_mat predict = (10, 1)[[0.48928624]
 [0.5       ]
 [0.4949373 ]
 [0.5       ]
 [0.5973854 ]
 [0.61968124]
 [0.49698165]
 [0.5       ]
 [0.5183723 ]
 [0.4212265 ]]
gdcnn_mat predict = (10, 1)[[0.48057705]
 [0.5026125 ]
 [0.51943815]
 [0.4902147 ]
 [0.5988    ]
 [0.39476413]
 [0.49915075]
 [0.49861506]
 [0.55501187]
 [0.54503495]]

Phil 5.14.20

GPT-2 Agents

Adding hints and meta information. Still need to handle special pawn moves and pull out comments

/[^{\}]+(?=})/g, from https://stackoverflow.com/questions/413071/regex-to-get-string-between-curly-braces

Amusingly, my simple parser is now at 560 LOC and counting

GOES

Working on creating a discriminator using Conv1D layers.
- With some help from Aaron, I got the discriminator working. There are some issues. I’m currently using batch_input_shape rather than input_shape, which beans that a pre-sized batch is compiled in. The second issue is that the discriminator requires a 3D vector to be fed in, which can’t be produced naturally with Dense/MLP. That means the Generator also has to use Conv1D at least at the output layer
- I think this post: How to Develop 1D Convolutional Neural Network Models for Human Activity Recognition should help, but I don’t think I have the cognitive ability at the end of the day. Tomorrow

Phil 5.12.20

D20

#COVID Aseel’s docs don’t seem to be in the proper unicode? I tried downloading a version of the Quran from here, and that seems to be working. Hmmm. Trying to train on the Quran with these args:

--output_dir=output --model_type=gpt2 --model_name_or_path=gpt2 --per_gpu_train_batch_size=1 --do_train --train_data_file=..\input\quran-simple.txt

GPT-2 Agents

Added basic moves for all the pieces. Still need to handle hints

Evaluating move [d4 Nf6]
 Fred Van der Vliet moves white pawn from d2 to d4.
 Loek Van Wely moves black knight from g8 to f6.
Evaluating move [c4 g6]
 Fred Van der Vliet moves white pawn from c2 to c4.
 Loek Van Wely moves black pawn from g7 to g6.
Evaluating move [g3 Bg7]
 Fred Van der Vliet moves white pawn from g2 to g3.
 Loek Van Wely moves black bishop from f8 to g7.
Evaluating move [Bg2 O-O]
 Fred Van der Vliet moves white bishop from f1 to g2.
 Loek Van Wely kingside castles.

GOES

Working on NoiseGAN
- Seems to be training without blowing up….
- It ran, but the results are weird.

acc_loss

- As you can see, the fake data seems to have learned the noise well, but the scale is wrong.
- It does seem to be able to learn about the scale though:

- Adding dropout seems to help:

Noise_trained

The discriminator so far:

self.d_model = Sequential()
self.d_model.add(Dense(64, activation='relu', kernel_initializer='he_uniform', input_dim=self.vector_size))
self.d_model.add(Dropout(0.2))
self.d_model.add(Dense(25, activation='relu', kernel_initializer='he_uniform', input_dim=self.vector_size))
self.d_model.add(Dropout(0.2))
self.d_model.add(Dense(1, activation='sigmoid'))
# compile model
self.d_model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

Noise_trained

Just found out about TF-GAN from this Google course on GANs
- Mode Collapse is why the GAN keeps generating a single waveform
GANSynth: Making music with GANs
- In this post, we introduce GANSynth, a method for generating high-fidelity audio with Generative Adversarial Networks (GANs).
10 Lessons I Learned Training GANs for one Year
Advanced Topics in GANs
10:00 meeting with Vadim – nope

Phil 5.11.20

Cut my hair for the second time. It looks ok from the front…

I’m also having dreams with crowds in them. Saturday night I dreamed I was at some job with a lot of people in a large building. Last night I dreamed I was sharing a dorm at the Naval Academy?

A foolproof way to shrink deep learning models

Train the model, prune its weakest connections, retrain the model at its fast, early training rate, and repeat, until the model is as tiny as you want.

Graph Neural Networks (GNN)

Graph neural networks (GNNs) are connectionist models that capture the dependence of graphs via message passing between the nodes of graphs. Unlike standard neural networks, graph neural networks retain a state that can represent information from its neighborhood with arbitrary depth.

D20

Zach’s having issues getting the map to work on mobile
Need to start pulling off controlled entities like China and Diamond Princess
Made a duplicate of the trending code to play with

GPT-2 Agents

More PGNtoEnglish
I have pawns and knights moving!

chessboard

With expanded text!
- ‘Fred Van der Vliet moves white pawn from d2 to d4’
- ‘Loek Van Wely moves black knight from g8 to f6’

GOES

Continue with NoiseGAN
Isolating noise. Done!

noise

Now I need to subsample to produce the training and test sets. Seems to be working
Fitting the timeseries sampling into the GAN

Try training the GAN?

Fika

Community Spaces for Interdisciplinary Science and Engagement
- Dr. Lisa Scheifele is an Associate Professor at Loyola University Maryland and head of the Build-a-Genome research network, where her research focuses on designing and programming cells for new and complex functions. She is also Executive Director at the Baltimore Underground Science Space (BUGSS) community lab. BUGSS provides unique and creative projects to members of the public who have few other opportunities to engage with modern science. As an informal and nontraditional science space, BUGSS’ activities blend biotechnology research, computational tools, artistic expression, and design principles to accomplish interdisciplinary projects driven by community interest and need.

Phil 5.8.20

D20

Really have to fix the trending. Places like Brazil, where the disease is likely to be chronic, are not working any more
Aaron and I agree if the site’s not updated by 5/15 to pull it down

GPT-2 Agents

More PGNtoEnglish
Worked out way to search for pieces in a rules-based range. It’ll work for pawns, knights, and kings right now. Will need to add rooks, bishops and queens

#COVID

Try finetuning the model on Arabic to see what happens. Don’t see the txt files?

GOES

The time taken for all the DB calls is substantial. I need to change the Measurements class so that there is a set of master Measurements that are big enough to subsample other Measurements from. Done. Much faster!

Start building noise query, possibly using a high pass filter? Otherwise, subtract the “real” signal from the simulated one

Starting with the subtraction, since I have to set up queries anyway, and this will help me debug them
Created NoiseGAN class that extends OneDGAN
Pulling over table building code from InfluxTestTrainBase()
Success!

"D:\Program Files\Python37\python.exe" D:/Development/Sandboxes/Influx2_ML/Influx2_ML/NoiseGAN.py
2020-05-08 14:45:36.077292: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
OneDGAN.reset()
NoiseGAN.reset()
query = from(bucket:"org_1_bucket") |> range(start:2020-04-13T13:30:00Z, stop:2020-04-13T13:40:00Z) |> filter(fn:(r) => r.type == "noisy_sin" and (r.period == "8"))
vector size = 100, query returns = 590

Probably a good place to stop for the day

10:00 Meeting. Vadim seems to be making good progress. Check in on Tuesday

Phil 5.7.20

D20

Everything is silent again.

GPT-2 Agents

Continuing with PGNtoEnglish
- Building out move text
- Changing board to a dataframe, since I can display it as a table in pyplot – done!

chessboard

Here’s the code for making the chesstable table in pyplot:

import pandas as pd
import matplotlib.pyplot as plt

class Chessboard():
    board:pd.DataFrame
    rows:List
    cols:List

    def __init__(self):
        self.reset()

    def reset(self):
        self.cols = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h']
        self.rows = [8, 7, 6, 5, 4, 3, 2, 1]
        self.board = df = pd.DataFrame(columns=self.cols, index=self.rows)
        for number in self.rows:
            for letter in self.cols:
                df.at[number, letter] = pieces.NONE.value

        self.populate_board()
        self.print_board()

    def populate_board(self):
        self.board.at[1, 'a'] = pieces.WHITE_ROOK.value
        self.board.at[1, 'h'] = pieces.WHITE_ROOK.value
        self.board.at[1, 'b'] = pieces.WHITE_KNIGHT.value
        self.board.at[1, 'g'] = pieces.WHITE_KNIGHT.value
        self.board.at[1, 'c'] = pieces.WHITE_BISHOP.value
        self.board.at[1, 'f'] = pieces.WHITE_BISHOP.value
        self.board.at[1, 'd'] = pieces.WHITE_QUEEN.value
        self.board.at[1, 'e'] = pieces.WHITE_KING.value

        self.board.at[8, 'a'] = pieces.BLACK_ROOK.value
        self.board.at[8, 'h'] = pieces.BLACK_ROOK.value
        self.board.at[8, 'b'] = pieces.BLACK_KNIGHT.value
        self.board.at[8, 'g'] = pieces.BLACK_KNIGHT.value
        self.board.at[8, 'c'] = pieces.BLACK_BISHOP.value
        self.board.at[8, 'f'] = pieces.BLACK_BISHOP.value
        self.board.at[8, 'd'] = pieces.BLACK_KING.value
        self.board.at[8, 'e'] = pieces.BLACK_QUEEN.value

        for letter in self.cols:
            self.board.at[2, letter] = pieces.WHITE_PAWN.value
            self.board.at[7, letter] = pieces.BLACK_PAWN.value

    def print_board(self):
        fig, ax = plt.subplots()

        # hide axes
        fig.patch.set_visible(False)
        ax.axis('off')
        ax.axis('tight')

        ax.table(cellText=self.board.values, colLabels=self.cols, rowLabels=self.rows, loc='center')

        fig.tight_layout()

        plt.show()

GOES

Continuing with the MLP sequence-to-sequence NN
Writing
Reading
- Hmm. Just realized that the input vector being defined by the query is a bit problematic. I think I need to define the input vector size and then ensure that the query creates sufficient points. Fixed. It now stores the model with the specified input vector size:

model_name

And here’s the loaded model in newly-retrieved data:

Here’s the model learning two waveforms. Went from 400×2 neurons to 3200×2:

Combining with GAN
- Subtract the sin from the noisy_sin to get the moise and train on that
Start writing paper? What are other venues beyond GVSETS?
2:00 status meeting

JuryRoom

3:30 Meeting
6:00 Meeting

Phil 5.6.20

#COVID

I looked at the COVID-19-TweetIDs GitHub project, and it is in fact lists of ids:

1219755883690774529
1219755875407224832
1219755707001659393
1219755610494861312
1219755586272813057
1219755378428338181
1219755293397012480
1219755288988798981
1219755197645279233
1219755157438828545

These can work by appending that number to the string “twitter.com/anyuser/status/”, like this: twitter.com/anyuser/status/1219755883690774529

The way to get the text in Python appears to be tweepy. This snippet from stackoverflow appears to show how to do it, but I haven’t verified yet.

import tweepy
consumer_key = xxxx
consumer_secret = xxxx
access_token = xxxx
access_token_secret = xxxx

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)

tweets = api.statuses_lookup(id_list) # id_list is the list of tweet ids
tweet_txt = []
for i in tweets:
    tweet_txt.append(i.text)

GPT-2 Agents

Continuing with PGNtoEnglish
- Figuring out how to parse the moves text, using the wonderful regex101 site
4:30 meeting
- We set up an Overleaf project with the goal to submit to the Harvard/Kennedy Misinformation Review
- We talked about the GPT-2 as a way of clustering tweets. Going to try finetuning with some Arabic novels first to see if it can work in that language

GOES

Continuing with the MLP sequence-to-sequence NN
- Getting the data to fit into nice, rectangular arrays, which is no straightforward, since the time window of the query can return a varying number of results. So I have to run the query, then trim the arrays down so that they are all the length of the shortest. Here’s the results:

I’ve got the training and prediction working pretty well. Stopping for the day

Tomorrow I’ll get the models to write out and read in
2:00 status meeting
- Two weeks to getting the sim running?

Phil 5.5.20

D20

Just goes to show that you shouldn’t take regression fits as correct

GPT-2 Agents

More PGNtoEnglish
Discovered typing.TextIO. I love typing to death 🙂
Finished parsing meta information

#COVID

twitter-text-python is a Tweet parser and formatter for Python. Extract users, hashtags, URLs and format as HTML for display. PyPI release: https://pypi.org/project/twitter-text-python/

GOES

Progress meeting with Vadim and Isaac
Train and save a 2-layer, 400 neuron MLP. No ensembles for now
Set up GAN to add noise

Phil 5.4.20

It is a Chopin sort of morning

D20

Zach got maps and lists working over the weekend. Still a lot more to do though
Need to revisit the math to work over the past n days

GPT-2 Agents

Working on PGN to English.
- Added game class that contains all the information for a game and reads it in. Games are created and managed by the PGNtoEnglish class
Rebased the transformers project. It updates fast

GOES

Figure out how to save and load models. I’m really not sure what to save, since you need access to the latent space and the discriminator? So far, it’s:

def save_models(self, directory:str, prefix:str):
    p = os.getcwd()
    os.chdir(directory)
    self.d_model.save("{}_discriminator.tf}".format(prefix))
    self.g_model.save("{}_generator.tf}".format(prefix))
    self.gan_model.save("{}_GAN.tf}".format(prefix))
    os.chdir(p)

def load_models(self, directory:str, prefix:str):
    p = os.getcwd()
    os.chdir(directory)
    self.d_model = tf.keras.models.load_model("{}_discriminator.tf}".format(prefix))
    self.g_model = tf.keras.models.load_model("{}_generator.tf}".format(prefix))
    self.gan_model = tf.keras.models.load_model("{}_GAN.tf}".format(prefix))
    os.chdir(p)

Here’s the initial run. Very nice for 10,000 epochs!

- And here’s the results from the loaded model:

GAN_trained

- The discriminator works as well:
```
real accuracy = 100.00%, fake accuracy = 100.00%
real loss = 0.0154, fake loss = 0.0947%
```
- An odd thing is that I can save the GAN model, but can’t load it?
```
ValueError: An empty Model cannot be used as a Layer.
```
  I can rebuild it from the loaded generator and discriminator models though
Set up MLP to convert low-fidelity sin waves to high-fidelity
- Get the training and test data from InfluxDB
  - input is square, output is sin, and the GAN should be noisy_sin minus sin. Randomly move the sample through the domain
- Got the queries working:
- Train and save a 2-layer, 400 neuron MLP. No ensembles for now
Set up GAN to add noise

Fika

Ask question about what the ACM and CHI are doing, beyond providing publication venues, to fight misinformation that lets millions of people find fabricated evidence that supports dangerous behavior.
Effects of Credibility Indicators on Social Media News Sharing Intent
- In recent years, social media services have been leveraged to spread fake news stories. Helping people spot fake stories by marking them with credibility indicators could dissuade them from sharing such stories, thus reducing their amplification. We carried out an online study (N = 1,512) to explore the impact of four types of credibility indicators on people’s intent to share news headlines with their friends on social media. We confirmed that credibility indicators can indeed decrease the propensity to share fake news. However, the impact of the indicators varied, with fact checking services being the most effective. We further found notable differences in responses to the indicators based on demographic and personal characteristics and social media usage frequency. Our findings have important implications for curbing the spread of misinformation via social media platforms.

Phil 5.1.20

Geez, it’s May! What a weird time

D20

Chatted with Zach. He’s bogged down in database issues, but I think it’s coming along

GPT-2 Agents

Upgrade TF, Torch, transformers, Nvidia, and CUDA on laptop
Set up input and output files
Pull char count of probe out and add that to the total generated
Try training on Moby Dick as per these instructions
- The following example fine-tunes GPT-2 on WikiText-2. We’re using the raw WikiText-2 (no tokens were replaced before the tokenization). The loss here is that of causal language modeling.
```
export TRAIN_FILE=/path/to/dataset/wiki.train.raw
export TEST_FILE=/path/to/dataset/wiki.test.raw

python run_language_modeling.py \
    --output_dir=output \
    --model_type=gpt2 \
    --model_name_or_path=gpt2 \
    --do_train \
    --train_data_file=$TRAIN_FILE \
    --do_eval \
    --eval_data_file=$TEST_FILE
```
  This takes about half an hour to train on a single K80 GPU and about one minute for the evaluation to run. It reaches a score of ~20 perplexity once fine-tuned on the dataset.

Ran with this command

python run_language_modeling.py --output_dir=output .\gpt2data\moby_dick_model --model_type=gpt2 --model_name_or_path=gpt2 --do_train --train_data_file=.\gptdata\moby_dick_train.txt --do_eval --eval_data_file=.\gptdata\moby_dick_test.txt

Which started the task correctly, but…

RuntimeError: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 8.00 GiB total capacity; 6.26 GiB already allocated; 77.55 MiB free; 6.31 GiB reserved in total by PyTorch)

Guess I’ll try running it on my work machine. If it runs there, I guess it’s time to upgrade my graphics card

That was not the problem! There is something going on with batch size. Added per_gpu_train_batch_size=1
Couldn’t use links. os.isfile() chokes

The model doesn’t seem to be saved? Looks like it is:

05/01/2020 09:43:49 - INFO - transformers.trainer -   Saving model checkpoint to output
05/01/2020 09:43:49 - INFO - transformers.configuration_utils -   Configuration saved in output\config.json
05/01/2020 09:43:50 - INFO - transformers.modeling_utils -   Model weights saved in output\pytorch_model.bin
05/01/2020 09:43:50 - INFO - __main__ -   *** Evaluate ***
05/01/2020 09:43:50 - INFO - transformers.trainer -   ***** Running Evaluation *****
05/01/2020 09:43:50 - INFO - transformers.trainer -     Num examples = 97
05/01/2020 09:43:50 - INFO - transformers.trainer -     Batch size = 16
Evaluation: 100%|██████████| 7/7 [00:06<00:00,  1.00it/s]
05/01/2020 09:43:57 - INFO - __main__ -   ***** Eval results *****
05/01/2020 09:43:57 - INFO - __main__ -     perplexity = 43.311306196182095

Found it. It defaults to the output directory in transformers/examples

To get this version, which is a PyTorch model, you have to add the ‘from_pt=True‘ argument:

model = TFGPT2LMHeadModel.from_pretrained("../data/moby_dick_model", pad_token_id=tokenizer.eos_token_id, from_pt=True)

And the results are great!

I enjoy walking with my cute dog:
	[0]: I enjoy walking with my cute dog, and then I like to take pictures! But, as for you, you will have to go all the way round for the proper weather! Here, I have some water in my belly! How am I
	[1]: I enjoy walking with my cute dog when I walk in the yard, and when we have been going in, I am always excited to try a little bit of the wildest stuff. I like to see my dogs do it. I like
	[2]: I enjoy walking with my cute dog because he has no fear of you leaving him alone. In that case, let me explain that I am a retired Sperm Whale in my Sperm Whale breeding herd. I was recently the leader of the

Far out in the uncharted backwaters of the unfashionable end:
	[0]: Far out in the uncharted backwaters of the unfashionable end of the Indian Ocean, you will see whales of many great variety. “Wherever they go, their mouths may be wide open, or they may be so packed
	[1]: Far out in the uncharted backwaters of the unfashionable end of the planet. On his way, it seemed that he was about to embark upon something which no mortal could have foreseen; it being the Cape Horn of the Pacific
	[2]: Far out in the uncharted backwaters of the unfashionable end. A curious discovery is made of the whale-whale. How much is he? I wonder how many sperm whales have there! I am still trying to get

It was a pleasure to burn. :
	[0]: It was a pleasure to burn. His teeth were the first thing to slide down to the side of his cheeks—a pointless thing—while my face stood there in this hideous position. It was my last, and only,
	[1]: It was a pleasure to burn. But, as the day wore on, another peculiarity was discovered in the method. When this first method was advanced to be used for preparing the best lye, it was found that it was, instead
	[2]: It was a pleasure to burn. “Sir, “aye, that’s true—” said I with a sort of exasperation. I then took one of the other boats and in a very similar

It was a bright cold day in April, and the clocks were striking thirteen. :
	[0]: It was a bright cold day in April, and the clocks were striking thirteen. It seemed that Captain Peleg had had just arrived, and was sitting in his Captain-Commander's cabin, and was trying to get up some time; but Pe
	[1]: It was a bright cold day in April, and the clocks were striking thirteen. One of us, who had been living in the tent for six days, still felt like the moon. I saw him. I saw him again. He looked just like
	[2]: It was a bright cold day in April, and the clocks were striking thirteen. “Good afternoon, sir, it was the very first Sabbath of the year, and the New Year is the first time the people of the world have an

Need to get the chess database and build a corpora. Working on a PGN to English translator. Doesn’t look toooooo bad

GOES

- Continue with GANS. Maybe explore 1D CNNs?
- The run with the high-frequency run actually looks pretty good:
  
  I think it may be a better use of my time to assemble all the components for a first pass proof-of concept

10:00 Meeting with Vadim and Isaac
- I walked through the whole controller architecture from the base class to the running version. Vadim will start implementing a Sim2 version using the base classes and the dictionary. Then we can work on writing to and reading from InfluxDB

Phil 4.30.20

Had some kind of power hiccup this morning and discovered that my computer was connected to the surge-suppressor part of the UPS. My box is now most unhappy as it recovers. On the plus side, computer recover from this sort of thing now.

D20

Fixed the neighbor list and was pleasantly surprised that it worked for the states

GPT-2Agents

Set up input and output files
Pull char count of probe out and add that to the total generated

Start looking into finetuning

Here are all the hugingface examples

export TRAIN_FILE=/path/to/dataset/wiki.train.raw
export TEST_FILE=/path/to/dataset/wiki.test.raw

python run_language_modeling.py \
    --output_dir=output \
    --model_type=gpt2 \
    --model_name_or_path=gpt2 \
    --do_train \
    --train_data_file=$TRAIN_FILE \
    --do_eval \
    --eval_data_file=$TEST_FILE

run_language_modeling.py source in GitHub

Tried running without any arguments as a sanity check, and got this: huggingface ImportError: cannot import name ‘MODEL_WITH_LM_HEAD_MAPPING’. Turns out that it won’t work without PyTorch being installed. Everything seems to be working now:

usage: run_language_modeling.py [-h] [--model_name_or_path MODEL_NAME_OR_PATH]
                                [--model_type MODEL_TYPE]
                                [--config_name CONFIG_NAME]
                                [--tokenizer_name TOKENIZER_NAME]
                                [--cache_dir CACHE_DIR]
                                [--train_data_file TRAIN_DATA_FILE]
                                [--eval_data_file EVAL_DATA_FILE]
                                [--line_by_line] [--mlm]
                                [--mlm_probability MLM_PROBABILITY]
                                [--block_size BLOCK_SIZE] [--overwrite_cache]
                                --output_dir OUTPUT_DIR
                                [--overwrite_output_dir] [--do_train]
                                [--do_eval] [--do_predict]
                                [--evaluate_during_training]
                                [--per_gpu_train_batch_size PER_GPU_TRAIN_BATCH_SIZE]
                                [--per_gpu_eval_batch_size PER_GPU_EVAL_BATCH_SIZE]
                                [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
                                [--learning_rate LEARNING_RATE]
                                [--weight_decay WEIGHT_DECAY]
                                [--adam_epsilon ADAM_EPSILON]
                                [--max_grad_norm MAX_GRAD_NORM]
                                [--num_train_epochs NUM_TRAIN_EPOCHS]
                                [--max_steps MAX_STEPS]
                                [--warmup_steps WARMUP_STEPS]
                                [--logging_dir LOGGING_DIR]
                                [--logging_first_step]
                                [--logging_steps LOGGING_STEPS]
                                [--save_steps SAVE_STEPS]
                                [--save_total_limit SAVE_TOTAL_LIMIT]
                                [--no_cuda] [--seed SEED] [--fp16]
                                [--fp16_opt_level FP16_OPT_LEVEL]
                                [--local_rank LOCAL_RANK]
run_language_modeling.py: error: the following arguments are required: --output_dir

And I still haven’t broken my text generation code. Astounding!

Moby Dick from Gutenberg
Chess
Covid tweets

Here’s the cite:

@article{Wolf2019HuggingFacesTS,
  title={HuggingFace's Transformers: State-of-the-art Natural Language Processing},
  author={Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and R'emi Louf and Morgan Funtowicz and Jamie Brew},
  journal={ArXiv},
  year={2019},
  volume={abs/1910.03771}
}

GOES

Set up meeting with Issac and Vadim for control
Continue with GAN
- Struggled with getting training to work for a while. I started by getting all the code to work, which included figuring out how the class labels worked (they just classify “real” vs “fake”. Then my results were terrible, basically noise. So I went back and parameterized the training and real data generation to try it on a smaller vector size. That seems to be working. Here’s the untrained model on a time series four elements long:
- And here’s the result after 10,000 epochs and a batch size of 64:
- That’s clearly not an accident. So progress!
- playing around with options based on this post and changed my Adam value from 0.01 to 0.001, and the output function from linear to tanh based on this random blog post. Better!
- I do not understand the loss/accuracy behavior though
  
  I think this is a good starting point! This is 16 points, and clearly the real loss function is still improving:
- Adding more variety of inputs:
- Trying adding layers. Nope, it generalized to a single sin wave
- Trying a bigger latent space of 16 dimensions up from 5:
- Splitting the difference and trying 8. Let’s see 5 again?
- Hmmm. I think I like the 16 better. Let’s go back to that with a batch size of 128 rather than 64. Better? I think?
- Let’s see what more samples does. Let’s try 100! Bad move. Let’s try 20, with a bigger random offset
- Ok, as a last thing for the day, I’m going to try more epochs. Going from 10,000 to 50,000:
- It definitely finds the best curve to forge. Have to think about that
Status report – done

Phil 4.29.20

D20

Waiting on maps

Adjust the neighbor list to look like this:

"United States of America": [
        "US",
        "Mexico",
        "Canada",
        "Cuba",
        "Honduras",
        "Dominican Republic",
        "Panama",
        "Colombia",
        "Ecuador",
        "Peru"
    ],

GPT-2 Agents

Trying this tutorial: How to generate text: using different decoding methods for language generation with Transformers. Very straightforward, with good examples that work!

Hooray! Installed and running!
Working with multiple inputs!

Examples:

I enjoy walking with my cute dog:
	[0]: I enjoy walking with my cute dog but also want to talk more about the dog experience. He wants to know how we feel and I'm sure he'll be impressed by our friendship! I've had him in our home from time to time and have
	[1]: I enjoy walking with my cute dog when I'm in town. His cute face really captures my life in a beautiful way. So much so that the dogs that have come before me feel very comfortable when I'm walking around. The dogs that I've
	[2]: I enjoy walking with my cute dog because she has no fear of people seeing her so they don't even think it's a threat, especially those of us who live near her in her area, where our dogs are raised to be the best, most

Far out in the uncharted backwaters of the unfashionable end:
	[0]: Far out in the uncharted backwaters of the unfashionable end of the world, as you wander through the barren wastes, I will tell you what happens there. The story is straightforward: you walk down a winding path up a hill
	[1]: Far out in the uncharted backwaters of the unfashionable end of the planet has taken a brave life. But it's in fact the one the world's most prolific land scientists have long wondered. Dr. Eric Shiek,
	[2]: Far out in the uncharted backwaters of the unfashionable end. The cold of night glints in the dark. The sun, hissing, is rising. The wind blows from the deep. You're in a room covered with water-

It was a pleasure to burn. :
	[0]: It was a pleasure to burn. "Ahaha. That's why I went to the hospital. It's still not been cleared by the state police, but I'm sure they will. They had me at my desk, and we were
	[1]: It was a pleasure to burn. Puertorrell: It seemed to me that this was something I had wanted to do since I was little. My mother had said that you should never let your father and mother do anything to him
	[2]: It was a pleasure to burn.

It was a bright cold day in April, and the clocks were striking thirteen. :
	[0]: It was a bright cold day in April, and the clocks were striking thirteen. It was a bad day in the capital. One of the clerks said the office had been closed for nearly an hour. The clerk pointed out that there had been a police
	[1]: It was a bright cold day in April, and the clocks were striking thirteen. One of us knew something about the clocks—the number that we were to see a thousand times was thirty-three and two, and I knew it was an alarm clock
	[2]: It was a bright cold day in April, and the clocks were striking thirteen. In the center of the chamber, an ice cube was placed in the ice and the ice cube was melted at the same time. There were three lines, one on top

Also interesting: The Current Best of Universal Word Embeddings and Sentence Embeddings

GOES

Build sequence 2 sequence GAN, or at least start
- Make real data generator – done
- Make generator input – done
- Make generator that outputs num_samples x vec_size

2:00 Meeting
- Went over status and gave kudos to Vadim
3:00 Meeting
- Discussed slide deck. T sent email to management to get info about the audience. We were told to not proceed
4:00 Meeting
- Went over the big data RFI. My involvement will be minimal, since it’s not algorithms, but infrastructure. Sounds like a submission though
Write Status for April
GVSETS paper deadline has been extended to June 1. Same template as before

Phil 4.28.20

ACSOS

Upload paper to Overleaf – done!

D20

Fix bug using this:

slope, intercept, r_value, p_value, std_err = stats.linregress(xsub, ysub)
# slope, intercept = np.polyfit(x, y, 1)
yn = np.polyval([slope, intercept], xsub)

steps = 0
if slope < 0:
    steps = abs(y[-1] / slope)

reg_x = []
reg_y = []
start = len(yl) - max_samples
yval = intercept + slope * start
for i in range(start, len(yl)-offset):
    reg_x.append(i)
    reg_y.append(yval)
    yval += slope

Anything else?

GPT-2 Agents

Install and test GPT-2 Client
Failed spectacularly. It depends on a lot of TF1.x items, like tensorflow.contrib.training. There is an issue request in.
Checked out the project to see if anything could be done. “Fixed” the contrib library, but that just exposed other things. Uninstalled.
Tried using the upgrade tool described here, which did absolutely nothing, as near as I can tell

GOES

Continue figuring out GANs
Here are results using 2 latent dimensions, a matching hint, a line hint, and no hint
Here are results using 5 latent dimensions, a matching hint, a line hint, and no hint
Meeting at 10:00 with Vadim and Isaac
- Wound up going over Isaac’s notes for Yaw Flip and learned a lot. He’s going to see if he can get the algorithm used for the maneuver. If so, we can build the control behavior around that. The goal is to minimize energy and indirectly fuel costs

Phil 4.27.20

Took the motorcycle for its weekly spin and rode past the BWI terminal. By far the most Zombie Apocalypse thing I’ve seen so far.

The repository contains an ongoing collection of tweets IDs associated with the novel coronavirus COVID-19 (SARS-CoV-2), which commenced on January 28, 2020.

D20

Reworked regression code to only use the last 14 days of data. It seems to take the slowing rate change into account better
That could be a nice interactive feature to add to the website. A js version of regression curve fitting is here.

ACSOS

Got Antonio’s revisions back and enbiggened the two chats for better readability

GPT-2 Agents

Going to try the GPT-2 Client and see how it works.
Whoops, needs TF 2.1. Upgraded that and the drivers – done

GOES

Step through the GAN code and look for ways of restricting the latent space to being near the simulation output
Here’s the GAN trying to fit a bit of a sin wave from the beginning of the day
And here’s the evolution of the GAN using hints and 5 latent dimensions from the end of the day:

And here are the accuracy outputs:

epoch = 399, real accuracy = 87.99999952316284%, fake accuracy = 37.99999952316284%
epoch = 799, real accuracy = 43.99999976158142%, fake accuracy = 56.99999928474426%
epoch = 1199, real accuracy = 81.00000023841858%, fake accuracy = 25.999999046325684%
epoch = 1599, real accuracy = 81.00000023841858%, fake accuracy = 40.99999964237213%
epoch = 1999, real accuracy = 87.99999952316284%, fake accuracy = 25.999999046325684%
epoch = 2399, real accuracy = 89.99999761581421%, fake accuracy = 20.000000298023224%
epoch = 2799, real accuracy = 87.00000047683716%, fake accuracy = 46.00000083446503%
epoch = 3199, real accuracy = 80.0000011920929%, fake accuracy = 47.999998927116394%
epoch = 3599, real accuracy = 76.99999809265137%, fake accuracy = 43.99999976158142%
epoch = 3999, real accuracy = 68.99999976158142%, fake accuracy = 30.000001192092896%
epoch = 4399, real accuracy = 75.0%, fake accuracy = 33.000001311302185%
epoch = 4799, real accuracy = 63.999998569488525%, fake accuracy = 28.00000011920929%
epoch = 5199, real accuracy = 50.0%, fake accuracy = 56.00000023841858%
epoch = 5599, real accuracy = 36.000001430511475%, fake accuracy = 56.00000023841858%
epoch = 5999, real accuracy = 49.000000953674316%, fake accuracy = 60.00000238418579%
epoch = 6399, real accuracy = 34.99999940395355%, fake accuracy = 58.99999737739563%
epoch = 6799, real accuracy = 70.99999785423279%, fake accuracy = 43.00000071525574%
epoch = 7199, real accuracy = 70.99999785423279%, fake accuracy = 30.000001192092896%
epoch = 7599, real accuracy = 47.999998927116394%, fake accuracy = 50.0%
epoch = 7999, real accuracy = 40.99999964237213%, fake accuracy = 52.99999713897705%
epoch = 8399, real accuracy = 23.000000417232513%, fake accuracy = 82.99999833106995%
epoch = 8799, real accuracy = 23.000000417232513%, fake accuracy = 75.0%
epoch = 9199, real accuracy = 31.00000023841858%, fake accuracy = 69.9999988079071%
epoch = 9599, real accuracy = 37.99999952316284%, fake accuracy = 68.00000071525574%
epoch = 9999, real accuracy = 23.000000417232513%, fake accuracy = 83.99999737739563%

Found a bug in the short-regression code. Need to roll in the fix

Here’s the working code:

slope, intercept, r_value, p_value, std_err = stats.linregress(xsub, ysub)
# slope, intercept = np.polyfit(x, y, 1)
yn = np.polyval([slope, intercept], xsub)

steps = 0
if slope < 0:
    steps = abs(y[-1] / slope)

reg_x = []
reg_y = []
start = len(yl) - max_samples
yval = intercept + slope * start
for i in range(start, len(yl)-offset):
    reg_x.append(i)
    reg_y.append(yval)
    yval += slope

Phil 4.24.20

It is very wet today

radar

Spent far too much time trying to upload a picture to the graduation site. It appears to be broken

D20

Changed the CONTROLLED days to < 2, since things are generally looking better

ACSOS

Sent the revised draft to Antonio

GPT-2 Agents

Found what appears to be just what I’m looking for. Searching on GitHub for GPT-2 tensorflow led me to this project, GPT-2 Client. I’ll give that a try and see how it works. The developer, Rishabh Anand seems to have solid skills so I have some hope that this could work. I do not have the energy to start this on a Friday and then switch to GANs for the rest of the day. Sunday looks like another wet one, so maybe then.

GOES

More looking at layers. This is Imagenet’s block3_conv3

Advanced CNNs
Start GANS? Yes!
- Got this version working. Now I need to step through it. But here are some plots of it learning:

- I had dreams about this, so I’m going to record the thinking here:
  - An MLP should be able to get from a simple simulation (square wave) to a more accurate(?) simulation sin wave. The data set is various start points and frequency queries into the DB, with matching (“real”/noisy) as the test. My intuition is that the noise will be lost, so that’s the part we’re going to have to get back with the GAN.
  - So I think there is a two-step process
    - Train the initial NN that will produce the generalized solution
    - Use the output of the NN and the “real” data to train the GAN for fine tuning

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: Tensorflow

Phil 5.15.20

Phil 5.14.20

Phil 5.12.20

Phil 5.11.20

Phil 5.8.20

Phil 5.7.20

Phil 5.6.20

Phil 5.5.20

Phil 5.4.20

Phil 5.1.20

Phil 4.30.20

Phil 4.29.20

Phil 4.28.20

Phil 4.27.20

Phil 4.24.20