Monthly Archives: April 2020

Phil 4.30.20

Had some kind of power hiccup this morning and discovered that my computer was connected to the surge-suppressor part of the UPS. My box is now most unhappy as it recovers. On the plus side, computer recover from this sort of thing now.


  • Fixed the neighbor list and was pleasantly surprised that it worked for the states


  • Set up input and output files
  • Pull char count of probe out and add that to the total generated
  • Start looking into finetuning
    • Here are all the hugingface examples
      • export TRAIN_FILE=/path/to/dataset/wiki.train.raw
        export TEST_FILE=/path/to/dataset/wiki.test.raw
        python \
            --output_dir=output \
            --model_type=gpt2 \
            --model_name_or_path=gpt2 \
            --do_train \
            --train_data_file=$TRAIN_FILE \
            --do_eval \
      • source in GitHub
      • Tried running without any arguments as a sanity check, and got this: huggingface ImportError: cannot import name ‘MODEL_WITH_LM_HEAD_MAPPING’. Turns out that it won’t work without PyTorch being installed. Everything seems to be working now:
        usage: [-h] [--model_name_or_path MODEL_NAME_OR_PATH]
                                        [--model_type MODEL_TYPE]
                                        [--config_name CONFIG_NAME]
                                        [--tokenizer_name TOKENIZER_NAME]
                                        [--cache_dir CACHE_DIR]
                                        [--train_data_file TRAIN_DATA_FILE]
                                        [--eval_data_file EVAL_DATA_FILE]
                                        [--line_by_line] [--mlm]
                                        [--mlm_probability MLM_PROBABILITY]
                                        [--block_size BLOCK_SIZE] [--overwrite_cache]
                                        --output_dir OUTPUT_DIR
                                        [--overwrite_output_dir] [--do_train]
                                        [--do_eval] [--do_predict]
                                        [--per_gpu_train_batch_size PER_GPU_TRAIN_BATCH_SIZE]
                                        [--per_gpu_eval_batch_size PER_GPU_EVAL_BATCH_SIZE]
                                        [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
                                        [--learning_rate LEARNING_RATE]
                                        [--weight_decay WEIGHT_DECAY]
                                        [--adam_epsilon ADAM_EPSILON]
                                        [--max_grad_norm MAX_GRAD_NORM]
                                        [--num_train_epochs NUM_TRAIN_EPOCHS]
                                        [--max_steps MAX_STEPS]
                                        [--warmup_steps WARMUP_STEPS]
                                        [--logging_dir LOGGING_DIR]
                                        [--logging_steps LOGGING_STEPS]
                                        [--save_steps SAVE_STEPS]
                                        [--save_total_limit SAVE_TOTAL_LIMIT]
                                        [--no_cuda] [--seed SEED] [--fp16]
                                        [--fp16_opt_level FP16_OPT_LEVEL]
                                        [--local_rank LOCAL_RANK] error: the following arguments are required: --output_dir

        And I still haven’t broken my text generation code. Astounding!

    • Moby Dick from Gutenberg
    • Chess
    • Covid tweets
    • Here’s the cite:
        title={HuggingFace's Transformers: State-of-the-art Natural Language Processing},
        author={Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and R'emi Louf and Morgan Funtowicz and Jamie Brew},


  • Set up meeting with Issac and Vadim for control
  • Continue with GAN
    • Struggled with getting training to work for a while. I started by getting all the code to work, which included figuring out how the class labels worked (they just classify “real” vs “fake”. Then my results were terrible, basically noise. So I went back and parameterized the training and real data generation to try it on a smaller vector size. That seems to be working. Here’s the untrained model on a time series four elements long: Four_element_untrained
    • And here’s the result after 10,000 epochs and a batch size of 64: Four_element_trained
    • That’s clearly not an accident. So progress!
    • playing around with options  based on this post and changed my Adam value from 0.01 to 0.001, and the output function from linear to tanh based on this random blog post. Better! Four_element_trained
    • I do not understand the loss/accuracy behavior though

      I think this is a good starting point! This is 16 points, and clearly the real loss function is still improving: Four_element_trainedacc_loss

    • Adding more variety of inputs: GAN_trained
    • Trying adding layers. Nope, it generalized to a single sin wave
    • Trying a bigger latent space of 16 dimensions up from 5:GAN_trained
    • Splitting the difference and trying 8. Let’s see 5 again? GAN_trained
    • Hmmm. I think I like the 16 better. Let’s go back to that with a batch size of 128 rather than 64. Better? I think?
    • Let’s see what more samples does. Let’s try 100! Bad move. Let’s try 20, with a bigger random offset GAN_trained
    • Ok, as a last thing for the day, I’m going to try more epochs. Going from 10,000 to 50,000:
    • It definitely finds the best curve to forge. Have to think about that
  • Status report – done

Phil 4.29.20


  • Waiting on maps
  • Adjust the neighbor list to look like this:
    "United States of America": [
            "Dominican Republic",


GPT-2 Agents

  • Trying this tutorial: How to generate text: using different decoding methods for language generation with Transformers. Very straightforward, with good examples that work!
    • Hooray! Installed and running!
    • Working with multiple inputs!
    • Examples:
      I enjoy walking with my cute dog:
      	[0]: I enjoy walking with my cute dog but also want to talk more about the dog experience. He wants to know how we feel and I'm sure he'll be impressed by our friendship! I've had him in our home from time to time and have
      	[1]: I enjoy walking with my cute dog when I'm in town. His cute face really captures my life in a beautiful way. So much so that the dogs that have come before me feel very comfortable when I'm walking around. The dogs that I've
      	[2]: I enjoy walking with my cute dog because she has no fear of people seeing her so they don't even think it's a threat, especially those of us who live near her in her area, where our dogs are raised to be the best, most
      Far out in the uncharted backwaters of the unfashionable end:
      	[0]: Far out in the uncharted backwaters of the unfashionable end of the world, as you wander through the barren wastes, I will tell you what happens there. The story is straightforward: you walk down a winding path up a hill
      	[1]: Far out in the uncharted backwaters of the unfashionable end of the planet has taken a brave life. But it's in fact the one the world's most prolific land scientists have long wondered. Dr. Eric Shiek,
      	[2]: Far out in the uncharted backwaters of the unfashionable end. The cold of night glints in the dark. The sun, hissing, is rising. The wind blows from the deep. You're in a room covered with water-
      It was a pleasure to burn. :
      	[0]: It was a pleasure to burn. "Ahaha. That's why I went to the hospital. It's still not been cleared by the state police, but I'm sure they will. They had me at my desk, and we were
      	[1]: It was a pleasure to burn. Puertorrell: It seemed to me that this was something I had wanted to do since I was little. My mother had said that you should never let your father and mother do anything to him
      	[2]: It was a pleasure to burn.
      It was a bright cold day in April, and the clocks were striking thirteen. :
      	[0]: It was a bright cold day in April, and the clocks were striking thirteen. It was a bad day in the capital. One of the clerks said the office had been closed for nearly an hour. The clerk pointed out that there had been a police
      	[1]: It was a bright cold day in April, and the clocks were striking thirteen. One of us knew something about the clocks—the number that we were to see a thousand times was thirty-three and two, and I knew it was an alarm clock
      	[2]: It was a bright cold day in April, and the clocks were striking thirteen. In the center of the chamber, an ice cube was placed in the ice and the ice cube was melted at the same time. There were three lines, one on top
  • Also interesting: The Current Best of Universal Word Embeddings and Sentence Embeddings


  • Build sequence 2 sequence GAN, or at least start
    • Make real data generator – done
    • Make generator input – done
    • Make generator that outputs num_samples x vec_size


  • 2:00 Meeting
    • Went over status and gave kudos to Vadim
  • 3:00 Meeting
    • Discussed slide deck. T sent email to management to get info about the audience. We were told to not proceed
  • 4:00 Meeting
    • Went over the big data RFI. My involvement will be minimal, since it’s not algorithms, but infrastructure. Sounds like a submission though
  • Write Status for April
  • GVSETS paper deadline has been extended to June 1. Same template as before

Phil 4.28.20


  • Upload paper to Overleaf – done!


  • Fix bug using this:
    slope, intercept, r_value, p_value, std_err = stats.linregress(xsub, ysub)
    # slope, intercept = np.polyfit(x, y, 1)
    yn = np.polyval([slope, intercept], xsub)
    steps = 0
    if slope < 0:
        steps = abs(y[-1] / slope)
    reg_x = []
    reg_y = []
    start = len(yl) - max_samples
    yval = intercept + slope * start
    for i in range(start, len(yl)-offset):
        yval += slope
  • Anything else?

GPT-2 Agents

  • Install and test GPT-2 Client
  • Failed spectacularly. It depends on a lot of TF1.x items, like There is an issue request in.
  • Checked out the project to see if anything could be done. “Fixed” the contrib library, but that just exposed other things. Uninstalled.
  • Tried using the upgrade tool described here, which did absolutely nothing, as near as I can tell


  • Continue figuring out GANs
  • Here are results using 2 latent dimensions, a matching hint, a line hint, and no hint
  • Here are results using 5 latent dimensions, a matching hint, a line hint, and no hint
  • Meeting at 10:00 with Vadim and Isaac
    • Wound up going over Isaac’s notes for Yaw Flip and learned a lot. He’s going to see if he can get the algorithm used for the maneuver. If so, we can build the control behavior around that. The goal is to minimize energy and indirectly fuel costs


Phil 4.27.20

Took the motorcycle for its weekly spin and rode past the BWI terminal. By far the most Zombie Apocalypse thing I’ve seen so far.

The repository contains an ongoing collection of tweets IDs associated with the novel coronavirus COVID-19 (SARS-CoV-2), which commenced on January 28, 2020.


  • Reworked regression code to only use the last 14 days of data. It seems to take the slowing rate change into account better
  • That could be a nice interactive feature to add to the website. A js version of regression curve fitting is here.


  • Got Antonio’s revisions back and enbiggened the two chats for better readability

GPT-2 Agents

  • Going to try the GPT-2 Client and see how it works.
  • Whoops, needs TF 2.1. Upgraded that and the drivers – done


  • Step through the GAN code and look for ways of restricting the latent space to being near the simulation output
  • Here’s the GAN trying to fit a bit of a sin wave from the beginning of the dayGAN2Sin
  • And here’s the evolution of the GAN using hints and 5 latent dimensions from the end of the day: GAN_fit
  • And here are the accuracy outputs:
    epoch = 399, real accuracy = 87.99999952316284%, fake accuracy = 37.99999952316284%
    epoch = 799, real accuracy = 43.99999976158142%, fake accuracy = 56.99999928474426%
    epoch = 1199, real accuracy = 81.00000023841858%, fake accuracy = 25.999999046325684%
    epoch = 1599, real accuracy = 81.00000023841858%, fake accuracy = 40.99999964237213%
    epoch = 1999, real accuracy = 87.99999952316284%, fake accuracy = 25.999999046325684%
    epoch = 2399, real accuracy = 89.99999761581421%, fake accuracy = 20.000000298023224%
    epoch = 2799, real accuracy = 87.00000047683716%, fake accuracy = 46.00000083446503%
    epoch = 3199, real accuracy = 80.0000011920929%, fake accuracy = 47.999998927116394%
    epoch = 3599, real accuracy = 76.99999809265137%, fake accuracy = 43.99999976158142%
    epoch = 3999, real accuracy = 68.99999976158142%, fake accuracy = 30.000001192092896%
    epoch = 4399, real accuracy = 75.0%, fake accuracy = 33.000001311302185%
    epoch = 4799, real accuracy = 63.999998569488525%, fake accuracy = 28.00000011920929%
    epoch = 5199, real accuracy = 50.0%, fake accuracy = 56.00000023841858%
    epoch = 5599, real accuracy = 36.000001430511475%, fake accuracy = 56.00000023841858%
    epoch = 5999, real accuracy = 49.000000953674316%, fake accuracy = 60.00000238418579%
    epoch = 6399, real accuracy = 34.99999940395355%, fake accuracy = 58.99999737739563%
    epoch = 6799, real accuracy = 70.99999785423279%, fake accuracy = 43.00000071525574%
    epoch = 7199, real accuracy = 70.99999785423279%, fake accuracy = 30.000001192092896%
    epoch = 7599, real accuracy = 47.999998927116394%, fake accuracy = 50.0%
    epoch = 7999, real accuracy = 40.99999964237213%, fake accuracy = 52.99999713897705%
    epoch = 8399, real accuracy = 23.000000417232513%, fake accuracy = 82.99999833106995%
    epoch = 8799, real accuracy = 23.000000417232513%, fake accuracy = 75.0%
    epoch = 9199, real accuracy = 31.00000023841858%, fake accuracy = 69.9999988079071%
    epoch = 9599, real accuracy = 37.99999952316284%, fake accuracy = 68.00000071525574%
    epoch = 9999, real accuracy = 23.000000417232513%, fake accuracy = 83.99999737739563%
  • Found a bug in the short-regression code. Need to roll in the fix regression
  • Here’s the working code:
    slope, intercept, r_value, p_value, std_err = stats.linregress(xsub, ysub)
    # slope, intercept = np.polyfit(x, y, 1)
    yn = np.polyval([slope, intercept], xsub)
    steps = 0
    if slope < 0:
        steps = abs(y[-1] / slope)
    reg_x = []
    reg_y = []
    start = len(yl) - max_samples
    yval = intercept + slope * start
    for i in range(start, len(yl)-offset):
        yval += slope


Phil 4.24.20

It is very wet today


Spent far too much time trying to upload a picture to the graduation site. It appears to be broken


  • Changed the CONTROLLED days to < 2, since things are generally looking better


  • Sent the revised draft to Antonio

GPT-2 Agents

  • Found what appears to be just what I’m looking for. Searching on GitHub for GPT-2 tensorflow led me to this project, GPT-2 Client. I’ll give that a try and see how it works. The developer, Rishabh Anand seems to have solid skills so I have some hope that this could work. I do  not have the energy to start this on a Friday and then switch to GANs for the rest of the day. Sunday looks like another wet one, so maybe then.


block_3_conv2More looking at layers. This is Imagenet’s block3_conv3

  • Advanced CNNs
  • Start GANS? Yes!
    • Got this version working. Now I need to step through it. But here are some plots of it learning:
    • I had dreams about this, so I’m going to record the thinking here:
      • An MLP should be able to get from a simple simulation (square wave) to a more accurate(?) simulation sin wave. The data set is various start points and frequency queries into the DB, with matching (“real”/noisy) as the test. My intuition is that the noise will be lost, so that’s the part we’re going to have to get back with the GAN.
      • So I think there is a two-step process
        • Train the initial NN that will produce the generalized solution
        • Use the output of the NN and the “real” data to train the GAN for fine tuning

Phil 4.23.20

Transformer Architecture: The Positional Encoding

  • In this article, I don’t plan to explain its architecture in depth as there are currently several great tutorials on this topic (herehere, and here), but alternatively, I want to discuss one specific part of the transformer’s architecture – the positional encoding.


  • Add centroids for states – done
  • Return the number of neighbors as an argument – done
  • Chatted with Aaron and Zach. More desire to continue than abandon


  • More revisions. Swap steps for discussion and future work


    • IRS proposal went in yesterday
    • Continue with GANs
    • Using the VGG model now with much better results. Also figured out how to loads weights and read the probabilities in the output layer: vgg
    • Same thing using the pre-trained model from Keras:
      from tensorflow.keras.applications.vgg16 import VGG16
      # prebuild model with pre-trained weights on imagenet
      model = VGG16(weights='imagenet', include_top=True)
      model.compile(optimizer='sgd', loss='categorical_crossentropy')


    • Trying to visualize a layer using this code. And using that code as a starting point, I had to explore how to slice up the tensors in the right way. A CNN layer has a set of “filters” that contain a square set of pixels. The data is stored as an array of pixels at each x, y, coordinate, so I had to figure out how to get one image at a time. Here’s my toy:
      import numpy as np
      import matplotlib.pyplot as plt
      n_rows = 4
      n_cols = 8
      depth = 4
      my_list = []
      for r in range(1, n_rows):
          row = []
          for c in range(1, n_cols):
              cell = []
              for d in range(depth):
      nl = np.array(my_list)
      for d in range(depth):
          print("\nlayer {} = \n{}".format(d, nl[:, :, d]))
          plt.imshow(nl[:, :, d], aspect='auto', cmap='plasma')
    • This gets features from a cat image at one of the pooling layers. The color map is completely arbitrary:
      # get the features from this block
      features = model.predict(x)
      farray = np.array(features[0])
      print("{}".format(farray[:, :, 0]))
      for d in range(4):
         plt.imshow(farray[:, :, d], aspect='auto', cmap='plasma')
    • But we get some cool pix!

Phil 4.22.20

  • Amsterdam, 24 April 2020​
  • This workshop aims to bring together researchers and practitioners from the emerging fields of Graph Representation Learning and Geometric Deep Learning. The workshop will feature invited talks and a poster session. There will be ample opportunity for discussion and networking.​
  • Invited talks will be live-streamed on YouTube:
  • Looking for an online seminar that presents the latest advances in reinforcement learning theory? You just found it! We aim to bring you a virtual seminar (approximately) every Tuesday at 5pm UTC featuring the latest work in theoretical reinforcement learning.


  • Added P-threshold to json file. I’m concerned that everyone is too busy to participate any more. Aaron hasn’t even asked about the project since he got better and is complaining about how overworked he is. Zach seems to be equally busy. If no one steps up by the end of the week, I think it’s time to either take over the project entirely or shut it down.


  • Started working on Antonio’s changes
  • Changed the MappApp so that the trajectory lines are blue


  • Finish CNN chapter
  • Enable Tensorflow profiling
    • Installed the plugin: pip install tensorboard_plugin_profile
    • Updated setup_tensorboard():
      def setup_tensorboard(dir_str: str, windows_slashes:bool = True) -> List:
          if windows_slashes:
              dir_str = dir_str.replace("/", "\\")
              print("no file {} at {}".format(dir_str, os.getcwd()))
          # use TensorBoard, princess Aurora!
          callbacks = [tf.keras.callbacks.TensorBoard(log_dir=dir_str, profile_batch = '500,510')]
          return callbacks
  • Huh. Looks like scipy.misc.imresize() and scipy.misc.imread() are both deprecated and out of the library. Trying opencv
    • pip install opencv-python
    • Here’s how I did it, with some debugging to varify that everything was working correctly thrown in:
      img_names = ['cat.jpg', 'steam-locomotive.jpg']
      img_list = []
      for name in img_names:
          img = cv2.imread(name)
          res = np.array(cv2.resize(img, dsize=(32, 32), interpolation=cv2.INTER_CUBIC))
          cv2.imwrite(name.replace(".jpg","_32x32.jpg"), res)
      imgs = np.transpose(img_list, (0, 2, 1, 3))
      imgs = np.array(img_list) / 255
  • This forced me to go down a transpose() in multiple dimensions rabbit hole that’s worth documenting. First, here’s code that takes some tiny images in an array and transposes them:
    import numpy as np
    img_list = [
        # image 1
        [[[10, 20, 30],
          [11, 21, 31],
          [12, 22, 32],
          [13, 23, 33]],
         [[255, 255, 255],
          [48, 45, 58],
          [101, 150, 205],
          [255, 255, 255]],
         [[255, 255, 255],
          [43, 56, 75],
          [77, 110, 157],
          [255, 255, 255]],
         [[255, 255, 255],
          [236, 236, 238],
          [76, 104, 139],
          [255, 255, 255]]],
        # image 2
        [[[100, 200, 300],
          [101, 201, 301],
          [102, 202, 302],
          [103, 203, 303]],
         [[159, 146, 145],
          [89, 74, 76],
          [207, 207, 210],
          [212, 203, 203]],
         [[145, 155, 164],
          [52, 40, 36],
          [166, 160, 163],
          [136, 132, 134]],
         [[61, 56, 60],
          [36, 32, 35],
          [202, 195, 195],
          [172, 165, 177]]]]
    np_imgs = np.array(img_list)
    print("np_imgs shape = {}".format(np_imgs.shape))
    imgs = np.transpose(img_list, (0, 2, 1, 3))
    print("imgs shape = {}".format(np_imgs.shape))
    #imgs = np.array(imgs) / 255
    print("pix 0: \n{}".format(np_imgs[0]))
    print("transposed pix 0: \n{}".format(imgs[0]))
    print("pix 1: \n{}".format(np_imgs[1]))
    print("transposed pix 1: \n{}".format(imgs[1]))
  • So, this is a complex matrix, with a shape of (2, 4, 4, 3). What we want to do is rotate the images (the inner 4, 4) by 90 degrees by transposing them. The way to understand Numpy’s transpose is that it interchanges two axis. The trick is understanding how.
  • For this matrix, applying a transpose that does nothing means writing this:
    imgs = np.transpose(img_list, (0, 1, 2, 3))
  • Think of it as an identity transpose. What we want to do is reverse the order of the inner 4, 4, which we do like this:
    imgs = np.transpose(img_list, (0, 2, 1, 3))
  • That’s it! Now the second “4” will be transposed with the first “4”. You can do this with any of the elements. So
    imgs = np.transpose(img_list, (3, 2, 1, 0))
  • Reverses everything!
  • Ok, so things are working, but the results are crap. Not really worrying about it for now because it’s CFAR and I always have this problem:
    ./images\airplane.jpg = [8] ship
    ./images\automobile.jpg = [0] airplane 
    ./images\bird.jpg = [4] deer
    ./images\cat.jpg = [0] airplane 
    ./images\cat2.jpg = [6] frog
    ./images\cat3.jpg = [8] ship
    ./images\deer.jpg = [8] ship
    ./images\dog.jpg = [2] bird
    ./images\horse.jpg = [8] ship
    ./images\ship.jpg = [0] airplane 
    ./images\steam-locomotive.jpg = [2] bird
    ./images\truck.jpg = [3] cat
    [8 0 4 0 6 8 8 2 8 0 2 3]


  • Meeting

Phil 2.21.20


  • Add P-threshold value to the json meta values. Items lower than this should be pale colors with black text

ML Seminar

  • finish reading paper, put together some quick slides – Done! Had to turn one table into charts
  • Good discussion. It makes me think that the way to have agents play games is to learn on text up to the latest point in the game as probe, and then add the question to the end of the probe, something like “<game text> + <character name> “. Have to try that with some novel text and talk to transformer. Here’s an example from the beginning of O Henry’s the Skylight Room. The Text begins: First Mrs. Parker would show you the double parlours. You would not dare to interrupt her description of their advantages and of the merits of the gentleman who had occupied them for eight years. Then you would manage to stammer forth the confession that you were neither a doctor nor a dentist. Mrs. Parker’s
    • O Henry: manner of receiving the admission was such that you could never afterward entertain the same feeling toward your parents, who had neglected to train you up in one of the professions that fitted Mrs. Parker’s parlours.
    • TTT 1: eyes would turn on you with the ferocity of her surgeon’s. “If I see a cripple with no teeth, I get him to be my apprentice, and he stays a month. I put him to work in a laundry. I give him a salary of seven hundred pounds a year, with a board and everything to live upon.”
    • TTT 2: memory is rather dim, and she was not accustomed to be questioned directly; and she was anxious to see that you could remember as little as possible. You felt sure that by telling a story from childhood that had so often caught your fancy, you would escape this forced duty. So you went in, decided to visit the fair each evening, and spent an hour or two putting it off.
    • TTT 3: vicarage would be a long, square room overlooking a grassy acre. Not less than four hundred acres were required to cover the floor of the park with his striking landscape. The grounds were monotonous, both in their beauty and in their repetition. To name a single flower would be so pedestrian an exercise that I am surprised that any of my country neighbours ever took that course. 
  • So there is something there, but working out the right probe is still an issue.


  • CNNs
  • The frontier of simulation-based inference
    • Many domains of science have developed complex simulations to describe phenomena of interest. While these simulations provide high-fidelity models, they are poorly suited for inference and lead to challenging inverse problems. We review the rapidly developing field of simulation-based inference and identify the forces giving new momentum to the field. Finally, we describe how the frontier is expanding so that a broad audience can appreciate the profound change these developments may have on science.

Phil 4.20.20


  • Reading the Distill article on Gaussian processes (highlighted page here)
  • Copy over neural-tangents code from notebook to IDE
  • Working on regression
  • Ran into a problem with Tensorboard
    Traceback (most recent call last):
      File "d:\program files\python37\lib\", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "d:\program files\python37\lib\", line 85, in _run_code
        exec(code, run_globals)
      File "D:\Program Files\Python37\Scripts\tensorboard.exe\", line 7, in 
      File "d:\program files\python37\lib\site-packages\tensorboard\", line 75, in run_main, flags_parser=tensorboard.configure)
      File "d:\program files\python37\lib\site-packages\absl\", line 299, in run
        _run_main(main, args)
      File "d:\program files\python37\lib\site-packages\absl\", line 250, in _run_main
      File "d:\program files\python37\lib\site-packages\tensorboard\", line 289, in main
        return runner(self.flags) or 0
      File "d:\program files\python37\lib\site-packages\tensorboard\", line 305, in _run_serve_subcommand
        server = self._make_server()
      File "d:\program files\python37\lib\site-packages\tensorboard\", line 409, in _make_server
        self.flags, self.plugin_loaders, self.assets_zip_provider
      File "d:\program files\python37\lib\site-packages\tensorboard\backend\", line 183, in standard_tensorboard_wsgi
        flags, plugin_loaders, data_provider, assets_zip_provider, multiplexer
      File "d:\program files\python37\lib\site-packages\tensorboard\backend\", line 272, in TensorBoardWSGIApp
        tbplugins, flags.path_prefix, data_provider, experimental_plugins
      File "d:\program files\python37\lib\site-packages\tensorboard\backend\", line 345, in __init__
        "Duplicate plugins for name %s" % plugin.plugin_name
    ValueError: Duplicate plugins for name projector
  • After poking around a bit online with the “Duplicate plugins for name %s” % plugin.plugin_name ValueError: Duplicate plugins for name projector, I found this diagnostic, which basically asked me to reinstall everything*. That didn’t work, so I went into the Python37\Lib\site-packages and deleted by hand. Tensorboard now runs, but now I need to upgrade my cuda so that I have cudart64_101.dll
    • Installed the minimum set of items from the Nvidia Package Launcher (cuda_10.1.105_418.96_win10.exe)
    • Installed the cuDNN drivers from here:
    • The regular (e.g. MNIST) demos work byt when I try the distribution code I got this error: tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op ‘NcclAllReduce’. It turns out that there are only two viable MirroredStrategy operations, for windows, and the default is not one of them. These are the valid calls:
      distribution = tf.distribute.MirroredStrategy(cross_device_ops=tf.distribute.ReductionToOneDevice())
      distribution = tf.distribute.MirroredStrategy(cross_device_ops=tf.distribute.HierarchicalCopyAllReduce())
    • And this call is not
      # distribution = tf.distribute.MirroredStrategy(cross_device_ops=tf.distribute.NcclAllReduce()) # <-- not valid for Windows
  • Funny thing. After reinstalling and getting everything to work, I tried the diagnostic again. It seems it always says to reinstall everything
  • And Tensorboard is working! Here’s the call that puts data in the directory:
    linear_est = tf.estimator.LinearRegressor(feature_columns=feature_columns, model_dir = 'logs/boston/')
  • And when launched on the command line pointing at the same directory:
    D:\Development\Tutorials\Deep Learning with TensorFlow 2 and Keras\Chapter 3>tensorboard --logdir=.\logs\boston
    2020-04-20 11:36:42.999208: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library cudart64_101.dll
    W0420 11:36:46.005735 18544] Found more than one graph event per run, or there was a metagraph containing a graph_def, as well as one or more graph events.  Overwriting the graph with the newest event.
    W0420 11:36:46.006743 18544] Found more than one metagraph event per run. Overwriting the metagraph with the newest event.
    Serving TensorBoard on localhost; to expose to the network, use a proxy or pass --bind_all
    TensorBoard 2.1.1 at http://localhost:6006/ (Press CTRL+C to quit)
  • I got this! tensoboard
  • Of course, we’re not done yet. When attempting to use the Keras callback, I get the following error: tensorflow.python.eager.profiler.ProfilerNotRunningError: Cannot stop profiling. No profiler is running. It turns out that you have to specify the log folder like this
      • command line:
        tensorboard --logdir=.\logs
      • in code:
        logpath = '.\\logs'



  • That seems to be working! RunningTBNN
  • Finished regression chapter


  • Submitted RFI response for review


  • Got Antonio’s comments back


  • Need to work on the math to find second bumps
    • If the rate has been < x% (maybe 2.5%), calculate an offset that leaves a value of 100 for each day. When the rate jumps more than y% (e.g. 100 – 120 = 20%), freeze that number until the rate settles down again and repeat the process
    • Change the number of samples to be the last x days
  • Work with Zach to get maps up?

ML seminar

Phil 4.19.20

This is interesting: Online Town

  • Online Town is a video-calling space that lets multiple people hold separate conversations in parallel. It lets you walk in, out and around those conversations just as easily as you would in real life.

More JAX and infinite-width networks. Get the code from the notebook and get it working in the IDE

Phil 4.18.20

Cross-Platform State Propaganda: Russian Trolls on Twitter and YouTube during the 2016 U.S. Presidential Election

  • This paper investigates online propaganda strategies of the Internet Research Agency (IRA)—Russian “trolls”—during the 2016 U.S. presidential election. We assess claims that the IRA sought either to (1) support Donald Trump or (2) sow discord among the U.S. public by analyzing hyperlinks contained in 108,781 IRA tweets. Our results show that although IRA accounts promoted links to both sides of the ideological spectrum, “conservative” trolls were more active than “liberal” ones. The IRA also shared content across social media platforms, particularly YouTube—the second-most linked destination among IRA tweets. Although overall news content shared by trolls leaned moderate to conservative, we find troll accounts on both sides of the ideological spectrum, and these accounts maintain their political alignment. Links to YouTube videos were decidedly conservative, however. While mixed, this evidence is consistent with the IRA’s supporting the Republican campaign, but the IRA’s strategy was multifaceted, with an ideological division of labor among accounts. We contextualize these results as consistent with a pre-propaganda strategy. This work demonstrates the need to view political communication in the context of the broader media ecology, as governments exploit the interconnected information ecosystem to pursue covert propaganda strategies

JAX Paper


  • Get centroids working – done!
    • Fixed names
    • For each country
  • Work on a “score” that looks at countries with larger(?) populations’s projections where the days to zero is less than 15. Do a distribution and then score

ML Seminar

  • Started to look at the neural tangents library.
  • Installed
  • Did a first pass through the Colab notebook. Need to but this in my IDE

Phil 4.17.20

Can You Beat COVID-19 Without a Lockdown? Sweden Is Trying

I dug into the predictions that we generate of Comparing Finland, Norway, and Sweden, it looks like something that Sweden did could result in about 2,600 people dying that don’t have to:




  • IRS proposal – done!
  • A better snippet: the best way to cheat on taxes is  to deliberately lie to the IRS about what you earned over a year, what you spent over a year, and the ways you would fill out those forms. This is where “time of year” really comes into play. The IRS assumes you worked on April 15 through the 15th of the following year in order to report and pay taxes on your actual income from April 15 through the following year. I’ve put some pictures and thoughts below. There are some really great readers who have put some excellent guides and resources out there on this topic. If you have any additional questions, please feel free to leave a comment below and I will do my best to answer them.
  • Another good snippet: The best way to cheat on taxes is  to set up an LLC or other tax-sheltered company that makes up for your sloth in paying business taxes. By doing this, you can deduct the business expenses and pay your taxes at a much lower tax rate, while also getting a tax refund. So, for example, if your net operating income for 2014 was $5,000 and you think you should owe about $2,000 in taxes for 2015, I suggest you set up a  S-Corporation   for 2015 that only owes $500 in taxes. Then, you can send the IRS a check for the difference between the $2,000 difference you owe them and the $5,000 net operating income for 2015.


  • Finish first pass? Done! And sent to Antonio!


Shortcut Learning in Deep Neural Networks

  • Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today’s machine intelligence. Numerous success stories have rapidly spread all over science, industry and society, but its limitations have only recently come into focus. In this perspective we seek to distil how many of deep learning’s problem can be seen as different symptoms of the same underlying problem: shortcut learning. Shortcuts are decision rules that perform well on standard benchmarks but fail to transfer to more challenging testing conditions, such as real-world scenarios. Related issues are known in Comparative Psychology, Education and Linguistics, suggesting that shortcut learning may be a common characteristic of learning systems, biological and artificial alike. Based on these observations, we develop a set of recommendations for model interpretation and benchmarking, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications.

Phil 4.16.20

Fix siding!

SageMathMore on SageTex here


  • Playing around with something to indicate the linear fit to the data. Trying P value
  • Updated UI code so that the P value will display on the next build
  • Hopefully we try the world map code today?



  • Learning more about multiple inputs to embedding and had to get the keras.utils.plot_model working, which failed with this error: ImportError: Failed to import pydot. You must install pydot and graphviz for `pydotprint` to work. So I pip installed both, and had the same problem.
  • Had problems running the distribution samples. Upgraded tf to version 2.1. No problems and better performance
  • Finished chapter 2


  • Struggled with picture placement. Moving on.
  • Finished first pass. I need to add more ABM text, but I’m down to 10 pages plus references!

Multi-input and multi-output models

  • Here’s a good use case for the functional API: models with multiple inputs and outputs. The functional API makes it easy to manipulate a large number of intertwined datastreams. Let’s consider the following model. We seek to predict how many retweets and likes a news headline will receive on Twitter. The main input to the model will be the headline itself, as a sequence of words, but to spice things up, our model will also have an auxiliary input, receiving extra data such as the time of day when the headline was posted, etc. The model will also be supervised via two loss functions. Using the main loss function earlier in a model is a good regularization mechanism for deep models.


Phil 4.15.20

Fix siding from wind!


  • Talked to Aaron about taking a derivative of the regression slope to see what it looks like. There may be common features in the pattern of rates, or of the slopes of the regressions changing over time
  • Still worried about countries that don’t report well. I’d like to be able to use rates from neighboring countries as some kind of check
  • Got the first pass on a world map json file done
  • Spread of SARS-CoV-2 in the Icelandic Population
    • As of April 4, a total of 1221 of 9199 persons (13.3%) who were recruited for targeted testing had positive results for infection with SARS-CoV-2. Of those tested in the general population, 87 (0.8%) in the open-invitation screening and 13 (0.6%) in the random-population screening tested positive for the virus. In total, 6% of the population was screened. Most persons in the targeted-testing group who received positive tests early in the study had recently traveled internationally, in contrast to those who tested positive later in the study. Children under 10 years of age were less likely to receive a positive result than were persons 10 years of age or older, with percentages of 6.7% and 13.7%, respectively, for targeted testing; in the population screening, no child under 10 years of age had a positive result, as compared with 0.8% of those 10 years of age or older. Fewer females than males received positive results both in targeted testing (11.0% vs. 16.7%) and in population screening (0.6% vs. 0.9%). The haplotypes of the sequenced SARS-CoV-2 viruses were diverse and changed over time. The percentage of infected participants that was determined through population screening remained stable for the 20-day duration of screening.


  • Finished first pass of the lit review. Now at 13 pages


  • Start looking at GANs. Also work on fixing Optevolver for multiple CPUs
    • Starting Deep Learning with TensorFlow 2 and Keras: Regression, ConvNets, GANs, RNNs, NLP, and more with TensorFlow 2 and the Keras API, 2nd Edition. Chapter six is GANs, which is what I’m interested in, but I’m ok with getting some review in first.
    • Working on embeddings with the IMDB sentiment analysis project. It’s the first time I’ve seen an embedding layer which is 1) Cool, and 2) Something to play with. I’d noticed when I was working with Word2Vec for my research that embeddings didn’t seem to change shape much as a function of the number of dimensions. It seemed like a lot of information was being kept at very low dimensions, like three, rather than the more accepted 128 or so:


    • Well, this example gave me an opportunity to test that with some accuracy numbers. Here’s what I get:


    • That is super interesting. It basically means that model building, testing, and visualization can happen at low dimensions. That makes everything faster, and with about a 10% improvement likely as one of the last steps.
    • Continuing with book.
  • Wrote up a response to Mike M’s questions about the white paper. Probably pointless, and has pretty much wasted my afternoon. And it was pointless! Now what?
  • Slides for John?

Phil 4.14.20

Fix siding from wind!


  • I want to try taking a second derivative of the rates to see what it looks like. There may be common features in the pattern of rates, or of the slopes of the regressions changing over time
  • I’m also getting worried about countries that don’t report well. I’d like to be able to use rates from neighboring countries as some kind of check
  • Work with Zach on cleanup and map integration?

COVID Twitter

  • Finished ingesting the new data. It took almost 24 hours


  • Finished first pass of the introduction. Still at 14 pages