Category Archives: Simulation

Phil 8.13.20

Ride through the park today and ask about pavilion rental – done

Māori Pronunciation

Iñupiaq (Inupiatun)

GPT-2 Agents

  • Rewrite intro, including the finding that these texts seem to be matched in some way. Done
  • Uploaded the new version to ArXiv. Should be live by tomorrow
  • Read Language Models as Knowledge Bases, and added to the lit review.
  • Discovered Antoine Bosselut, who was lead author on the following papers.Need to add them to the future work section
    • Dynamic Knowledge Graph Construction for Zero-shot Commonsense Question Answering
      • Understanding narratives requires dynamically reasoning about the implicit causes, effects, and states of the situations described in text, which in turn requires understanding rich background knowledge about how the social and physical world works. At the core of this challenge is how to access contextually relevant knowledge on demand and reason over it.
        In this paper, we present initial studies toward zero-shot commonsense QA by formulating the task as probabilistic inference over dynamically generated commonsense knowledge graphs. In contrast to previous studies for knowledge integration that rely on retrieval of existing knowledge from static knowledge graphs, our study requires commonsense knowledge integration where contextually relevant knowledge is often not present in existing knowledge bases. Therefore, we present a novel approach that generates contextually relevant knowledge on demand using generative neural commonsense knowledge models.
        Empirical results on the SocialIQa and StoryCommonsense datasets in a zero-shot setting demonstrate that using commonsense knowledge models to dynamically construct and reason over knowledge graphs achieves performance boosts over pre-trained language models and using knowledge models to directly evaluate answers.
    • COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
      • We present the first comprehensive study on automatic knowledge base construction for two prevalent commonsense knowledge graphs: ATOMIC (Sap et al., 2019) and ConceptNet (Speer et al., 2017). Contrary to many conventional KBs that store knowledge with canonical templates, commonsense KBs only store loosely structured open-text descriptions of knowledge. We posit that an important step toward automatic commonsense completion is the development of generative models of commonsense knowledge, and propose COMmonsEnse Transformers (COMET) that learn to generate rich and diverse commonsense descriptions in natural language. Despite the challenges of commonsense modeling, our investigation reveals promising results when implicit knowledge from deep pre-trained language models is transferred to generate explicit knowledge in commonsense knowledge graphs. Empirical results demonstrate that COMET is able to generate novel knowledge that humans rate as high quality, with up to 77.5% (ATOMIC) and 91.7% (ConceptNet) precision at top 1, which approaches human performance for these resources. Our findings suggest that using generative commonsense models for automatic commonsense KB completion could soon be a plausible alternative to extractive methods.


  • 10:00 sim status meeting – planning to fully evaluate off-axis rotation by Monday, then characterize Rwheel contribution, adjust the control system and start commanding vehicle rotations by the end of the week? Seems ambitions, but what the hell.
  • 2:00 status meeting
  • Anything about GVSETS? Yup: Meeting Wed 9/16/2020 9:00 AM – 10:00 AM


  • 5:30 meeting. Discuss proposal and additional meetings


  • Transfer more content

Phil 8.11.20

Zero-Shot Learning in Modern NLP

  • In this post, I will present a few techniques, both from published research and our own experiments at Hugging Face, for using state-of-the-art NLP models for sequence classification without large annotated training sets.

Found a really good dashboard for US economic indicators:



  • I think I realize my problem about the second axis. It’s not rotating around the origin, so the vectors that I’m using to create the rotation vectors are not right.
  • Fixed! Here are some rotations (180 around Z, 90 around x, and 360 around z, 180 around x)

GPT-2 Agents

  • I did 11 runs of S/He walked into the room and made word clouds:
  • I’m going to re-run this on my GPT-2 so I can have a larger N. Just need to do some things to the test code to output to a file


  • Finished the last review. The last paper was an ontological model with no computation in it
  • Uploaded and finished!

ML seminar

  • I have access to the Twitter data now. Need to download and store it in the db
  • Presentation next week

Phil 8.6.20

Coronavirus: The viral rumours that were completely wrong (BBC)

An ocean of Books (Google Arts & Culture Experiments)


Hopfield Networks is All You Need

  • We show that the transformer attention mechanism is the update rule of a modern Hopfield network with continuous states. This new Hopfield network can store exponentially (with the dimension) many patterns, converges with one update, and has exponentially small retrieval errors. The number of stored patterns is traded off against convergence speed and retrieval error. The new Hopfield network has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. Transformer and BERT models operate in their first layers preferably in the global averaging regime, while they operate in higher layers in metastable states. The gradient in transformers is maximal for metastable states, is uniformly distributed for global averaging, and vanishes for a fixed point near a stored pattern. Using the Hopfield network interpretation, we analyzed learning of transformer and BERT models. Learning starts with attention heads that average and then most of them switch to metastable states. However, the majority of heads in the first layers still averages and can be replaced by averaging, e.g. our proposed Gaussian weighting. In contrast, heads in the last layers steadily learn and seem to use metastable states to collect information created in lower layers. These heads seem to be a promising target for improving transformers. Neural networks with Hopfield networks outperform other methods on immune repertoire classification, where the Hopfield net stores several hundreds of thousands of patterns. We provide a new PyTorch layer called “Hopfield”, which allows to equip deep learning architectures with modern Hopfield networks as a new powerful concept comprising pooling, memory, and attention. GitHub: this https URL

Can GPT-3 Make Analogies?. By Melanie Mitchell | by Melanie Mitchell | Aug, 2020 | Medium


  • Going to try to get the translator working and inserting best effort into the DB. They we can make queries for the good results. Done! Here’s a shot of it chunking away. About one translation a second:



  • Work on quaternion frame tracking
  • This might help with visualization:
  • Updating my work box. Had a weird experience upgrading pip. It hit a permissions issue and failed out without rolling back. I had to use to get it back
  • Looking good:



  • 5:30(?) meeting
  • Project grant application


  • Write review – done. One to go!


Phil 7.30.20

GPT-2 Agents

  • Writing up graph creation


  • More coordinate transforms. I think a good way to do this is to determine the desired target vector, take the cross product of that with the current vector and use that as the axis of rotation. Recalculate each frame to keep the rotations on track
  • Think I got the hang of the quaternions for the target frame. I need to put things together in a nice class:



  • Sent Shimei a note about online CHI experiences in the current issue of  Interactions
  • Start writing review


  • 5:30 Meeting tonight

Phil 7.24.20

I had home-grown tomatoes this morning!

And I hung up my shiny new diploma!

GPT-2 Agents

  • I think it’s time to start writing the paper. Something like Synthetic Agents in Language Models: Navigating belief
    • Using the IEEE(ACSOS) template
    • Set up the paper with authors and dummy text. Starting to fill in the pieces
  • Writing the methods section and needed to count the number of games (#draw + #resigns). The easiest way to do this was jut to count all the word frequencies. Here are the top terms:
    to : 1474559
    from : 1472081
    moves : 1472071
    white : 1062561
    black : 1056840
    pawn : 392494
    in : 330044
    move : 307701
    takes : 307166
    rook : 258476
    knight : 250998
    bishop : 225442
    queen : 175254
    king : 173837
    pawn. : 145164
    check. : 91512


  • The list goes on a while. The most mentioned squares are d4 (56,224), d5(53,986), and f6(48,772)

God help me, I’m updating my IDE


  • Asked Vadim to exercise the satellite through +/- 90
  • Need to start working on the mapping of rwheels to inertial(?) frame. The thing is, the yaw axis rotates 360 degrees every day, so what frame do we use? My thinking is that the inertial frame (as defined by the star tracker) is unchanging, but we have a rotating frame inside that . The satellite’s moves are relative to that rotating frame plus the inertial frame. So the satellite’s first task is to keep its orientation relative to the rotating frame, then execute commands with respect to that frame. So a stacked matrix of inertial frame, Earth frame, vehicle matrix and then a matrix for each of the rwheels?

Phil 7.20.20

My guess it that barring interference of some kind all US cities will have something like what’s going on in Portland by election day

GPT-2 Agents

  • Back from break, and thinking about what to do next. I think the first thing to do is simply gather more data from the model. Right now I have about  1,500 GPT-2 moves and about 190,000 human moves. Increasing the number of predictions to 1,000 by adding a batch size value. Otherwise I got out-of-memory errors.
  • I had started the run in the morning and was almost done when a power failure hit and the UPS didn’t work. Ordered a new UPS. Tried to be clever about finishing off the last piece of data but left in the code that truncated the table. Ah, well. Starting over.
  • Next is to adjust the queries so that the populations are more similar. The GPT-2 moves come from the following prompts:
    probe_list = ['The game begins as ', 'In move 10', 'In move 20', 'In move 30', 'In move 40', 'White takes black ', 'Black takes white ', 'Check. ']

    That means I should adjust my queries of the human data to reflect those biases, something like:

    select * from table_actual where move_number = 1 order by move_number limit 50;

    which should match the probe ‘The game begins as ‘.

  • I’d also like to run longer, full games (look for ‘resigns’, ‘draw’, or ‘wins’) and parse them, but that’s for later.
  • Need to figure out the statistics to compare populations. I think I’m going to take some time and look through the NIST Engineering Statistics Handbook



  • Vadim seems to have made progress. Need to set up a meeting to chat and catch up
  • 2:00 meeting with V & E. Good progress!
  • GVSETS has been moved to Nov 3. Speaking of which, I’ll need to compare simulated and actual maneuvers, so stats here too. Now that the moves are cooking I’ll start on the stats


Phil 7.2.20

Emergence of polarized ideological opinions in multidimensional topic spaces

  • Opinion polarization is on the rise, causing concerns for the openness of public debates. Additionally, extreme opinions on different topics often show significant correlations. The dynamics leading to these polarized ideological opinions pose a challenge: How can such correlations emerge, without assuming them a priori in the individual preferences or in a preexisting social structure? Here we propose a simple model that reproduces ideological opinion states found in survey data, even between rather unrelated, but sufficiently controversial, topics. Inspired by skew coordinate systems recently proposed in natural language processing models, we solidify these intuitions in a formalism where opinions evolve in a multidimensional space where topics form a non-orthogonal basis. The model features a phase transition between consensus, opinion polarization, and ideological states, which we analytically characterize as a function of the controversialness and overlap of the topics. Our findings shed light upon the mechanisms driving the emergence of ideology in the formation of opinions.

DtZ has broken



  • Continue working on the trajectory. I think that a plot that works entirely on distance to target can result in spirals, so there needs to be some kind of system that looks at the distance to the center line first, and if there is a fail, move the last node from the trajectory list to a dirty list. Then the search restores the cur node to the previous, and continue the search with the trajectory and dirty list nodes ignored?
  • Found an example to fix: A6 – H7
    • get_closest_node() line = [337.0, 44.0, 581.0, 499.0], cur_node = h1, node_list = [‘a6’, ‘b6’, ‘c7’, ‘d7’, ‘e6’, ‘c5’, ‘b7’, ‘g7’, ‘h6’, ‘g6’, ‘c6’, ‘e7’, ‘f7’, ‘g8’, ‘f6’, ‘d8’, ‘a8’, ‘e8’, ‘d6’, ‘b4’, ‘b8’, ‘c8’, ‘c4’, ‘e5’, ‘d5’, ‘d4’, ‘b5’, ‘c3’, ‘e4’, ‘f5’, ‘f8’, ‘f4’, ‘g5’, ‘g4’, ‘h5’, ‘h4’, ‘f3’, ‘d3’, ‘c2’, ‘e3’, ‘d2’, ‘e2’, ‘b2’, ‘b1’, ‘c1’, ‘e1’, ‘d1’, ‘a1’, ‘f1’, ‘g3’, ‘h3’, ‘g2’, ‘f2’, ‘g1’, ‘h2’, ‘h1’]
    • It does fine until it gets to E6, where it chooses c5
    • Adding a target distance-based search if the distance to line search fails seems to have fixed it:
      nlist = list(nx.all_neighbors(self.gml_model, cur_node))
      print("\tneighbors = {}".format(nlist))
      dist_dict = {}
      sx, sy = self.get_center(cur_node)
      for n in nlist:
          if n not in node_list:
              newx, newy = self.get_center(n)
              newa = [newx, newy]
              print("\tline dist checking {} at {}".format(n, newa))
              x, y = self.point_to_line([l[0], l[1]], [l[2], l[3]], newa)
              ca = [x, y]
              ib = self.is_between([sx, sy], [l[2], l[3]], [x, y])
              if ib:
                  # option 1: Find the closest to the line
                  dist = np.linalg.norm(np.array(newa)-np.array(ca))
                  dist_dict[n] = dist
                  print("\tis BETWEEN = {}, dist = {}".format(ib, dist))
      if len(dist_dict) == 0:
          ta = [self.get_center(self.target_node)]
          for n in nlist:
              if n not in node_list:
                  newx, newy = self.get_center(n)
                  newa = [newx, newy]
                  print("\ttarget dist checking {} at {}".format(n, newa))
                  # option 2: Find the closest to the target node
                  dist = np.linalg.norm(np.array(newa)-np.array(ta))
                  dist_dict[n] = dist
                  print("\tis CLOSEST: dist = {}".format(dist))
  • Got legal trajectories working. Below is a set of jumps that are legal (rook to c1, bishop to e3 and then h6, then rook the rest of the way) I think I want to also sort based on closest distance to the current node.



  • Add InfluxDB streaming to DD
  • 10:00 Sim meeting
  • 2:00 Status meeting

Phil 5.14.20

GPT-2 Agents

  • Adding hints and meta information. Still need to handle special pawn moves and pull out comments
    • /[^{\}]+(?=})/g, from
  • Amusingly, my simple parser is now at 560 LOC and counting


  • Working on creating a discriminator using Conv1D layers.
    • With some help from Aaron, I got the discriminator working. There are some issues. I’m currently using batch_input_shape rather than input_shape, which beans that a pre-sized batch is compiled in. The second issue is that the discriminator requires a 3D vector to be fed in, which can’t be produced naturally with Dense/MLP. That means the Generator also has to use Conv1D at least at the output layer
    • I think this post: How to Develop 1D Convolutional Neural Network Models for Human Activity Recognition should help, but I don’t think I have the cognitive ability at the end of the day. Tomorrow

Phil 5.8.20


  • Really have to fix the trending. Places like Brazil, where the disease is likely to be chronic, are not working any more
  • Aaron and I agree if the site’s not updated by 5/15 to pull it down

GPT-2 Agents

  • More PGNtoEnglish
  • Worked out way to search for pieces in a rules-based range. It’ll work for pawns, knights, and kings right now. Will need to add rooks, bishops and queens


  • Try finetuning the model on Arabic to see what happens. Don’t see the txt files?


  • The time taken for all the DB calls is substantial. I need to change the Measurements class so that there is a set of master Measurements that are big enough to subsample other Measurements from. Done. Much faster!
  • Start building noise query, possibly using a high pass filter? Otherwise, subtract the “real” signal from the simulated one
    • Starting with the subtraction, since I have to set up queries anyway, and this will help me debug them
    • Created NoiseGAN class that extends OneDGAN
    • Pulling over table building code from InfluxTestTrainBase()
    • Success!
    • "D:\Program Files\Python37\python.exe" D:/Development/Sandboxes/Influx2_ML/Influx2_ML/
      2020-05-08 14:45:36.077292: I tensorflow/stream_executor/platform/default/] Successfully opened dynamic library cudart64_101.dll
      query = from(bucket:"org_1_bucket") |> range(start:2020-04-13T13:30:00Z, stop:2020-04-13T13:40:00Z) |> filter(fn:(r) => r.type == "noisy_sin" and (r.period == "8"))
      vector size = 100, query returns = 590
    • Probably a good place to stop for the day
  • 10:00 Meeting. Vadim seems to be making good progress. Check in on Tuesday

Phil 5.4.20

It is a Chopin sort of morning


  • Zach got maps and lists working over the weekend. Still a lot more to do though
  • Need to revisit the math to work over the past days

GPT-2 Agents

  • Working on PGN to English.
    • Added game class that contains all the information for a game and reads it in. Games are created and managed by the PGNtoEnglish class
  • Rebased the transformers project. It updates fast


  • Figure out how to save and load models. I’m really not sure what to save, since you need access to the latent space and the discriminator? So far, it’s:
    def save_models(self, directory:str, prefix:str):
        p = os.getcwd()
    def load_models(self, directory:str, prefix:str):
        p = os.getcwd()
        self.d_model = tf.keras.models.load_model("{}}".format(prefix))
        self.g_model = tf.keras.models.load_model("{}}".format(prefix))
        self.gan_model = tf.keras.models.load_model("{}}".format(prefix))
    • Here’s the initial run. Very nice for 10,000 epochs!


    • And here’s the results from the loaded model:


    • The discriminator works as well:
      real accuracy = 100.00%, fake accuracy = 100.00%
      real loss = 0.0154, fake loss = 0.0947%
    • An odd thing is that I can save the GAN model, but can’t load it?
      ValueError: An empty Model cannot be used as a Layer.

      I can rebuild it from the loaded generator and discriminator models though

  • Set up MLP to convert low-fidelity sin waves to high-fidelity
    • Get the training and test data from InfluxDB
      • input is square, output is sin, and the GAN should be noisy_sin minus sin. Randomly move the sample through the domain
    • Got the queries working:
    • Train and save a 2-layer, 400 neuron MLP. No ensembles for now
  • Set up GAN to add noise


  • Ask question about what the ACM and CHI are doing, beyond providing publication venues, to fight misinformation that lets millions of people find fabricated evidence that supports dangerous behavior.
  • Effects of Credibility Indicators on Social Media News Sharing Intent
    • In recent years, social media services have been leveraged to spread fake news stories. Helping people spot fake stories by marking them with credibility indicators could dissuade them from sharing such stories, thus reducing their amplification. We carried out an online study (N = 1,512) to explore the impact of four types of credibility indicators on people’s intent to share news headlines with their friends on social media. We confirmed that credibility indicators can indeed decrease the propensity to share fake news. However, the impact of the indicators varied, with fact checking services being the most effective. We further found notable differences in responses to the indicators based on demographic and personal characteristics and social media usage frequency. Our findings have important implications for curbing the spread of misinformation via social media platforms.

Phil 4.30.20

Had some kind of power hiccup this morning and discovered that my computer was connected to the surge-suppressor part of the UPS. My box is now most unhappy as it recovers. On the plus side, computer recover from this sort of thing now.


  • Fixed the neighbor list and was pleasantly surprised that it worked for the states


  • Set up input and output files
  • Pull char count of probe out and add that to the total generated
  • Start looking into finetuning
    • Here are all the hugingface examples
      • export TRAIN_FILE=/path/to/dataset/wiki.train.raw
        export TEST_FILE=/path/to/dataset/wiki.test.raw
        python \
            --output_dir=output \
            --model_type=gpt2 \
            --model_name_or_path=gpt2 \
            --do_train \
            --train_data_file=$TRAIN_FILE \
            --do_eval \
      • source in GitHub
      • Tried running without any arguments as a sanity check, and got this: huggingface ImportError: cannot import name ‘MODEL_WITH_LM_HEAD_MAPPING’. Turns out that it won’t work without PyTorch being installed. Everything seems to be working now:
        usage: [-h] [--model_name_or_path MODEL_NAME_OR_PATH]
                                        [--model_type MODEL_TYPE]
                                        [--config_name CONFIG_NAME]
                                        [--tokenizer_name TOKENIZER_NAME]
                                        [--cache_dir CACHE_DIR]
                                        [--train_data_file TRAIN_DATA_FILE]
                                        [--eval_data_file EVAL_DATA_FILE]
                                        [--line_by_line] [--mlm]
                                        [--mlm_probability MLM_PROBABILITY]
                                        [--block_size BLOCK_SIZE] [--overwrite_cache]
                                        --output_dir OUTPUT_DIR
                                        [--overwrite_output_dir] [--do_train]
                                        [--do_eval] [--do_predict]
                                        [--per_gpu_train_batch_size PER_GPU_TRAIN_BATCH_SIZE]
                                        [--per_gpu_eval_batch_size PER_GPU_EVAL_BATCH_SIZE]
                                        [--gradient_accumulation_steps GRADIENT_ACCUMULATION_STEPS]
                                        [--learning_rate LEARNING_RATE]
                                        [--weight_decay WEIGHT_DECAY]
                                        [--adam_epsilon ADAM_EPSILON]
                                        [--max_grad_norm MAX_GRAD_NORM]
                                        [--num_train_epochs NUM_TRAIN_EPOCHS]
                                        [--max_steps MAX_STEPS]
                                        [--warmup_steps WARMUP_STEPS]
                                        [--logging_dir LOGGING_DIR]
                                        [--logging_steps LOGGING_STEPS]
                                        [--save_steps SAVE_STEPS]
                                        [--save_total_limit SAVE_TOTAL_LIMIT]
                                        [--no_cuda] [--seed SEED] [--fp16]
                                        [--fp16_opt_level FP16_OPT_LEVEL]
                                        [--local_rank LOCAL_RANK] error: the following arguments are required: --output_dir

        And I still haven’t broken my text generation code. Astounding!

    • Moby Dick from Gutenberg
    • Chess
    • Covid tweets
    • Here’s the cite:
        title={HuggingFace's Transformers: State-of-the-art Natural Language Processing},
        author={Thomas Wolf and Lysandre Debut and Victor Sanh and Julien Chaumond and Clement Delangue and Anthony Moi and Pierric Cistac and Tim Rault and R'emi Louf and Morgan Funtowicz and Jamie Brew},


  • Set up meeting with Issac and Vadim for control
  • Continue with GAN
    • Struggled with getting training to work for a while. I started by getting all the code to work, which included figuring out how the class labels worked (they just classify “real” vs “fake”. Then my results were terrible, basically noise. So I went back and parameterized the training and real data generation to try it on a smaller vector size. That seems to be working. Here’s the untrained model on a time series four elements long: Four_element_untrained
    • And here’s the result after 10,000 epochs and a batch size of 64: Four_element_trained
    • That’s clearly not an accident. So progress!
    • playing around with options  based on this post and changed my Adam value from 0.01 to 0.001, and the output function from linear to tanh based on this random blog post. Better! Four_element_trained
    • I do not understand the loss/accuracy behavior though

      I think this is a good starting point! This is 16 points, and clearly the real loss function is still improving: Four_element_trainedacc_loss

    • Adding more variety of inputs: GAN_trained
    • Trying adding layers. Nope, it generalized to a single sin wave
    • Trying a bigger latent space of 16 dimensions up from 5:GAN_trained
    • Splitting the difference and trying 8. Let’s see 5 again? GAN_trained
    • Hmmm. I think I like the 16 better. Let’s go back to that with a batch size of 128 rather than 64. Better? I think?
    • Let’s see what more samples does. Let’s try 100! Bad move. Let’s try 20, with a bigger random offset GAN_trained
    • Ok, as a last thing for the day, I’m going to try more epochs. Going from 10,000 to 50,000:
    • It definitely finds the best curve to forge. Have to think about that
  • Status report – done

Phil 4.28.20


  • Upload paper to Overleaf – done!


  • Fix bug using this:
    slope, intercept, r_value, p_value, std_err = stats.linregress(xsub, ysub)
    # slope, intercept = np.polyfit(x, y, 1)
    yn = np.polyval([slope, intercept], xsub)
    steps = 0
    if slope < 0:
        steps = abs(y[-1] / slope)
    reg_x = []
    reg_y = []
    start = len(yl) - max_samples
    yval = intercept + slope * start
    for i in range(start, len(yl)-offset):
        yval += slope
  • Anything else?

GPT-2 Agents

  • Install and test GPT-2 Client
  • Failed spectacularly. It depends on a lot of TF1.x items, like There is an issue request in.
  • Checked out the project to see if anything could be done. “Fixed” the contrib library, but that just exposed other things. Uninstalled.
  • Tried using the upgrade tool described here, which did absolutely nothing, as near as I can tell


  • Continue figuring out GANs
  • Here are results using 2 latent dimensions, a matching hint, a line hint, and no hint
  • Here are results using 5 latent dimensions, a matching hint, a line hint, and no hint
  • Meeting at 10:00 with Vadim and Isaac
    • Wound up going over Isaac’s notes for Yaw Flip and learned a lot. He’s going to see if he can get the algorithm used for the maneuver. If so, we can build the control behavior around that. The goal is to minimize energy and indirectly fuel costs


Phil 4.27.20

Took the motorcycle for its weekly spin and rode past the BWI terminal. By far the most Zombie Apocalypse thing I’ve seen so far.

The repository contains an ongoing collection of tweets IDs associated with the novel coronavirus COVID-19 (SARS-CoV-2), which commenced on January 28, 2020.


  • Reworked regression code to only use the last 14 days of data. It seems to take the slowing rate change into account better
  • That could be a nice interactive feature to add to the website. A js version of regression curve fitting is here.


  • Got Antonio’s revisions back and enbiggened the two chats for better readability

GPT-2 Agents

  • Going to try the GPT-2 Client and see how it works.
  • Whoops, needs TF 2.1. Upgraded that and the drivers – done


  • Step through the GAN code and look for ways of restricting the latent space to being near the simulation output
  • Here’s the GAN trying to fit a bit of a sin wave from the beginning of the dayGAN2Sin
  • And here’s the evolution of the GAN using hints and 5 latent dimensions from the end of the day: GAN_fit
  • And here are the accuracy outputs:
    epoch = 399, real accuracy = 87.99999952316284%, fake accuracy = 37.99999952316284%
    epoch = 799, real accuracy = 43.99999976158142%, fake accuracy = 56.99999928474426%
    epoch = 1199, real accuracy = 81.00000023841858%, fake accuracy = 25.999999046325684%
    epoch = 1599, real accuracy = 81.00000023841858%, fake accuracy = 40.99999964237213%
    epoch = 1999, real accuracy = 87.99999952316284%, fake accuracy = 25.999999046325684%
    epoch = 2399, real accuracy = 89.99999761581421%, fake accuracy = 20.000000298023224%
    epoch = 2799, real accuracy = 87.00000047683716%, fake accuracy = 46.00000083446503%
    epoch = 3199, real accuracy = 80.0000011920929%, fake accuracy = 47.999998927116394%
    epoch = 3599, real accuracy = 76.99999809265137%, fake accuracy = 43.99999976158142%
    epoch = 3999, real accuracy = 68.99999976158142%, fake accuracy = 30.000001192092896%
    epoch = 4399, real accuracy = 75.0%, fake accuracy = 33.000001311302185%
    epoch = 4799, real accuracy = 63.999998569488525%, fake accuracy = 28.00000011920929%
    epoch = 5199, real accuracy = 50.0%, fake accuracy = 56.00000023841858%
    epoch = 5599, real accuracy = 36.000001430511475%, fake accuracy = 56.00000023841858%
    epoch = 5999, real accuracy = 49.000000953674316%, fake accuracy = 60.00000238418579%
    epoch = 6399, real accuracy = 34.99999940395355%, fake accuracy = 58.99999737739563%
    epoch = 6799, real accuracy = 70.99999785423279%, fake accuracy = 43.00000071525574%
    epoch = 7199, real accuracy = 70.99999785423279%, fake accuracy = 30.000001192092896%
    epoch = 7599, real accuracy = 47.999998927116394%, fake accuracy = 50.0%
    epoch = 7999, real accuracy = 40.99999964237213%, fake accuracy = 52.99999713897705%
    epoch = 8399, real accuracy = 23.000000417232513%, fake accuracy = 82.99999833106995%
    epoch = 8799, real accuracy = 23.000000417232513%, fake accuracy = 75.0%
    epoch = 9199, real accuracy = 31.00000023841858%, fake accuracy = 69.9999988079071%
    epoch = 9599, real accuracy = 37.99999952316284%, fake accuracy = 68.00000071525574%
    epoch = 9999, real accuracy = 23.000000417232513%, fake accuracy = 83.99999737739563%
  • Found a bug in the short-regression code. Need to roll in the fix regression
  • Here’s the working code:
    slope, intercept, r_value, p_value, std_err = stats.linregress(xsub, ysub)
    # slope, intercept = np.polyfit(x, y, 1)
    yn = np.polyval([slope, intercept], xsub)
    steps = 0
    if slope < 0:
        steps = abs(y[-1] / slope)
    reg_x = []
    reg_y = []
    start = len(yl) - max_samples
    yval = intercept + slope * start
    for i in range(start, len(yl)-offset):
        yval += slope


Phil 4.22.20

  • Amsterdam, 24 April 2020​
  • This workshop aims to bring together researchers and practitioners from the emerging fields of Graph Representation Learning and Geometric Deep Learning. The workshop will feature invited talks and a poster session. There will be ample opportunity for discussion and networking.​
  • Invited talks will be live-streamed on YouTube:
  • Looking for an online seminar that presents the latest advances in reinforcement learning theory? You just found it! We aim to bring you a virtual seminar (approximately) every Tuesday at 5pm UTC featuring the latest work in theoretical reinforcement learning.


  • Added P-threshold to json file. I’m concerned that everyone is too busy to participate any more. Aaron hasn’t even asked about the project since he got better and is complaining about how overworked he is. Zach seems to be equally busy. If no one steps up by the end of the week, I think it’s time to either take over the project entirely or shut it down.


  • Started working on Antonio’s changes
  • Changed the MappApp so that the trajectory lines are blue


  • Finish CNN chapter
  • Enable Tensorflow profiling
    • Installed the plugin: pip install tensorboard_plugin_profile
    • Updated setup_tensorboard():
      def setup_tensorboard(dir_str: str, windows_slashes:bool = True) -> List:
          if windows_slashes:
              dir_str = dir_str.replace("/", "\\")
              print("no file {} at {}".format(dir_str, os.getcwd()))
          # use TensorBoard, princess Aurora!
          callbacks = [tf.keras.callbacks.TensorBoard(log_dir=dir_str, profile_batch = '500,510')]
          return callbacks
  • Huh. Looks like scipy.misc.imresize() and scipy.misc.imread() are both deprecated and out of the library. Trying opencv
    • pip install opencv-python
    • Here’s how I did it, with some debugging to varify that everything was working correctly thrown in:
      img_names = ['cat.jpg', 'steam-locomotive.jpg']
      img_list = []
      for name in img_names:
          img = cv2.imread(name)
          res = np.array(cv2.resize(img, dsize=(32, 32), interpolation=cv2.INTER_CUBIC))
          cv2.imwrite(name.replace(".jpg","_32x32.jpg"), res)
      imgs = np.transpose(img_list, (0, 2, 1, 3))
      imgs = np.array(img_list) / 255
  • This forced me to go down a transpose() in multiple dimensions rabbit hole that’s worth documenting. First, here’s code that takes some tiny images in an array and transposes them:
    import numpy as np
    img_list = [
        # image 1
        [[[10, 20, 30],
          [11, 21, 31],
          [12, 22, 32],
          [13, 23, 33]],
         [[255, 255, 255],
          [48, 45, 58],
          [101, 150, 205],
          [255, 255, 255]],
         [[255, 255, 255],
          [43, 56, 75],
          [77, 110, 157],
          [255, 255, 255]],
         [[255, 255, 255],
          [236, 236, 238],
          [76, 104, 139],
          [255, 255, 255]]],
        # image 2
        [[[100, 200, 300],
          [101, 201, 301],
          [102, 202, 302],
          [103, 203, 303]],
         [[159, 146, 145],
          [89, 74, 76],
          [207, 207, 210],
          [212, 203, 203]],
         [[145, 155, 164],
          [52, 40, 36],
          [166, 160, 163],
          [136, 132, 134]],
         [[61, 56, 60],
          [36, 32, 35],
          [202, 195, 195],
          [172, 165, 177]]]]
    np_imgs = np.array(img_list)
    print("np_imgs shape = {}".format(np_imgs.shape))
    imgs = np.transpose(img_list, (0, 2, 1, 3))
    print("imgs shape = {}".format(np_imgs.shape))
    #imgs = np.array(imgs) / 255
    print("pix 0: \n{}".format(np_imgs[0]))
    print("transposed pix 0: \n{}".format(imgs[0]))
    print("pix 1: \n{}".format(np_imgs[1]))
    print("transposed pix 1: \n{}".format(imgs[1]))
  • So, this is a complex matrix, with a shape of (2, 4, 4, 3). What we want to do is rotate the images (the inner 4, 4) by 90 degrees by transposing them. The way to understand Numpy’s transpose is that it interchanges two axis. The trick is understanding how.
  • For this matrix, applying a transpose that does nothing means writing this:
    imgs = np.transpose(img_list, (0, 1, 2, 3))
  • Think of it as an identity transpose. What we want to do is reverse the order of the inner 4, 4, which we do like this:
    imgs = np.transpose(img_list, (0, 2, 1, 3))
  • That’s it! Now the second “4” will be transposed with the first “4”. You can do this with any of the elements. So
    imgs = np.transpose(img_list, (3, 2, 1, 0))
  • Reverses everything!
  • Ok, so things are working, but the results are crap. Not really worrying about it for now because it’s CFAR and I always have this problem:
    ./images\airplane.jpg = [8] ship
    ./images\automobile.jpg = [0] airplane 
    ./images\bird.jpg = [4] deer
    ./images\cat.jpg = [0] airplane 
    ./images\cat2.jpg = [6] frog
    ./images\cat3.jpg = [8] ship
    ./images\deer.jpg = [8] ship
    ./images\dog.jpg = [2] bird
    ./images\horse.jpg = [8] ship
    ./images\ship.jpg = [0] airplane 
    ./images\steam-locomotive.jpg = [2] bird
    ./images\truck.jpg = [3] cat
    [8 0 4 0 6 8 8 2 8 0 2 3]


  • Meeting

Phil 4.17.20

Can You Beat COVID-19 Without a Lockdown? Sweden Is Trying

I dug into the predictions that we generate of Comparing Finland, Norway, and Sweden, it looks like something that Sweden did could result in about 2,600 people dying that don’t have to:




  • IRS proposal – done!
  • A better snippet: the best way to cheat on taxes is  to deliberately lie to the IRS about what you earned over a year, what you spent over a year, and the ways you would fill out those forms. This is where “time of year” really comes into play. The IRS assumes you worked on April 15 through the 15th of the following year in order to report and pay taxes on your actual income from April 15 through the following year. I’ve put some pictures and thoughts below. There are some really great readers who have put some excellent guides and resources out there on this topic. If you have any additional questions, please feel free to leave a comment below and I will do my best to answer them.
  • Another good snippet: The best way to cheat on taxes is  to set up an LLC or other tax-sheltered company that makes up for your sloth in paying business taxes. By doing this, you can deduct the business expenses and pay your taxes at a much lower tax rate, while also getting a tax refund. So, for example, if your net operating income for 2014 was $5,000 and you think you should owe about $2,000 in taxes for 2015, I suggest you set up a  S-Corporation   for 2015 that only owes $500 in taxes. Then, you can send the IRS a check for the difference between the $2,000 difference you owe them and the $5,000 net operating income for 2015.


  • Finish first pass? Done! And sent to Antonio!


Shortcut Learning in Deep Neural Networks

  • Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today’s machine intelligence. Numerous success stories have rapidly spread all over science, industry and society, but its limitations have only recently come into focus. In this perspective we seek to distil how many of deep learning’s problem can be seen as different symptoms of the same underlying problem: shortcut learning. Shortcuts are decision rules that perform well on standard benchmarks but fail to transfer to more challenging testing conditions, such as real-world scenarios. Related issues are known in Comparative Psychology, Education and Linguistics, suggesting that shortcut learning may be a common characteristic of learning systems, biological and artificial alike. Based on these observations, we develop a set of recommendations for model interpretation and benchmarking, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications.