Phil 5.1.20

Geez, it’s May! What a weird time


  • Chatted with Zach. He’s bogged down in database issues, but I think it’s coming along

GPT-2 Agents

  • Upgrade TF, Torch, transformers, Nvidia, and CUDA on laptop
  • Set up input and output files
  • Pull char count of probe out and add that to the total generated
  • Try training on Moby Dick as per these instructions
    • The following example fine-tunes GPT-2 on WikiText-2. We’re using the raw WikiText-2 (no tokens were replaced before the tokenization). The loss here is that of causal language modeling.
      export TRAIN_FILE=/path/to/dataset/wiki.train.raw
      export TEST_FILE=/path/to/dataset/wiki.test.raw
      python \
          --output_dir=output \
          --model_type=gpt2 \
          --model_name_or_path=gpt2 \
          --do_train \
          --train_data_file=$TRAIN_FILE \
          --do_eval \

      This takes about half an hour to train on a single K80 GPU and about one minute for the evaluation to run. It reaches a score of ~20 perplexity once fine-tuned on the dataset.

  • Ran with this command
    python --output_dir=output .\gpt2data\moby_dick_model --model_type=gpt2 --model_name_or_path=gpt2 --do_train --train_data_file=.\gptdata\moby_dick_train.txt --do_eval --eval_data_file=.\gptdata\moby_dick_test.txt

    Which started the task correctly, but…

    RuntimeError: CUDA out of memory. Tried to allocate 96.00 MiB (GPU 0; 8.00 GiB total capacity; 6.26 GiB already allocated; 77.55 MiB free; 6.31 GiB reserved in total by PyTorch)

    Guess I’ll try running it on my work machine. If it runs there, I guess it’s time to upgrade my graphics card

  • That was not the problem! There is something going on with batch size. Added  per_gpu_train_batch_size=1
  • Couldn’t use links. os.isfile() chokes
  • The model doesn’t seem to be saved? Looks like it is:
    05/01/2020 09:43:49 - INFO - transformers.trainer -   Saving model checkpoint to output
    05/01/2020 09:43:49 - INFO - transformers.configuration_utils -   Configuration saved in output\config.json
    05/01/2020 09:43:50 - INFO - transformers.modeling_utils -   Model weights saved in output\pytorch_model.bin
    05/01/2020 09:43:50 - INFO - __main__ -   *** Evaluate ***
    05/01/2020 09:43:50 - INFO - transformers.trainer -   ***** Running Evaluation *****
    05/01/2020 09:43:50 - INFO - transformers.trainer -     Num examples = 97
    05/01/2020 09:43:50 - INFO - transformers.trainer -     Batch size = 16
    Evaluation: 100%|██████████| 7/7 [00:06<00:00,  1.00it/s]
    05/01/2020 09:43:57 - INFO - __main__ -   ***** Eval results *****
    05/01/2020 09:43:57 - INFO - __main__ -     perplexity = 43.311306196182095
  • Found it. It defaults to the output directory in transformers/examples
  • To get this version, which is a PyTorch model, you have to add the ‘from_pt=True‘ argument:
    model = TFGPT2LMHeadModel.from_pretrained("../data/moby_dick_model", pad_token_id=tokenizer.eos_token_id, from_pt=True)
  • And the results are great!
    I enjoy walking with my cute dog:
    	[0]: I enjoy walking with my cute dog, and then I like to take pictures! But, as for you, you will have to go all the way round for the proper weather! Here, I have some water in my belly! How am I
    	[1]: I enjoy walking with my cute dog when I walk in the yard, and when we have been going in, I am always excited to try a little bit of the wildest stuff. I like to see my dogs do it. I like
    	[2]: I enjoy walking with my cute dog because he has no fear of you leaving him alone. In that case, let me explain that I am a retired Sperm Whale in my Sperm Whale breeding herd. I was recently the leader of the
    Far out in the uncharted backwaters of the unfashionable end:
    	[0]: Far out in the uncharted backwaters of the unfashionable end of the Indian Ocean, you will see whales of many great variety. “Wherever they go, their mouths may be wide open, or they may be so packed
    	[1]: Far out in the uncharted backwaters of the unfashionable end of the planet. On his way, it seemed that he was about to embark upon something which no mortal could have foreseen; it being the Cape Horn of the Pacific
    	[2]: Far out in the uncharted backwaters of the unfashionable end. A curious discovery is made of the whale-whale. How much is he? I wonder how many sperm whales have there! I am still trying to get
    It was a pleasure to burn. :
    	[0]: It was a pleasure to burn. His teeth were the first thing to slide down to the side of his cheeks—a pointless thing—while my face stood there in this hideous position. It was my last, and only,
    	[1]: It was a pleasure to burn. But, as the day wore on, another peculiarity was discovered in the method. When this first method was advanced to be used for preparing the best lye, it was found that it was, instead
    	[2]: It was a pleasure to burn. “Sir, “aye, that’s true—” said I with a sort of exasperation. I then took one of the other boats and in a very similar
    It was a bright cold day in April, and the clocks were striking thirteen. :
    	[0]: It was a bright cold day in April, and the clocks were striking thirteen. It seemed that Captain Peleg had had just arrived, and was sitting in his Captain-Commander's cabin, and was trying to get up some time; but Pe
    	[1]: It was a bright cold day in April, and the clocks were striking thirteen. One of us, who had been living in the tent for six days, still felt like the moon. I saw him. I saw him again. He looked just like
    	[2]: It was a bright cold day in April, and the clocks were striking thirteen. “Good afternoon, sir, it was the very first Sabbath of the year, and the New Year is the first time the people of the world have an


  • Need to get the chess database and build a corpora. Working on a PGN to English translator. Doesn’t look toooooo bad


    • Continue with GANS. Maybe explore 1D CNNs?
    • The run with the high-frequency run actually looks pretty good:

      I think it may be a better use of my time to assemble all the components for a first pass proof-of concept

  • 10:00 Meeting with Vadim and Isaac
    • I walked through the whole controller architecture from the base class to the running version. Vadim will start implementing a Sim2 version using the base classes and the dictionary. Then we can work on writing to and reading from InfluxDB