Category Archives: Torch

Phil 7.3.20

Today is a federal holiday, so no rocket science

Huggingface has a pipeline interface now that is pretty abstract. This works:

from transformers import pipeline

translator = pipeline("translation_en_to_fr")
print(translator("Hugging Face is a technology company based in New York and Paris", max_length=40))

[{‘translation_text’: ‘Hugging Face est une entreprise technologique basée à New York et à Paris.’}]

Wow: GPT-3 writes code!

DtZ is back up! Too many countries have the disease and the histories had to be cropped to stay under the data cap for the free service

GPT-2 Agents

Work on more granular path finding
- Going to try the hypotenuse of distance to source and line first – nope
- Trying looking for the distances of each and doing a nested sort
- I had a problem where I was checking to see whether a point was between the current node and the target node using the original line between the source and target nodes. Except that I was checking on a lone from the current node to the target, and failing the test. Oops! Fixed
- I went back to the hypotenuse version now that the in_between test isn’t broken and look at that!

granular

- Added the option for coarse or granular paths
Start thinking about topic extraction for a given corpus

#COVID

Evaluate Arabic to English translation. Got it working!

from transformers import MarianTokenizer, MarianMTModel
from typing import List
src = 'ar'  # source language
trg = 'en'  # target language
sample_text = "لم يسافر أبي إلى الخارج من قبل"
sample_text2 = "الصحة_السعودية تعلن إصابة أربعيني بفيروس كورونا بالمدينة المنورة حيث صنفت عدواه بحالة أولية مخالطة الإبل مشيرة إلى أن حماية الفرد من(كورونا)تكون باتباع الإرشادات الوقائية والمحافظة على النظافة والتعامل مع #الإبل والمواشي بحرص شديد من خلال ارتداء الكمامة "
mname = f'Helsinki-NLP/opus-mt-{src}-{trg}'

model = MarianMTModel.from_pretrained(mname)
tok = MarianTokenizer.from_pretrained(mname)
batch = tok.prepare_translation_batch(src_texts=[sample_text2])  # don't need tgt_text for inference
gen = model.generate(**batch)  # for forward pass: model(**batch)
words: List[str] = tok.batch_decode(gen, skip_special_tokens=True) 
print(words)

It took a few tries to find the right model. The naming here is very haphazard.

Asked for a sanity check from the group

This:

الصحة_السعودية تعلن إصابة أربعيني بفيروس كورونا بالمدينة المنورة حيث صنفت عدواه بحالة أولية مخالطة الإبل مشيرة إلى أن حماية الفرد من(كورونا)تكون باتباع الإرشادات الوقائية والمحافظة على النظافة والتعامل مع #الإبل والمواشي بحرص شديد من خلال ارتداء الكمامة

Translates to this:

Saudi health announces a 40-year-old corona virus in the city of Manora, where his enemy was classified as a primary camel conglomerate, indicating that the protection of the individual from Corona would be through preventive guidance, hygiene, and careful handling of the Apple and the cattle by wearing the gag.

Write script that takes a batch of rows and adds translations until all the rows in the table are complete

Book chat

List of folks who would be interesting to interview
- Stewart Russel
- Stuart Kauffman
- Alex (Sandy) Pentland
- Kate Starbird
- Joanna Bryson
- Daniel DeNicola
- Margaret Gilbert
- Joseph Liechty /Cecelia Clegg
- Rebecca Solint
- Zeynep Tefuchi
- Christian Jacob (directeur de recherche at the Centre national de la recherche scientifique in Paris)
- Ezra Klien

Phil 1.2.20

7:00 – 4:30 ASRC PhD

More highlighting and slides. Once I get through the Background section, I’ll write the overview, then repeat that patterns.
- I’m tweaking too much text to keep the markup version. Sigh.
- Finished Background and sent that to Wayne
GPT-2 Agents. See if we can get multiple texts generated – nope
- Build a corpus of .txt files
- Try running them through LMN
No NOAA meeting
No ORCA meeting

Phil 1.30.19

7:00 – 7:00 ASRC PhD

Nice visualization, with map-like aspects: The Climate Learning Tree
Dissertation
- Start JuryRoom section – done!
- Finished all content!
GPT-2 Agents
- Download big model and try to run it
- Move models and code out of the transformers project
GOES
- Learning by Cheating (sounds like a mechanism for simulation to work with)
  - Vision-based urban driving is hard. The autonomous system needs to learn to perceive the world and act in it. We show that this challenging learning problem can be simplified by decomposing it into two stages. We first train an agent that has access to privileged information. This privileged agent cheats by observing the ground-truth layout of the environment and the positions of all traffic participants. In the second stage, the privileged agent acts as a teacher that trains a purely vision-based sensorimotor agent. The resulting sensorimotor agent does not have access to any privileged information and does not cheat. This two-stage training procedure is counter-intuitive at first, but has a number of important advantages that we analyze and empirically demonstrate. We use the presented approach to train a vision-based autonomous driving system that substantially outperforms the state of the art on the CARLA benchmark and the recent NoCrash benchmark. Our approach achieves, for the first time, 100% success rate on all tasks in the original CARLA benchmark, sets a new record on the NoCrash benchmark, and reduces the frequency of infractions by an order of magnitude compared to the prior state of the art. For the video that summarizes this work, see this https URL
Meeting with Aaron
- Overview at the beginning of each chapter – look at Aaron’s chapter 5 for
- example intro and summary.
- Callouts in text should match the label
- hfill to right-justify
- Footnote goes after puntuation
- Punctuation goes inside quotes
- for url monospace use \texttt{} (perma.cc)
- indent blockquotes 1/2 more tab
- Non breaking spaces on names
- Increase figure sizes in intro

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: Torch

Phil 7.3.20

Phil 1.2.20

Phil 1.30.19