Monthly Archives: March 2019

Phil 3.22.19

7:00 – ASRC PhD

  • Morning thought – primordial movement is navigation at a more generalized level. At this level, only broad features are visible, so it’s only easy to jump to a large, common feature like fear-of-the-other. For humans, I think there are very few accessible features at this level.
  • Adversarial attacks on medical machine learning
    • With public and academic attention increasingly focused on the new role of machine learning in the health information economy, an unusual and no-longer-esoteric category of vulnerabilities in machine-learning systems could prove important. These vulnerabilities allow a small, carefully designed change in how inputs are presented to a system to completely alter its output, causing it to confidently arrive at manifestly wrong conclusions. These advanced techniques to subvert otherwise-reliable machine-learning systems—so-called adversarial attacks—have, to date, been of interest primarily to computer science researchers (1). However, the landscape of often-competing interests within health care, and billions of dollars at stake in systems’ outputs, implies considerable problems. We outline motivations that various players in the health care system may have to use adversarial attacks and begin a discussion of what to do about them. Far from discouraging continued innovation with medical machine learning, we call for active engagement of medical, technical, legal, and ethical experts in pursuit of efficient, broadly available, and effective health care that machine learning will enable.
  • Jumping straight into coding
    • Moved the menu bars around
      • Added a “Chains” menu that runs sequences of analytics
      • Added an NLP menu for BOW and TF-IDF
      • Working on parsing the selected listings to add to the various lists
      • Added a “search” button to search for neighbors of terms in the spaces and places text areas

Phil 3.21.19

YouTube

ASRC PhD 7:00 – 6:00

  • Worked a little on the iConference slides. Found the section on subcritical/supercritical in Kaufman’s book
  • Working on drawing the player panalyzers
  • Visualizing a NetworkX graph in the Notebook with D3.js
    • In this recipe, we will create a graph in Python with NetworkX and visualize it in the Jupyter Notebook with D3.js.
  • Bokeh is an interactive visualization python library that targets modern web browsers for presentation. Its goal is to provide elegant, concise construction of versatile graphics, and to extend this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.
    • Visualizing Network Graphs
      • Bokeh has added native support for creating network graph visualizations with configurable interactions between edges and nodes.
  • Got the BOW terms out of all the panalyzers.
  • Working on setting child embeddings
  • Scikit has TF-IDF and BOW vectorizers
  • Players plotted in the master embedding space: PlayerChildEmbedding
  • I just realized that I can plot the space and place terms in the same way, which should make some nice figures in the paper.
  • Here’s a plot of the different groups. Note that it is much wider. Something happened? Child_Embeddings
  • Anyway, I’m done for the day. Tomorrow I’ll start to add stop words and look for neighbor terms (And plot them!) SQ_GUI

Phil 3.20.19

ASRC PhD 7:00 – 2:00, NASA 2:00 – 4:00

  • Add Text area for “selected”, with a combobox for “Place”, “Space”, and “Ignore”. Ignore words get split(), added to a sorted list view, and saved in the config.xml
  • Add text area for program output as well as console. Text lists for users can be shown here. Selected terms from here can be selected and added to the ignore list
  • Draw the following as colored points on the embeddings
    • Top Room (bow/tf-idf) terms
    • Top Group (bow/tf-idf)
    • Place terms and Space terms (differentiated by room/group?)

NASA

  • Split my time between TL and AIMS through May, see what has the most traction

Meeting with Aaron M.

  • Nice discussion. He thinks I’m on a good track, and agrees with using mixed methods as a step towards full automation

Made some tweaks to Antonio’s reply article

Phil 3.19.19

7:00 – 5:00, 6:30 – 7:15 ASRC PhD

  • Algorithmic Rationality: Game Theory with Costly Computation
    • We develop a general game-theoretic framework for reasoning about strategic agents performing possibly costly computation. In this framework, many traditional game-theoretic results (such as the existence of a Nash equilibrium) no longer hold. Nevertheless, we can use the framework to provide psychologically appealing explanations of observed behavior in well-studied games (such as finitely repeated prisoner’s dilemma and rock-paper-scissors). Furthermore, we provide natural conditions on games sufficient to guarantee that equilibria exist.
  • NVIDIA has an IoT NN chipset as well
  • Recreate DB with file pull. Success! Here’s the code that works. It takes a downloaded Slack chat session and iterates over that. When it finds a message with a “files” subobject, it does the following. Shout out to the wonderful requests library:
    for mf in entry["files"]:
        mf["user"] = entry["user"]
        mf["ts"] = entry["ts"]
        if mf["pretty_type"] == "Post":
            if use_local :
                to_return += mf["preview"]
            else:
                url = mf["url_private"]
                try:
                    result = requests.get(url, headers={'Authorization': 'Bearer %s' % self.slack_token})
                    dict = json.loads(result.text)
                    root = dict["root"]
                    to_return = ""
                    ch_list = root["children"]
                    for ch in ch_list:
                        if ch["type"] == 'p':
                            to_return += "{} ".format(ch["text"])
                    print("handle_file_entries(): text = {}".format(to_return))
                except requests.exceptions.RequestException as err:
                    print("Got a RequestException: {}".format(err))
                    to_return += mf["preview"]
  • Whoops, tymora3 doesn’t have the “Near the port city of Waterdeep in Faerun” phrase. Switching to “young man in a well crafted tunic”
  • Add explicit file opening and saving – done
  • Add “Embedding Dimensions” field – done
  • Example xml file:
    <config>
       <name>test_2019.03.19_11.24</name>
       <buckets>10</buckets>
       <similarity>0.01</similarity>
       <dimensions>3</dimensions>
       <dbs>
          <db>ab_slack</db>
          <db>phpbb</db>
       </dbs>
       <channels>
          <channel>Group 1</channel>
          <channel>tymora1</channel>
          <channel>tymora2</channel>
          <channel>tymora3</channel>
          <channel>peanutgallery</channel>
       </channels>
       <places>
          <place>Orc</place>
          <place>Goblin</place>
          <place>stairs</place>
          <place>orb</place>
          <place>statues</place>
          <place>troll</place>
          <place>Grogg</place>
       </places>
       <spaces>
          <space>fight</space>
          <space>diplomacy</space>
          <space>sing</space>
          <space>sleep</space>
       </spaces>
       <splits>
          <split>young man in a well crafted tunic</split>
          <split>brightly glowing blue orb</split>
          <split>large scaled troll sleeping</split>
          <split>the sea of gold coins and gems filling it</split>
          <split>Two women lounge on chairs across from each other</split>
       </splits>
    </config>
  • Reading and writing data are done. Now to start slicing and displaying data interactively

CSAGUI

  • Data slicing
    • Build embedding for all docs
    • Starting to add bucketing code. First step is to get ignore text from individual users, second task is to have single, room aligned buckets, gust so we can see what that looks like
  • JuryRoom meeting at 6:30
    • Added my req’s
      • Groups can stay together
      • Threaded questions in sequential or random order
      • Voting for post, rather than yes/no questions
    • Tony raised a bunch of points about how the conversation could be gamed. My response was that we should build something simple and then try to game it to see what affordances we need
    • Extended discussion on display real estate – how do we show the “starred” posts
      • Tony mentioned the idea that starred posts could fade if they languish without additional stars
    • Panos mentioned the idea of a countdown clock to pressure a vote
    • We walked through the implementation issues. The estimated framework is 3-tier, with a relational DB, Node on the server, and a web client running a framework like React. The goal is a “Hello World” application that we can log into and create accounts running by next week
    • I pointed back to the original JuryRoom document’s strawman representation of the schema.
  • Synonyms for fringe
    • Synonyms: Noun
      • border, borderline, bound, boundary, brim, circumference, compass, confines, edge, edging, end, frame, hem, margin, perimeter, periphery, rim, skirt, skirting, verge
    • Synonyms: Verb
      • abut, adjoin, border (on), butt (on or against), flank, join, march (with), neighbor, skirt, touch, verge (on)

Phil 3.18.19

ASRC PhD 7:00 – 6:00

  • SlackToDb
    • Pull down text – Done, I hope. The network here has bad problems with TLS resolution. Will try from home
    • Link sequential posts – done
    • Add word lists for places and spaces (read from file, also read embeddings)
      • Writing out the config file – done
    • Add field for similarity distance threshold. Changing this lists nearby words in the embedding space. These terms are used for trajectory generation and centrality tables.
    • Add plots for place/space words
    • Add phrase-based splitting to find rooms. Buckets work within these splits. Text before the first split and after the last split isn’t used (For embedding, centrality, etc.)
    • Add phrase-based trimming. Test before one and after the other isn’t used
    • Stub out centrality for (embedded) terms and (concatenated, bucketed, and oriented) documents
    • Look into 3d for tkinter (from scratch)

       

    • Progress for the day:

SAGUI

Phil 3.17.19

Got a really good idea about doing a hybrid coding model using embeddings. We start with a list of “place terms” and a list of “space terms”. We then use the embedded representation (vector) of those terms to find the adjacent terms. This is a sort of automated “snowball sampling”, where terms can lead to other terms. Once we have these terms, we use them as queries into the database to find the campaign and the timestamp for each. We use these to create the trajectories and maps.

This is a pretty straightforward code and a set of queries to write, and I have high confidence that it will work, and provide a novel, understandable method of producing a nice ‘mixed method’ process that is also grounded completely in the corpora.

Phil 3.15.19

7:00 – ASRC

  • Downloaded the JuryRoom spec from Waikato and sent my sympathies for Christchurch
  • Worked on getting cosine distance working – Done. Also created spreadsheets of the distances between posts and list the posts on a tab in the spreadsheet. I strip out the words that aren’t used to make the vectors so the posts look a little funny, but the gist is there:

Phil 3.14.19

ASRC AIMS 7:00 – 4:00, PhD ML, 4:30 –

Phil 3.13.19

7:00 – 5:00 ASRC AIMS

SAv3.13

  • Got the db reading in and creating PostAnalyzer objects for each user by channel
  • Need to also create a PostAnalyzer that contains the entire set of runs. Since that crosses DBs, I think the best way to do this is to create a method that lets me load additional data into an existing instance
    • Added load_data() method to PostAnalyzer. Seems to be working
    • The GUI code was getting ugly with the analytics, so I did some refactoring and now have an MVC architecture and am happier
  • Create the master embedding – done!!!! The number of points seems low (98), but I’ll look at that tomorrow.Embedding
  • Compare user average vectors in a user x user matrix
  • Compare post average vectors in a post x post matrix
  • Missed the JuryRoom Skype last night. Aaron was there though. Need to catch up
    • Quick notes for JuryRoom:
      • The votes should be for a posted response, not a yes/no to the original question
      • Groups should be able stick together if they want
      • Topics should be “threadable” for groups, with defined and randomized order
  • Steve S. Is going to read the paper and make suggestions
  • Here’s how you import into postgres: .\pg_restore.exe -h localhost -p 5433 -U postgres -d GEMSEC_logs -v “D:/Development/A2P/GEMSEC_logs/greatdb.backup”
  • Aaron’s blog is up!

GAN_Fashion

Click to see trajectories through fashion space (paper)

Phil 3.12.19

7:00 – 4:00 ASRC PhD

TFK

d1dpqqlxgaansuo

Phil 3.11.19

7:00 – 10:00 ASRC PhD. Fun, long day.

Phil 3.10.19

Learning to Speak and Act in a Fantasy Text Adventure Game

  • We introduce a large scale crowdsourced text adventure game as a research platform for studying grounded dialogue. In it, agents can perceive, emote, and act whilst conducting dialogue with other agents. Models and humans can both act as characters within the game. We describe the results of training state-of-the-art generative and retrieval models in this setting. We show that in addition to using past dialogue, these models are able to effectively use the state of the underlying world to condition their predictions. In particular, we show that grounding on the details of the local environment, including location descriptions, and the objects (and their affordances) and characters (and their previous actions) present within it allows better predictions of agent behavior and dialogue. We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully.

New run in the dungeon. Exciting!

Finished my pass through Antonio’s paper

Zoe Keating (May 1) or Imogen Heap (May 3)?

Phil 3.9.19

Understanding China’s AI Strategy

  • In my interactions with Chinese government officials, they demonstrated remarkably keen understanding of the issues surrounding AI and international security. It is clear that China’s government views AI as a high strategic priority and is devoting the required resources to cultivate AI expertise and strategic thinking among its national security community. This includes knowledge of U.S. AI policy discussions. I believe it is vital that the U.S. policymaking community similarly prioritize cultivating expertise and understanding of AI developments in China.

Russian Trolls Shift Strategy to Disrupt U.S. Election in 2020

  • Russian internet trolls appear to be shifting strategy in their efforts to disrupt the 2020 U.S. elections, promoting politically divisive messages through phony social media accounts instead of creating propaganda themselves, cybersecurity experts say.

Backup phone

Work on SASO paper – started

Rachel’s dungeon run is tomorrow! Maybe cross 10,000 posts?

Look at using BERT and the full Word2Vec model for analyzing posts

The Promise of Hierarchical Reinforcement Learning

  • To really understand the need for a hierarchical structure in the learning algorithm and in order to make the bridge between RL and HRL, we need to remember what we are trying to solve: MDPs. HRL methods learn a policy made up of multiple layers, each of which is responsible for control at a different level of temporal abstraction. Indeed, the key innovation of the HRL is to extend the set of available actions so that the agent can now choose to perform not only elementary actions, but also macro-actions, i.e. sequences of lower-level actions. Hence, with actions that are extended over time, we must take into account the time elapsed between decision-making moments. Luckily, MDP planning and learning algorithms can easily be extended to accommodate HRL.

Phil 3.7.19

Day 2 of the TF Dev summit. Worth the money, though much less research-y and more implementation and production-y

Google Cloud has Fedramp certification, which it does see details here.

Live Transcribe

Coral: On Device Transfer learning (paper)

TF 2.0 API \changes and Behavior changes

  • Best practices (link: )
  • Declare variables at the beginning of the code
  • Keras Functional API
    • The Keras functional API is the way to go for defining complex models, such as multi-output models, directed acyclic graphs, or models with shared layers.
  • Autograd can automatically differentiate native Python and Numpy code. It can handle a large subset of Python’s features, including loops, ifs, recursion and closures, and it can even take derivatives of derivatives of derivatives. It supports reverse-mode differentiation (a.k.a. backpropagation), which means it can efficiently take gradients of scalar-valued functions with respect to array-valued arguments, as well as forward-mode differentiation, and the two can be composed arbitrarily. The main intended application of Autograd is gradient-based optimization. For more information, check out the tutorial and the examples directory.
  • JAX is Autograd and XLA, brought together for high-performance machine learning research. With its updated version of Autograd, JAX can automatically differentiate native Python and NumPy functions. It can differentiate through loops, branches, recursion, and closures, and it can take derivatives of derivatives of derivatives. It supports reverse-mode differentiation (a.k.a. backpropagation) via grad as well as forward-mode differentiation, and the two can be composed arbitrarily to any order.
  • Effective TF 2.0: There are multiple changes in TensorFlow 2.0 to make TensorFlow users more productive. TensorFlow 2.0 removes redundant APIs, makes APIs more consistent (Unified RNNsUnified Optimizers), and better integrates with the Python runtime with Eager execution.