Category Archives: research

Phil 3.20.19

ASRC PhD 7:00 – 2:00, NASA 2:00 – 4:00

  • Add Text area for “selected”, with a combobox for “Place”, “Space”, and “Ignore”. Ignore words get split(), added to a sorted list view, and saved in the config.xml
  • Add text area for program output as well as console. Text lists for users can be shown here. Selected terms from here can be selected and added to the ignore list
  • Draw the following as colored points on the embeddings
    • Top Room (bow/tf-idf) terms
    • Top Group (bow/tf-idf)
    • Place terms and Space terms (differentiated by room/group?)

Phil 3.19.19

7:00 – 5:00, 6:30 – 7:15 ASRC PhD

  • Algorithmic Rationality: Game Theory with Costly Computation
    • We develop a general game-theoretic framework for reasoning about strategic agents performing possibly costly computation. In this framework, many traditional game-theoretic results (such as the existence of a Nash equilibrium) no longer hold. Nevertheless, we can use the framework to provide psychologically appealing explanations of observed behavior in well-studied games (such as finitely repeated prisoner’s dilemma and rock-paper-scissors). Furthermore, we provide natural conditions on games sufficient to guarantee that equilibria exist.
  • NVIDIA has an IoT NN chipset as well
  • Recreate DB with file pull. Success! Here’s the code that works. It takes a downloaded Slack chat session and iterates over that. When it finds a message with a “files” subobject, it does the following. Shout out to the wonderful requests library:
    for mf in entry["files"]:
        mf["user"] = entry["user"]
        mf["ts"] = entry["ts"]
        if mf["pretty_type"] == "Post":
            if use_local :
                to_return += mf["preview"]
            else:
                url = mf["url_private"]
                try:
                    result = requests.get(url, headers={'Authorization': 'Bearer %s' % self.slack_token})
                    dict = json.loads(result.text)
                    root = dict["root"]
                    to_return = ""
                    ch_list = root["children"]
                    for ch in ch_list:
                        if ch["type"] == 'p':
                            to_return += "{} ".format(ch["text"])
                    print("handle_file_entries(): text = {}".format(to_return))
                except requests.exceptions.RequestException as err:
                    print("Got a RequestException: {}".format(err))
                    to_return += mf["preview"]
  • Whoops, tymora3 doesn’t have the “Near the port city of Waterdeep in Faerun” phrase. Switching to “young man in a well crafted tunic”
  • Add explicit file opening and saving – done
  • Add “Embedding Dimensions” field – done
  • Example xml file:
    <config>
       <name>test_2019.03.19_11.24</name>
       <buckets>10</buckets>
       <similarity>0.01</similarity>
       <dimensions>3</dimensions>
       <dbs>
          <db>ab_slack</db>
          <db>phpbb</db>
       </dbs>
       <channels>
          <channel>Group 1</channel>
          <channel>tymora1</channel>
          <channel>tymora2</channel>
          <channel>tymora3</channel>
          <channel>peanutgallery</channel>
       </channels>
       <places>
          <place>Orc</place>
          <place>Goblin</place>
          <place>stairs</place>
          <place>orb</place>
          <place>statues</place>
          <place>troll</place>
          <place>Grogg</place>
       </places>
       <spaces>
          <space>fight</space>
          <space>diplomacy</space>
          <space>sing</space>
          <space>sleep</space>
       </spaces>
       <splits>
          <split>young man in a well crafted tunic</split>
          <split>brightly glowing blue orb</split>
          <split>large scaled troll sleeping</split>
          <split>the sea of gold coins and gems filling it</split>
          <split>Two women lounge on chairs across from each other</split>
       </splits>
    </config>
  • Reading and writing data are done. Now to start slicing and displaying data interactively

CSAGUI

  • Data slicing
    • Build embedding for all docs
    • Starting to add bucketing code. First step is to get ignore text from individual users, second task is to have single, room aligned buckets, gust so we can see what that looks like
  • JuryRoom meeting at 6:30
    • Added my req’s
      • Groups can stay together
      • Threaded questions in sequential or random order
      • Voting for post, rather than yes/no questions
    • Tony raised a bunch of points about how the conversation could be gamed. My response was that we should build something simple and then try to game it to see what affordances we need
    • Extended discussion on display real estate – how do we show the “starred” posts
      • Tony mentioned the idea that starred posts could fade if they languish without additional stars
    • Panos mentioned the idea of a countdown clock to pressure a vote
    • We walked through the implementation issues. The estimated framework is 3-tier, with a relational DB, Node on the server, and a web client running a framework like React. The goal is a “Hello World” application that we can log into and create accounts running by next week
    • I pointed back to the original JuryRoom document’s strawman representation of the schema.
  • Synonyms for fringe
    • Synonyms: Noun
      • border, borderline, bound, boundary, brim, circumference, compass, confines, edge, edging, end, frame, hem, margin, perimeter, periphery, rim, skirt, skirting, verge
    • Synonyms: Verb
      • abut, adjoin, border (on), butt (on or against), flank, join, march (with), neighbor, skirt, touch, verge (on)

Phil 3.18.19

ASRC PhD 7:00 – 6:00

  • SlackToDb
    • Pull down text – Done, I hope. The network here has bad problems with TLS resolution. Will try from home
    • Link sequential posts – done
    • Add word lists for places and spaces (read from file, also read embeddings)
      • Writing out the config file – done
    • Add field for similarity distance threshold. Changing this lists nearby words in the embedding space. These terms are used for trajectory generation and centrality tables.
    • Add plots for place/space words
    • Add phrase-based splitting to find rooms. Buckets work within these splits. Text before the first split and after the last split isn’t used (For embedding, centrality, etc.)
    • Add phrase-based trimming. Test before one and after the other isn’t used
    • Stub out centrality for (embedded) terms and (concatenated, bucketed, and oriented) documents
    • Look into 3d for tkinter (from scratch)

       

    • Progress for the day:

SAGUI

Phil 3.17.19

Got a really good idea about doing a hybrid coding model using embeddings. We start with a list of “place terms” and a list of “space terms”. We then use the embedded representation (vector) of those terms to find the adjacent terms. This is a sort of automated “snowball sampling”, where terms can lead to other terms. Once we have these terms, we use them as queries into the database to find the campaign and the timestamp for each. We use these to create the trajectories and maps.

This is a pretty straightforward code and a set of queries to write, and I have high confidence that it will work, and provide a novel, understandable method of producing a nice ‘mixed method’ process that is also grounded completely in the corpora.

Phil 3.15.19

7:00 – ASRC

  • Downloaded the JuryRoom spec from Waikato and sent my sympathies for Christchurch
  • Worked on getting cosine distance working – Done. Also created spreadsheets of the distances between posts and list the posts on a tab in the spreadsheet. I strip out the words that aren’t used to make the vectors so the posts look a little funny, but the gist is there:

Phil 3.14.19

ASRC AIMS 7:00 – 4:00, PhD ML, 4:30 –

Phil 3.13.19

7:00 – 5:00 ASRC AIMS

SAv3.13

  • Got the db reading in and creating PostAnalyzer objects for each user by channel
  • Need to also create a PostAnalyzer that contains the entire set of runs. Since that crosses DBs, I think the best way to do this is to create a method that lets me load additional data into an existing instance
    • Added load_data() method to PostAnalyzer. Seems to be working
    • The GUI code was getting ugly with the analytics, so I did some refactoring and now have an MVC architecture and am happier
  • Create the master embedding – done!!!! The number of points seems low (98), but I’ll look at that tomorrow.Embedding
  • Compare user average vectors in a user x user matrix
  • Compare post average vectors in a post x post matrix
  • Missed the JuryRoom Skype last night. Aaron was there though. Need to catch up
    • Quick notes for JuryRoom:
      • The votes should be for a posted response, not a yes/no to the original question
      • Groups should be able stick together if they want
      • Topics should be “threadable” for groups, with defined and randomized order
  • Steve S. Is going to read the paper and make suggestions
  • Here’s how you import into postgres: .\pg_restore.exe -h localhost -p 5433 -U postgres -d GEMSEC_logs -v “D:/Development/A2P/GEMSEC_logs/greatdb.backup”
  • Aaron’s blog is up!

GAN_Fashion

Click to see trajectories through fashion space (paper)

Phil 3.12.19

7:00 – 4:00 ASRC PhD

TFK

d1dpqqlxgaansuo

Phil 3.11.19

7:00 – 10:00 ASRC PhD. Fun, long day.

Phil 3.10.19

Learning to Speak and Act in a Fantasy Text Adventure Game

  • We introduce a large scale crowdsourced text adventure game as a research platform for studying grounded dialogue. In it, agents can perceive, emote, and act whilst conducting dialogue with other agents. Models and humans can both act as characters within the game. We describe the results of training state-of-the-art generative and retrieval models in this setting. We show that in addition to using past dialogue, these models are able to effectively use the state of the underlying world to condition their predictions. In particular, we show that grounding on the details of the local environment, including location descriptions, and the objects (and their affordances) and characters (and their previous actions) present within it allows better predictions of agent behavior and dialogue. We analyze the ingredients necessary for successful grounding in this setting, and how each of these factors relate to agents that can talk and act successfully.

New run in the dungeon. Exciting!

Finished my pass through Antonio’s paper

Zoe Keating (May 1) or Imogen Heap (May 3)?

Phil 3.4.19

7:00 – 5:00 ASRC

  • Build an interactive SequenceAnalyzer. The adjustments are
    • Number of buckets
    • Percentages for each analytic (percentages to keep/discard
    • Selectable skip words that can be added to a list (in the db?)
  • Algorithm
    1. Find the most common words across all groups, these are skip_words
    2. Find the most common words along the entire series of posts per player and eliminate them
    3. Find the most common/central words across all sequences and keep those as belief places
    4. For each sequence by group, find the most common/central words after the belief places. These are the belief spaces.
    5. Build an adjacency matrix of players, groups, places and spaces
    6. Build submatrices for centrality calculations? This could be rather than finding the most common
    7. Possible word2vec variations?
      1. It seems to me that I might be able to use direction cosines and dynamic time warping to calculate the similarity of posts and align them better than the overall scaling that I’m doing now. DM posts introducing a room should align perfectly, and then other scaling could happen between those areas of greatest alignment
  • Display
    • Menu:
      • Save spreadsheet (includes config, included words, posts(?), trajectories)
      • load data
      • select database
      • select group within db
      • load/save config file
      • clear all
    • Fields
      • percent for A1, A2, A3, A4
      • Centrality/Sum switch
      • BOW/TF-IDF switch
      • Word2vec switch?
    • Textarea (areas? tabbed?)
      • Table with rows as sequence step. Columns are grouped by places, spaces, groups, and players
    • Work on Antonio’s paper got a first draft on introduction and motivation
    • BAA
      • Upload latex and references to laptop
    • Haircut! Pack!
    • Model-Based Reinforcement Learning for Atari
      • Model-free reinforcement learning (RL) can be used to learn effective policies for complex tasks, such as Atari games, even from image observations. However, this typically requires very large amounts of interaction — substantially more, in fact, than a human would need to learn the same games. How can people learn so quickly? Part of the answer may be that people can learn how the game works and predict which actions will lead to desirable outcomes. In this paper, we explore how video prediction models can similarly enable agents to solve Atari games with orders of magnitude fewer interactions than model-free methods. We describe Simulated Policy Learning (SimPLe), a complete model-based deep RL algorithm based on video prediction models and present a comparison of several model architectures, including a novel architecture that yields the best results in our setting. Our experiments evaluate SimPLe on a range of Atari games and achieve competitive results with only 100K interactions between the agent and the environment (400K frames), which corresponds to about two hours of real-time play.

 

Phil 3.3.19

Once more, icky weather makes me productive

  • Ingested all the runs into the db. We are at 7,246 posts
  • Reworking the 5 bucket analysis
  • Building better ignore files and rebuilding bucket spreadsheets. It tuns out that for tymora1, names took up 25% of the BOW, so I increased the fraction saved to the trimmed spreadsheets to 50%
  • Building bucket spreadsheets and saving the centrality vector
  • Here’s what I’ve got so far: ThreeRuns
  • Trajectories: Trajectories
  • First map: firstMap
  • Here it is annotated: firstMapAnnotated
  • Some thoughts. I think this is still “zoomed out” too far. Changing the granularity should help some. I need to automate some of my tools though. The other issue is how I’m assembling my sequences.

Phil 3.2.19

Updating SheetToMap to take comma separated cell names. Lines 180 – 193. I think I’ll need an iterating compare function. Nope, wound up doing something simpler

for (String colName : colNames) {
    String curCells = tm.get(colName);
    String[] cellArray = curCells.split("\\|\\|"); <--- new!
    for(String curCell : cellArray) {
        addNode(curCell, rowName);
        if (prevCell != null && !curCell.equals(prevCell)) {
            String edgeName = curCell + "+" + prevCell;
            if (graph.getEdge(edgeName) == null) {
                try {
                    graph.addEdge(edgeName, curCell, prevCell);
                    System.out.println("adding edge [" + edgeName + "]");
                } catch (EdgeRejectedException e) {
                    System.out.println("didn't add edge [" + edgeName + "]");
                }
            }
        }
        prevCell = curCell;
    }

    //System.out.print(curCell + ", ");
    prevCell = cellArray[0];
    col++;
}

Updating GPM to generate comma separated cell names in trajectories

  • need to get the previous n cell names
  • Need to change the cellName val in FlockingBeliefCA to be a stack of tail length. Done.
  • Parsed the strings in SheetToMap. Each cell has a root name (the first) which connects to the roots of the previous cell. The root then links to the subsequent names in the chain of names that are separated by “||”
    "cell_[4, 5]||cell_[4, 4]||cell_[4, 3]||cell_[4, 2]||cell_[4, 1]"
  • Seems to be working: tailtest

Phil 3.1.19

7:00 – ASRC

  • Got accepted to the TF dev conference. The flight out is expensive… Sent Eric V. a note asking for permission to go, but bought tix anyway given the short fuse
  • Downloaded the full slack data
  • Working on white paper. The single file was getting unwieldy, so I broke it up
  • Found Speeding up Parliamentary Decision Making for Cyber Counter-Attack, which argues for the possibility of pre-authorizing automated response
  • Up to six pages. IN the middle of the cyberdefense section