Category Archives: Development

Phil 3.20.19

ASRC PhD 7:00 – 2:00, NASA 2:00 – 4:00

  • Add Text area for “selected”, with a combobox for “Place”, “Space”, and “Ignore”. Ignore words get split(), added to a sorted list view, and saved in the config.xml
  • Add text area for program output as well as console. Text lists for users can be shown here. Selected terms from here can be selected and added to the ignore list
  • Draw the following as colored points on the embeddings
    • Top Room (bow/tf-idf) terms
    • Top Group (bow/tf-idf)
    • Place terms and Space terms (differentiated by room/group?)

Phil 3.19.19

7:00 – 5:00, 6:30 – 7:15 ASRC PhD

  • Algorithmic Rationality: Game Theory with Costly Computation
    • We develop a general game-theoretic framework for reasoning about strategic agents performing possibly costly computation. In this framework, many traditional game-theoretic results (such as the existence of a Nash equilibrium) no longer hold. Nevertheless, we can use the framework to provide psychologically appealing explanations of observed behavior in well-studied games (such as finitely repeated prisoner’s dilemma and rock-paper-scissors). Furthermore, we provide natural conditions on games sufficient to guarantee that equilibria exist.
  • NVIDIA has an IoT NN chipset as well
  • Recreate DB with file pull. Success! Here’s the code that works. It takes a downloaded Slack chat session and iterates over that. When it finds a message with a “files” subobject, it does the following. Shout out to the wonderful requests library:
    for mf in entry["files"]:
        mf["user"] = entry["user"]
        mf["ts"] = entry["ts"]
        if mf["pretty_type"] == "Post":
            if use_local :
                to_return += mf["preview"]
            else:
                url = mf["url_private"]
                try:
                    result = requests.get(url, headers={'Authorization': 'Bearer %s' % self.slack_token})
                    dict = json.loads(result.text)
                    root = dict["root"]
                    to_return = ""
                    ch_list = root["children"]
                    for ch in ch_list:
                        if ch["type"] == 'p':
                            to_return += "{} ".format(ch["text"])
                    print("handle_file_entries(): text = {}".format(to_return))
                except requests.exceptions.RequestException as err:
                    print("Got a RequestException: {}".format(err))
                    to_return += mf["preview"]
  • Whoops, tymora3 doesn’t have the “Near the port city of Waterdeep in Faerun” phrase. Switching to “young man in a well crafted tunic”
  • Add explicit file opening and saving – done
  • Add “Embedding Dimensions” field – done
  • Example xml file:
    <config>
       <name>test_2019.03.19_11.24</name>
       <buckets>10</buckets>
       <similarity>0.01</similarity>
       <dimensions>3</dimensions>
       <dbs>
          <db>ab_slack</db>
          <db>phpbb</db>
       </dbs>
       <channels>
          <channel>Group 1</channel>
          <channel>tymora1</channel>
          <channel>tymora2</channel>
          <channel>tymora3</channel>
          <channel>peanutgallery</channel>
       </channels>
       <places>
          <place>Orc</place>
          <place>Goblin</place>
          <place>stairs</place>
          <place>orb</place>
          <place>statues</place>
          <place>troll</place>
          <place>Grogg</place>
       </places>
       <spaces>
          <space>fight</space>
          <space>diplomacy</space>
          <space>sing</space>
          <space>sleep</space>
       </spaces>
       <splits>
          <split>young man in a well crafted tunic</split>
          <split>brightly glowing blue orb</split>
          <split>large scaled troll sleeping</split>
          <split>the sea of gold coins and gems filling it</split>
          <split>Two women lounge on chairs across from each other</split>
       </splits>
    </config>
  • Reading and writing data are done. Now to start slicing and displaying data interactively

CSAGUI

  • Data slicing
    • Build embedding for all docs
    • Starting to add bucketing code. First step is to get ignore text from individual users, second task is to have single, room aligned buckets, gust so we can see what that looks like
  • JuryRoom meeting at 6:30
    • Added my req’s
      • Groups can stay together
      • Threaded questions in sequential or random order
      • Voting for post, rather than yes/no questions
    • Tony raised a bunch of points about how the conversation could be gamed. My response was that we should build something simple and then try to game it to see what affordances we need
    • Extended discussion on display real estate – how do we show the “starred” posts
      • Tony mentioned the idea that starred posts could fade if they languish without additional stars
    • Panos mentioned the idea of a countdown clock to pressure a vote
    • We walked through the implementation issues. The estimated framework is 3-tier, with a relational DB, Node on the server, and a web client running a framework like React. The goal is a “Hello World” application that we can log into and create accounts running by next week
    • I pointed back to the original JuryRoom document’s strawman representation of the schema.
  • Synonyms for fringe
    • Synonyms: Noun
      • border, borderline, bound, boundary, brim, circumference, compass, confines, edge, edging, end, frame, hem, margin, perimeter, periphery, rim, skirt, skirting, verge
    • Synonyms: Verb
      • abut, adjoin, border (on), butt (on or against), flank, join, march (with), neighbor, skirt, touch, verge (on)

Phil 3.18.19

ASRC PhD 7:00 – 6:00

  • SlackToDb
    • Pull down text – Done, I hope. The network here has bad problems with TLS resolution. Will try from home
    • Link sequential posts – done
    • Add word lists for places and spaces (read from file, also read embeddings)
      • Writing out the config file – done
    • Add field for similarity distance threshold. Changing this lists nearby words in the embedding space. These terms are used for trajectory generation and centrality tables.
    • Add plots for place/space words
    • Add phrase-based splitting to find rooms. Buckets work within these splits. Text before the first split and after the last split isn’t used (For embedding, centrality, etc.)
    • Add phrase-based trimming. Test before one and after the other isn’t used
    • Stub out centrality for (embedded) terms and (concatenated, bucketed, and oriented) documents
    • Look into 3d for tkinter (from scratch)

       

    • Progress for the day:

SAGUI

Phil 3.17.19

Got a really good idea about doing a hybrid coding model using embeddings. We start with a list of “place terms” and a list of “space terms”. We then use the embedded representation (vector) of those terms to find the adjacent terms. This is a sort of automated “snowball sampling”, where terms can lead to other terms. Once we have these terms, we use them as queries into the database to find the campaign and the timestamp for each. We use these to create the trajectories and maps.

This is a pretty straightforward code and a set of queries to write, and I have high confidence that it will work, and provide a novel, understandable method of producing a nice ‘mixed method’ process that is also grounded completely in the corpora.

Phil 3.15.19

7:00 – ASRC

  • Downloaded the JuryRoom spec from Waikato and sent my sympathies for Christchurch
  • Worked on getting cosine distance working – Done. Also created spreadsheets of the distances between posts and list the posts on a tab in the spreadsheet. I strip out the words that aren’t used to make the vectors so the posts look a little funny, but the gist is there:

Phil 3.14.19

ASRC AIMS 7:00 – 4:00, PhD ML, 4:30 –

Phil 3.12.19

7:00 – 4:00 ASRC PhD

TFK

d1dpqqlxgaansuo

Phil 3.11.19

7:00 – 10:00 ASRC PhD. Fun, long day.

Phil 3.3.19

Once more, icky weather makes me productive

  • Ingested all the runs into the db. We are at 7,246 posts
  • Reworking the 5 bucket analysis
  • Building better ignore files and rebuilding bucket spreadsheets. It tuns out that for tymora1, names took up 25% of the BOW, so I increased the fraction saved to the trimmed spreadsheets to 50%
  • Building bucket spreadsheets and saving the centrality vector
  • Here’s what I’ve got so far: ThreeRuns
  • Trajectories: Trajectories
  • First map: firstMap
  • Here it is annotated: firstMapAnnotated
  • Some thoughts. I think this is still “zoomed out” too far. Changing the granularity should help some. I need to automate some of my tools though. The other issue is how I’m assembling my sequences.

Phil 3.2.19

Updating SheetToMap to take comma separated cell names. Lines 180 – 193. I think I’ll need an iterating compare function. Nope, wound up doing something simpler

for (String colName : colNames) {
    String curCells = tm.get(colName);
    String[] cellArray = curCells.split("\\|\\|"); <--- new!
    for(String curCell : cellArray) {
        addNode(curCell, rowName);
        if (prevCell != null && !curCell.equals(prevCell)) {
            String edgeName = curCell + "+" + prevCell;
            if (graph.getEdge(edgeName) == null) {
                try {
                    graph.addEdge(edgeName, curCell, prevCell);
                    System.out.println("adding edge [" + edgeName + "]");
                } catch (EdgeRejectedException e) {
                    System.out.println("didn't add edge [" + edgeName + "]");
                }
            }
        }
        prevCell = curCell;
    }

    //System.out.print(curCell + ", ");
    prevCell = cellArray[0];
    col++;
}

Updating GPM to generate comma separated cell names in trajectories

  • need to get the previous n cell names
  • Need to change the cellName val in FlockingBeliefCA to be a stack of tail length. Done.
  • Parsed the strings in SheetToMap. Each cell has a root name (the first) which connects to the roots of the previous cell. The root then links to the subsequent names in the chain of names that are separated by “||”
    "cell_[4, 5]||cell_[4, 4]||cell_[4, 3]||cell_[4, 2]||cell_[4, 1]"
  • Seems to be working: tailtest

Phil 3.1.19

7:00 – ASRC

  • Got accepted to the TF dev conference. The flight out is expensive… Sent Eric V. a note asking for permission to go, but bought tix anyway given the short fuse
  • Downloaded the full slack data
  • Working on white paper. The single file was getting unwieldy, so I broke it up
  • Found Speeding up Parliamentary Decision Making for Cyber Counter-Attack, which argues for the possibility of pre-authorizing automated response
  • Up to six pages. IN the middle of the cyberdefense section

Phil 2.28.19

7:00 – very, very, late ASRC

  • Tomorrow is March! I need to write a few paragraphs for Antonio this weekend
  • YouTube stops recommending alt-right channels
    • For the first two weeks of February, YouTube was recommending videos from at least one of these major alt-right channels on more than one in every thirteen randomly selected videos (7.8%). From February 15th, this number has dropped to less than one in two hundred and fifty (0.4%).
  • Working on text splitting Group1 in the PHPBB database
    • Updated the view so the same queries work
    • Discovered that you can do this: …, “message” as type, …. That gives you a column of type filled with “message”. Via stackoverflow
    • Mostly working, I’m missing the last bucket for some reason. But it’s good overlap with the Slack data.
    • Was debugging on my office box, and was wondering where all the data after the troll was! Ooops, not loaded
    • Changed the time tests to be > ts1 and <= ts2
  • Working on the white paper. Deep into strategy, Cyberdefense, and the evolution towards automatic active response in cyber.
  • Looooooooooooooooooooooooooong meeting of Shimei’s group. Interesting but difficult paper: Learning Dynamic Embeddings from Temporal Interaction Networks
  • Emily’s run in the dungeon finishes tonight!
  • Looks like I’m going to the TF Dev conference after all….

Phil 2.27.19

7:00 – 5:30 ASRC

  • Getting closer to the goal by being less capable
    • Understanding how systems with many semi-autonomous parts reach a desired target is a key question in biology (e.g., Drosophila larvae seeking food), engineering (e.g., driverless navigation), medicine (e.g., reliable movement for brain-damaged individuals), and socioeconomics (e.g., bottom-up goal-driven human organizations). Centralized systems perform better with better components. Here, we show, by contrast, that a decentralized entity is more efficient at reaching a target when its components are less capable. Our findings reproduce experimental results for a living organism, predict that autonomous vehicles may perform better with simpler components, offer a fresh explanation for why biological evolution jumped from decentralized to centralized design, suggest how efficient movement might be achieved despite damaged centralized function, and provide a formula predicting the optimum capability of a system’s components so that it comes as close as possible to its target or goal.
  • Nice chat with Greg last night. He likes the “Bones in a Hut” and “Stampede Theory” phrases. It turns out the domains are available…
    • Thinking that the title of the book could be “Stampede Theory: Why Group Think Happens, and why Diversity is the First, Best Answer”. Maybe structure the iConference talk around that as well.
  • Guidance from Antonio: In the meantime, if you have an idea on how to structure the Introduction, please go on considering that we want to put the decision logic inside each Autonomous Car that will be able to select passengers and help them in a self-organized manner.
  • Try out the splitter on the Tymora1 text.
    • Incorporate the ignore.xml when reading the text
    • If things look promising, then add changes to the phpbb code and try on that text as well.
    • At this point I’m just looking at overlapping lists of words that become something like a sand chart. I wonder if I can use the Eigenvector values to become a percentage connectivity/weight? Weights
    • Ok – I have to say that I’m pretty happy with this. These are centrality using top 25% BOW from the Slack text of Tymora1. I think that the way to use this is to have each group be an “agent” that has cluster of words for each step: Top 10
    • Based on this, I’d say add a “Evolving Networks of words” section to the dissertation. Have to find that WordRank paper
  • Working on white paper. Lit review today, plus fix anything that I might have broken…
    • Added section on cybersecurity that got lost in the update fiasco
    • Aaron found a good paper on the lack of advantage that the US has in AI, particularly wrt China
  • Avoiding working on white paper by writing a generator for Aaron. Done!
  • Cortex is an open-source platform for building, deploying, and managing machine learning applications in production. It is designed for any developer who wants to build machine learning powered services without having to worry about infrastructure challenges like configuring data pipelines, continuous deployment, and dependency management. Cortex is actively maintained by Cortex Labs. We’re a venture-backed team of infrastructure engineers and we’re hiring.

Phil 2.26.19

7:00 – 3:00 ASRC

    • Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design. Built by experienced developers, it takes care of much of the hassle of Web development, so you can focus on writing your app without needing to reinvent the wheel. It’s free and open source.
    • More white paper. Add Flynn’s thoughts about cyber security – see notes from yesterday
    • Reconnected with Antonio. He’d like me to write the introduction and motivation for his SASO paper
    • Add time bucketing to postanalyzer. I’m really starting to want to add a UI
      • Looks done. Try it out next time
        Running query for Poe in subject peanutgallery between 23:56 and 00:45
        Running query for Dungeon Master in subject peanutgallery between 23:56 and 00:45
        Running query for Lord Javelin in subject peanutgallery between 23:56 and 00:45
        Running query for memoriesmaze in subject peanutgallery between 23:56 and 00:45
        Running query for Linda in subject peanutgallery between 23:56 and 00:45
        Running query for phil in subject peanutgallery between 23:56 and 00:45
        Running query for Lorelai in subject peanutgallery between 23:56 and 00:45
        Running query for Bren'Dralagon in subject peanutgallery between 23:56 and 00:45
        Running query for Shelton Herrington in subject peanutgallery between 23:56 and 00:45
        Running query for Keiri'to in subject peanutgallery between 23:56 and 00:45
    • More white paper. Got through the introduction and background. Hopefully didn’t loose anything when I had to resynchronize with the repository that I hadn’t updated from