Category Archives: Python

Phil 3.20.19

ASRC PhD 7:00 – 2:00, NASA 2:00 – 4:00

  • Add Text area for “selected”, with a combobox for “Place”, “Space”, and “Ignore”. Ignore words get split(), added to a sorted list view, and saved in the config.xml
  • Add text area for program output as well as console. Text lists for users can be shown here. Selected terms from here can be selected and added to the ignore list
  • Draw the following as colored points on the embeddings
    • Top Room (bow/tf-idf) terms
    • Top Group (bow/tf-idf)
    • Place terms and Space terms (differentiated by room/group?)

Phil 3.19.19

7:00 – 5:00, 6:30 – 7:15 ASRC PhD

  • Algorithmic Rationality: Game Theory with Costly Computation
    • We develop a general game-theoretic framework for reasoning about strategic agents performing possibly costly computation. In this framework, many traditional game-theoretic results (such as the existence of a Nash equilibrium) no longer hold. Nevertheless, we can use the framework to provide psychologically appealing explanations of observed behavior in well-studied games (such as finitely repeated prisoner’s dilemma and rock-paper-scissors). Furthermore, we provide natural conditions on games sufficient to guarantee that equilibria exist.
  • NVIDIA has an IoT NN chipset as well
  • Recreate DB with file pull. Success! Here’s the code that works. It takes a downloaded Slack chat session and iterates over that. When it finds a message with a “files” subobject, it does the following. Shout out to the wonderful requests library:
    for mf in entry["files"]:
        mf["user"] = entry["user"]
        mf["ts"] = entry["ts"]
        if mf["pretty_type"] == "Post":
            if use_local :
                to_return += mf["preview"]
                url = mf["url_private"]
                    result = requests.get(url, headers={'Authorization': 'Bearer %s' % self.slack_token})
                    dict = json.loads(result.text)
                    root = dict["root"]
                    to_return = ""
                    ch_list = root["children"]
                    for ch in ch_list:
                        if ch["type"] == 'p':
                            to_return += "{} ".format(ch["text"])
                    print("handle_file_entries(): text = {}".format(to_return))
                except requests.exceptions.RequestException as err:
                    print("Got a RequestException: {}".format(err))
                    to_return += mf["preview"]
  • Whoops, tymora3 doesn’t have the “Near the port city of Waterdeep in Faerun” phrase. Switching to “young man in a well crafted tunic”
  • Add explicit file opening and saving – done
  • Add “Embedding Dimensions” field – done
  • Example xml file:
          <channel>Group 1</channel>
          <split>young man in a well crafted tunic</split>
          <split>brightly glowing blue orb</split>
          <split>large scaled troll sleeping</split>
          <split>the sea of gold coins and gems filling it</split>
          <split>Two women lounge on chairs across from each other</split>
  • Reading and writing data are done. Now to start slicing and displaying data interactively


  • Data slicing
    • Build embedding for all docs
    • Starting to add bucketing code. First step is to get ignore text from individual users, second task is to have single, room aligned buckets, gust so we can see what that looks like
  • JuryRoom meeting at 6:30
    • Added my req’s
      • Groups can stay together
      • Threaded questions in sequential or random order
      • Voting for post, rather than yes/no questions
    • Tony raised a bunch of points about how the conversation could be gamed. My response was that we should build something simple and then try to game it to see what affordances we need
    • Extended discussion on display real estate – how do we show the “starred” posts
      • Tony mentioned the idea that starred posts could fade if they languish without additional stars
    • Panos mentioned the idea of a countdown clock to pressure a vote
    • We walked through the implementation issues. The estimated framework is 3-tier, with a relational DB, Node on the server, and a web client running a framework like React. The goal is a “Hello World” application that we can log into and create accounts running by next week
    • I pointed back to the original JuryRoom document’s strawman representation of the schema.
  • Synonyms for fringe
    • Synonyms: Noun
      • border, borderline, bound, boundary, brim, circumference, compass, confines, edge, edging, end, frame, hem, margin, perimeter, periphery, rim, skirt, skirting, verge
    • Synonyms: Verb
      • abut, adjoin, border (on), butt (on or against), flank, join, march (with), neighbor, skirt, touch, verge (on)

Phil 3.18.19

ASRC PhD 7:00 – 6:00

  • SlackToDb
    • Pull down text – Done, I hope. The network here has bad problems with TLS resolution. Will try from home
    • Link sequential posts – done
    • Add word lists for places and spaces (read from file, also read embeddings)
      • Writing out the config file – done
    • Add field for similarity distance threshold. Changing this lists nearby words in the embedding space. These terms are used for trajectory generation and centrality tables.
    • Add plots for place/space words
    • Add phrase-based splitting to find rooms. Buckets work within these splits. Text before the first split and after the last split isn’t used (For embedding, centrality, etc.)
    • Add phrase-based trimming. Test before one and after the other isn’t used
    • Stub out centrality for (embedded) terms and (concatenated, bucketed, and oriented) documents
    • Look into 3d for tkinter (from scratch)


    • Progress for the day:


Phil 3.17.19

Got a really good idea about doing a hybrid coding model using embeddings. We start with a list of “place terms” and a list of “space terms”. We then use the embedded representation (vector) of those terms to find the adjacent terms. This is a sort of automated “snowball sampling”, where terms can lead to other terms. Once we have these terms, we use them as queries into the database to find the campaign and the timestamp for each. We use these to create the trajectories and maps.

This is a pretty straightforward code and a set of queries to write, and I have high confidence that it will work, and provide a novel, understandable method of producing a nice ‘mixed method’ process that is also grounded completely in the corpora.

Phil 3.15.19

7:00 – ASRC

  • Downloaded the JuryRoom spec from Waikato and sent my sympathies for Christchurch
  • Worked on getting cosine distance working – Done. Also created spreadsheets of the distances between posts and list the posts on a tab in the spreadsheet. I strip out the words that aren’t used to make the vectors so the posts look a little funny, but the gist is there:

Phil 3.14.19

ASRC AIMS 7:00 – 4:00, PhD ML, 4:30 –

Phil 3.12.19

7:00 – 4:00 ASRC PhD



Phil 3.11.19

7:00 – 10:00 ASRC PhD. Fun, long day.

Phil 3.3.19

Once more, icky weather makes me productive

  • Ingested all the runs into the db. We are at 7,246 posts
  • Reworking the 5 bucket analysis
  • Building better ignore files and rebuilding bucket spreadsheets. It tuns out that for tymora1, names took up 25% of the BOW, so I increased the fraction saved to the trimmed spreadsheets to 50%
  • Building bucket spreadsheets and saving the centrality vector
  • Here’s what I’ve got so far: ThreeRuns
  • Trajectories: Trajectories
  • First map: firstMap
  • Here it is annotated: firstMapAnnotated
  • Some thoughts. I think this is still “zoomed out” too far. Changing the granularity should help some. I need to automate some of my tools though. The other issue is how I’m assembling my sequences.

Phil 3.1.19

7:00 – ASRC

  • Got accepted to the TF dev conference. The flight out is expensive… Sent Eric V. a note asking for permission to go, but bought tix anyway given the short fuse
  • Downloaded the full slack data
  • Working on white paper. The single file was getting unwieldy, so I broke it up
  • Found Speeding up Parliamentary Decision Making for Cyber Counter-Attack, which argues for the possibility of pre-authorizing automated response
  • Up to six pages. IN the middle of the cyberdefense section

Phil 2.28.19

7:00 – very, very, late ASRC

  • Tomorrow is March! I need to write a few paragraphs for Antonio this weekend
  • YouTube stops recommending alt-right channels
    • For the first two weeks of February, YouTube was recommending videos from at least one of these major alt-right channels on more than one in every thirteen randomly selected videos (7.8%). From February 15th, this number has dropped to less than one in two hundred and fifty (0.4%).
  • Working on text splitting Group1 in the PHPBB database
    • Updated the view so the same queries work
    • Discovered that you can do this: …, “message” as type, …. That gives you a column of type filled with “message”. Via stackoverflow
    • Mostly working, I’m missing the last bucket for some reason. But it’s good overlap with the Slack data.
    • Was debugging on my office box, and was wondering where all the data after the troll was! Ooops, not loaded
    • Changed the time tests to be > ts1 and <= ts2
  • Working on the white paper. Deep into strategy, Cyberdefense, and the evolution towards automatic active response in cyber.
  • Looooooooooooooooooooooooooong meeting of Shimei’s group. Interesting but difficult paper: Learning Dynamic Embeddings from Temporal Interaction Networks
  • Emily’s run in the dungeon finishes tonight!
  • Looks like I’m going to the TF Dev conference after all….

Phil 2.27.19

7:00 – 5:30 ASRC

  • Getting closer to the goal by being less capable
    • Understanding how systems with many semi-autonomous parts reach a desired target is a key question in biology (e.g., Drosophila larvae seeking food), engineering (e.g., driverless navigation), medicine (e.g., reliable movement for brain-damaged individuals), and socioeconomics (e.g., bottom-up goal-driven human organizations). Centralized systems perform better with better components. Here, we show, by contrast, that a decentralized entity is more efficient at reaching a target when its components are less capable. Our findings reproduce experimental results for a living organism, predict that autonomous vehicles may perform better with simpler components, offer a fresh explanation for why biological evolution jumped from decentralized to centralized design, suggest how efficient movement might be achieved despite damaged centralized function, and provide a formula predicting the optimum capability of a system’s components so that it comes as close as possible to its target or goal.
  • Nice chat with Greg last night. He likes the “Bones in a Hut” and “Stampede Theory” phrases. It turns out the domains are available…
    • Thinking that the title of the book could be “Stampede Theory: Why Group Think Happens, and why Diversity is the First, Best Answer”. Maybe structure the iConference talk around that as well.
  • Guidance from Antonio: In the meantime, if you have an idea on how to structure the Introduction, please go on considering that we want to put the decision logic inside each Autonomous Car that will be able to select passengers and help them in a self-organized manner.
  • Try out the splitter on the Tymora1 text.
    • Incorporate the ignore.xml when reading the text
    • If things look promising, then add changes to the phpbb code and try on that text as well.
    • At this point I’m just looking at overlapping lists of words that become something like a sand chart. I wonder if I can use the Eigenvector values to become a percentage connectivity/weight? Weights
    • Ok – I have to say that I’m pretty happy with this. These are centrality using top 25% BOW from the Slack text of Tymora1. I think that the way to use this is to have each group be an “agent” that has cluster of words for each step: Top 10
    • Based on this, I’d say add a “Evolving Networks of words” section to the dissertation. Have to find that WordRank paper
  • Working on white paper. Lit review today, plus fix anything that I might have broken…
    • Added section on cybersecurity that got lost in the update fiasco
    • Aaron found a good paper on the lack of advantage that the US has in AI, particularly wrt China
  • Avoiding working on white paper by writing a generator for Aaron. Done!
  • Cortex is an open-source platform for building, deploying, and managing machine learning applications in production. It is designed for any developer who wants to build machine learning powered services without having to worry about infrastructure challenges like configuring data pipelines, continuous deployment, and dependency management. Cortex is actively maintained by Cortex Labs. We’re a venture-backed team of infrastructure engineers and we’re hiring.

Phil 2.26.19

7:00 – 3:00 ASRC

    • Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design. Built by experienced developers, it takes care of much of the hassle of Web development, so you can focus on writing your app without needing to reinvent the wheel. It’s free and open source.
    • More white paper. Add Flynn’s thoughts about cyber security – see notes from yesterday
    • Reconnected with Antonio. He’d like me to write the introduction and motivation for his SASO paper
    • Add time bucketing to postanalyzer. I’m really starting to want to add a UI
      • Looks done. Try it out next time
        Running query for Poe in subject peanutgallery between 23:56 and 00:45
        Running query for Dungeon Master in subject peanutgallery between 23:56 and 00:45
        Running query for Lord Javelin in subject peanutgallery between 23:56 and 00:45
        Running query for memoriesmaze in subject peanutgallery between 23:56 and 00:45
        Running query for Linda in subject peanutgallery between 23:56 and 00:45
        Running query for phil in subject peanutgallery between 23:56 and 00:45
        Running query for Lorelai in subject peanutgallery between 23:56 and 00:45
        Running query for Bren'Dralagon in subject peanutgallery between 23:56 and 00:45
        Running query for Shelton Herrington in subject peanutgallery between 23:56 and 00:45
        Running query for Keiri'to in subject peanutgallery between 23:56 and 00:45
    • More white paper. Got through the introduction and background. Hopefully didn’t loose anything when I had to resynchronize with the repository that I hadn’t updated from


Phil 2.25.19

7:00 – 2:30 ASRC TL

2:30 – 4:30 PhD

  • Fix directory code of LMN so that it remembers the input and output directories – done
  • Add time bucketing capabilities. Do this by taking the complete conversation and splitting the results into N sublists. Take the beginning and ending time from each list and then use those to set the timestamp start and stop for each player’s posts.
  • Thinking about a time-series LMN tool that can chart the relative occurrence of the sorted terms over time. I think this could be done with tkinter. I would need to create and executable as described here, though the easiest answer seems to be pyinstaller.
  • Here are two papers that show the advantages of herding over nomadic behavior:
    • Phagotrophy by a flagellate selects for colonial prey: A possible origin of multicellularity
      • Predation was a powerful selective force promoting increased morphological complexity in a unicellular prey held in constant environmental conditions. The green alga, Chlorella vulgaris, is a well-studied eukaryote, which has retained its normal unicellular form in cultures in our laboratories for thousands of generations. For the experiments reported here, steady-state unicellular C. vulgaris continuous cultures were inoculated with the predator Ochromonas vallescia, a phagotrophic flagellated protist (‘flagellate’). Within less than 100 generations of the prey, a multicellular Chlorella growth form became dominant in the culture (subsequently repeated in other cultures). The prey Chlorella first formed globose clusters of tens to hundreds of cells. After about 10–20 generations in the presence of the phagotroph, eight-celled colonies predominated. These colonies retained the eight-celled form indefinitely in continuous culture and when plated onto agar. These self-replicating, stable colonies were virtually immune to predation by the flagellate, but small enough that each Chlorella cell was exposed directly to the nutrient medium.
    • De novo origins of multicellularity in response to predation
      • The transition from unicellular to multicellular life was one of a few major events in the history of life that created new opportunities for more complex biological systems to evolve. Predation is hypothesized as one selective pressure that may have driven the evolution of multicellularity. Here we show that de novo origins of simple multicellularity can evolve in response to predation. We subjected outcrossed populations of the unicellular green alga Chlamydomonas reinhardtii to selection by the filter-feeding predator Paramecium tetraurelia. Two of five experimental populations evolved multicellular structures not observed in unselected control populations within ~750 asexual generations. Considerable variation exists in the evolved multicellular life cycles, with both cell number and propagule size varying among isolates. Survival assays show that evolved multicellular traits provide effective protection against predation. These results support the hypothesis that selection imposed by predators may have played a role in some origins of multicellularity. SpontaniousClustering\