Category Archives: Python

Phil 2.27.19

7:00 – 5:30 ASRC

  • Getting closer to the goal by being less capable
    • Understanding how systems with many semi-autonomous parts reach a desired target is a key question in biology (e.g., Drosophila larvae seeking food), engineering (e.g., driverless navigation), medicine (e.g., reliable movement for brain-damaged individuals), and socioeconomics (e.g., bottom-up goal-driven human organizations). Centralized systems perform better with better components. Here, we show, by contrast, that a decentralized entity is more efficient at reaching a target when its components are less capable. Our findings reproduce experimental results for a living organism, predict that autonomous vehicles may perform better with simpler components, offer a fresh explanation for why biological evolution jumped from decentralized to centralized design, suggest how efficient movement might be achieved despite damaged centralized function, and provide a formula predicting the optimum capability of a system’s components so that it comes as close as possible to its target or goal.
  • Nice chat with Greg last night. He likes the “Bones in a Hut” and “Stampede Theory” phrases. It turns out the domains are available…
    • Thinking that the title of the book could be “Stampede Theory: Why Group Think Happens, and why Diversity is the First, Best Answer”. Maybe structure the iConference talk around that as well.
  • Guidance from Antonio: In the meantime, if you have an idea on how to structure the Introduction, please go on considering that we want to put the decision logic inside each Autonomous Car that will be able to select passengers and help them in a self-organized manner.
  • Try out the splitter on the Tymora1 text.
    • Incorporate the ignore.xml when reading the text
    • If things look promising, then add changes to the phpbb code and try on that text as well.
    • At this point I’m just looking at overlapping lists of words that become something like a sand chart. I wonder if I can use the Eigenvector values to become a percentage connectivity/weight? Weights
    • Ok – I have to say that I’m pretty happy with this. These are centrality using top 25% BOW from the Slack text of Tymora1. I think that the way to use this is to have each group be an “agent” that has cluster of words for each step: Top 10
    • Based on this, I’d say add a “Evolving Networks of words” section to the dissertation. Have to find that WordRank paper
  • Working on white paper. Lit review today, plus fix anything that I might have broken…
    • Added section on cybersecurity that got lost in the update fiasco
    • Aaron found a good paper on the lack of advantage that the US has in AI, particularly wrt China
  • Avoiding working on white paper by writing a generator for Aaron. Done!
  • Cortex is an open-source platform for building, deploying, and managing machine learning applications in production. It is designed for any developer who wants to build machine learning powered services without having to worry about infrastructure challenges like configuring data pipelines, continuous deployment, and dependency management. Cortex is actively maintained by Cortex Labs. We’re a venture-backed team of infrastructure engineers and we’re hiring.

Phil 2.26.19

7:00 – 3:00 ASRC

    • Django is a high-level Python Web framework that encourages rapid development and clean, pragmatic design. Built by experienced developers, it takes care of much of the hassle of Web development, so you can focus on writing your app without needing to reinvent the wheel. It’s free and open source.
    • More white paper. Add Flynn’s thoughts about cyber security – see notes from yesterday
    • Reconnected with Antonio. He’d like me to write the introduction and motivation for his SASO paper
    • Add time bucketing to postanalyzer. I’m really starting to want to add a UI
      • Looks done. Try it out next time
        Running query for Poe in subject peanutgallery between 23:56 and 00:45
        Running query for Dungeon Master in subject peanutgallery between 23:56 and 00:45
        Running query for Lord Javelin in subject peanutgallery between 23:56 and 00:45
        Running query for memoriesmaze in subject peanutgallery between 23:56 and 00:45
        Running query for Linda in subject peanutgallery between 23:56 and 00:45
        Running query for phil in subject peanutgallery between 23:56 and 00:45
        Running query for Lorelai in subject peanutgallery between 23:56 and 00:45
        Running query for Bren'Dralagon in subject peanutgallery between 23:56 and 00:45
        Running query for Shelton Herrington in subject peanutgallery between 23:56 and 00:45
        Running query for Keiri'to in subject peanutgallery between 23:56 and 00:45
    • More white paper. Got through the introduction and background. Hopefully didn’t loose anything when I had to resynchronize with the repository that I hadn’t updated from

 

Phil 2.25.19

7:00 – 2:30 ASRC TL

2:30 – 4:30 PhD

  • Fix directory code of LMN so that it remembers the input and output directories – done
  • Add time bucketing capabilities. Do this by taking the complete conversation and splitting the results into N sublists. Take the beginning and ending time from each list and then use those to set the timestamp start and stop for each player’s posts.
  • Thinking about a time-series LMN tool that can chart the relative occurrence of the sorted terms over time. I think this could be done with tkinter. I would need to create and executable as described here, though the easiest answer seems to be pyinstaller.
  • Here are two papers that show the advantages of herding over nomadic behavior:
    • Phagotrophy by a flagellate selects for colonial prey: A possible origin of multicellularity
      • Predation was a powerful selective force promoting increased morphological complexity in a unicellular prey held in constant environmental conditions. The green alga, Chlorella vulgaris, is a well-studied eukaryote, which has retained its normal unicellular form in cultures in our laboratories for thousands of generations. For the experiments reported here, steady-state unicellular C. vulgaris continuous cultures were inoculated with the predator Ochromonas vallescia, a phagotrophic flagellated protist (‘flagellate’). Within less than 100 generations of the prey, a multicellular Chlorella growth form became dominant in the culture (subsequently repeated in other cultures). The prey Chlorella first formed globose clusters of tens to hundreds of cells. After about 10–20 generations in the presence of the phagotroph, eight-celled colonies predominated. These colonies retained the eight-celled form indefinitely in continuous culture and when plated onto agar. These self-replicating, stable colonies were virtually immune to predation by the flagellate, but small enough that each Chlorella cell was exposed directly to the nutrient medium.
    • De novo origins of multicellularity in response to predation
      • The transition from unicellular to multicellular life was one of a few major events in the history of life that created new opportunities for more complex biological systems to evolve. Predation is hypothesized as one selective pressure that may have driven the evolution of multicellularity. Here we show that de novo origins of simple multicellularity can evolve in response to predation. We subjected outcrossed populations of the unicellular green alga Chlamydomonas reinhardtii to selection by the filter-feeding predator Paramecium tetraurelia. Two of five experimental populations evolved multicellular structures not observed in unselected control populations within ~750 asexual generations. Considerable variation exists in the evolved multicellular life cycles, with both cell number and propagule size varying among isolates. Survival assays show that evolved multicellular traits provide effective protection against predation. These results support the hypothesis that selection imposed by predators may have played a role in some origins of multicellularity. SpontaniousClustering\

Phil 2.24.19

It is a miserable, rainy morning, so I’m working on extracting text blocks for analytics. Once I try the various packages on those blocks, I’ll work on breaking them into blocks.

Ok, that’s coming along well. Here’s an example:

Bren'Dralagon: Pushing through the vines, he steps out to meet the Orc..
(unknown distance clarity, if possible, rush down the stairs to the attack)

Bren'Dralagon: kk

Shelton Herrington: RIP

Keiri'to: first blood

Bren'Dralagon: *Hmm, my tailor will have questions on where that came from*

Shelton Herrington: how far across is the hazard? impossible to jump over?

Shelton Herrington: ok

Bren'Dralagon: close enough to attack?

Shelton Herrington: understood, just checking

Bren'Dralagon: if charging is allowed, since i just moved forward and would be turning i doubt it?, i'll charge

Lorelai: I thought the vines were (mostly) gone?

Shelton Herrington: *"this ingress is a formidable enemy"*

Bren'Dralagon: *Remind me to have those stairs cleaned. I know a guy*

Shelton Herrington: do i have a line of sight to either?

Now that I have some text, I’ll try the tools listed here: linguisticanalysistools.org. The whole suite is known as the Suite of Automatic Linguistic Analysis Tools (SALAT).

Which means… (bear with me here)

That these are tools for creating word salat!

I’ll be here all night folks. Be sure to try the fish…

Played with the tools, but I need a list of words to analyze the docs with respect to. LMN does a good job of this, so I tried it using the broken-out player and DM. It looks super interesting. This is BOW with the non-topic words “these, those, get, etc” ignored:

LMN-tymora1

Based on what I see here, I’m going to work on the bucketing and see if the top words change over time. If they do, then we can build a map in fewer steps

Phil 2.22.19

7:00 – 4:00 ASRC

  • Running Ellen’s dungeon tonight
  • Wondering what to do next. Look at text analytics? List is in this post.
  • But before we do that, I need to extract from the DB posts as text. And now I have something to do!
    • Sheesh – tried to update the database and had all kinds of weird problems. I wound up re-injesting everything from the Slack files. This seems to work fine, so I exported that to replace the .sql file that may have been causing all the trouble.
  • Here’s a thing using the JAX library, which I’m becoming interested in: Meta-Learning in 50 Lines of JAX
    • The focus of Machine Learning (ML) is to imbue computers with the ability to learn from data, so that they may accomplish tasks that humans have difficulty expressing in pure code. However, what most ML researchers call “learning” right now is but a very small subset of the vast range of behavioral adaptability encountered in biological life! Deep Learning models are powerful, but require a large amount of data and many iterations of stochastic gradient descent (SGD). This learning procedure is time-consuming and once a deep model is trained, its behavior is fairly rigid; at deployment time, one cannot really change the behavior of the system (e.g. correcting mistakes) without an expensive retraining process. Can we build systems that can learn faster, and with less data?
  • Meta-Learning: Learning to Learn Fast
    • A good machine learning model often requires training with a large number of samples. Humans, in contrast, learn new concepts and skills much faster and more efficiently. Kids who have seen cats and birds only a few times can quickly tell them apart. People who know how to ride a bike are likely to discover the way to ride a motorcycle fast with little or even no demonstration. Is it possible to design a machine learning model with similar properties — learning new concepts and skills fast with a few training examples? That’s essentially what meta-learning aims to solve.
  • Meta learning is everywhere! Learning to Generalize from Sparse and Underspecified Rewards
    • In “Learning to Generalize from Sparse and Underspecified Rewards“, we address the issue of underspecified rewards by developing Meta Reward Learning (MeRL), which provides more refined feedback to the agent by optimizing an auxiliary reward function. MeRL is combined with a memory buffer of successful trajectories collected using a novel exploration strategy to learn from sparse rewards.
  • Lingvo: A TensorFlow Framework for Sequence Modeling
    • While Lingvo started out with a focus on NLP, it is inherently very flexible, and models for tasks such as image segmentation and point cloud classification have been successfully implemented using the framework. Distillation, GANs, and multi-task models are also supported. At the same time, the framework does not compromise on speed, and features an optimized input pipeline and fast distributed training. Finally, Lingvo was put together with an eye towards easy productionisation, and there is even a well-defined path towards porting models for mobile inference.
  • Working on white paper. Still reading Command Dysfunction and making notes. I think I’ll use the idea of C&C combat as the framing device of the paper. Started to write more bits
  • What, if anything, can the Pentagon learn from this war simulator?
    • It is August 2010, and Operation Glacier Mantis is struggling in the fictional Saffron Valley. Coalition forces moved into the valley nine years ago, but peace negotiations are breaking down after a series of airstrikes result in civilian casualties. Within a few months, the Coalition abandons Saffron Valley. Corruption sapped the reputation of the operation. Troops are called away to a different war. Operation Glacier Mantis ends in total defeat.
  • Created a post for Command Dysfunction here. Finished.

Phil 2.14.19

7:00 – 7:00 ASRC

  • Worked on the whitepaper. Going down the chain of consequences with respect to adding AI to military systems in the light of the Starcraft2 research.
  • Maps of Meaning: The Architecture of Belief
    • A 1999 book by Canadian clinical psychologist and psychology professor Jordan Peterson. The book describes a comprehensive theory for how people construct meaning, in a way that is compatible with the modern scientific understanding of how the brain functions.[1] It examines the “structure of systems of belief and the role those systems play in the regulation of emotion”,[2] using “multiple academic fields to show that connecting myths and beliefs with science is essential to fully understand how people make meaning”.[3] Wikipedia
  • Continuing with Clockwork Muse review. Finished the overview and theoretical takes. Continuing on the notes, which is going slow because of bad text scanning
  • JAX is Autograd and XLA, brought together for high-performance machine learning research. With its updated version of Autograd, JAX can automatically differentiate native Python and NumPy functions. It can differentiate through loops, branches, recursion, and closures, and it can take derivatives of derivatives of derivatives. It supports reverse-mode differentiation (a.k.a. backpropagation) via grad as well as forward-mode differentiation, and the two can be composed arbitrarily to any order. What’s new is that JAX uses XLA to compile and run your NumPy programs on GPUs and TPUs. Compilation happens under the hood by default, with library calls getting just-in-time compiled and executed. But JAX also lets you just-in-time compile your own Python functions into XLA-optimized kernels using a one-function API, jit. Compilation and automatic differentiation can be composed arbitrarily, so you can express sophisticated algorithms and get maximal performance without leaving Python.
  • Working on white paper lit review
    • An Evolutionary Algorithm that Constructs Recurrent Neural Networks
      • Standard methods for simultaneously inducing the structure and weights of recurrent neural networks limit every task to an assumed class of architectures. Such a simplification is necessary since the interactions between network structure and function are not well understood. Evolutionary computations, which include genetic algorithms and evolutionary programming, are population-based search methods that have shown promise in many similarly complex tasks. This paper argues that genetic algorithms are inappropriate for network acquisition and describes an evolutionary program, called GNARL, that simultaneously acquires both the structure and weights for recurrent networks. GNARL’s empirical acquisition method allows for the emergence of complex behaviors and topologies that are potentially excluded by the artificial architectural constraints imposed in standard network induction methods
    • Added Evolutionary Deep Learning and Deep RTS to the references
  • Better Language Models and Their Implications
    • We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization — all without task-specific training.
  • Shimei seminar – 4:30 – 7:00

Phil 2.13.19

7:00 – 7:00 ASRC IRAD TL

  • The Digital Clockwork Muse: A Computational Model of Aesthetic Evolution
    • This paper presents a computational model of creativity that attempts to capture within a social context an important aspect of the art and design process: the search for novelty. The computational model consists of multiple novelty-seeking agents that can assess the interestingness of artworks. The agents can communicate to particularly interesting artworks to others. Agents can also communicate to reward other agents for finding interesting artworks. We present the results from running experiments to investigate the effects of searching for different degrees of novelty on the artworks produced and the social organisation of the agents.
  • Upload the rest of Slack Tymora.
  • Create some txt files and feed into LMN. I’m thinking of by player and then by slice. Do this for both PHPBB and Slack data. Other ideas
    • Look into coherence measures
    • Are there economic models of attention? (ArXive)
    • TAACO is an easy to use tool that calculates 150 indices of both local and global cohesion, including a number of type-token ratio indices (including specific parts of speech, lemmas, bigrams, trigrams and more), adjacent overlap indices (at both the sentence and paragraph level), and connectives indices.
    • CRAT is an easy to use tool that includes over 700 indices related to lexical sophistication, cohesion and source text/summary text overlap. CRAT is particularly well suited for the exploration of writing quality as it relates to summary writing.
    •  TAALED is an analysis tool designed to calculate a wide variety of lexical diversity indices. Homographs are disambiguated using part of speech tags, and indices are calculated using lemma forms. Indices can also be calculated using all lemmas, content lemmas, or function lemmas. Also available is diagnostic output which allows the user to see how TAALED processed each word.
    • TAALES is a tool that measures over 400 classic and new indices of lexical sophistication, and includes indices related to a wide range of sub-constructs.  TAALES indices have been used to inform models of second language (L2) speaking proficiency, first language (L1) and L2 writing proficiency, spoken and written lexical proficiency, genre differences, and satirical language.
    • SEANCE is an easy to use tool that includes 254 core indices and 20 component indices based on recent advances in sentiment analysis. In addition to the core indices, SEANCE allows for a number of customized indices including filtering for particular parts of speech and controlling for instances of negation.
    • TAASSC is an advanced syntactic analysis tool. It measures a number of indices related to syntactic development. Included are classic indices of syntactic complexity (e.g., mean length of T-unit) and fine-grained indices of phrasal (e.g., number of adjectives per noun phrase) and clausal (e.g., number of adverbials per clause) complexity. Also included are indices that are grounded in usage-based perspectives to language acquisition that rely on frequency profiles of verb argument constructions.
    • GAMET is an easy to use tool that provides incidence counts for structural and mechanics errors in texts including grammar, spelling, punctuation, white space, and repetition errors. The tool also provides line output for the errors flagged in the text.
    • Comparison of Top 6 Python NLP Libraries
      • NLTK (Natural Language Toolkit) is used for such tasks as tokenization, lemmatization, stemming, parsing, POS tagging, etc. This library has tools for almost all NLP tasks.
      • Spacy is the main competitor of the NLTK. These two libraries can be used for the same tasks.
      • Scikit-learn provides a large library for machine learning. The tools for text preprocessing are also presented here.
      • Gensim is the package for topic and vector space modeling, document similarity.
      • The general mission of the Pattern library is to serve as the web mining module. So, it supports NLP only as a side task.
      • Polyglot is the yet another python package for NLP. It is not very popular but also can be used for a wide range of the NLP tasks.
  • Continuing writing Clockwork Muse review
  • Reading Attachment 1 to BAA FA8750-18-S-7014. “While white papers will be considered if received prior to 6:00 PM Eastern Standard Time (EST) on 30 Sep 2022, the following submission dates are suggested to best align with projected funding:” 
    • FY20 – 15 April 2019
  • AIMS/ML Meeting. Not sure what the outcome was, other than folks are covered for this quarter?
  • Long, wide ranging meeting with Wayne at Frisco’s. Gave him an account on Antibubbles.com. And it seems like we won first place for Blue Sky papers?

Phil 2.12.19

7:00 – 4:30 ASRC IRAD

  • Talked with Eric yesterday. going to write up a white paper about teachable AI. Two-three week effort
  • Speaking of which, The Evolved Transformer
    • Recent works have highlighted the strengths of the Transformer architecture for dealing with sequence tasks. At the same time, neural architecture search has advanced to the point where it can outperform human-designed models. The goal of this work is to use architecture search to find a better Transformer architecture. We first construct a large search space inspired by the recent advances in feed-forward sequential models and then run evolutionary architecture search, seeding our initial population with the Transformer. To effectively run this search on the computationally expensive WMT 2014 English-German translation task, we develop the progressive dynamic hurdles method, which allows us to dynamically allocate more resources to more promising candidate models. The architecture found in our experiments – the Evolved Transformer – demonstrates consistent improvement over the Transformer on four well-established language tasks: WMT 2014 English-German, WMT 2014 English-French, WMT 2014 English-Czech and LM1B. At big model size, the Evolved Transformer is twice as efficient as the Transformer in FLOPS without loss in quality. At a much smaller – mobile-friendly – model size of ~7M parameters, the Evolved Transformer outperforms the Transformer by 0.7 BLEU on WMT’14 English-German.
  • Finished running Tymora1 on Slack. Downloaded, though the download didn’t include research_notes. Hmmm. Looks like I can’t make it public, either.
  • Thinking about writing a tagging app, possibly with a centrality capability.
  • Started on the Teachable AI paper. The rough outline is there, and I have a good set of references.

Phil 2.11.19

7:00 – 5:00 ASRC IRAD (TL)

  • Gen Studio is a way to navigate between designs in latent space. It is a prototype concept which was created over a two-day hackathon with collaborators across The Met, Microsoft, and MIT.
  • Write up Clockwork Muse
  • Continue with parsing, storing and report generation of slack data. Aaron had the idea that multiple statements by one person should be combined into a single post. Need to think about how that works in the report generation. Since the retrieved list is ordered by timestamp, the naive implementation is to accumulate text into a single post as long as the same person is “talking”
  • Pinged back to Panos about JuryRoom. The original email evaporated, so I tried again…
  • Setting up a meeting with Wayne for Wednesday?
  • Fika – nope, meeting with Eric instead. The goal is to write up a whitepaper for human in the loop AI

Phil 2.8.19

7:00 – 6:00 ASRC IRAD TL

  • Need to ping Eric about tasking. Suggest time series prediction. Speaking of which, Transformers (post 1 and post 2) may be much better than LSTMs for series prediction.
    • The Transformer model in Attention is all you need:a Keras implementation.
      • A Keras+TensorFlow Implementation of the Transformer: “Attention is All You Need” (Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin, arxiv, 2017)
    • keras-transformer 0.17.0
      • Implementation of transformer for translation-like tasks.
    • The other option is “teachable” ML systems using evolution. There is a lot of interesting older work in this area:
      • Particle swarms for feedforward neural network training
      • Evolving artificial neural networks
      • Training Feedforward Neural Networks Using Genetic Algorithms.
        • Multilayered feedforward neural networks possess a number of properties which make them particularly suited to complex pattern classification problems. However, their application to some real world problems has been hampered by the lack of a training algorithm which reliably finds a nearly globally optimal set of weights in a relatively short time. Genetic algorithms are a class of optimization procedures which are good at exploring a large and complex space in an intelligent way to find values close to the global optimum. Hence, they are well suited to the problem of training feedforward networks. In this paper, we describe a set of experiments performed on data from a sonar image classification problem. These experiments both 1) illustrate the improvements gained by using a genetic algorithm rather than backpropagation and 2) chronicle the evolution of the performance of the genetic algorithm as we added more and more domain-specific knowledge into it.
  • Add writing to the db from within the program, download the latest slack bundle, and try storing it!
  • Read in test-dungeon-1 and realized that there is no explicit link between the channel and the message in the data, so I added fields for the current directory and the current file
  • Ok, everything seems to be working. I had a few trips around the block getting a unique id for messages, but that seems ok now.
  • Created view(s), where I learned how to use conditionals and was happy:
    SELECT * FROM t_message;
    SELECT * FROM t_message_files;
    
    CREATE or REPLACE VIEW user_view AS
    SELECT u.id, p.email, p.real_name,
           (CASE WHEN p.display_name > '' THEN p.display_name ELSE u.name END) as username
    FROM t_user u
           INNER JOIN t_user_profile p ON u.id = p.parent_id;
    
    select * from user_view;
    
    CREATE or REPLACE VIEW post_view AS
    SELECT FROM_UNIXTIME(p.ts) as post_time, p.dirname as post_topic, p.text as post_text, u.username,
           (CASE WHEN p.subtype > '' THEN p.subtype ELSE p.type END) as type
    FROM t_message p
           INNER JOIN user_view u ON p.user = u.id;
    
    select * from post_view order by post_time limit 1000;

     

  • Need to put together a strawman invitation that also has checkboxes for BB-based and/or Slack-based preferences and why a user might choose one over the other. Nope, not yet
  • Got the Slack academic discount!

Phil 2.7.19

7:00 – 7:00 ASRC IRAD TL

  • Continuing with Slack to DB process. I should be done with channels, and now I need to get conversations done.
    • The secondary tables that point to the primary user and conversation tables and the tertiary tables that point at them need to be looked at based on what happens when we go past the 10k limit (assuming that I can’t get the discount on the Standard Plan). REPLACE INTO won’t work with an auto incrementing primary key
    • Got all the parts working, now I need to automate and try out on Tymora1
    • Need to write up a letter for Don to sign – done
    • I think Emily is having a run tonight? Nope
    • Added a research_notes section to Slack for Aaron and I right now. I think I’ll invite Wayne as well – done! Need to know
  • Submitted expenses for TL trip
  • Was officially not invited to the TF dev conf
  • Shimei’s group meeting 4:30 – 7:00

Phil 2.6.19

7:00 – 5:00 ASRC IRAD (TL)

  • The role of maps in spatial knowledge acquisition
    • The Cartographic Journal
    • One goal of cartographic research is to improve the usefulness of maps. To do so, we must consider the process of spatial knowledge acquisition, the role of maps in that process, and the content of cognitive representations derived. Research from psychology, geography, and other disciplines related to these issues is reviewed. This review is used to suggest potential new directions for research with particular attention to spatial problem solving and geographic instruction. A classroom experiment related to these issues is then described. The experiment highlights some of the implications that a concern for the process of spatial knowledge acquisition will have on questions and methods of cartographic research as well as on the use of maps in geographic instruction. It also provides evidence of independent but interrelated verbal and spatial components of regional images that can be altered by directed map work.
  • It’s Not A Lie If You Believe It: Lying and Belief Distortion Under Norm-Uncertainty
    • This paper focuses on norm-following considerations as motivating behavior when lying opportunities are present. To obtain evidence on what makes it harder/easier to lie, we hypothesize that subjects might use belief-manipulation in order to justify their lying. We employ a two-stage variant of a cheating paradigm, in which subjects’ beliefs are elicited in stage 1 before performing the die task in stage 2. In stage 1: a) we elicit the subjects’ beliefs about majoritarian (i) behavior or (ii) normative beliefs in a previous session, and b) we vary whether participants are (i) aware or (ii) unaware of the upcoming opportunity to lie. We show that belief manipulation happens, and takes the form of people convincing themselves that lying behavior is widespread. In contrast with beliefs about the behavior of others, we find that beliefs about their normative convictions are not distorted, since believing that the majority disapproves of lying does not inhibit own lying. These findings are consistent with a model where agents are motivated by norm-following concerns, and honest behavior is a strong indicator of disapproval of lying but disapproval of lying is not a strong indicator of honest behavior. We provide evidence that supports this hypothesis.
  • Sent a note to Slack, asking for an academic plan. They do, and there are forms to fill out. I need to send Don some text that he can send back to me on letterhead.
  • Looks like I’m not going to the TF Dev Conf this year…
  • Continuing with the INSERT code
  • Meeting in Greenbelt to discuss… what, exactly?
  • Got a cool book: A Programmer’s Introduction to Mathematics
  • Got my converter creating error-free sql! t_user
  • Working on reading channel data into the db. Possibly done, but I’m afraid to run it so late in the day. I have chores!
  • Reviewing proposal for missing citations – done

Phil 2.5.19

7:00 – 5:00 ASRC IRAD

  • Got the parser to the point that it’s creating query strings, but I need to escape the text properly
  • Created and ab_slack mysql db
  • Added “parent_id” and an auto increment ID to any of the arrays that are associated with the Slack data
  • Reviewing sections 1-3 – done
  • Figure out some past performance – done
  • Work on the CV. Add the GF work and A2P ML work. – done
  • Start reimbursement for NJ trip
  •  Accidentally managed to start a $45/month subscription to the IEEE digital library. It really reeks of deceptive practices. There is nothing on the subscription page that informs you that this is a $45/month, 6-month minimum purchase. I’m about to contact the Maryland deceptive practices people to see if there is legal action that can be brought

Phil 1.30.19

7:00 – 4:00 ASRC IRAD

Teaching a neural network to drive a car. It’s a simple network with a fixed number of hidden nodes (no NEAT), and no bias. Yet it manages to drive the cars fast and safe after just a few generations. Population is 650. The network evolves through random mutation (no cross-breeding). Fitness evaluation is currently done manually as explained in the video.

  • This interactive balance between evolution and learning is exactly the sort of interaction that I think should be at the core of the research browser. The only addition is the ability to support groups collaboratively interacting with the information so that multiple analysts can train the system.
  • A quick thing on the power of belief spaces from a book review about, of all things, Hell. One of the things that gives dimension to a belief space is the fact that people show up.
    • Soon, he’d left their church and started one of his own, where he proclaimed his lenient gospel, pouring out pity and anger for those Christians whose so-called God was a petty torturer, until his little congregation petered out. Assured salvation couldn’t keep people in pews, it turned out. The whole episode, in its intensity and its focus on the stakes of textual interpretation, was reminiscent of Lucas Hnath’s recent play “The Christians,” about a pastor who comes out against Hell and sparks not relief but an exegetical nightmare.
  • Web Privacy Measurement in Real-Time Bidding Systems. A Graph-Based Approach to Rtb System Classification.
    • In the doctoral thesis, Robbert J. van Eijk investigates the advertisements online that seem to follow you. The technology enabling the advertisements is called Real-Time Bidding (RTB). An RTB system is defined as a network of partners enabling big data applications within the organizational field of marketing. The system aims to improve sales by real-time data-driven marketing and personalized (behavioral) advertising. The author applies network science algorithms to arrive at measuring the privacy component of RTB. In the thesis, it is shown that cluster-edge betweenness and node betweenness support us in understanding the partnerships of the ad-technology companies. From our research it transpires that the interconnection between partners in an RTB network is caused by the data flows of the companies themselves due to their specializations in ad technology. Furthermore, the author provides that a Graph-Based Methodological Approach (GBMA) controls the situation of differences in consent implementations in European countries. The GBMA is tested on a dataset of national and regional European news websites.
  • Continuing with Tkinter and ttk
      • That was easy!
        • app3
      • And now there is a scrollbar, which is a little odd to add. They are separate components that you have to explicitly link and place in the same ttk.Frame:
    # make the frame for the listbox and the scroller to live in
    self.lbox_frame = ttk.Frame(self.content_frame)
    
    # place the frame 
    self.lbox_frame.grid(column=0, row=0, rowspan=6, sticky=(N,W,E,S))
    
    # create the listbox and the scrollbar
    self.lbox = Listbox(self.lbox_frame, listvariable=self.cnames, height=5)
    lbox_scrollbar = ttk.Scrollbar(self.lbox_frame, orient=VERTICAL, command=self.lbox.yview)
    
    # after both components have been made, have the lbox point at the scroller
    self.lbox['yscrollcommand'] = lbox_scrollbar.set

     

    • If you get this wrong, then you can end up with a scrollbar in some other Frame, connected to your target. Here’s what happens if the parent is root:
      • badscroller
    • And here is where it’s in the lbox frame as in the code example above:
      • goodscroller
    • The fully formed examples are no more. Putting together a menu app with text. Got the text running with a scrollbar, and everything makes sense. Next is the menus…scrollingtext
    • Here’s the version of the app with working menus: slackdbio
  • For seminar: Predictive Analysis by Leveraging Temporal User Behavior and User Embeddings
    • The rapid growth of mobile devices has resulted in the generation of a large number of user behavior logs that contain latent intentions and user interests. However, exploiting such data in real-world applications is still difficult for service providers due to the complexities of user behavior over a sheer number of possible actions that can vary according to time. In this work, a time-aware RNN model, TRNN, is proposed for predictive analysis from user behavior data. First, our approach predicts next user action more accurately than the baselines including the n-gram models as well as two recently introduced time-aware RNN approaches. Second, we use TRNN to learn user embeddings from sequences of user actions and show that overall the TRNN embeddings outperform conventional RNN embeddings. Similar to how word embeddings benefit a wide range of task in natural language processing, the learned user embeddings are general and could be used in a variety of tasks in the digital marketing area. This claim is supported empirically by evaluating their utility in user conversion prediction, and preferred application prediction. According to the evaluation results, TRNN embeddings perform better than the baselines including Bag of Words (BoW), TFIDF and Doc2Vec. We believe that TRNN embeddings provide an effective representation for solving practical tasks such as recommendation, user segmentation and predictive analysis of business metrics.

Phil 1.29.19

7:00 – 5:30 ASRC IRAD

  • Theories of Error Back-Propagation in the Brain
    • This review article summarises recently proposed theories on how neural circuits in the brain could approximate the error back-propagation algorithm used by artificial neural networks. Computational models implementing these theories achieve learning as efficient as artificial neural networks, but they use simple synaptic plasticity rules based on activity of presynaptic and postsynaptic neurons. The models have similarities, such as including both feedforward and feedback connections, allowing information about error to propagate throughout the network. Furthermore, they incorporate experimental evidence on neural connectivity, responses, and plasticity. These models provide insights on how brain networks might be organised such that modification of synaptic weights on multiple levels of cortical hierarchy leads to improved performance on tasks.
  • Interactive Machine Learning by Visualization: A Small Data Solution
    • Machine learning algorithms and traditional data mining process usually require a large volume of data to train the algorithm-specific models, with little or no user feedback during the model building process. Such a “big data” based automatic learning strategy is sometimes unrealistic for applications where data collection or processing is very expensive or difficult, such as in clinical trials. Furthermore, expert knowledge can be very valuable in the model building process in some fields such as biomedical sciences. In this paper, we propose a new visual analytics approach to interactive machine learning and visual data mining. In this approach, multi-dimensional data visualization techniques are employed to facilitate user interactions with the machine learning and mining process. This allows dynamic user feedback in different forms, such as data selection, data labeling, and data correction, to enhance the efficiency of model building. In particular, this approach can significantly reduce the amount of data required for training an accurate model, and therefore can be highly impactful for applications where large amount of data is hard to obtain. The proposed approach is tested on two application problems: the handwriting recognition (classification) problem and the human cognitive score prediction (regression) problem. Both experiments show that visualization supported interactive machine learning and data mining can achieve the same accuracy as an automatic process can with much smaller training data sets.
  • Shifted Maps: Revealing spatio-temporal topologies in movement data
    • We present a hybrid visualization technique that integrates maps into network visualizations to reveal and analyze diverse topologies in geospatial movement data. With the rise of GPS tracking in various contexts such as smartphones and vehicles there has been a drastic increase in geospatial data being collect for personal reflection and organizational optimization. The generated movement datasets contain both geographical and temporal information, from which rich relational information can be derived. Common map visualizations perform especially well in revealing basic spatial patterns, but pay less attention to more nuanced relational properties. In contrast, network visualizations represent the specific topological structure of a dataset through the visual connections of nodes and their positioning. So far there has been relatively little research on combining these two approaches. Shifted Maps aims to bring maps and network visualizations together as equals. The visualization of places shown as circular map extracts and movements between places shown as edges, can be analyzed in different network arrangements, which reveal spatial and temporal topologies of movement data. We implemented a web-based prototype and report on challenges and opportunities about a novel network layout of places gathered during a qualitative evaluation.
    • Demo!
  • More TkInter.
    • Starting Modern Tkinter for Busy Python Developers
    • Spent a good deal of time working through how to get an image to appear. There are two issues:
      • Loading file formats:
        from tkinter import *
        from tkinter import ttk
        from PIL import Image, ImageTk
      • This is because python doesn’t know natively how to load much beyond gif, it seems. However, there is the Python Image Library, which does. Since the original PIL is deprecated, install Pillow instead. It looks like the import and bindings are the same.
      • dealing with garbage collection (“self” keeps the pointer alive):
        image = Image.open("hal.jpg")
        self.photo = ImageTk.PhotoImage(image)
        ttk.Label(mainframe, image=self.photo).grid(column=1, row=1, sticky=(W, E))
      • The issue is that if the local variable that contains the reference goes out of scope, the garbage collector (in Tkinter? Not sure) scoops it up before the picture can even appear, causing the system (and the debugger) to try to draw a None. If you make the reference global to the class (i.e. self.xxx), then the reference is maintained and everything works.
    • The relevant stack overflow post.
    • A pretty picture of everything working:
      • app
  • The 8.6.9 Tk/Ttk documentation
  • Looks like there are some WYSIWYG tools for building pages. PyGubu looks like its got the most recent activity
  • Now my app resizes on grid layouts: app2