  • Had a thought that the incomprehension that comes from misalignment that Stephens shows resembles polarizing light. I need to add a slider that enables influence as a function of alignment. Done
    • Getting the direction cosine between the source and target belief
      double interAgentDotProduct = unitOrientVector.dotProduct(otherUnitOrientVector);
      double cosTheta = Math.min(1.0, interAgentDotProduct);
      double beliefAlignment = Math.toDegrees(Math.acos(cosTheta));
      double interAgentAlignment = (1.0 - beliefAlignment/180.0);
    • Adding a global variable that sets how much influence (0% – 100%) influence from an opposing agent. Just setting it to on/off, because the effects are actually pretty subtle
    Need to reduce complexity and add clearly labeled sections, in particular methods
  I need to start paying attention to attention
  • Also, keeping this on the list How social media took us from Tahrir Square to Donald Trump by Zeynep Tufekci
  • Social Identity Threat Motivates Science – Discrediting Online Comments
    • Experiencing social identity threat from scientific findings can lead people to cognitively devalue the respective findings. Three studies examined whether potentially threatening scientific findings motivate group members to take action against the respective findings by publicly discrediting them on the Web. Results show that strongly (vs. weakly) identified group members (i.e., people who identified as “gamers”) were particularly likely to discredit social identity threatening findings publicly (i.e., studies that found an effect of playing violent video games on aggression). A content analytical evaluation of online comments revealed that social identification specifically predicted critiques of the methodology employed in potentially threatening, but not in non-threatening research (Study 2). Furthermore, when participants were collectively (vs. self-) affirmed, identification did no longer predict discrediting posting behavior (Study 3). These findings contribute to the understanding of the formation of online collective action and add to the burgeoning literature on the question why certain scientific findings sometimes face a broad public opposition.

There was no colusion“…”Anyone involved in that meddling to justice.

Premises for Data Science Magical Realism

  • What follows are some premises for data science magical realism stories based (very, very loosely) on experiences I’ve had or heard about — premises, that is, for stories about impossible, absurd, magical things happening to data scientists in ordinary data science situations. Enjoy!
  • More from David Masad

Program Synthesis in 2017-18

  • A high-level overview of the recent ideas and representative papers in program synthesis as of mid-2018.
  • Alex (Oleksandr) Polozov, a researcher in the Deep Procedural Intelligence group at Microsoft Research AI, Redmond. I work on neural program synthesis from input-output examples and natural language, intersections of machine learning and software engineering, and neuro-symbolic architectures. I am particularly interested in combining neural and symbolic techniques to tackle the next generation of AI problems, including program synthesis, planning, and reasoning.

UMAP Uniform Manifold Approximation and Projection for Dimension Reduction | SciPy 2018 |(video) (paper)

  • UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP as described has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning.
  • This could be nice for building maps

  • Progress on getting my keys back!
  • Got everyone’s response on the Doodle, but only 4 of the 5 line up…
  • Finish first pass through PhD review slides
  • Start SASO slides and poster?
  • Continue with exporting terms from the sim and importing them into python. One of the things that will matter is the tagging of the data with the seed terms from the sim as well as the cell name so that reconstructions can be compared for accuracy.
  • Added the cell location to each <sampleData> so that there can be some kind of tagging/ground truth about the maps we’re inferring.
  • Working on iterating through the etree hierarchy. I can now read in the file, parse it and get elements that I’m looking for.
  • Tomorrow will be pulling the seed words out of the code in an ordered list. Generated sentences will need to be timestamped to that conversations can be reconstructed. That being said, it could be interesting to take seed words out of a generated sentence and add them to the embedding seed words. Something to think about.

  • Twitter experiment on a fake Gary Indiana secession. IFTTT retweeting leads to interesting behavior.
  • Fixed FlockingShape casting by adding a customDrawStep(GraphicsContext gc) to the SmartShape base class that’s called from draw().
  • Add records to each agent that store a list of source and agent influences at each time sample. It should include the name of the item and the amount of influence. Probably save as an XML file, since it has too many dimensions. The file could then be used to create terms or spreadsheets.
    • Started on CAInfluence class which will be added to CA classes in an arrayList in BaseCA;
  • More file conversion with Bob – and everything worked great until I try one after Bob leaves. Ka-BOOM!
    • Installed all the packages to get everything to run in the debugger. Found what appears to be a perfectly good line “range” that causes the problem? Will start debugging on Wednesday.
  • Project MERCATOR proposal
  • Meeting with Sy

  • Add an attractor scalar for agents that’s normally zero. A vector to each agent within the SIH is calculated and scaled by the attractor scalar. That vector is then added to the direction vector to the agent – done
  • Remove the heading influence based on site – done
  • Add a white circle to the center of the agent that is the size of the attraction scalar. Done
  • Add attraction radius slider that is independent of the SIH. -done
  • Add a ‘site trajectory’ to the spreadsheet that will have the site lists (and their percentage?)
  There is now an opportunity for a poster and a demo at SASO
  Add stories, lists and maps to implication slides – done
  • Schooling Fish May Offer Insights Into Networked Neurons
    • Iain Couzin is deciphering the rules that govern group behavior. The results might provide a fresh perspective on how networks of neurons work together.
  • City arts and lectures: The New Science Of Psychedelics With Michael Pollan
    • Psychedelics reduce the section of the brain that have to do with the sense of self. Pollan thinks that this also happens with certain types of rhythmic music and in crowd situations. This could be related to stampedes and flocking.
    • LSD May Chip Away at the Brain’s “Sense of Self” Network
      • Brain imaging suggests LSD’s consciousness-altering traits may work by hindering some brain networks and boosting overall connectivity
  • Add an attractor scalar for agents that’s normally zero. A vector to each agent within the SIH is calculated and scaled by the attractor scalar. That vector is then added to the direction vector to the agent – done?
  • Remove the heading influence based on site – done
  • Add a white circle to the center of the agent that is the size of the attraction scalar. Done
  • Add a ‘site trajectory’ to the spreadsheet that will have the site lists (and their percentage?)
  • Worked on A2P white paper with Aaron.
  • Worked on a response to Dr. Li’s response

  • More Bit by Bit. Reading the section on ethics. It strikes me that simulation could be a way to cut the PII Gordion Knot in some conditions. If a simulation can be developed that generates statistically similar data to the desired population, then the simulated data and the simulation code can be released to the research community. The dataset becomes infinite and adjustable, while the PII data can be held back. Machine learning systems trained on the simulated data can then be evaluated on the confidential data. The differences in the classification by the ML systems between real data and simulated data can also provide insight into the gaps in fidelity of the simulated data, which would provide an ongoing improvement to the simulation, which could in turn be released to the community.
  • Continuing with the cleanup of the SASO paper. Mostly done but some trimming of redundent bits and the “Ose Simple Trick” paragraph.
    • Come up with 3-5 options for a finished state for the dissertation. It probably ranges from “pure theory” through “instance based on theory” to “a map generated by the system that matches the theory”
    • Once the SASO paper is in, set up a “wine and cheese” get together for the committee to go over the current work and discuss changes to the next phase
    • Start on a new IRB. Emphasize how everyone will have the same system to interact with, though their interactions will be different. Emphasize that the system has to allow open interaction to provide the best chance to realize theoretical results.
    • Will and I are on the hook for a Fika about LaTex

  • Applications of big social media data analysis: An overview
    • Over the last few years, online communication has moved toward user-driven technologies, such as online social networks (OSNs), blogs, online virtual communities, and online sharing platforms. These social technologies have ushered in a revolution in user-generated data, online global communities, and rich human behavior-related content. Human-generated data and human mobility patterns have become important steps toward developing smart applications in many areas. Understanding human preferences is important to the development of smart applications and services to enable such applications to understand the thoughts and emotions of humans, and then act smartly based on learning from social media data. This paper discusses the role of social media data in comprehending online human data and in consequently different real applications of SM data for smart services are executed.
  • Explainable, Interactive Deep Learning
    • Recently, deep learning has been advancing the state of the art in artificial intelligence to yet another level, and humans are relying more and more on the outputs generated by artificial intelligence techniques than ever before. However, even with such unprecedented advancements, the lack of interpretability on the decisions made by deep learning models and no control over their internal processes act as a major drawback when utilizing them to critical decision-making processes such as precision medicine and law enforcement. In response, efforts are being made to make deep learning interpretable and controllable by humans. In this paper, we review recent studies relevant to this direction and discuss potential challenges and future research directions.
  • Building successful online communities: Evidence-based social design (book review)
    • In Building Successful Online Communities (2012), Robert Kraut, Paul Resnick, and their collaborators set out to draw links between the design of socio-technical systems with findings from social psychology and economics. Along the way, they set out a vision for the role of social sciences in the design of systems like mailing lists, discussion forums, wikis, and social networks, offering a way that behavior on those platforms might inform our understanding of human behavior.
  • Since I’ve forgotten my Angular stuff, reviewing UltimateAngular, Angular Fundamentals course. Finished the ‘Getting Started’ section
  • Strip out Guttenburg text from corpora – done!

  • Some new papers from ICLR 2018
  • Need to write up a quick post for communicating between Angular and a (PHP) server, with an optional IntelliJ configuration section
  JuryRoom this morning and then GANs + Agents this afternoon?
  • Next steps for JuryRoom
    • Start up the AngularPro course
    • Set up PHP access to DB, returning JSON objects
  • Starting Agent/GAN project
    • Need to set up an ACM paper to start dumping things into – done.
    • Looking for a good source for Jack London. Gutenberg looks nice, but there is a no-scraping rule, so I guess, we’ll do this by hand…
    • We will need to check for redundant short stories
    • We will need to strip the front and back matter that pertains to project Gutenburg
  • Fika: Accessibility at the Intersection of Users and Data
    • Nice talk and followup discussion with Dr. Hernisa Kacorri, who’s combining machine learning and HCC
      • My research goal is to build technologies that address real-world problems by integrating data-driven methods and human-computer interaction. I am interested in investigating human needs and challenges that may benefit from advancements in artificial intelligence. My focus is both in building new models to address these challenges and in designing evaluation methodologies that assess their impact. Typically my research involves application of machine learning and analytics research to benefit people with disabilities, especially assistive technologies that model human communication and behavior such as sign language avatars and independent mobility for the blind.

    • Good discussion with Aaron about the agents navigating embedding space. This would be a great example of creating “more realistic” data from simulation that bridges the gap between simulation and human data. This becomes the basis for work producing text for inputs such as DHS input streams.
      • Get the embedding space from the Jack London corpora (crawl here)
      • Train a classifier that recognizes JL using the embedding vectors instead of the words. This allows for contextual closeness. Additionally, it might allow a corpus to be trained “at once” as a pattern in the embedding space using CNNs.
      • Train an NN(what type?) to produce sentences that contain words sent by agents that fool the classifier
      • Record the sentences as the trajectories
      • Reconstruct trajectories from the sentences and compare to the input
      • Some thoughts WRT generating Twitter data
        • Closely aligned agents can retweet (alignment measure?)
        • Less closely aligned agents can mention/respond, and also add their tweet
    Handed off the proposal to Red Team. Still need to rework the Exec Summary. Nope. Doesn't matter that the current exec summary does not comply with the requirements.
    • A dog with high social influence creates an adorable stampede:
    • Using Machine Learning to Replicate Chaotic Attractors and Calculate Lyapunov Exponents from Data
      • This is a paper that describes how ML can be used to predict the behavior of chaotic systems. An implication is that this technique could be used for early classification of nomadic/flocking/stampede behavior
    • Visualizing a Thinker’s Life
      • This paper presents a visualization framework that aids readers in understanding and analyzing the contents of medium-sized text collections that are typical for the opus of a single or few authors.We contribute several document-based visualization techniques to facilitate the exploration of the work of the German author Bazon Brock by depicting various aspects of its texts, such as the TextGenetics that shows the structure of the collection along with its chronology. The ConceptCircuit augments the TextGenetics with entities – persons and locations that were crucial to his work. All visualizations are sensitive to a wildcard-based phrase search that allows complex requests towards the author’s work. Further development, as well as expert reviews and discussions with the author Bazon Brock, focused on the assessment and comparison of visualizations based on automatic topic extraction against ones that are based on expert knowledge.


  • Society for Personality and Social Psychology
    • The mission of SPSP is to advance the scienceteaching, and application of social and personality psychology. SPSP members aspire to understand individuals in their social contexts for the benefit of all people.
    • Social psychology is the scientific study of how people’s thoughts, feelings, and behaviors are influenced by the actual, imagined, or implied presence of others.
  • Rebecca Hofstein Grady
    • I am interested in the ways that bias and motivation can affect our reasoning and memory to influence the judgments and decisions that we make.  In particular, I am currently studying how these biases apply to real-world situations, such as political conflicts, hiring decisions, and legal decision-making.  I explore not only how biases affect decision-making but what people think about their own biases and the best ways to help correct them.
    • Data from a pre-publication independent replication initiative examining ten moral judgement effects

    • BIC_102 (page 102)
    • BIC107 (pg 107)
    • BIC107b (pg 107)
    • Sociality: Coordinating bodies, minds and groups
      • Human interaction, as opposed to aggregation, occurs in face-to-face groups. “Sociality theory” proposes that such groups have a nested, hierarchical structure, consisting of a few basic variations, or “core configurations.” These function in the coordination of human behavior, and are repeatedly assembled, generation to generation, in human ontogeny, and in daily life. If face-to-face groups are “the mind’s natural environment,” then we should expect human mental systems to correlate with core configurations. Features of groups that recur across generations could provide a descriptive paradigm for testable and non-intuitive evolutionary hypotheses about social and cognitive processes. This target article sketches three major topics in sociality theory, roughly corresponding to the interests of biologists, psychologists, and social scientists. These are (1) a multiple levels-of-selection view of Darwinism, part group selectionism, part developmental systems theory; (2) structural and psychological features of repeatedly assembled, concretely situated face-to-face coordination; and (3) superordinate, “unsituated” coordination at the level of large-scale societies. Sociality theory predicts a tension, perhaps unresolvable, between the social construction of knowledge, which facilitates coordination within groups, and the negotiation of the habitat, which requires some correspondence with contingencies in specific situations. This tension is relevant to ongoing debates about scientific realism, constructivism, and relativism in the philosophy and sociology of knowledge.
        • These definitions seem to span atomic (mother/child, etc), small group (situated, environmental), and societal (unsituated, normative)
      • Coordination occurs to the extent that knowledge and practice domains overlap or are complementary. I suggest that values serve as a medium. Humans live in a value-saturated environment; values are known from interactions with people, natural objects, and artifacts
        • Dimension reduction
  •  I’m starting to think that agents as gradient descent machines within networks is something to look for:
    • Individual Strategy Update and Emergence of Cooperation in Social Networks
      • In this article, we critically study whether social networks can explain the emergence of cooperative behavior. We carry out an extensive simulation program in which we study the most representative social dilemmas. For the Prisoner’s Dilemma, it turns out that the emergence of cooperation is dependent on the microdynamics. On the other hand, network clustering mostly facilitates global cooperation in the Stag Hunt game, whereas degree heterogeneity promotes cooperation in Snowdrift dilemmas. Thus, social networks do not promote cooperation in general, because the macro-outcome is not robust under change of dynamics. Therefore, having specific applications of interest in mind is crucial to include the appropriate microdetails in a good model.
    • Alex Peysakhovich and Adam Lerer
      • Prosocial learning agents solve generalized Stag Hunts better than selfish ones
        • Deep reinforcement learning has become an important paradigm for constructing agents that can enter complex multi-agent situations and improve their policies through experience. One commonly used technique is reactive training – applying standard RL methods while treating other agents as a part of the learner’s environment. It is known that in general-sum games reactive training can lead groups of agents to converge to inefficient outcomes. We focus on one such class of environments: Stag Hunt games. Here agents either choose a risky cooperative policy (which leads to high payoffs if both choose it but low payoffs to an agent who attempts it alone) or a safe one (which leads to a safe payoff no matter what). We ask how we can change the learning rule of a single agent to improve its outcomes in Stag Hunts that include other reactive learners. We extend existing work on reward-shaping in multi-agent reinforcement learning and show that that making a single agent prosocial, that is, making them care about the rewards of their partners can increase the probability that groups converge to good outcomes. Thus, even if we control a single agent in a group making that agent prosocial can increase our agent’s long-run payoff. We show experimentally that this result carries over to a variety of more complex environments with Stag Hunt-like dynamics including ones where agents must learn from raw input pixels.
      • The Good, the Bad, and the Unflinchingly Selfish: Cooperative Decision-Making Can Be Predicted with High Accuracy Using Only Three Behavioral Types
        • The human willingness to pay costs to benefit anonymous others is often explained by social preferences: rather than only valuing their own material payoff, people also care in some fashion about the outcomes of others. But how successful is this concept of outcome-based social preferences for actually predicting out-of-sample behavior? We investigate this question by having 1067 human subjects each make 20 cooperation decisions, and using machine learning to predict their last 5 choices based on their first 15. We find that decisions can be predicted with high accuracy by models that include outcome-based features and allow for heterogeneity across individuals in baseline cooperativeness and the weights placed on the outcome-based features (AUC=0.89). It is not necessary, however, to have a fully heterogeneous model — excellent predictive power (AUC=0.88) is achieved by a model that allows three different sets of baseline cooperativeness and feature weights (i.e. three behavioral types), defined based on the participant’s cooperation frequency in the 15 training trials: those who cooperated at least half the time, those who cooperated less than half the time, and those who never cooperated. Finally, we provide evidence that this inclination to cooperate cannot be well proxied by other personality/morality survey measures or demographics, and thus is a natural kind (or “cooperative phenotype”)
        • “least”, “intermediate” and “most” cooperative. Doesn’t give percentages, though it says that 17.8% were cooperative?


  • Talk Susan Gregurick (
    • All of Us research program
    • Opiod epidemic – trajectory modeling?
    • PZM21 computational drug
    • Develop advanced software and tools. Specialized generalizable and accessible tools for biomedicing (finding stream). Includes mobile, data indexing, etc.
    • NIH Data Fellows? Postdocs to senior industry
    • T32 funding? Mike Summers at UMBC
    • (look for data?
    • Primary supporter for machine learning is NIMH (imaging), then NIGNS, and NCI Team science (Multi-PI) is a developing thing
    • $400m in computing enabled interactions (human in the loop decision tools. Research Browser?
    • Big Data to Knowledge Initiative (BD2K)
    • Interagency Modeling and Analysis Group (IMAG) imagewiki,
    • funding:
    • NIH RePorter Check out matchmaker. What’s the ranking algorithm?
    • NIDDK predictive analytics for budgeting <- A2P-ish?
    • Most of thi srequires preliminary data and papers to be considered for funding. There is one opportunity for getting funding to get preliminary data. Need to get more specific infor here.
    • Each SRO normalizes grade as a percentile, not the score, since some places inflate, and others are hard.
    • Richard Aargon at NIGMS
    • Office of behavioral and social science – NIH center Francis Collins. Also agent-based simulation
    • Really wants a Research Browser to go through proposals
    • IRB – you can email and chat with the board if you have a tricky study

  • UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction
    • UMAP (Uniform Manifold Approximation and Projection) is a novel manifold learning technique for dimension reduction. UMAP is constructed from a theoretical framework based in Riemannian geometry and algebraic topology. The result is a practical scalable algorithm that applies to real world data. The UMAP algorithm is competitive with t-SNE for visualization quality, and arguably preserves more of the global structure with superior run time performance. Furthermore, UMAP as described has no computational restrictions on embedding dimension, making it viable as a general purpose dimension reduction technique for machine learning.
  • How Prevalent are Filter Bubbles and Echo Chambers on Social Media? Not as Much as Conventional Wisdom Has It
    • Yet, as Rasmus points out, conventional wisdom seems to be stuck with the idea that social media constitute filter bubbles and echo chambers, where most people only, or mostly, see political content they already agree with. It is definitely true that there is a lot of easily accessible, clearly identifiable, highly partisan content on social media. It is also true that, to some extent, social media users can make choices as to which sources they follow and engage with. Whether people use these choice affordances solely to flock to content reinforcing their political preferences and prejudices, filtering out or avoiding content that espouses other viewpoints, is, however, an empirical question—not a destiny inscribed in the way social media and their algorithms function.
  • He Predicted The 2016 Fake News Crisis. Now He’s Worried About An Information Apocalypse.
    • That future, according to Ovadya, will arrive with a slew of slick, easy-to-use, and eventually seamless technological tools for manipulating perception and falsifying reality, for which terms have already been coined — “reality apathy,” “automated laser phishing,” and “human puppets.”
  • Begin trimming paper – good progress.
  • Add a slider that lets the user interactively move a token along the selected trajectory path – done. Yes, it looks like a golf ball on a tee… Capture
  • The social structural foundations of adaptation and transformation in social–ecological systems
    • Social networks are frequently cited as vital for facilitating successful adaptation and transformation in linked social–ecological systems to overcome pressing resource management challenges. Yet confusion remains over the precise nature of adaptation vs. transformation and the specific social network structures that facilitate these processes. Here, we adopt a network perspective to theorize a continuum of structural capacities in social–ecological systems that set the stage for effective adaptation and transformation. We begin by drawing on the resilience literature and the multilayered action situation to link processes of change in social–ecological systems to decision making across multiple layers of rules underpinning societal organization. We then present a framework that hypothesizes seven specific social–ecological network configurations that lay the structural foundation necessary for facilitating adaptation and transformation, given the type and magnitude of human action required. A key contribution of the framework is explicit consideration of how social networks relate to ecological structures and the particular environmental problem at hand. Of the seven configurations identified, three are linked to capacities conducive to adaptation and three to transformation, and one is hypothesized to be important for facilitating both processes.
  • Starting to trim paper down to three pages
  • Starting on CHIIR slide stack – Still need to add future work
  • Rwanda radio transcripts
    • From October 1993 to late 1994, RTLM was used by Hutu leaders to advance an extremist Hutu message and anti-Tutsi disinformation, spreading fear of a Tutsi genocide against Hutu, identifying specific Tutsi targets or areas where they could be found, and encouraging the progress of the genocide. In April 1994, Radio Rwanda began to advance a similar message, speaking for the national authorities, issuing directives on how and where to kill Tutsis, and congratulating those who had already taken part.
  • Fika
    Set up Fika Writing group that will meet Wednesdays at 4:00. We'll see how that goes.

  • Working on the 3D mapping app.
    • Reading in single spreadsheet with nomad graph info
    • Building a NodeInfo inner class to keep the nomad positions for the other populations
    • Working! 2018-02-07
    • Better: 2018-02-07 (2)
    • Resisting the urge to code more and getting back to the extended abstract. I also need to add a legend to the above pix.
  • Back to extended abstract
    • Added results and future work section
    • got all the pictures in
    • Currently at 3 pages plus. Not horrible.
  • Demographics and Dynamics of Mechanical Turk Workers
    • There are about 100K-200K unique workers on Amazon. On average, there are 2K-5K workers active on Amazon at any given time, which is equivalent to having 10K-25K full-time employees. On average, 50% of the worker population changes within 12-18 months. Workers exhibit widely different patterns of activity, with most workers being active only occasionally, and few workers being very active. Combining our results with the results from Hara et al, we see that MTurk has a yearly transaction volume of a few hundreds of millions of dollars.