Category Archives: TODO

Phil 5.2.20

It did a great job of summarizing my last paper: “The paper describes how to build maps of human belief as expressed through textual interaction in ways analogous to physical maps.”

Phil 4.5.20

The initial version of DaysToZero is up! Working on adding states now

Got USA data working. New York looks very bad:

Evaluating the fake news problem at the scale of the information ecosystem

“Fake news,” broadly defined as false or misleading information masquerading as legitimate news, is frequently asserted to be pervasive online with serious consequences for democracy. Using a unique multimode dataset that comprises a nationally representative sample of mobile, desktop, and television consumption, we refute this conventional wisdom on three levels. First, news consumption of any sort is heavily outweighed by other forms of media consumption, comprising at most 14.2% of Americans’ daily media diets. Second, to the extent that Americans do consume news, it is overwhelmingly from television, which accounts for roughly five times as much as news consumption as online. Third, fake news comprises only 0.15% of Americans’ daily media diet. Our results suggest that the origins of public misinformedness and polarization are more likely to lie in the content of ordinary news or the avoidance of news altogether as they are in overt fakery.

Phil 2.16.20

Bringing Stories Alive: Generating Interactive Fiction Worlds

World building forms the foundation of any task that requires narrative intelligence. In this work, we focus on procedurally generating interactive fiction worlds—text-based worlds that players “see” and “talk to” using natural language. Generating these worlds requires referencing everyday and thematic commonsense priors in addition to being semantically consistent, interesting, and coherent throughout. Using existing story plots as inspiration, we present a method that first extracts a partial knowledge graph encoding basic information regarding world structure such as locations and objects. This knowledge graph is then automatically completed utilizing thematic knowledge and used to guide a neural language generation model that fleshes out the rest of the world. We perform human participant-based evaluations, testing our neural model’s ability to extract and fill-in a knowledge graph and to generate language conditioned on it against rule-based and human-made baselines. Our code is available at this https URL.

The Obligation To Experiment

Tech companies should test the effects of their products on our safety and civil liberties. We should also test them ourselves.

Ran through the presentation with David. He pointed out that stampedes bouncing off the edge of the environment look like flocking, so I generated a new map where the stampede gathers and runs off the edge

flock_2_stampede_map_legend

Phil 2.11.20

7:00 – 9:00 ASRC GOES

The brains of birds synchronize when they sing duets

When a male or female white-browed sparrow-weaver begins its song, its partner joins in at a certain time. They duet with each other by singing in turn and precisely in tune. A team led by researchers from the Max Planck Institute for Ornithology in Seewiesen used mobile transmitters to simultaneously record neural and acoustic signals from pairs of birds singing duets in their natural habitat. They found that the nerve cell activity in the brain of the singing bird changes and synchronizes with its partner when the partner begins to sing. The brains of both animals then essentially function as one, which leads to the perfect duet. (original article: Duets recorded in the wild reveal that interindividually coordinated motor control enables cooperative behavior)

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering

Top-down visual attention mechanisms have been used extensively in image captioning and visual question answering (VQA) to enable deeper image understanding through fine-grained analysis and even multiple steps of reasoning. In this work, we propose a combined bottom-up and top down attention mechanism that enables attention to be calculated at the level of objects and other salient image regions. This is the natural basis for attention to be considered. Within our approach, the bottom-up mechanism (based on Faster R-CNN) proposes image regions, each with an associated feature vector, while the top-down mechanism determines feature weightings. Applying this approach to image captioning, our results on the MSCOCO test server establish a new state-of-the-art for the task, achieving CIDEr / SPICE / BLEU-4 scores of 117.9, 21.5 and 36.9, respectively. Demonstrating the broad applicability of the method, applying the same approach to VQA we obtain first place in the 2017 VQA Challenge

Defense
- Need to think about how to discuss maps like the T-O and belief space maps (flocking and stampeding projections?) are attention maps as well. Emphasizing well-triangulated but less-attended areas is a potential good. Compare to how maps opened up areas for exploration and exploitation, but this is constructive and not extractive
Admin -done
Walkthrough of Aaron’s slides
- Showed him how to outline boxes and reduce the filesize
Shimei’s group
- Walkthrough of the slides
- Strengthen the connection between the sim and the human study

Phil 2.9.20

In Data Voids: Where Missing Data Can Easily Be Exploited, Golebiewski teams up with danah boyd (Microsoft Research; Data & Society) to demonstrate how data voids are exploited by manipulators eager to expose people to problematic content including falsehoods, misinformation, and disinformation.

Data voids are often difficult to detect. Most can be harmless until something happens that causes lots of people to search for the same term, such as a breaking news event, or a reporter using an unfamiliar phrase. In some cases, manipulators work quickly to produce conspiratorial content to fill a void, whereas other data voids, such as those from outdated terms, are filled slowly over time. Data voids are compounded by the fraught pathways of search-adjacent recommendation systems such as auto-play, auto-fill, and trending topics; each of which are vulnerable to manipulation.

Persuading Algorithms With an AI Nudge Fact-Checking Can Reduce the Spread of Unreliable News. It Can Also Do the Opposite.

Tesla Autopilot Duped By ‘Phantom’ Images: Researchers were able to fool popular autopilot systems into perceiving projected images as real – causing the cars to brake or veer into oncoming traffic lanes.

Phil 1.17.20

An ant colony has memories that its individual members don’t have

Like a brain, an ant colony operates without central control. Each is a set of interacting individuals, either neurons or ants, using simple chemical interactions that in the aggregate generate their behaviour. People use their brains to remember. Can ant colonies do that?

7:00 – ASRC

Dissertation
- More edits
- Changed all the overviews so that they also reference the section by name. It reads better now, I think
- Meeting with Thom
GPT-2 Agents
GSAW Slide deck

Phil 1.3.20

7:00 – 5:00 ASRC PhD

Diversity promotes collective intelligence in large groups but harms small ones
- Diverse groups are often said to be less susceptible to decision errors resulting from herding and polarization. Thus, the fact that many modern interactions happen in a digital world, where filter bubbles and homophily bring people together, is an alarming yet poorly understood phenomenon. But online interactions are also characterized by unprecedented scale, where thousands of individuals can exchange ideas simultaneously. Evidence in collective intelligence however suggests that small (rather than large) groups tend to do better in complex information environments. Here, we adopt the well-established framework of social learning theory (from the fields of ecology and cultural evolution) to explore the causal link between diversity and performance as a function of group size. In this pre-registered study, we experimentally manipulate both group diversity and group size, and measure individual and group performance in realistic geo-political judgements. We find that diversity hinders the performance of individuals in small groups, but improves it in large groups. Furthermore, aggregating opinions of modular crowds composed of small independent but homogeneous groups achieves better results than using non-modular diverse ones. The results are explained by greater conflict of opinion in diverse groups, which negatively impacts small (but not large) groups. The present work sheds light on the causal mechanisms underlying the success (or lack thereof) of diverse groups in digital environments, and suggests that diversity research can benefit from adopting a wider social learning perspective.
“I Just Google It”: Folk Theories of Distributed Discovery
- A significant minority of people do not follow news regularly, and a growing number rely on distributed discovery (especially social media and search engines) to stay informed. Here, we analyze folk theories of news consumption. On the basis of an inductive analysis of 43 in-depth interviews with infrequent users of conventional news, we identify three complementary folk theories (“news finds me,” “the information is out there,” and “I don’t know what to believe”) that consumers draw on when making sense of their information environment. We show that the notion of folk theories help unpack the different, complementary, sometimes contradictory cultural resources people rely on as they navigate digital media and public affairs, and we argue that studying those who rarely engage directly with news media but do access information via social media and search provides a critical case study of the dynamics of an environment increasingly defined by platforms.

Dissertation

Working on Lit Review overview

Fixed the margins for blockquotes by creating a more flexible changemargin command

\def\changemargin#1#2{\list{}{\rightmargin#2\leftmargin#1}\item[]}
\let\endchangemargin=\endlist

Which is used like this

\begin{changemargin}{1.5cm}{1.5cm} 
	They were one man, not thirty. For as the one ship that held them all; though it was put together of all contrasting things-oak, and maple, and pine wood; iron, and pitch, and hemp-yet all these ran into each other in the one concrete hull, which shot on its way, both balanced and directed by the long central keel; even so, all the individualities of the crew, this man’s valor, that man’s fear; guilt and guiltiness, all varieties were welded into oneness, and were all directed to that fatal goal which Ahab their one lord and keel did point to.
\end{changemargin}

Fixed a bunch of things, including blockquotes
Added ch_lit_review_overview.tex
- Biological Basis – done
- Human Belief Spaces – done
- Dimension Reduction – done
- Orientation – done
- Velocity – done
- Social Influence Horizon – done
- Bones in a hut – started

1:00 Dentist

Phil 10.26.19

The dynamics of norm change in the cultural evolution of language

What happens when a new social convention replaces an old one? While the possible forces favoring norm change—such as institutions or committed activists—have been identified for a long time, little is known about how a population adopts a new convention, due to the difficulties of finding representative data. Here, we address this issue by looking at changes that occurred to 2,541 orthographic and lexical norms in English and Spanish through the analysis of a large corpora of books published between the years 1800 and 2008. We detect three markedly distinct patterns in the data, depending on whether the behavioral change results from the action of a formal institution, an informal authority, or a spontaneous process of unregulated evolution. We propose a simple evolutionary model able to capture all of the observed behaviors, and we show that it reproduces quantitatively the empirical data. This work identifies general mechanisms of norm change, and we anticipate that it will be of interest to researchers investigating the cultural evolution of language and, more broadly, human collective behavior.

When Hillclimbers Beat Genetic Algorithms in Multimodal Optimization

It has been shown in the past that a multistart hillclimbing strategy compares favourably to a standard genetic algorithm with respect to solving instances of the multimodal problem generator. We extend that work and verify if the utilization of diversity preservation techniques in the genetic algorithm changes the outcome of the comparison. We do so under two scenarios: (1) when the goal is to find the global optimum, (2) when the goal is to find all optima.
A mathematical analysis is performed for the multistart hillclimbing algorithm and a through empirical study is conducted for solving instances of the multimodal problem generator with increasing number of optima, both with the hillclimbing strategy as well as with genetic algorithms with niching. Although niching improves the performance of the genetic algorithm, it is still inferior to the multistart hillclimbing strategy on this class of problems.
An idealized niching strategy is also presented and it is argued that its performance should be close to a lower bound of what any evolutionary algorithm can do on this class of problems.

Phil 7.9.19

7:00 – 5:30 ASRC GEOS

BP&S is “on hold” in ArXiv. Hoping that it’s overlap with DfS. I took the mapping text out of the DfS paper and resubmitted. Once that’s done I can send Antonio a link and get advice.
Code review with Chris
Contact David and see if he’s ok with July 23 – Nope. Trying Aaron M. as a replacement
More dissertation. Folded in most of the BP&S paper
Look! More mapping of latent spaces! Unsupervised word embeddings capture latent knowledge from materials science literature
- Here we show that materials science knowledge present in the published literature can be efficiently encoded as information-dense word embeddings^11,12,13 (vector representations of words) without human labelling or supervision. Without any explicit insertion of chemical knowledge, these embeddings capture complex materials science concepts such as the underlying structure of the periodic table and structure–property relationships in materials. Furthermore, we demonstrate that an unsupervised method can recommend materials for functional applications several years before their discovery. This suggests that latent knowledge regarding future discoveries is to a large extent embedded in past publications. Our findings highlight the possibility of extracting knowledge and relationships from the massive body of scientific literature in a collective manner, and point towards a generalized approach to the mining of scientific literature.
More Panda3D
- Intervals and sequences
- Panda3D forum
- Programming with Panda3D
  - Well, this is looking a lot like the way I would have written it
  - You can convert a NodePath into a “regular” pointer at any time by calling nodePath.node(). However, there is no unambiguous way to convert back. That’s important: sometimes you need a NodePath, sometimes you need a node pointer. Because of this, it is recommended that you store NodePaths, not node pointers. When you pass parameters, you should probably pass NodePaths, not node pointers. The callee can always convert the NodePath to a node pointer if it needs to.
- Huh. It looks like there is no support for procedurally generated primitives. Well, I know what I’m going to be doing…
  - Origin – done
  - Grid
  - Cube (x, y, z size), color (texture?), Boolean for endcaps
  - Cylinder (radius+steps, length), color
  - Sphere (radius+steps), color
  - Skybox (texture)
  - Then try making a satellite from parts
- JuryRoom Meeting
  - A lot of discussion on UI issues – how to vote for/against, the right panel layout, and the questions that should be asked for Chris’ study

Phil 5.28.19

Phil 7:00 – 5:00 ASRC NASA GEOS

Factors Motivating Customization and Echo Chamber Creation Within Digital News Environments
- With the influx of content being shared through social media, mobile apps, and other digital sources – including fake news and misinformation – most news consumers experience some degree of information overload. To combat these feelings of unease associated with the sheer volume of news content, some consumers tailor their news ecosystems and purposefully include or exclude content from specific sources or individuals. This study explores customization on social media and news platforms through a survey (N = 317) of adults regarding their digital news habits. Findings suggest that consumers who diversify their online news streams report lower levels of anxiety related to current events and highlight differences in reported anxiety levels and customization practices across the political spectrum. This study provides important insights into how perceived information overload, anxiety around current events, political affiliations and partisanship, and demographic characteristics may contribute to tailoring practices related to news consumption in social media environments. We discuss these findings in terms of their implications for industry, policy, and theory
More JASSS paper
Installing new IntelliJ and re-indexing
Discovered a few bugs with the JsonUtils.find. Fixed and submitted a version to StackOverflow. Eeeep!

Phil 11.24.18

Semantics-Space-Time Cube. A Conceptual Framework for Systematic Analysis of Texts in Space and Time

We propose an approach to analyzing data in which texts are associated with spatial and temporal references with the aim to understand how the text semantics vary over space and time. To represent the semantics, we apply probabilistic topic modeling. After extracting a set of topics and representing the texts by vectors of topic weights, we aggregate the data into a data cube with the dimensions corresponding to the set of topics, the set of spatial locations (e.g., regions), and the time divided into suitable intervals according to the scale of the planned analysis. Each cube cell corresponds to a combination (topic, location, time interval) and contains aggregate measures characterizing the subset of the texts concerning this topic and having the spatial and temporal references within these location and interval. Based on this structure, we systematically describe the space of analysis tasks on exploring the interrelationships among the three heterogeneous information facets, semantics, space, and time. We introduce the operations of projecting and slicing the cube, which are used to decompose complex tasks into simpler subtasks. We then present a design of a visual analytics system intended to support these subtasks. To reduce the complexity of the user interface, we apply the principles of structural, visual, and operational uniformity while respecting the specific properties of each facet. The aggregated data are represented in three parallel views corresponding to the three facets and providing different complementary perspectives on the data. The views have similar look-and-feel to the extent allowed by the facet specifics. Uniform interactive operations applicable to any view support establishing links between the facets. The uniformity principle is also applied in supporting the projecting and slicing operations on the data cube. We evaluate the feasibility and utility of the approach by applying it in two analysis scenarios using geolocated social media data for studying people’s reactions to social and natural events of different spatial and temporal scales.

Phil 11.20.18

7:00 – 3:30 ASRC PhD/NASA

Disrupting the Coming Robot Stampedes: Designing Resilient Information Ecologies got accepted to the iConference! Time to start thinking about the slide deck…
- Workshop: Online nonsense: tools and teaching to combat fake news on the Web
  - How can we raise the quality of what we find on the Web? What software might we build, what education might we try to provide, and what procedures (either manual or mechanical) might be introduced? What are the technical and legal issues that limit our responses? The speakers will suggest responses to problems, and we’ll ask the audience what they would do in specific circumstances. Examples might include anti-vaccination pages, nonstandard cancer treatments, or climate change denial. We will compare with past history, such as the way CB radio became useless as a result of too much obscenity and abuse, or the way the Hearst newspapers created the Spanish-American War. We’ll report out the suggestions and evaluations of the audience.
SocialOcean: Visual Analysis and Characterization of Social Media Bubbles
- Social media allows citizens, corporations, and authorities to create, post, and exchange information. The study of its dynamics will enable analysts to understand user activities and social group characteristics such as connectedness, geospatial distribution, and temporal behavior. In this context, social media bubbles can be defined as social groups that exhibit certain biases in social media. These biases strongly depend on the dimensions selected in the analysis, for example, topic affinity, credibility, sentiment, and geographic distribution. In this paper, we present SocialOcean, a visual analytics system that allows for the investigation of social media bubbles. There exists a large body of research in social sciences which identifies important dimensions of social media bubbles (SMBs). While such dimensions have been studied separately, and also some of them in combination, it is still an open question which dimensions play the most important role in defining SMBs. Since the concept of SMBs is fairly recent, there are many unknowns regarding their characterization. We investigate the thematic and spatiotemporal characteristics of SMBs and present a visual analytics system to address questions such as: What are the most important dimensions that characterize SMBs? and How SMBs embody in the presence of specific events that resonate with them? We illustrate our approach using three different real scenarios related to the single event of Boston Marathon Bombing, and political news about Global Warming. We perform an expert evaluation, analyze the experts’ feedback, and present the lessons learned.
More Grokking. We’re at backpropagation, and I’m not seeing it yet. The pix are cool though:
Continuing Characterizing Online Public Discussions through Patterns of Participant Interactions.
- This paper introduces a computational framework to characterize public discussions, relying on a representation that captures a broad set of social patterns which emerge from the interactions between interlocutors, comments and audience reactions. (Page 198:1)
- we use it to predict the eventual trajectory of individual discussions, anticipating future antisocial actions (such as participants blocking each other) and forecasting a discussion’s growth (Page 198:1)
- platform maintainers may wish to identify salient properties of a discussion that signal particular outcomes such as sustained participation [9] or future antisocial actions [16], or that reflect particular dynamics such as controversy [24] or deliberation [29]. (Page 198:1)
- Systems supporting online public discussions have affordances that distinguish them from other forms of online communication. Anybody can start a new discussion in response to a piece of content, or join an existing discussion at any time and at any depth. Beyond textual replies, interactions can also occur via reactions such as likes or votes, engaging a much broader audience beyond the interlocutors actively writing comments. (Page 198:2)
  - This is why JuryRoom would be distinctly different. It’s unique affordances should create unique, hopefully clearer results.
- This multivalent action space gives rise to salient patterns of interactional structure: they reflect important social attributes of a discussion, and define axes along which discussions vary in interpretable and consequential ways. (Page 198:2)
- Our approach is to construct a representation of discussion structure that explicitly captures the connections fostered among interlocutors, their comments and their reactions in a public discussion setting. We devise a computational method to extract a diverse range of salient interactional patterns from this representation—including but not limited to the ones explored in previous work—without the need to predefine them. We use this general framework to structure the variation of public discussions, and to address two consequential tasks predicting a discussion’s future trajectory: (a) a new task aiming to determine if a discussion will be followed by antisocial events, such as the participants blocking each other, and (b) an existing task aiming to forecast the growth of a discussion [9]. (Page 198:2)
- We find that the features our framework derives are more informative in forecasting future events in a discussion than those based on the discussion’s volume, on its reply structure and on the text of its comments (Page 198:2)
- we find that mainstream print media (e.g., The New York Times, The Guardian, Le Monde, La Repubblica) is separable from cable news channels (e.g., CNN, Fox News) and overtly partisan outlets (e.g., Breitbart, Sean Hannity, Robert Reich)on the sole basis of the structure of the discussions they trigger (Figure 4).(Page 198:2)
- These studies collectively suggest that across the broader online landscape, discussions take on multiple types and occupy a space parameterized by a diversity of axes—an intuition reinforced by the wide range of ways in which people engage with social media platforms such as Facebook [25]. With this in mind, our work considers the complementary objective of exploring and understanding the different types of discussions that arise in an online public space, without predefining the axes of variation. (Page 198:3)
- Many previous studies have sought to predict a discussion’s eventual volume of comments with features derived from their content and structure, as well as exogenous information [8, 9, 30, 69, inter alia]. (Page 198:3)
- Many such studies operate on the reply-tree structure induced by how successive comments reply to earlier ones in a discussion rooted in some initial content. Starting from the reply-tree view, these studies seek to identify and analyze salient features that parameterize discussions on platforms like Reddit and Twitter, including comment popularity [72], temporal novelty [39], root-bias [28], reply-depth [41, 50] and reciprocity [6]. Other work has taken a linear view of discussions as chronologically ordered comment sequences, examining properties such as the arrival sequence of successive commenters [9] or the extent to which commenters quote previous contributions [58]. The representation we introduce extends the reply-tree view of comment-to-comment. (Page 198:3)
- Our present approach focuses on representing a discussion on the basis of its structural rather than linguistic attributes; as such, we offer a coarser view of the actions taken by discussion participants that more broadly captures the nature of their contributions across contexts which potentially exhibit large linguistic variation.(Page 198:4)
- This representation extends previous computational approaches that model the relationships between individual comments, and more thoroughly accounts for aspects of the interaction that arise from the specific affordances offered in public discussion venues, such as the ability to react to content without commenting. Next, we develop a method to systematically derive features from this representation, hence producing an encoding of the discussion that reflects the interaction patterns encapsulated within the representation, and that can be used in further analyses.(Page 198:4)
- In this way, discussions are modelled as collections of comments that are connected by the replies occurring amongst them. Interpretable properties of the discussion can then be systematically derived by quantifying structural properties of the underlying graph: for instance, the indegree of a node signifies the propensity of a comment to draw replies. (Page 198:5)
  - Quick responses that reflect a high degree of correlation would be tight. A long-delayed “like” could be slack?
- For instance, different interlocutors may exhibit varying levels of engagement or reciprocity. Activity could be skewed towards one particularly talkative participant or balanced across several equally-prolific contributors, as can the volume of responses each participant receives across the many comments they may author.(Page 198: 5)
- We model this actor-focused view of discussions with a graph-based representation that augments the reply-tree model with an additional superstructure. To aid our following explanation, we depict the representation of an example discussion thread in Figure 1 (Page 198: 6)
- Relationships between actors are modeled as the collection of individual responses they exchange. Our representation reflects this by organizing edges into hyperedges: a hyperedge between a hypernode C and a node c ‘ contains all responses an actor directed at a specific comment, while a hyperedge between two hypernodes C and C’ contains the responses that actor C directed at any comment made by C’ over the entire discussion. (Page 198: 6)
  - I think that this can be represented as a tensor (hyperdimensional or flattened) with each node having a value if there is an intersection. There may be an overall scalar that allows each type of interaction to be adjusted as a whole
- The mixture of roles within one discussion varies across different discussions in intuitively meaningful ways. For instance, some discussions are skewed by one particularly active participant, while others may be balanced between two similarly-active participants who are perhaps equally invested in the discussion. We quantify these dynamics by taking several summary statistics of each in/outdegree distribution in the hypergraph representation, such as their maximum, mean and entropy, producing aggregate characterizations of these properties over an entire discussion. We list all statistics computed in the appendices (Table 4). (Page 198: 6, 7)
- To interpret the structure our model offers and address potentially correlated or spurious features, we can perform dimensionality reduction on the feature set our framework yields. In particular, let X be a N×k matrix whose N rows each correspond to a thread represented by k features.We perform a singular value decomposition on X to obtain a d-dimensional representation X ˜ Xˆ = USVT where rows of U are embeddings of threads in the induced latent space and rows of V represent the hypergraph-derived features. (Page 198: 9)
  - This lets us find the hyperplane of the map we want to build
- Community-level embeddings. We can naturally extend our method to characterize online discussion communities—interchangeably, discussion venues—such as Facebook Pages. To this end, we aggregate representations of the collection of discussions taking place in a community, hence providing a representation of communities in terms of the discussions they foster. This higher level of aggregation lends further interpretability to the hypergraph features we derive. In particular, we define the embedding U¯C of a community C containing threads {t1, t2, . . . tn } as the average of the corresponding thread embeddings Ut1 ,Ut2 , . . .Utn , scaled to unit l2 norm. Two communities C1 and C2 that foster structurally similar discussions then have embeddings U¯C1 and U¯C2 that are close in the latent space.(Page 198: 9)
  - And this may let us place small maps in a larger map. Not sure if the dimensions will line up though
- The set of threads to a post may be algorithmically re-ordered based on factors like quality [13]. However, subsequent replies within a thread are always listed chronologically.We address elements of such algorithmic ranking effects in our prediction tasks (§5). (Page 198: 10)
- Taken together, these filtering criteria yield a dataset of 929,041 discussion threads.(Page 198: 10)
  - And approximately 9,290,410 posts. At an average of 18 words per post (Monitoring Trends on Facebook), that’s a corpora of 167,227,380 words
- We now apply our framework to forecast a discussion’s trajectory—can interactional patterns signal future thread growth or predict future antisocial actions? We address this question by using the features our method extracts from the 10-comment prefix to predict two sets of outcomes that occur temporally after this prefix. (Pg 198:10)
  - These are behavioral trajectories, though not belief trajectories. Maps of these behaviors could probably be built, too.
- For instance, news articles on controversial issues may be especially susceptible to contentious discussions, but this should not translate to barring discussions about controversial topics outright. Additionally, in large-scale social media settings such as Facebook, the content spurring discussions can vary substantially across different sub-communities, motivating the need to seek adaptable indicators that do not hinge on content specific to a particular context. (Page 198: 11)
- Classification protocol. For each task, we train logistic regression classifiers that use our full set of hypergraph-derived features, grid-searching over hyperparameters with 5-fold cross-validation and enforcing that no Page spans multiple folds.13 We evaluate our models on a (completely fresh) heldout set of thread pairs drawn from the subsequent week of data (Nov. 8-14, 2017), addressing a model’s potential dependence on various evolving interface features that may have been deployed by Facebook during the time spanned by the training data. (Page 198: 11)
  - We use logistic regression classifiers from scikit-learn with l2 loss, standardizing features and grid-searching over C = {0.001, 0.01, 1}. In the bag-of-words models, we tf-idf transform features, set a vocabulary size of 5,000 words and additionally grid-search over the maximum document frequency in {0.25, 0.5, 1}. (Page 198: 11, footnote 13)
- We test a model using the temporal rate of commenting, which was shown to be a much stronger signal of thread growth than the structural properties considered in prior work [9] (Page 198: 12)
- Table 3 shows Page-macroaveraged heldout accuracies for our prediction tasks. The feature set we extract from our hypergraph significantly outperforms all of the baselines in each task. This shows that interactional patterns occurring within a thread’s early activity can signal later events, and that our framework can extract socially and structurally-meaningful patterns that are informative beyond coarse counts of activity volume, the reply-tree alone and the order in which commenters contribute, along with a shallow representation of the linguistic content discussed. (Page 198: 12)
  - So triangulation from a variety of data sources produces more accurate results in this context, and probably others. Not a surprising finding, but important to show
- We find that in almost all cases, our full model significantly outperforms each subcomponent considered, suggesting that different parts of the hypergraph framework add complementary information across these tasks. (Page 198: 13)
- Having shown that our approach can extract interaction patterns of practical importance from individual threads, we now apply our framework to explore the space of public discussions occurring on Facebook. In particular, we identify salient axes along which discussions vary by qualitatively examining the latent space induced from the embedding procedure described in §3, with d = 7 dimensions. Using our methodology, we recover intuitive types of discussions, which additionally reflect our priors about the venues which foster them. This analysis provides one possible view of the rich landscape of public discussions and shows that our thread representation can structure this diverse space of discussions in meaningful ways. This procedure could serve as a starting point for developing taxonomies of discussions that address the wealth of structural interaction patterns they contain, and could enrich characterizations of communities to systematically account for the types of discussions they foster. (Page 198: 14)
  - ^^^Show this to Wayne!^^^
- The emergence of these groupings is especially striking since our framework considers just discussion structure without explicitly encoding for linguistic, topical or demographic data. In fact, the groupings produced often span multiple languages—the cluster of mainstream news sites at the top includes French (Le Monde), Italian (La Repubblica) and German (SPIEGEL ONLINE) outlets; the “sports” region includes French (L’EQUIPE) as well as English outlets. This suggests that different types of content and different discussion venues exhibit distinctive interactional signatures, beyond lexical traits. Indeed, an interesting avenue of future work could further study the relation between these factors and the structural patterns addressed in our approach, or augment our thread representation with additional contextual information. (Page 198: 15)
- Taken together, we can use the features, threads and Pages which are relatively salient in a dimension to characterize a type of discussion. (Page 198: 15)
- To underline this finer granularity, for each examined dimension we refer to example discussion threads drawn from a single Page, The New York Times (https://www.facebook.com/nytimes), which are listed in the footnotes. (Page 198: 15)
  - Common starting point. Do they find consensus, or how the dimensions reduce?
- Focused threads tend to contain a small number of active participants replying to a large proportion of preceding comments; expansionary threads are characterized by many less-active participants concentrating their responses on a single comment, likely the initial one. We see that (somewhat counterintuitively) meme-sharing discussion venues tend to have relatively focused discussions. (Page 198: 15)
  - These are two sides of the same dimension-reduction coin. A focused thread should be using the dimension-reduction tool of open discussion that requires the participants to agree on what they are discussing. As such it refines ideas and would produce more meme-compatible content. Expansive threads are dimension reducing to the initial post. The subsequent responses go in too many directions to become a discussion.
- Threads at one end (blue) have highly reciprocal dyadic relationships in which both reactions and replies are exchanged. Since reactions on Facebook are largely positive, this suggests an actively supportive dynamic between actors sharing a viewpoint, and tend to occur in lifestyle-themed content aggregation sub-communities as well as in highly partisan sites which may embody a cohesive ideology. In threads at the other end (red), later commenters tend to receive more reactions than the initiator and also contribute more responses. Inspecting representative threads suggests this bottom-heavy structure may signal a correctional dynamic where late arrivals who refute an unpopular initiator are comparatively well-received. (Page 198: 17)
- This contrast reflects an intuitive dichotomy of one- versus multi-sided discussions; interestingly, the imbalanced one-sided discussions tend to occur in relatively partisan venues, while multi-sided discussions often occur in sports sites (perhaps reflecting the diversity of teams endorsed in these sub-communities). (Page 198: 17)
  - This means that we can identify one-sided behavior and use that then to look at they underlying information. No need to look in diverse areas, they are taking care of themselves. This is ecosystem management 101, where things like algae blooms and invasive species need to be recognized and then managed
- We now seek to contrast the relative salience of these factors after controlling for community: given a particular discussion venue, is the content or the commenter more responsible for the nature of the ensuing discussions? (Page 198: 17)
- This suggests that, perhaps somewhat surprisingly, the commenter is a stronger driver of discussion type. (Page 198: 18)
  - I can see that. The initial commenter is kind of a gate-keeper to the discussion. A low-dimension, incendiary comment that is already aligned with one group (“lock her up”), will create one kind of discussion, while a high-dimensional, nuanced post will create another.
- We provide a preliminary example of how signals derived from discussion structure could be applied to forecast blocking actions, which are potential symptoms of low-quality interactions (Page 198: 18)
- Important references
  - Cornell Conversational Analysis Toolkit
    - This toolkit contains tools to extract conversational features and analyze social phenomena in conversations. Several large conversational datasets are included together with scripts exemplifying the use of the toolkit on these datasets.
  - [5] Detecting Platform Effects in Online Discussions
  - [6] To Thread or Not to Thread: The Impact of Conversation Threading on Online Discussion
  - [8] Predicting responses to microblog posts.
  - [9] Characterizing and curating conversation threads
  - [12] Higher-order organization of complex networks
  - [13] Discussion quality diffuses in the digital public square
  - [16] Anyone can become a troll: Causes of trolling behavior in online discussions
  - [24] Quantifying controversy in social media
  - [28] Statistical analysis of the social network and discussion threads in Slashdot
  - [29] The structure of political discussion networks: A model for the analysis of online deliberation
  - [30] Exploring Text Virality in Social Networks
  - [39] Dynamics of conversations
  - [49] Conversational Markers of Constructive Discussions
  - [50] Reply trees in Twitter: Data analysis and branching process models
  - [58] Quotes Reveal Community Structure and Interaction Dynamics
  - [69] Predicting the volume of comments on online news stories
  - [72] From user comments to on-line conversations.
  - [75] Democracy, deliberation and design: The case of online discussion forums

Phil 11.12.18

7:00 – 7:00 ASRC PhD

Call Tim Ellis – done
Tags – done
Bills – nope, including MD EV paperwork -done
Get oil change kit from Bob’s – closed
Fika – done
Finish Similar neural responses predict friendship – Done!
Discrete hierarchical organization of social group sizes
- The ‘social brain hypothesis’ for the evolution of large brains in primates has led to evidence for the coevolution of neocortical size and social group sizes, suggesting that there is a cognitive constraint on group size that depends, in some way, on the volume of neural material available for processing and synthesizing information on social relationships. More recently, work on both human and non-human primates has suggested that social groups are often hierarchically structured. We combine data on human grouping patterns in a comprehensive and systematic study. Using fractal analysis, we identify, with high statistical confidence, a discrete hierarchy of group sizes with a preferred scaling ratio close to three: rather than a single or a continuous spectrum of group sizes, humans spontaneously form groups of preferred sizes organized in a geometrical series approximating 3–5, 9–15, 30–45, etc. Such discrete scale invariance could be related to that identified in signatures of herding behaviour in financial markets and might reflect a hierarchical processing of social nearness by human brains.
Work on Antonio’s paper – good progress
Aaron added a lot of content to Belief Spaces, and we got together to discuss. Probably the best thing to come out of the discussion was an approach to the dungeons that at one end is an acyclic, directed, linear graph of connected nodes. The map will be a line, with any dilemma discussions connected with the particular nodes. At the other end is an open environment. In between are various open and closed graphs that we can classify with some level of complexity.
One of the things that might be interesting to examine is the distance between nodes, and how that affects behavior
Need to mention that D&D are among the oldest “digital residents” of the internet, with decades-old artifacts.

Phil 11.7.18

Let the House Subcommittee investigations begin! Also, better redistricting?

7:00 – 5:00 ASRC PhD/BD

Rather than Deep Learning with Keras, I’m starting on Grokking Deep Learning. I need better grounding
- Installed Jupyter
After lunch, send follow-up emails to the technical POCs. This will be the basis for the white paper: Tentative findings/implications for design. Modify it on the blog page first and then use to create the LaTex doc. Make that one project, with different mains that share overlapping content.
Characterizing Online Public Discussions through Patterns of Participant Interactions
- Public discussions on social media platforms are an intrinsic part of online information consumption. Characterizing the diverse range of discussions that can arise is crucial for these platforms, as they may seek to organize and curate them. This paper introduces a computational framework to characterize public discussions, relying on a representation that captures a broad set of social patterns which emerge from the interactions between interlocutors, comments and audience reactions. We apply our framework to study public discussions on Facebook at two complementary scales. First, we use it to predict the eventual trajectory of individual discussions, anticipating future antisocial actions (such as participants blocking each other) and forecasting a discussion’s growth. Second, we systematically analyze the variation of discussions across thousands of Facebook sub-communities, revealing subtle differences (and unexpected similarities) in how people interact when discussing online content. We further show that this variation is driven more by participant tendencies than by the content triggering these discussions.
More latent space flocking from Innovation Hub
- You Share Everything With Your Bestie. Even Brain Waves.
  - Scientists have found that the brains of close friends respond in remarkably similar ways as they view a series of short videos: the same ebbs and swells of attention and distraction, the same peaking of reward processing here, boredom alerts there. The neural response patterns evoked by the videos — on subjects as diverse as the dangers of college football, the behavior of water in outer space, and Liam Neeson trying his hand at improv comedy — proved so congruent among friends, compared to patterns seen among people who were not friends, that the researchers could predict the strength of two people’s social bond based on their brain scans alone.
- Similar neural responses predict friendship
  - Human social networks are overwhelmingly homophilous: individuals tend to befriend others who are similar to them in terms of a range of physical attributes (e.g., age, gender). Do similarities among friends reflect deeper similarities in how we perceive, interpret, and respond to the world? To test whether friendship, and more generally, social network proximity, is associated with increased similarity of real-time mental responding, we used functional magnetic resonance imaging to scan subjects’ brains during free viewing of naturalistic movies. Here we show evidence for neural homophily: neural responses when viewing audiovisual movies are exceptionally similar among friends, and that similarity decreases with increasing distance in a real-world social network. These results suggest that we are exceptionally similar to our friends in how we perceive and respond to the world around us, which has implications for interpersonal influence and attraction.
- Brain-to-Brain coupling: A mechanism for creating and sharing a social world
  - Cognition materializes in an interpersonal space. The emergence of complex behaviors requires the coordination of actions among individuals according to a shared set of rules. Despite the central role of other individuals in shaping our minds, most cognitive studies focus on processes that occur within a single individual. We call for a shift from a single-brain to a multi-brain frame of reference. We argue that in many cases the neural processes in one brain are coupled to the neural processes in another brain via the transmission of a signal through the environment. Brain-to-brain coupling constrains and simplifies the actions of each individual in a social network, leading to complex joint behaviors that could not have emerged in isolation.
Started reading Similar neural responses predict friendship

Phil 11.6.18

7:00 – 2:00 ASRC PhD/BD

Today’s big though: Maps are going top be easier than I thought. We’ve been doing them for thousands of years with board games.
Worked with Aaron on slides, including finding fault detection using our technologies. There is quite a bit, with pioneering work from NASA
Uploaded documents – done
Called and left messages for Dr. Wilkins and Dr. Palazzolo. Need to send a follow-up email to Dr. Palazzolo and start on the short white papers
Leaving early to vote
The following two papers seem to be addressing edge stiffness
Model of the Information Shock Waves in Social Network Based on the Special Continuum Neural Network
- The article proposes a special class of continuum neural network with varying activation thresholds and a specific neuronal interaction mechanism as a model of message distribution in social networks. Activation function for every neuron is fired as a decision of the specific systems of differential equations which describe the information distribution in the chain of the network graph. This class of models allows to take into account the specific mechanisms for transmitting messages, where individuals who, receiving a message, initially form their attitude towards it, and then decide on the further transmission of this message, provided that the corresponding potential of the interaction of two individuals exceeds a certain threshold level. The authors developed the original algorithm for calculating the time moments of message distribution in the corresponding chain, which comes to the solution of a series of Cauchy problems for systems of ordinary nonlinear differential equations.
A cost-effective algorithm for inferring the trust between two individuals in social networks
- The popularity of social networks has significantly promoted online individual interaction in the society. In online individual interaction, trust plays a critical role. It is very important to infer the trust among individuals, especially for those who have not had direct contact previously in social networks. In this paper, a restricted traversal method is defined to identify the strong trust paths from the truster and the trustee. Then, these paths are aggregated to predict the trust rate between them. During the traversal on a social network, interest topics and topology features are comprehensively considered, where weighted interest topics are used to measure the semantic similarity between users. In addition, trust propagation ability of users is calculated to indicate micro topology information of the social network. In order to find the – most trusted neighbors, two combination strategies for the above two factors are proposed in this paper. During trust inference, the traversal depth is constrained according to the heuristic rule based on the “small world” theory. Three versions of the trust rate inference algorithm are presented. The first algorithm merges interest topics and topology features into a hybrid measure for trusted neighbor selection. The other two algorithms consider these two factors in two different orders. For the purpose of performance analysis, experiments are conducted on a public and widely-used data set. The results show that our algorithms outperform the state-of-the-art algorithms in effectiveness. In the meantime, the efficiency of our algorithms is better than or comparable to those algorithms.
Back to LSTMs. Made a numeric version of “all work and no play in the jack_torrance generator
Reading in and writing out weight files. The predictions seems to be working well, but I have no insight into the arguments that go into the LSTM model. Going to revisit the Deep Learning with Keras book

viztales

Dimension reduction, State, Orientation, and Speed