Category Archives: Fact Checking

Phil 5.1.19

7:00 – 7:00 ASRC NASA AIMS

Added lit review section to the dissertation, and put the seven steps of sectarianism in.
Spent most of yesterday helping Aaron with TimeSeriesML. Currently working on a JSON util that will get a value on a provided path
Had to set up python at the module and not project level, which was odd. Here’s how: www.jetbrains.com/help/idea/2016.1/configuring-global-project-and-module-sdks.html#module_sdk

Done!

    def lfind(self, query_list:List, target_list:List, targ_str:str = "???"):
        for tval in target_list:
            if isinstance(tval, dict):
                return self.dfind(query_list[0], tval, targ_str)
            elif tval == query_list[0]:
                return tval

    def dfind(self, query_dict:Dict, target_dict:Dict, targ_str:str = "???"):
        for key, qval in query_dict.items():
            # print("key = {}, qval = {}".format(key, qval))
            tval = target_dict[key]
            if isinstance(qval, dict):
                return self.dfind(qval, tval, targ_str)
            elif isinstance(qval, list):
                return self.lfind(qval, tval, targ_str)
            else:
                if qval == targ_str:
                    return tval
                if qval != tval:
                    return None

    def find(self, query_dict:Dict):
        # pprint.pprint(query_dict)
        result = self.dfind(query_dict, self.json_dict)
        return result

It’s called like this:

ju = JsonUtils("../../data/output_data/lstm_structure.json")
# ju.pprint()
result = ju.find({"config":[{"class_name":"Masking", "config":{"batch_input_shape": "???"}}]})
print("result 1 = {}".format(result))
result = ju.find({"config":[{"class_name":"Masking", "config":{"mask_value": "???"}}]})
print("result 2 = {}".format(result))

Here’s the results:

result 1 = [None, 12, 1]
result 2 = 666.0

Got Aaron’s code running!
Meeting with Joel
- A quicker demo that I was expecting, though I was able to walk through how to create and use Corpus Manager and LMN. Also, we got a bug where the column index for the eigenvector didn’t exist. Fixed that in JavaUtils.math.Labeled2DMatrix.java
Meeting with Wayne
- Walked through the JASSS paper. Need to make sure that the lit review is connected and in the proper order
- Changed the title of the dissertation to
  - Stampede Theory: Mapping Dangerous Misinformation at Scale
- Solidifying defense over the winter break, with diploma in the Spring
- Mentioned the “aikido with drones” concept. Need to make an image. Actually, I wonder if there is a way for that model to be used for actually getting a grant to explore weaponized AI in a way that isn’t directly mappable to weapons systems, but is close enough to reality that people will get the point.
- Also discussed the concept of managing runaway AI with the Sanhedrin-17a concept, where unanimous agreement to convict means acquittal. Cities had Sanhedrin of 23 Judges and the Great Sanhedrin had 71 Judges en.wikipedia.org/wiki/Sanhedrin
  - Rav Kahana says: In a Sanhedrin where all the judges saw fit to convict the defendant in a case of capital law, they acquit him. The Gemara asks: What is the reasoning for this halakha? It is since it is learned as a tradition that suspension of the trial overnight is necessary in order to create a possibility of acquittal. The halakha is that they may not issue the guilty verdict on the same day the evidence was heard, as perhaps over the course of the night one of the judges will think of a reason to acquit the defendant. And as those judges all saw fit to convict him they will not see any further possibility to acquit him, because there will not be anyone arguing for such a verdict. Consequently, he cannot be convicted.

Phil 8.30.18

7:00 – 5:00 ASRC MKT

Target Blue Sky paper for iSchool/iConference 2019: The chairs are particularly looking for “Blue Sky Ideas” that are open-ended, possibly even “outrageous” or “wacky,” and present new problems, new application domains, or new methodologies that are likely to stimulate significant new research.
I’m thinking that a paper that works through the ramifications of this diagram as it relates to people and machines. With humans that are slow responding with spongy, switched networks the flocking area is large. With a monolithic densely connected system it’s going to be a straight line from nomadic to stampede.
- Length: Up to 4 pages (excluding references)
- Submission deadline: October 1, 2018
- Notification date: mid-November, 2018
- Final versions due: December 14, 2018
- First versions will be submitted using .pdf. Final versions must be submitted in .doc, .docx or La Tex.
More good stuff on BBC Business Daily Trolling for Cash
- Anger and animosity is prevalent online, with some people even seeking it out. It’s present on social media of course as well as many online forums. But now outrage has spread to mainstream media outlets and even the advertising industry. So why is it so lucrative? Bonny Brooks, a writer and researcher at Newcastle University explains who is making money from outrage. Neuroscientist Dr Dean Burnett describes what happens to our brains when we see a comment designed to provoke us. And Curtis Silver, a tech writer for KnowTechie and ForbesTech, gives his thoughts on what we need to do to defend ourselves from this onslaught of outrage.
Exposure to Opposing Views can Increase Political Polarization: Evidence from a Large-Scale Field Experiment on Social Media
- Christopher Bail (Scholar)
- There is mounting concern that social media sites contribute to political polarization by creating “echo chambers” that insulate people from opposing views about current events. We surveyed a large sample of Democrats and Republicans who visit Twitter at least three times each week about a range of social policy issues. One week later, we randomly assigned respondents to a treatment condition in which they were offered financial incentives to follow a Twitter bot for one month that exposed them to messages produced by elected officials, organizations, and other opinion leaders with opposing political ideologies. Respondents were re-surveyed at the end of the month to measure the effect of this treatment, and at regular intervals throughout the study period to monitor treatment compliance. We find that Republicans who followed a liberal Twitter bot became substantially more conservative post-treatment, and Democrats who followed a conservative Twitter bot became slightly more liberal post-treatment. These findings have important implications for the interdisciplinary literature on political polarization as well as the emerging field of computational social science.
Setup gcloud tools on laptop – done
Setup Tensorflow on laptop. Gave up un using CUDA 9.1, but got tf doing ‘hello, tensorflow’
Marcom meeting – 2:00
Get the concept of behaviors being a more scalable, dependable way of vetting information.
- Eg Watching the DISI of outrage as manifested in trolling
  - “Uh. . . . not to be nitpicky,,,,,but…the past tense of drag is dragged, not drug.”: An overview of trolling strategies
    - Dr Claire Hardaker (Scholar) (Blog)
      - I primarily research aggression, deception, and manipulation in computer-mediated communication (CMC), including phenomena such as flaming, trolling, cyberbullying, and online grooming. I tend to take a forensic linguistic approach, based on a corpus linguistic methodology, but due to the multidisciplinary nature of my research, I also inevitably branch out into areas such as psychology, law, and computer science.
    - This paper investigates the phenomenon known as trolling — the behaviour of being deliberately antagonistic or offensive via computer-mediated communication (CMC), typically for amusement’s sake. Having previously started to answer the question, what is trolling? (Hardaker 2010), this paper seeks to answer the next question, how is trolling carried out? To do this, I use software to extract 3,727 examples of user discussions and accusations of trolling from an eighty-six million word Usenet corpus. Initial findings suggest that trolling is perceived to broadly fall across a cline with covert strategies and overt strategies at each pole. I create a working taxonomy of perceived strategies that occur at different points along this cline, and conclude by refining my trolling definition.
    - Citing papers
FireAnt (Filter, Identify, Report, and Export Analysis Toolkit) is a freeware social media and data analysis toolkit with built-in visualization tools including time-series, geo-position (map), and network (graph) plotting.
Fix marquee – done
Export to ppt – done!
- include videos – done
- Center title in ppt:
  - model considerations – done
  - diversity injection – done
Got the laptop running Python and Tensorflow. Had a stupid problem where I accidentally made a virtual environment and keras wouldn’t work. Removed, re-connected and restarted IntelliJ and everything is working!

Phil 3.7.18

7:00 – 5:00 ASRC MKT

Some surprising snow
Meeting with Sy at 1:30 slides
Meeting with Dr. DesJardins at 4:00
Nice chat with Wajanat about the presentation of the Saudi Female self in physical and virtual environments
- Scholar search
Sprint planning
- Finish ONR Proposal VP-331
- CHIIR VP-332
- Prep for TF dev conf VP-334
- TF dev conf VP-334
Working on the ONR proposal
Oxford Internet Institute – Computational Propaganda Research Project
- The Computational Propaganda Research Project (COMPROP) investigates the interaction of algorithms, automation and politics. This work includes analysis of how tools like social media bots are used to manipulate public opinion by amplifying or repressing political content, disinformation, hate speech, and junk news. We use perspectives from organizational sociology, human computer interaction, communication, information science, and political science to interpret and analyze the evidence we are gathering. Our project is based at the Oxford Internet Institute, University of Oxford.
- Polarization, Partisanship and Junk News Consumption over Social Media in the US
  - What kinds of social media users read junk news? We examine the distribution of the most significant sources of junk news in the three months before President Donald Trump’s first State of the Union Address. Drawing on a list of sources that consistently publish political news and information that is extremist, sensationalist, conspiratorial, masked commentary, fake news and other forms of junk news, we find that the distribution of such content is unevenly spread across the ideological spectrum. We demonstrate that (1) on Twitter, a network of Trump supporters shares the widest range of known junk news sources and circulates more junk news than all the other groups put together; (2) on Facebook, extreme hard right pages—distinct from Republican pages—share the widest range of known junk news sources and circulate more junk news than all the other audiences put together; (3) on average, the audiences for junk news on Twitter share a wider range of known junk news sources than audiences on Facebook’s public pages
  - Need to look at the variance in the articles. Are these topical stampedes? Or is this source-oriented?
Understanding and Addressing the Disinformation Ecosystem
- This workshop brings together academics, journalists, fact-checkers, technologists, and funders to better understand the challenges produced by the current disinformation ecosystem. The facilitated discussions will highlight relevant research, share best-practices, identify key questions of scholarly and practical concern regarding the nature and implications of the disinformation ecosystem, and outline a potential research agenda designed to answer these questions.
More BIC
- The psychology of group identity allows us to understand that group identification can be due to factors that have nothing to do with the individual preferences. Strong interdependence and other forms of common individual interest are one sort of favouring condition, but there are many others, such as comembership of some existing social group, sharing a birthday, and the artificial categories of the minimal group paradigm. (pg 150)
- Wherever we may expect group identity we may also expect team reasoning. The effect of team reasoning on behavior is different from that of individualistic reasoning. We have already seen this for Hi-Lo. This has wide implications. It makes the theory of team reasoning a much more powerful explanatory and predictive theory than it would be if it came on line only in games with th3e right kind of common interest. To take just one example, if management brings it about so that the firm’s employees identify with the firm, we may expect for them to team-reason and so to make choices that are not predicted by the standard theories of rational choice. (pg 150)
- As we have seen, the same person passes through many group identities in the flux of life, and even on a single occasion more than one of these identities may be stimulated. So we will need a model of identity in which the probability of a person’s identification is distributed over not just two alternatives-personal self-identity or identity with a fixed group-but, in principle, arbitrarily many. (pg 151)

Phil 1.24.18

7:00 – 5:00 ASRC MKT

H1: Groups are defined by a common location, orientation, and velocity (LOV) through a navigable physical or cognitive space. The amount of group cohesion and identification is proportional to the amount of similarity along all three axis.
H2: Group Behavior emerges from mutual influence, based on awareness and trust. Mutual influence is facilitated by Dimension Reduction: The lower the number of dimensions, the easier it is to produce a group.
H3: Group behavior has three distinct patterns: Nomadic, Flocking and Stampeding. These behaviors are dictated by the level of trust and awareness between individuals having similar LOVs
- H3a: The trustworthiness of the underlying information space can be inferred from the group behaviors through belief space. All agents seek out fitness peaks (reward gradients) and avoids valleys (risk gradients) within the space. (Risk = negative heading alignment, increase speed. Reward = positive heading alignment, decrease speed.)
  - Nomadic emphasizes environmental gradients as an individual or small group of agents. This supports the broadest awareness of the belief space, though it may be difficult to infer fitness peaks. Gradient discovery is less influences by additional social effects,
  - Flocking behavior results from environmentally constrained social gradient seeking. For example, distance attenuates social influence. If an agent finds a risk or reward, that information cascades through the population as a function of the environmental constraints. (Note: In-group and out group could be manifestations of pure social gradient creation.)
  - Stampede emphasizes social gradients. This becomes easier as groups become larger and a strong ‘social reality’ occurs. When social influence is dominant at the expense of environmental awareness, a runaway stampede can occur. The beliefs and associated information that underlie a stampede can be inferred to be untrustworthy.
H4: Individual trajectories through these spaces, when combined with large numbers of other individual trajectories produce maps which reflect the dimensions that define the groups in that space.
These conclusions can be derived though
- Game theory (individual – Multi-armed bandit, coordinating [HiLo] – using only diagonals)
- Ecology (Explore/Exploit, Schooling and flocking for gradient detection)
- Sociology (Conflict, consensus, and compromise)
- Computer Science (“intelligence is computation – an expensive physical process – and therefore limited“)
- Simulation (Current work)
- Computer-mediated communication (Future work)
Continuing with BIC
Fundamentals of Data Visualization
- I’m very excited to announce my latest project, a book on data visualization. The working title is “Fundamentals of Data Visualization”. The book will be published with O’Reilly, and a preview is available here. The entire book is written in R Markdown, and the figures are made with ggplot2. The source for the book is available on github.
Sex differences in the use of social information emerge under conditions of risk
- Social learning provides an effective route to gaining up-to-date information, particularly when information is costly to obtain asocially. Theoretical work predicts that the willingness to switch between using asocial and social sources of information will vary between individuals according to their risk tolerance. We tested the prediction that, where there are sex differences in risk tolerance, altering the variance of the payoffs of using asocial and social information differentially influences the probability of social information use by sex. In a computer-based task that involved building a virtual spaceship, men and women (N = 88) were given the option of using either asocial or social sources of information to improve their performance. When the asocial option was risky (i.e., the participant’s score could markedly increase or decrease) and the social option was safe (i.e., their score could slightly increase or remain the same), women, but not men, were more likely to use the social option than the asocial option. In all other conditions, both women and men preferentially used the asocial option to a similar degree.
Thinking Fast and Slow on Networks: Co-evolution of Cognition and Cooperation in Structured Populations
- In line with past work in well-mixed populations, we find that selection favors either the intuitive defector (ID) strategy which never deliberates, or the dual-process cooperator (DC) strategy which intuitively cooperates but uses deliberation to switch to defection in Prisoner’s Dilemma games. We find that sparser networks (i.e. smaller average degree) facilitate the success of DC over ID, while also reducing the level of deliberation that DC agents engage in; and that these results generalize across different kinds of networks.
Joanna J Bryson 7:30 AM – 24 Jan 2018: This didn’t happen because humans are evil. It happens because intelligence is computation—an expensive physical process—and therefore limited. Thread very worth reading.
A bit more Angular
Compared the speed of execution for LSTM on my and Aaron’s boxes. His newer card is a bit faster than my TITAN
Most of the day was spent putting together the ppt for the ML/AI workshop on Monday

Phil 12.7.17

ASRC MKT 7:00 – 4:30

Association of moral values with vaccine hesitancy
Online extremism and the communities that sustain it: Detecting the ISIS supporting community on Twitter
Continuing Schooling as a strategy for taxis in a noisy environment here
Consensus and Cooperation in Networked Multi-Agent Systems
- This paper provides a theoretical framework for analysis of consensus algorithms for multi-agent networked systems with an emphasis on the role of directed information flow, robustness to changes in network topology due to link/node failures, time-delays, and performance guarantees. An overview of basic concepts of information consensus in networks and methods of convergence and performance analysis for the algorithms are provided. Our analysis framework is based on tools from matrix theory, algebraic graph theory, and control theory. We discuss the connections between consensus problems in networked dynamic systems and diverse applications including synchronization of coupled oscillators, flocking, formation control, fast consensus in smallworld networks, Markov processes and gossip-based algorithms, load balancing in networks, rendezvous in space, distributed sensor fusion in sensor networks, and belief propagation. We establish direct connections between spectral and structural properties of complex networks and the speed of information diffusion of consensus algorithms. A brief introduction is provided on networked systems with nonlocal information flow that are considerably faster than distributed systems with lattice-type nearest neighbor interactions. Simulation results are presented that demonstrate the role of smallworld effects on the speed of consensus algorithms and cooperative control of multivehicle formations.
Found this in the citations of the above paper with terms “belief space flocking“: Spatial Coordination Games for Large-Scale Visualization
- Dimensionality reduction (’visualization’) is a central problem in statistics. Several of the most popular solutions grew out of interaction metaphors (springs, boids, neurons, etc.) We show that the problem can be framed as a game of coordination and solved with standard game-theoretic concepts. Nodes that are close in a (high-dimensional) graph must coordinate in a (low-dimensional) screen position. We derive a game solution, a GPU-parallel implementation and report visualization experiments in several datasets. The solution is a very practical application of game-theory in an important problem, with fast and low-stress embeddings.
Lots of progress on the White Paper. Aaron wants to split out the WordRank work and the mapping work as two separate epochs. He thinks they may be easier to pitch than the phased approach
Some discussion on how explore/exploit is a bad metaphor due to the bad associations with exploit
Added a SIGINT use case
Discussed the ‘map weaving from trajectories’ concept

Phil 11.14.17

7:00 – 4:00 ASRC MKT

Reinforcement Learning: An Introduction (2nd Edition)
- Richard S. Sutton (Scholar): I am seeking to identify general computational principles underlying what we mean by intelligence and goal-directed behavior. I start with the interaction between the intelligent agent and its environment. Goals, choices, and sources of information are all defined in terms of this interaction. In some sense it is the only thing that is real, and from it all our sense of the world is created. How is this done? How can interaction lead to better behavior, better perception, better models of the world? What are the computational issues in doing this efficiently and in realtime? These are the sort of questions that I ask in trying to understand what it means to be intelligent, to predict and influence the world, to learn, perceive, act, and think. In practice, I work primarily in reinforcement learning as an approach to artificial intelligence. I am exploring ways to represent a broad range of human knowledge in an empirical form–that is, in a form directly in terms of experience–and in ways of reducing the dependence on manual encoding of world state and knowledge.
- Andrew G. Barto : Most of my recent work has been about extending reinforcement learning methods so that they can work in real-time with real experience, rather than solely with simulated experience as in many of the most impressive applications to date. Of particular interest to me at present is what psychologists call intrinsically motivated behavior, meaning behavior that is done for its own sake rather than as a step toward solving a specific problem of clear practical value. What we learn during intrinsically motivated behavior is essential for our development as competent autonomous entities able to efficiently solve a wide range of practical problems as they arise. Recent work by my colleagues and me on what we call intrinsically motivated reinforcement learning is aimed at allowing artificial agents to construct and extend hierarchies of reusable skills that form the building blocks for open-ended learning. Visit the Autonomous Learning Laboratory page for some more details.
There was a piece on BBC Business Daily on social network moderators. Aside from it being a horrible job, the show touched on how international criminal cases often rest on video uploaded to services like Twitter and Facebook. This process worked as long as the moderators were human and could tell the difference between criminal activity and the documentation of criminal activity, but now with ML solutions being implemented, these videos are being deleted. First, this shows how ad-hoc the usage of these networks are as a place for legal and journalistic activity. Second, it shows the need for a mechanism that is built to support these activities, where there is a more expansive role of reporter/researcher and editor. This is near the center of gravity for the TACJOUR project.
Flying home yesterday, I was thinking about how the maps need to get built. One way of thinking about it is that you are given a set of directions that run through a geographic area and have to build a map from that. We know the adjacencies by the sequence of the directions. It follows that we should be able to build a map by overlaying all the routes in an n-dimensional space. I was then reading Technical Perspective: Exploring a Kingdom by Geodesic Measures, and at least some of the concepts appear related. In the case of the game at least, we have the center ‘post’, which is the discussion starting point. The discussion is (can be) a random walk towards the poles created in that iteration. Multiple walks create multiple paths over this unknown Manifold. I’m thinking that this should be enough information to build a self organizing map. This might help: Visual analysis of self-organizing maps
- Had some discussions with Arron about this. It should be pretty straightforward to build a map, grid or hex that trajectories can be recorded from. Then the trajectories can be used to reconstruct the map. Success is evaluated by the similarity between the source map and the reconstructed one.
- I could also add recorded trajectories to the generated spreadsheet. It could be a list of cells that the agent traverses. Comparing explore, flocking and stampede behaviors in their reconstructed maps?
Continuing with From Keyword Search to Exploration
- The mSpace Browser is a multi faceted column based client for exploring large data sets in the way that makes sense to you. You decide the columns and the order that best suits your browsing needs.
- Yippy search
- Exalead search
- pg 62, animation
Continuing along with Angular
Multiple discussions with Aaron about next steps, particularly for anomaly detection

Phil 11.9.17

Instagram, Meme Seeding, and the Truth about Facebook Manipulation, Pt. 1

Jonathan Albright is the Research Director at the Tow Center for Digital Journalism. Previously an assistant professor of media analytics in the school of communication at Elon University, Dr. Albright’s work focuses on the analysis of socially-mediated news events, misinformation/propaganda, and trending topics, applying a mixed-methods, investigative data-driven storytelling approach.
The last couple of weeks have brought us the first new major revelations about the reach and scope of the IRA media influence campaign. Yet the most important development about the ongoing Facebook investigation isn’t the tenfold increase in the company’s updated estimate of the organic reach of “ads” on its platform.

While the estimate increasing the reach of IRA content from 10 million people to 126 million people is surely a leap, after last week’s testimony, the real question we should be asking is: how did we suddenly arrive at 150 million?

The answer is Instagram.

Reading The Group Polarization Phenomenon working on the PolarizationGame. Some thoughts:

There needs a way for each player to state their support/oppose state on a slider before the debate begins. We could even color code the threads using that information, though maybe only when viewing after the debate is complete.
What about teams?

The Emergence of a Fovea while Learning to Attend

Everything is about how we deal as individuals and groups with imperfect information. Which is why a attention-based economy is crazy

Identifying Dogmatism in Social Media: Signals and Models

We explore linguistic and behavioral features of dogmatism in social media and construct statistical models that can identify dogmatic comments. Our model is based on a corpus of Reddit posts, collected across a diverse set of conversational topics and annotated via paid crowdsourcing. We operationalize key aspects of dogmatism described by existing psychology theories (such as over-confidence), finding they have predictive power. We also find evidence for new signals of dogmatism, such as the tendency of dogmatic posts to refrain from signaling cognitive processes. When we use our predictive model to analyze millions of other Reddit posts, we find evidence that suggests dogmatism is a deeper personality trait, present for dogmatic users across many different domains, and that users who engage on dogmatic comments tend to show increases in dogmatic posts themselves.

Phil 10.27.17

7:00 – 5:00 ASRC MKT

Nicely written paper on GANs:
- Abstract: We describe a new training methodology for generative adversarial networks. The key idea is to grow both the generator and discriminator progressively: starting from a low resolution, we add new layers that model increasingly ﬁne details as training progresses. This both speeds the training up and greatly stabilizes it, allowing us to produce images of unprecedented quality, e.g., CELEBA images at 10242. We also propose a simple way to increase the variation in generated images, and achieve a record inception score of 8.80 in unsupervised CIFAR10. Additionally, we describe several implementation details that are important for discouraging unhealthy competition between the generator and discriminator. Finally,we suggest a new metric for evaluating GAN results, both in terms of image quality and variation. As an additional contribution, we construct a higher-quality version of the CELEBA dataset.
- With cool video
- And code
Working on adding UI and batch interaction for the adversarial herding
- Enable/disable switch – Done
- Field for power – don’t know what the scale should be so no slider yet – Done
- Set<String, Set<Flockingshape, weight>> If this doesn’t work, make shape comparable by name. Done!
```
HashMap<FlockingShape, Double> alignedShapeMap;
if(flock.size() > 0 && !alignedFlockMap.containsKey(flockName)){
    alignedShapeMap = new HashMap<>();
    alignedFlockMap.put(flockName, alignedShapeMap);
}else{
    alignedShapeMap = alignedFlockMap.get(flockName);
}
```
- Do I want to delay the triggering of the herding on a separate timer? Waiting on this.
- It’s done, and the results are kind of scary. If I set the weight of the herder to 15, I can change the change the flocking behavior of the default to echo chamber.
- Normal:
- Herding weight set to 15, other options the same:
Did some additional tweaking to see if having highly-weighted herders ignore each other (they would be coordinated through C&C) would have any effect. It doesn’t. There is enough interaction through the regular populations to keep the alignment space reduced.
It looks like there is a ‘sick echo chamber’ pattern. If the borders are reflective, and the herding weight + influence radius is great enough, then a wall-hugging pattern will emerge.
- The influence weight is sort of a credibility score. An agent that has a lot of followers, or says a lot of the things that I agree with has a lot of influence weight The range weight is reach.
- Since a troll farm or botnet can be regarded as a single organization, interacting with any one of the agents is really interacting with the root entity. So a herding agent has high influence and high reach. The high reach explains the border hugging behavior.
- It’s like there’s someone at the back of the stampede yelling YOUR’E GOING THE RIGHT WAY! KEEP AT IT! And they never go off the cliff because they are a swarm Or, it never goes of the cliff, because it manifests as a swarm.
- A loud, distributed voice pointing in a bad direction means wall hugging. Note that there is some kind of floating point error that lets wall huggers creep off the edge.
- With a respawn border, we get the situation where the overall heading of the flock doesn’t change even as it gets destroyed as it goes over the border. Again, since the herding algorithm is looking at the overall population, it never crosses the border but influences all the respawned agents to head towards the same edge:
Paper thoughts:
- Armys have different patterns from emergent groups. They are imposed formations and reflect a commander’s will
- From a distance, they look different, but close up, they may look the same. One of the reasons for the success of the Roman Legion was the use of formations against the less sophisticated structures of their adversaries [ref]

Phil 9.21.17

6:00 – 10:30, 1:00 – 6:00 ASRC MKT

I think there is a difference between exploring, a deliberate exposing to things unknown and serendipity, an accidental encounter with the unknown. In the first case, the mind is prepared for the situation. In the second, the mind needs to be receptive to the serendipity. I think that design may matter a lot here. A serendipitous result low on a list may not have the same impact as a point on a map or a line in a story.
Oxford English dictionary’’s definitions of:
- serendipity: “the faculty of making happy and unexpected discoveries by accident”.
- explore: An act of exploring an unfamiliar place; an exploration, an excursion.
- discover: To disclose, reveal, etc., to others or (later) oneself; to find out.
- sagacity: Acuteness of mental discernment; aptitude for investigation or discovery; keenness and soundness of judgement in the estimation of persons and conditions, and in the adaptation of means to ends; penetration, shrewdness.
- synchronicity: the phenomenon of events which coincide in time and appear meaningfully related but have no discoverable causal connection.
Skimming these
- The bohemian bookshelf: supporting serendipitous book discoveries through information visualization
  - A Thudt, U Hinrichs, S Carpendale
  - Serendipity, a trigger of exciting discoveries when we least expect it, is currently being discussed as an often neglected but still important factor in information seeking processes, research, and ideation. In this paper we explore serendipity as an information visualization goal. In particular, we introduce the Bohemian Bookshelf visualization that aims to support serendipitous exploration of digital book collections. The Bohemian Bookshelf consists of five interlinked visualizations, each representing a unique (over)view of the collection. It facilitates serendipitous discoveries by (1) offering multiple access points by providing visualizations of different perspectives on the book collection, (2) enticing curiosity through abstract, metaphorical, and visually distinct representations of the collection, (3) highlighting alternate adjacencies between books, (4) providing multiple pathways for exploring the data collection in a flexible way, (5) supporting immediate previews of books, and (6) enabling a playful approach to information exploration. Our design goals and their exploration through the Bohemian Bookshelf visualization opens up a discussion on how to promote serendipity through information visualization.
  - six design goals that we have derived for promoting serendipitous discoveries through information visualization.
  - Austin coined the term altamirage that describes serendipitous discoveries as a result of chance paired with individual traits of the exploring person [2, 29].
  - This is closely related to the notion of synchronicity where related ideas may manifest as simultaneous occurrences that seem acausal but still meaningful [29].
  - The prevalence of these ideas of chance, fortuity, and coincidence in the discussion around serendipity has led to a tendency to trivialize this complex concept by assuming that serendipity can be supported simply through the introduction of randomness.
  - The design of the Bohemian Bookshelf offers multiple pathways through the book collection by (1) providing multiple interactive overviews of the book collection that can guide the information seeker into different and interesting directions, (2) the presentation of adjacent data that can act as visual signposts providing alternatives for the viewer to move through the dataset by following up on related books, and (3) emphasizing cross visualization attributes by mutual highlighting as in coordinated views [3, 7]
  - multiple pathways through the book collection that can provide guidance in a serendipitous way. The visual overviews can provide one way of exploring books. For instance, visitors can systematically browse through all books of their favourite colour and, in this way, possibly encounter books that are of interest to them but that they did not think of to search for directly. Furthermore, emphasizing adjacent books can be considered as visual signposts. For instance, following up on highlighted books in the Book Pile is likely to rapidly guide people serendipitously to different topical areas of the book collection. As a third approach to multiple pathways, all visualizations of the Bohemian Bookshelf are interlinked with each other. Therefore, every selection of a book in one visualization can be considered a cross road to the other visualizations that highlight this selection as well in their particular context.
  - We deliberately designed the Bohemian Bookshelf to provide multiple overviews of the entire book collection to provide opportunities to discover unexpected trends and relations within the collection.
- Discovery is never by chance: designing for (un)serendipity – finished. Good paper!
  - P André, J Teevan, ST Dumais
  - Serendipity has a long tradition in the history of science as having played a key role in many significant discoveries. Computer scientists, valuing the role of serendipity in discovery, have attempted to design systems that encourage serendipity. However, that research has focused primarily on only one aspect of serendipity: that of chance encounters. In reality, for serendipity to be valuable chance encounters must be synthesized into insight. In this paper we show, through a formal consideration of serendipity and analysis of how various systems have seized on attributes of interpreting serendipity, that there is a richer space for design to support serendipitous creativity, innovation and discovery than has been tapped to date. We discuss how ideas might be encoded to be shared or discovered by “association-hunting” agents. We propose considering not only the inventor‘s role in perceiving serendipity, but also how that inventor‘s perception may be enhanced to increase the opportunity for serendipity. We explore the role of environment and how we can better enable serendipitous discoveries to find a home more readily and immediately.
    - there is “no discovery of a thing you are looking for“
    - However, most systems designed to induce or facilitate serendipity have focused on the first aspect, subtly encouraging chance encounters, while ignoring the second part, making use of those encounters in a productive way.
    - Especially, however, we want to offer approaches to get at
      the desired effect of serendipity: insight
    - For us, serendipity is:
      1. the finding of unexpected information (relevant to the goal or not) while engaged in any information activity,
      2. the making of an intellectual leap of understanding with that information to arrive at an insight
    - In our study, a number of participants remarked that they thought of themselves as ‘serendipitous’, and were surprised to find no instances of it in their search behaviour.
      - This is because exploring is not serendipity. See first point above
    - Click entropy, a direct measure of how varied the result clicks are for the query, was found to be significant. That is, a positive correlation between entropy and the number of potentially serendipitous results suggests that people may have clicked varied results not just because they could not find what they wanted, but because they considered more things interesting, or were more willing to go off at a tangent.
    - Arguably however, almost all visualization systems are designed to support such a goal: identifying interesting, but unknown, trends or patterns in data that would not have been visible otherwise.
    - Erdelez‘s [12] so-called ‘super-encounterers’, encountering unexpected information on a regular basis, even counting on it as an important element in information acquisition.
    - Instead of treating serendipity as arcane, mysterious and accidental, we embrace the ability of computers to help us perceive connections and opportunities in various pieces of information
    - presenting such information to users has the potential to increase the overall information the user must interact with. This can lead to two problems: distraction or overload, and the negative consequences of incorrect or problematic recommendations or assumptions
    - It is widely acknowledged that serendipitous discoveries are preceded by a period of preparation and incubation [7]. They are, in that respect, not as ‗serendipitous‘ as we might expect, being the product of mental preparation as well as of an open and questioning mind
    - The challenge from a design perspective may not necessarily be discovering domain literature opportunities, but defining mechanisms for presenting these suggestions in ways that are effective for the investigator. Further to creating a reading list is defining the space to deliver them opportunistically
    - This idea again supposes a form of common language model, a way to express interest or expertise in particular areas, and a way to search for results.
    - In this spectrum, we have also demonstrated that computer science has spent most of it’s design effort perhaps overly focused on trying to create insight (effect of serendipity), by recreating the cause (chance), rather than on, for instance, increasing the rate and accuracy of proposed candidates for serendipitous insight, or developing domain expertise

Ordered this, too: Information Visualization: Beyond the Horizon. Has quite a bit on maps that’s going to be needed in the implications for design section
What is a Diagram?
- This paper responds to renewed interest in the centuries old question of what is a diagram. Existing status of our understanding of diagrams is seen as unsatisfactory and confusing. This paper responds to this by proposing a framework for understanding diagrams based on symbolic and spatial mapping. The framework deals with some complex problems any useful definition of diagrams has to deal with. These problems are the variety of diagrams, meaningful dynamics of diagramming, handling change in diagrams in a well formed way, and all of this in the context of semantically mixed diagrams. A brief description of the framework is given discussing how it addresses the problems.
Supporting serendipity: Using ambient intelligence to augment user exploration for data mining and web browsing.
- Has some very Research-Browser-ish bits in it
- an agent-based system to support internet browsing. It models the user‘s behaviour to look ahead at linked web pages and their word frequencies, using a Bayesian approach to determine relevance. It then colours links on the page depending on their relevance. In evaluation, the colouring was seen as successful, with people tending to follow the strongly advised links most of the time.
Retroactive answering of search queries
- Major search engines currently use the history of a user’s actions (e.g., queries, clicks) to personalize search results. In this paper, we present a new personalized service, query-specific web recommendations (QSRs), that retroactively answers queries from a user’s history as new results arise. The QSR system addresses two important subproblems with applications beyond the system itself: (1) Automatic identification of queries in a user’s history that represent standing interests and unfulfilled needs. (2) Effective detection of interesting new results to these queries. We develop a variety of heuristics and algorithms to address these problems, and evaluate them through a study of Google history users. Our results strongly motivate the need for automatic detection of standing interests from a user’s history, and identifies the algorithms that are most useful in doing so. Our results also identify the algorithms, some which are counter-intuitive, that are most useful in identifying interesting new results for past queries, allowing us to achieve very high precision over our data set.

Phil 9.12.17

7:00 – 5:00 ASRC MKT

Meeting with Wayne yesterday after Fika. Get him a draft by the end of the week to discuss Monday?
More writing
Herding in humans (Ramsey M. Raafat, Nick Chater, and Chris Frith)
- Herding is a form of convergent social behaviour that can be broadly defined as the alignment of the thoughts or behaviours of individuals in a group (herd) through local interaction and without centralized coordination. We suggest that herding has a broad application, from intellectual fashion to mob violence; and that understanding herding is particularly pertinent in an increasingly interconnected world. An integrated approach to herding is proposed, describing two key issues: mechanisms of transmission of thoughts or behaviour between agents, and patterns of connections between agents. We show how bringing together the diverse, often disconnected, theoretical and methodological approaches illuminates the applicability of herding to many domains of cognition and suggest that cognitive neuroscience offers a novel approach to its study.
Alignment in social interactions (M.Gallotti, M.T.Fairhurst, C.D.Frith)
- According to the prevailing paradigm in social-cognitive neuroscience, the mental states of individuals become shared when they adapt to each other in the pursuit of a shared goal. We challenge this view by proposing an alternative approach to the cognitive foundations of social interactions. The central claim of this paper is that social cognition concerns the graded and dynamic process of alignment of individual minds, even in the absence of a shared goal. When individuals reciprocally exchange information about each other’s minds processes of alignment unfold over time and across space, creating a social interaction. Not all cases of joint action involve such reciprocal exchange of information. To understand the nature of social interactions, then, we propose that attention should be focused on the manner in which people align words and thoughts, bodily postures and movements, in order to take one another into account and to make full use of socially relevant information.
Herding and escaping responses of juvenile roundfish to square mesh window in a trawl cod end (This is the only case I can find of 3-D stampeding. Note the [required?] dimension reduction)
- The movements of juvenile roundfish, mainly haddock Melanogrammus aeglefinus and whiting Merlangius merlangus, reacting to a square mesh window in the cod end of a bottom trawl were observed during fishing experiments in the North Sea. Two typical behavioral responses of roundfish are described as the herding response and the escaping response, which were analyzed from video recordings by time sequences of the movement parameters. It was found that most of the actively escaping fish approached the square mesh window at right angles by swimming straight ahead with very little change in direction, while most of the herded fish approached the net at obtuse angles and retreated by sharp turning. The herding and escaping responses showed significant difference when characterized by frequency distributions of swimming speed and angular velocity, and both responses showed large and irregular variations in swimming movement parameters like the panic erratic responses. It is concluded that an escaping or herding response to the square mesh window could be decided by an interaction between the predictable parameters that describe the stimuli of net and angular changes of fish response, such as approaching angle, turning angle and angular velocity.
Assessing the Effect of “Disputed” Warnings and Source Salience on Perceptions of Fake News Accuracy
- What are effective techniques for combating belief in fake news? Tagging fake articles with “Disputed by 3rd party fact-checkers” warnings and making articles’ sources more salient by adding publisher logos are two approaches that have received large-scale rollouts on social media in recent months. Here we assess the effect of these interventions on perceptions of accuracy across seven experiments (total N=7,534). With respect to disputed warnings, we find that tagging articles as disputed did significantly reduce their perceived accuracy relative to a control without tags, but only modestly (d=.20, 3.7 percentage point decrease in headlines judged as accurate). Furthermore, we find a backfire effect – particularly among Trump supporters and those under 26 years of age – whereby untagged fake news stories are seen as more accurate than in the control. We also find a similar spillover effect for real news, whose perceived accuracy is increased by the presence of disputed tags on other headlines. With respect to source salience, we find no evidence that adding a banner with the logo of the headline’s publisher had any impact on accuracy judgments whatsoever. Together, these results suggest that the currently deployed approaches are not nearly enough to effectively undermine belief in fake news, and new (empirically supported) strategies are needed.
Some meetings on marketing. Looks like we’re trying to get on this panel. Wrote bioblurbs!
More writing. Reasonable progress.

Phil 6.15.16

7:00 – 10:00, 12:00 – 4:00 VTX

Got the official word that I should be charging the project for research. Saved the email this time.
Continuing to work on the papers list
And in the process of looking at Daniele Quercia‘s work, I found Auralist: introducing serendipity into music recommendation which was cited by
An investigation on the serendipity problem in recommender systems. Which has the following introduction:
- In the book ‘‘The Filter Bubble: What the Internet Is Hiding from You’’, Eli Pariser argues that Internet is limiting our horizons (Parisier, 2011). He worries that personalized filters, such as Google search or Facebook delivery of news from our friends, create individual universes of information for each of us, in which we are fed only with information we are familiar with and that confirms our beliefs. These filters are opaque, that is to say, we do not know what is being hidden from us, and may be dangerous because they threaten to deprive us from serendipitous encounters that spark creativity, innovation, and the democratic exchange of ideas. Similar observations have been previously made by Gori and Witten (2005) and extensively developed in their book ‘‘Web Dragons, Inside the Myths of Search Engine Technology’’ (Witten, Gori, & Numerico, 2006), where the metaphor of search engines as modern dragons or gatekeepers of a treasure is justified by the fact that ‘‘the immense treasure they guard is society’s repository of knowledge’’ and all of us accept dragons as mediators when having access to that treasure. But most of us do not know how those dragons work, and all of us (probably the search engines’ creators, either) are not able to explain the reason why a specific web page ranked first when we issued a query. This gives rise to the so called bubble of Web visibility, where people who want to promote visibility of a Web site fight against heuristics adopted by most popular search engines, whose details and biases are closely guarded trade secrets.
- Added both papers to the corpus. Need to read and code. What I’m doing is different in that I want to add a level of interactivity to the serendipity display that looks for user patterns in how they react to the presented serendipity and incorporate that pattern into a trustworthiness evaluation of the web content. I’m also doing it in Journalism, which is a bit different in its constraints. And I’m trying to tie it back to Group Polarization and opinion drift.
Also, Raz Schwartx at Facebook: , Editorial Algorithms: Using Social Media to Discover and Report Local News
Working on getting all html and pdf files in one matrix
Spent the day chasing down a bug where if the string being annotated is too long (I’ve set the number of wordes to 60), then we skip. THis leads to a divide by zero issue. Fixed now

Phil 5.5.16

7:00 – 5:30 VTX

Continuing An Introduction to the Bootstrap.
This helped a lot. I hope it’s right…
Had a thought about how to build the Bootstrap class. Build it using RealVector and then use Interface RealVectorPreservingVisitor to do whatever calculation is desired. Default methods for Mean, Median, Variance and StdDev. It will probably need arguments for max iteration and epsilon.
Didn’t do that at all. Wound up using ArrayRealVector for the population and Percentile to hold the mean and variance values. I can add something else later
I think to capture how the centrality affects the makeup of the data in a matrix. I think it makes sense to use the normalized eigenvector to multiply the counts in the initial matrix and submit that population (the whole matrix) to the Bootstrap
Meeting with Wayne? Need to finish tool updates though.
Got bogged down in understanding the Percentile class and how binomial distributions work.
Built and then fixed a copy ctor for Labled2DMatrix.
Testing. It looks ok, but I want to try multiplying the counts by the eigenVec. Tomorrow.

Phil 5.2.16

7:00 – 3:00 VTX

How to get funding using Web of Science
http://www.grants.gov/web/grants/search-grants.html
http://www.research.gov/
Finished Supporting Reﬂective Public Thought with ConsiderIt
- Watched the ConsiderIt demo. I love the histogram that shows how the issue polarization is characterized.
Back to Informed Citizenship in a Media-Centric Way of Life
- Page 225 – Conclusions: As prescriptive as it may sound, it is time to suspend the normative traditions that envelop journalism and democracy, take stock of how knowledge is explicated and operationalized, and calibrate research practice to accommodate an explication of informed citizenship and democratic participation itted to contemporary life. Doing so strays from the dominant research paradigm, grounded in convictions about the supremacy of rational thought, verbal information, news as cold hard facts, and electoral activities as the gold standard of participatory practices. We advanced arguments for a departure from tradition and elaborated on how the very notions of informed citizenship and political participation are mutating in (and because of) the current media environment.
- And this is kind of scary: Freedom is on the longest global downward trajectory in 40 years (Freedom House, 2011), democratic failure is at the highest rate since the mid-1980s (Diamond, 1999), and there are indicators of qualitative erosion in democratic practice worldwide (Bertelsmann Foundation, 2012). he people’s view on democratic life appears tepid, in several parts of the world, there are reports of a so-called authoritarial nostalgia among citizens who live in Asian countries that are transforming to democratic systems of governance (Chang, Chu, & Park, 2007) while a mere half (or fewer) of Russians, Poles, Ukrainians, and Indonesians expressed strong support for democratic rule (World Public Opinion.org, 2015).
  - Make America Great Again.
- Done. Reading this makes me feel more like a connectivist/AI revolution is coming that will either tend towards isolating us more or finding ways to bring us together. The thing is that we’re wired to do both. So this really is a design problem.
————————————
Well drat, was going to do some light work on developing the ranking app, but it looks like I forgot to check in the latest version of Java Utils
Installed Launch4j
TODO:
- Add a ‘session name’ text field – done
- Add a ‘interactive’ checkbox. If it’s selected, then change in the weight slider will fire calculate(). Done
- Fixed the ‘Reset Weights’
- Got the ‘Use Unit Weights’ option. I just replace all the non-zero values in the derived symmetric matrix to 1.0. I have a suspicion that this will come back to bite me, but for now I can’t think of a reason. The only thing that I really don’t like is that there is no obvious change in the data. The ‘Weights’ column actually means ‘scalar’. This issue is that the whole matrix would have to be shown, since the weight exists at the intersection of two items. So a row or column is sort of a sum of weights.
- Start TF-IDF app. It should do the following:
  - Take a list of URIs (local or remote, pdf, html, text). These are the documents
  - Read each of the documents into a data structure that has
    - Document title
    - Keywords (if called out)
    - Word list (lemmatized)
      - Word
      - Document count
      - Parts Of Speech(?)
  - Run TF-IDF to produce an ordered list of terms
  - Build a co-occurrence matrix of terms and documents
  - Output matrix to Excel.
The end of a good day:

Phil 4.29.16

7:00 – 5:00 VTX

Expense reports and timesheets! Done.
Continuing Informed Citizenship in a Media-Centric Way of Life
- The pertinence interface may be an example of a UI affording the concept of monitorial citizenship.
  - Page 219: The monitorial citizen, in Schudson’s (1998) view, does environmental surveillance rather than gathering in-depth information. By implication, citizens have social awareness that spans vast territory without having in-depth understanding of specific topics. Related to the idea of monitorial instead of informed citizenship, Pew Center (2008) data identified an emerging group of young (18–34) mobile media users called news grazers. These grazers ind what they need by switching across media platforms rather than waiting for content to be served.
- Page 222: Risk as Feelings. The abstract is below. There is an emotional hacking aspect here that traditional journalism has used (heuristically?) for most(?) of its history.
  - Virtually all current theories of choice under risk or uncertainty are cognitive and consequentialist. They assume that people assess the desirability and likelihood of possible outcomes of choice alternatives and integrate this information through some type of expectation-based calculus to arrive at a decision. The authors propose an alternative theoretical perspective, the risk-as-feelings hypothesis, that highlights the role of affect experienced at the moment of decision making. Drawing on research from clinical, physiological, and other subfields of psychology, they show that emotional reactions to risky situations often diverge from cognitive assessments of those risks. When such divergence occurs, emotional reactions often drive behavior. The risk-as-feelings hypothesis is shown to explain a wide range of phenomena that have resisted interpretation in cognitive–consequentialist terms.
- At page 223 – Elections as the canon of participation
Working on getting tables to sort – Done
Loading excel file -done
Calculating – done
Using weights -done
Reset weights – done
Saving (don’t forget to add sheet with variables!) – done
Wrapped in executable – done
Uploading to dropbox. Wow – the files with JavaFX are *much* bigger than Swing.

Phil 4.27.16

7:00 – 5:30 VTX

Finished A fistful of bitcoins: characterizing payments among men with no names
- In reading the discussion about ‘peeling’, I wonder if in a similar way, if someone returns to a story repeatedly, would an adversary be able to find out anything useful?Or, if Bitcoin were used to pay for stories, would tracking transactions do anything as well? One of the nice things about using aliases for BC addresses is that other than the initial mapping, the address can be hidden in the system.
- Page 93: ...even the most motivated Bitcoin users (i.e., criminals) are engaging in idioms of use that allow us to erode their anonymity.
  - This is an important point. As with biometrics at the small scale, we are identifiable through our behaviors. In this case, idioms or patterns of usage.
Rating app
- Add people – done
- Add John’s suggestions – done
- Build and deploy – Done. Waiting on Andy.
Write up TF_IDF story
- Basic capability – 11 points
  - The initial part of the effort is to scan over the collection of documents and produce a list of words ordered by TF-IDF. This means iterating over all the documents and producing a Set<String> of words that are then run over the the set of documents. The output should be an excel file that lists the documents in the corpus, and the list of words.
    - Documents should be listed in a file (xml?) as URIs. HTML docs can be read by jsoup, PDF by PDFBox.
    - The TF-IDF algorithm is discussed here: https://guendouz.wordpress.com/2015/02/17/implementation-of-tf-idf-in-java/
- Pull pages from approved flags – 3 points
  - The second part of the effort is to use Jeremy’s REST interface to extract the URLs of ‘cleared’ flags to use as the input to the app, via the input file (or call from within the app, though there may be certs issues)
- Report with new term recommendations – 3 points
  - Using the rating app, we should be able to try using these new terms and see if they improve results. One of the items that will need to be returned from the DB (that’s already stored in the QueryObject2) so we can see if we’re getting cleaner results.
LanguageModelNetworks
- Read in a spreadsheet (xls and xlsx)
- Write out spreadsheets (page containing the data information
  - File
  - User
  - Date run
  - Settings used
- allow for manipulation of row and column values (in this case, papers and codes, but the possibilities are endless)
  - Select the value to manipulate (reset should be an option)
  - Spinner/entry field to set changes (original value in label)
  - ‘Calculate’ button
  - Sorted list(s) of rows and columns. (indicate +/- change in rank)
- Reset all button
- Normalize all button
- Progress for today! Lots of wiring up to do though:

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: Fact Checking

Phil 5.1.19

Phil 8.30.18

Phil 3.7.18

Phil 1.24.18

Phil 12.7.17

Phil 11.14.17

Phil 11.9.17

The answer is Instagram.

Phil 10.27.17

Phil 9.21.17

Phil 9.12.17

Phil 6.15.16

Phil 5.5.16

Phil 5.2.16

Phil 4.29.16

Phil 4.27.16