Category Archives: Lit Review

Phil 6.7.18

7:00 – 4:30 ASRC MKT

  • Che Dorval
  • Done with the whitepaper! Submitted! Yay! Add to ADP
  • The SLT meeting went well, apparently. Need to determine next steps
  • Back to Bit by Bit. Reading about mass collaboration. eBird looks very interesting. All kinds of social systems involved here.
    • Research
      • Deep Multi-Species Embedding
        • Understanding how species are distributed across landscapes over time is a fundamental question in biodiversity research. Unfortunately, most species distribution models only target a single species at a time, despite strong ecological evidence that species are not independently distributed. We propose Deep Multi-Species Embedding (DMSE), which jointly embeds vectors corresponding to multiple species as well as vectors representing environmental covariates into a common high-dimensional feature space via a deep neural network. Applied to bird observational data from the citizen science project \textit{eBird}, we demonstrate how the DMSE model discovers inter-species relationships to outperform single-species distribution models (random forests and SVMs) as well as competing multi-label models. Additionally, we demonstrate the benefit of using a deep neural network to extract features within the embedding and show how they improve the predictive performance of species distribution modelling. An important domain contribution of the DMSE model is the ability to discover and describe species interactions while simultaneously learning the shared habitat preferences among species. As an additional contribution, we provide a graphical embedding of hundreds of bird species in the Northeast US.
  • Start fixing This one Simple Trick
    • Highlighted all the specified changes. There are a lot of them!
    • Started working on figure 2, and realized (after about an hour of Illustrator work) that the figure is correct. I need to verify each comment before fixing it!
  • Researched NN anomaly detection. That work seems to have had its heyday in the ’90s, with more conventional (but computationally intensive) methods being preferred these days.
  • I also thought that Dr. Li’s model had a time-orthogonal component for prediction, but I don’t think that’s true. THe NN is finding the frequency and bounds on its own.
  • Wrote up a paragraph expressing my concerns and sent to Aaron.

Phil 6.6.18

7:00 – 4:30 ASRC MKT

  • Finished the white paper
  • Peer review of Dr. Li’s AIMS work
  • Computational Propaganda in the United States of America: Manufacturing Consensus Online
    • Do bots have the capacity to influence the flow of political information over social media? This working paper answers this question through two methodological avenues: A) a qualitative analysis of how political bots were used to support United States presidential candidates and campaigns during the 2016 election, and B) a network analysis of bot influence on Twitter during the same event. Political bots are automated software programs that operate on social media, written to mimic real people in order to manipulate public opinion. The qualitative findings are based upon nine months of fieldwork on the campaign trail, including interviews with bot makers, digital campaign strategists, security consultants, campaign staff, and party officials. During the 2016 campaign, a bipartisan range of domestic and international political actors made use of political bots. The Republican Party, including both self-proclaimed members of the “alt-right” and mainstream members, made particular use of these digital political tools throughout the election. Meanwhile, public conversation from campaigners and government representatives is inconsistent about the political influence of bots. This working paper provides ethnographic evidence that bots affect information flows in two key ways: 1) by “manufacturing consensus,” or giving the illusion of significant online popularity in order to build real political support, and 2) by democratizing propaganda through enabling nearly anyone to amplify online interactions for partisan ends. We supplement these findings with a quantitative network analysis of the influence bots achieved within retweet networks of over 17 million tweets, collected during the 2016 US election. The results of this analysis confirm that bots reached positions of measurable influence during the 2016 US election. Ultimately, therefore, we find that bots did affect the flow of information during this particular event. This mixed methods approach shows that bots are not only emerging as a widely-accepted tool of computational propaganda used by campaigners and citizens, but also that bots can influence political processes of global significance.

Phil 6.4.18

7:00 – 4:00 ASRC MKT

  • Got accepted to SASO!
  • Listening to a show about energy in bitcoin mining. There are ramifications to AI, since that’s also expensive processing.
  • Thinking about the ramifications of ‘defect always’ emerging in a society.
  • More Bit by Bit
  • Quad chart
  • Fika

Phil 6.1.18

7:00 – 6:00 ASRC MKT

  • Bot stampede reaction to “evolution” in a thread about UNIX. This is in this case posting scentiment against the wrong thing. There are layers here though. It can also be advertising. Sort of the dark side of diversity injection.
  • Seems like an explore/exploit morning
  • Autism on “The Leap”: Neurotypical and Neurodivergent (Neurodiversity)
  • From a BBC Business Daily show on Elon Musk
    • Thomas Astebro (Decision Science): The return to independent invention: evidence of unrealistic optimism, risk seeking or skewness loving? 
      • Examining a sample of 1,091 inventions I investigate the magnitude and distribution of the pre‐tax internal rate of return (IRR) to inventive activity. The average IRR on a portfolio investment in these inventions is 11.4%. This is higher than the risk‐free rate but lower than the long‐run return on high‐risk securities and the long‐run return on early‐stage venture capital funds. The portfolio IRR is significantly higher, for some ex anteidentifiable classes of inventions. The distribution of return is skew: only between 7‐9% reach the market. Of the 75 inventions that did, six realised returns above 1400%, 60% obtained negative returns and the median was negative.
  • Myth of first mover advantage
    • Conventional wisdom would have us believe that it is always beneficial to be first – first in, first to market, first in class. The popular business literature is full of support for being first and legions of would-be business leaders, steeped in the Jack Welch school of business strategy, will argue this to be the case. The advantages accorded to those who are first to market defines the concept of First Mover Advantage (FMA). We outline why this is not the case, and in fact, that there are conditions of applicability in order for FMA to hold (and these conditions often do not hold). We also show that while there can be advantages to being first, from an economic perspective, the costs generally exceed the benefits, and the full economics of FMA are usually a losing proposition. Finally, we show that increasingly, we live in a world where FMA is eclipsed by innovation and format change, rendering the FMA concept obsolete (i.e. strategic obsolescence).
  • More Bit by Bit
  • Investigating the Effects of Google’s Search Engine Result Page in Evaluating the Credibility of Online News Sources
    • Recent research has suggested that young users are not particularly skilled in assessing the credibility of online content. A follow up study comparing students to fact checkers noticed that students spend too much time on the page itself, while fact checkers performed “lateral reading”, searching other sources. We have taken this line of research one step further and designed a study in which participants were instructed to do lateral reading for credibility assessment by inspecting Google’s search engine result page (SERP) of unfamiliar news sources. In this paper, we summarize findings from interviews with 30 participants. A component of the SERP noticed regularly by the participants is the so-called Knowledge Panel, which provides contextual information about the news source being searched. While this is expected, there are other parts of the SERP that participants use to assess the credibility of the source, for example, the freshness of top stories, the panel of recent tweets, or a verified Twitter account. Given the importance attached to the presence of the Knowledge Panel, we discuss how variability in its content affected participants’ opinions. Additionally, we perform data collection of the SERP page for a large number of online news sources and compare them. Our results indicate that there are widespread inconsistencies in the coverage and quality of information included in Knowledge Panels.
  • White paper
    • Add something about geospatial mapping of belief.
    • Note that belief maps are cultural artifacts, so comparing someone from one belief space to others in a shared physical belief environment can be roughly equivalent to taking the dot product of the belief space vectors that you need to compare. This could produce a global “alignment map” that can suggest how aligned, opposed, or indifferent a population might be with respect to an intervention, ranging from medical (Ebola teams) to military (special forces operations).
      • Similar maps related to wealth in Rwanda based on phone metadata: Blumenstock, Joshua E., Gabriel Cadamuro, and Robert On. 2015. “Predicting Poverty and Wealth from Mobile Phone Metadata.” Science350 (6264):1073–6. https://doi.org/10.1126/science.aac4420F2.large
    • Added a section about how mapping belief maps would afford prediction about local belief, since overall state, orientation and velocity could be found for some individuals who are geolocated to that area and then extrapolated over the region.

Phil 5.31.18

7:00 – ASRC MKT

  • Via BBC Business Daily, found this interesting post on diversity injection through lunch table size:
  • KQED is playing America Abroad – today on russian disinfo ops:
    • Sowing Chaos: Russia’s Disinformation Wars 
      • Revelations of Russian meddling in the 2016 US presidential election were a shock to Americans. But it wasn’t quite as surprising to people in former Soviet states and the EU. For years they’ve been exposed to Russian disinformation and slanted state media; before that Soviet propaganda filtered into the mainstream. We don’t know how effective Russian information warfare was in swaying the US election. But we do know these tactics have roots going back decades and will most likely be used for years to come. This hour, we’ll hear stories of Russian disinformation and attempts to sow chaos in Europe and the United States. We’ll learn how Russia uses its state-run media to give a platform to conspiracy theorists and how it invites viewers to doubt the accuracy of other news outlets. And we’ll look at the evolution of internet trolling from individuals to large troll farms. And — finally — what can be done to counter all this?
  • Some interesting papers on the “Naming Game“, a form of coordination where individuals have to agree on a name for something. This means that there is some kind of dimension reduction involved from all the naming possibilities to the agreed-on name.
    • The Grounded Colour Naming Game
      • Colour naming games are idealised communicative interactions within a population of artificial agents in which a speaker uses a single colour term to draw the attention of a hearer to a particular object in a shared context. Through a series of such games, a colour lexicon can be developed that is sufficiently shared to allow for successful communication, even when the agents start out without any predefined categories. In previous models of colour naming games, the shared context was typically artificially generated from a set of colour stimuli and both agents in the interaction perceive this environment in an identical way. In this paper, we investigate the dynamics of the colour naming game in a robotic setup in which humanoid robots perceive a set of colourful objects from their own perspective. We compare the resulting colour ontologies to those found in human languages and show how these ontologies reflect the environment in which they were developed.
    • Group-size Regulation in Self-Organised Aggregation through the Naming Game
      • In this paper, we study the interaction effect between the naming game and one of the simplest, yet most important collective behaviour studied in swarm robotics: self-organised aggregation. This collective behaviour can be seen as the building blocks for many others, as it is required in order to gather robots, unable to sense their global position, at a single location. Achieving this collective behaviour is particularly challenging, especially in environments without landmarks. Here, we augment a classical aggregation algorithm with a naming game model. Experiments reveal that this combination extends the capabilities of the naming game as well as of aggregation: It allows the emergence of more than one word, and allows aggregation to form a controllable number of groups. These results are very promising in the context of collective exploration, as it allows robots to divide the environment in different portions and at the same time give a name to each portion, which can be used for more advanced subsequent collective behaviours.
  • More Bit by Bit. Could use some worked examples. Also a login so I’m not nagged to buy a book I own.
    • Descriptive and injunctive norms – The transsituational influence of social norms.
      • Three studies examined the behavioral implications of a conceptual distinction between 2 types of social norms: descriptive norms, which specify what is typically done in a given setting, and injunctive norms, which specify what is typically approved in society. Using the social norm against littering, injunctive norm salience procedures were more robust in their behavioral impact across situations than were descriptive norm salience procedures. Focusing Ss on the injunctive norm suppressed littering regardless of whether the environment was clean or littered (Study 1) and regardless of whether the environment in which Ss could litter was the same as or different from that in which the norm was evoked (Studies 2 and 3). The impact of focusing Ss on the descriptive norm was much less general. Conceptual implications for a focus theory of normative conduct are discussed along with practical implications for increasing socially desirable behavior. 
    • Construct validity centers around the match between the data and the theoretical constructs. As discussed in chapter 2, constructs are abstract concepts that social scientists reason about. Unfortunately, these abstract concepts don’t always have clear definitions and measurements.
      • Simulation is a way of implementing theoretical constructs that are measurable and testable.
  • Hyperparameter Optimization with Keras
  • Recognizing images from parts Kaggle winner
  • White paper
  • Storyboard meeting
  • The advanced analytics division(?) needs a modeling and simulation department that builds models that feed ML systems.
  • Meeting with Steve Specht – adding geospatial to white paper

Phil 5.30.18

7:15 – 6:00 ASRC MKT

  • More Bit by Bit
  • An interesting tweet about the dichotomy between individual and herd behaviors.
  • More white paper. Add something about awareness horizon, and how maps change that from a personal to a shared reality (cite understanding ignorance?)
  • Great discussion with Aaron about incorporating adversarial herding. I think that there will be three areas
    • Thunderdome – affords adversarial herding. Users have to state their intent before joining a discussion group. Bots and sock puppets allowed
    • Clubhouse – affords discussion with chosen individuals. THis is what I thought JuryRoom was
    • JuryRoom – fully randomized members and topics, based on activity in the Clubhouse and Thunderdome

Phil 5.29.18

Insane, catastrophic rain this weekend. That’s the top of a guardrail in the middle of the scene below:

7:00 – 4:30 ASRC MKT

  • The Neural Representation of Social Networks
    • The computational demands associated with navigating large, complexly bonded social groups are thought to have significantly shaped human brain evolution. Yet, research on social network representation and cognitive neuroscience have progressed largely independently. Thus, little is known about how the human brain encodes the structure of the social networks in which it is embedded. This review highlights recent work seeking to bridge this gap in understanding. While the majority of research linking social network analysis and neuroimaging has focused on relating neuroanatomy to social network size, researchers have begun to define the neural architecture that encodes social network structure, cognitive and behavioral consequences of encoding this information, and individual differences in how people represent the structure of their social world.
  • This website is amazing, linear algebra with interactive examples. Vectors, matrix, dot product, etc, cool resource for learning
  • Web Literacy for Student Fact-Checkers: …and other people who care about facts.
    • Author: Mike Caulfield
    • We Should Put Fact-Checking Tools In the Core Browser
      • Years ago when the web was young, Netscape (Google it, noobs!) decided on its metaphor for the browser: it was a “navigator”. <—— this!!!!
        • Navigator: a person who directs the route or course of a ship, aircraft, or other form of transportation, especially by using instruments and maps.
        • Browser: a person who looks casually through books or magazines or at things for sale.
  • Deep Learning Hunts for Signals Among the Noise
    • Interesting article that indicates that deep learning generalizes through some form of compression. If that’s true, then the teurons and layers are learning how to coordinate (who recognizes what), which means dimension reduction and localized alignment (what are the features that make a person vs. a ship). Hmmm.
  • More Bit by Bit
  • Really enjoying Casualties of Cool, btw. Lovely sound layering. Reminds me of Dark Side of the Moon / Wish you were here Pink Floyd
  • Why you need to improve your training data, and how to do it
    • sleep_lost1
  • No scrum today
  • Travel briefing – charge to conference code
  • Complexity Explorables
    • Ride my Kuramotocycle!
      • This explorable illustrates the Kuramoto model for phase coupled oscillators. This model is used to describe synchronization phenomena in natural systems, e.g. the flash synchronization of fire flies or wall-mounted clocks. The model is defined as a system of NN oscillators. Each oscillator has a phase variable θn(t)θn(t) (illustrated by the angular position on a circle below), and an angular frequency ωnωn that captures how fast the oscillator moves around the circle.
    • Into the Dark
      • This explorable illustrates how a school of fish can collectively find an optimal location, e.g. a dark, unexposed region in their environment simply by light-dependent speed control. The explorable is based on the model discussed in Flock’n Roll, which you may want to explore first. This is how it works: The swarm here consists of 100 individuals. Each individual moves around at a constant speed and changes direction according to three rules
  • More cool software: Kepler.gl is a powerful open source geospatial analysis tool for large-scale data sets.
  • White paper. Good progress! I like the conclusions

Phil 5.15.18

7:00 – 4:00 ASRC MKT

Phil 5.7.18

7:00 – 5:00 ASRC MKT

  • Content Sharing within the Alternative Media Echo-System: The Case of the White Helmets
    • Kate Starbird
    • In June 2017 our lab began a research project looking at online conversations about the Syria Civil Defence (aka the “White Helmets”). Over the last 8–9 months, we have spent hundreds of hours conducting analysis on the tweets, accounts, articles, and websites involved in that discourse. Our first peer-reviewed paper was recently accepted to an upcoming conference (ICWSM-18). That paper focuses on a small piece of the structure and dynamics of this conversation, specifically looking at content sharing across websites. Here, I describe that research and highlight a few of the findings.
  • Matt Salganik on Open Review
  • Spent a lot of time getting each work to draw differently in the scatterplot. That took some digging into the gensim API to get vectors from the corpora. I then tried to plot the list of arrays, but matplotlib only likes ndarrays (apparently?). I’m now working on placing the words from each text into their own ndarray.
  • Also added a filter for short stop words and switched to a hash map for words to avoid redundant points in the plot.
  • Fika
    • Bryce Peake
    • ICA has a computational methods study area. How media lows through different spaces, etc. Python and [R]
    • Anne Balsamo – designing culture
    • what about language as an anti-colonial interaction
    • Human social scraping of data. There can be emergent themes that become important.
    • The ability of the user to delete all primary, secondary and tertiary data.
    • The third eye project (chyron crawls)

Phil 5.1.18

7:00 – 4:30 ASRC MKT

  • Applications of big social media data analysis: An overview
    • Over the last few years, online communication has moved toward user-driven technologies, such as online social networks (OSNs), blogs, online virtual communities, and online sharing platforms. These social technologies have ushered in a revolution in user-generated data, online global communities, and rich human behavior-related content. Human-generated data and human mobility patterns have become important steps toward developing smart applications in many areas. Understanding human preferences is important to the development of smart applications and services to enable such applications to understand the thoughts and emotions of humans, and then act smartly based on learning from social media data. This paper discusses the role of social media data in comprehending online human data and in consequently different real applications of SM data for smart services are executed.
  • Explainable, Interactive Deep Learning
    • Recently, deep learning has been advancing the state of the art in artificial intelligence to yet another level, and humans are relying more and more on the outputs generated by artificial intelligence techniques than ever before. However, even with such unprecedented advancements, the lack of interpretability on the decisions made by deep learning models and no control over their internal processes act as a major drawback when utilizing them to critical decision-making processes such as precision medicine and law enforcement. In response, efforts are being made to make deep learning interpretable and controllable by humans. In this paper, we review recent studies relevant to this direction and discuss potential challenges and future research directions.
  • Building successful online communities: Evidence-based social design (book review)
    • In Building Successful Online Communities (2012), Robert Kraut, Paul Resnick, and their collaborators set out to draw links between the design of socio-technical systems with findings from social psychology and economics. Along the way, they set out a vision for the role of social sciences in the design of systems like mailing lists, discussion forums, wikis, and social networks, offering a way that behavior on those platforms might inform our understanding of human behavior.
  • Since I’ve forgotten my Angular stuff, reviewing UltimateAngular, Angular Fundamentals course. Finished the ‘Getting Started’ section
  • Strip out Guttenburg text from corpora – done!

Phil 4.30.18

7:00 – 4:30 ASRC MKT

  • Some new papers from ICLR 2018
  • Need to write up a quick post for communicating between Angular and a (PHP) server, with an optional IntelliJ configuration section
  • JuryRoom this morning and then GANs + Agents this afternoon?
  • Next steps for JuryRoom
    • Start up the AngularPro course
    • Set up PHP access to DB, returning JSON objects
  • Starting Agent/GAN project
    • Need to set up an ACM paper to start dumping things into – done.
    • Looking for a good source for Jack London. Gutenberg looks nice, but there is a no-scraping rule, so I guess, we’ll do this by hand…
    • We will need to check for redundant short stories
    • We will need to strip the front and back matter that pertains to project Gutenburg
      • *** START OF THIS PROJECT GUTENBERG EBOOK BROWN WOLF AND OTHER JACK ***
      • *** END OF THIS PROJECT GUTENBERG EBOOK BROWN WOLF AND OTHER JACK ***
  • Fika: Accessibility at the Intersection of Users and Data
    • Nice talk and followup discussion with Dr. Hernisa Kacorri, who’s combining machine learning and HCC
      • My research goal is to build technologies that address real-world problems by integrating data-driven methods and human-computer interaction. I am interested in investigating human needs and challenges that may benefit from advancements in artificial intelligence. My focus is both in building new models to address these challenges and in designing evaluation methodologies that assess their impact. Typically my research involves application of machine learning and analytics research to benefit people with disabilities, especially assistive technologies that model human communication and behavior such as sign language avatars and independent mobility for the blind.

Phil 4.27.18

7:00 – 4:00 ASRC MKT

  • Call Charlestown about getting last two years of payments – done. Left a message
  • Get parking from StubHub
  • I saw James Burnham’s interview on the Daly Show last night. Roughly, I think that his thoughts on how humans haven’t changed, but our norms and practices is true. Based on listening to him talk, I think he’s more focussed on the symptoms than the cause. The question that he doesn’t seem to be asking is “why did civilization emerge when it did?”, and “why does it seem to be breaking now?”
    Personally, I think it’s tied up with communication technology. Slow communication systems like writing, mail, and the printing press lead to civilization. Rapid, frictionless forms of communication from radio to social media disrupt this process by changing how we define, perceive and trust our neighbors. The nice thing is that if technology is the critical element, then technology can be adjusted. Not that it’s easier, but it’s probably easier than changing humans.
  • Continuing From I to We: Group Formation and Linguistic Adaption in an Online Xenophobic Forum. Done and posted in Phlog
  • Tweaking the Angular and PHP code.
  • I got the IntelliJ debugger to connect to the Apache PHP server! Here’s the final steps. Pay particular attention to the highlighted areas:
    • File->Settings->Languages & Frameworks->PHP->Debug Debug1
    • Validate: Debug2
  • Objects are now coming back in the same way, so no parsing on the Angular side
  • Sprint planning

Phil 4.26.18

Too much stuff posted yesterday, so I’m putting Kate Starbird’s new paper here:

  • Ecosystem or Echo-System? Exploring Content Sharing across Alternative Media Domains
    • This research examines the competing narratives about the role and function of Syria Civil Defence, a volunteer humanitarian organization popularly known as the White Helmets, working in war-torn Syria. Using a mixed-method approach based on seed data collected from Twitter, and then extending out to the websites cited in that data, we examine content sharing practices across distinct media domains that functioned to construct, shape, and propagate these narratives. We articulate a predominantly alternative media “echo-system” of websites that repeatedly share content about the White Helmets. Among other findings, our work reveals a small set of websites and authors generating content that is spread across diverse sites, drawing audiences from distinct communities into a shared narrative. This analysis also reveals the integration of government funded media and geopolitical think tanks as source content for anti-White Helmets narratives. More broadly, the analysis demonstrates the role of alternative newswire-like services in providing content for alternative media websites. Though additional work is needed to understand these patterns over time and across topics, this paper provides insight into the dynamics of this multi-layered media ecosystem.

7:00 – 5:00 ASRC MKT

  • Referencing for Aanton at 5:00
  • Call Charlestown about getting last two years of payments
  • Benjamin D. Horne, Sara Khedr, and Sibel Adali. “Sampling the News Producers: A Large News and Feature Data Set for the Study of the Complex Media Landscape” ICWSM 2018
  • Continuing From I to We: Group Formation and Linguistic Adaption in an Online Xenophobic Forum
  • Anchor-Free Correlated Topic Modeling
    • In topic modeling, identifiability of the topics is an essential issue. Many topic modeling approaches have been developed under the premise that each topic has an anchor word, which may be fragile in practice, because words and terms have multiple uses; yet it is commonly adopted because it enables identifiability guarantees. Remedies in the literature include using three- or higher-order word co-occurence statistics to come up with tensor factorization models, but identifiability still hinges on additional assumptions. In this work, we propose a new topic identification criterion using second order statistics of the words. The criterion is theoretically guaranteed to identify the underlying topics even when the anchor-word assumption is grossly violated. An algorithm based on alternating optimization, and an efficient primal-dual algorithm are proposed to handle the resulting identification problem. The former exhibits high performance and is completely parameter-free; the latter affords up to 200 times speedup relative to the former, but requires step-size tuning and a slight sacrifice in accuracy. A variety of real text copora are employed to showcase the effectiveness of the approach, where the proposed anchor-free method demonstrates substantial improvements compared to a number of anchor-word based approaches under various evaluation metrics.
  • Cleaning up the Angular/PHP example. Put on GitHub?

Phil 4.25.18

7:00 – 3:30 ASRC MKT

  • Google’s Workshop on AI/ML Research and Practice in India:
    Ganesh Ramakrishnan (IIT Bombay) presented research on human assisted machine learning.
  • From I to We: Group Formation and Linguistic Adaption in an Online Xenophobic Forum
    • Much of identity formation processes nowadays takes place online, indicating that intergroup differentiation may be found in online communities. This paper focuses on identity formation processes in an open online xenophobic, anti-immigrant, discussion forum. Open discussion forums provide an excellent opportunity to investigate open interactions that may reveal how identity is formed and how individual users are influenced by other users. Using computational text analysis and Linguistic Inquiry Word Count (LIWC), our results show that new users change from an individual identification to a group identification over time as indicated by a decrease in the use of “I” and increase in the use of “we”. The analyses also show increased use of “they” indicating intergroup differentiation. Moreover, the linguistic style of new users became more similar to that of the overall forum over time. Further, the emotional content decreased over time. The results indicate that new users on a forum create a collective identity with the other users and adapt to them linguistically.
    • Social influence is broadly defined as any change – emotional, behavioral, or attitudinal – that has its roots in others’ real or imagined presence (Allport, 1954). (pg 77)
    • Regardless of why an individual displays an observable behavioral change that is in line with group norms, social identification with a group is the basis for the change. (pg 77)
    • In social psychological terms, a group is defined as more than two people that share certain goals (Cartwright & Zander, 1968). (pg 77)
    • Processes of social identification, intergroup differentiation and social influence have to date not been studied in online forums. The aim of the present research is to fill this gap and provide information on how such processes can be studied through language used on the forum. (pg 78)
    • The popularity of social networking sites has increased immensely during the last decade. At the same time, offline socializing has shown a decline (Duggan & Smith, 2013). Now, much of the socializing actually takes place online (Ganda, 2014). In order to be part of an online community, the individual must socialize with other users. Through such socializing, individuals create self-representations (Enli & Thumim, 2012). Hence, the processes of identity formation, may to a large extent take place on the Internet in various online forums. (pg 78)
    • For instance, linguistic analyses of American Nazis have shown that use of third person plural pronouns (they, them, their) is the single best predictor of extreme attitudes (Pennebaker & Chung, 2008). (pg 79)
    • Because language can be seen as behavior (Fiedler, 2008), it may be possible to study processes of social influence through linguistic analysis. Thus, our second hypothesis is that the linguistic style of new users will become increasingly similar to the linguistic style of the overall forum over time (H2). (pg 79)
    • This indicates that the content of the posts in an online forum may also change over time as arguments become more fine-tuned and input from both supporting and contradicting members are integrated into an individual’s own beliefs. This is likely to result (linguistically) in an increase in indicators of cognitive complexity. Hence, we hypothesize that the content of the posts will change over time, such that indicators of complex thinking will increase (H3a). (pg 80)
      • I’m not sure what to think about this. I expect from dimension reduction, that as the group becomes more aligned, the overall complex thinking will reduce, and the outliers will leave, at least in the extreme of a stampede condition.
    • This result indicates that after having expressed negativity in the forum, the need for such expressions should decrease. Hence, we expect that the content of the posts will change such that indicators of negative emotions will decrease, over time (H3b). (pg 80)
    • the forum is presented as a “very liberal forum”, where people are able to express their opinions, whatever they may be. This “extreme liberal” idea implies that there is very little censorship the forum is presented as a “very liberal forum”, where people are able to express their opinions, whatever they may be. This “extreme liberal” idea implies that there is very little censorship, which has resulted in that the forum is highly xenophobic. Nonetheless, due to its liberal self-presentation, the xenophobic discussions are not unchallenged. For example, also anti-racist people join this forum in order to challenge individuals with xenophobic attitudes. This means that the forum is not likely to function as a pure echo chamber, because contradicting arguments must be met with own arguments. Hence, individuals will learn from more experienced users how to counter contradicting arguments in a convincing way. Hence, they are likely to incorporate new knowledge, embrace input and contribute to evolving ideas and arguments. (pg 81)
      • Open debate can lead to the highest level of polarization (M&D)
      • There isn’t diverse opinion. The conversation is polarized, with opponents pushing towards the opposite pole. The question I’d like to see answered is has extremism increased in the forum?
    • Natural language analyses of anonymous social media forums also circumvent social desirability biases that may be present in traditional self-rating research, which is a particular important concern in relation to issues related to outgroups (Maass, Salvi, Arcuri, & Semin, 1989; von Hippel, Sekaquaptewa, & Vargas, 1997, 2008). The to-be analyzed media uses “aliases”, yielding anonymity of the users and at the same time allow us to track individuals over time and analyze changes in communication patterns. (pg 81)
      • After seeing “Ready Player One”, I also wonder if the aliases themselves could be looked at using an embedding space built from the terms used by the users? Then you get distance measurements, t-sne projections, etc.
    • Linguistic Inquiry Word Count (LIWC; Pennebaker et al., 2007; Chung & Pennebaker, 2007; Pennebaker, 2011b; Pennebaker, Francis, & Booth, 2001) is a computerized text analysis program that computes a LIWC score, i.e., the percentage of various language categories relative to the number of total words (see also www.liwc.net). (pg 81)
      • LIWC2015 ($90) is the gold standard in computerized text analysis. Learn how the words we use in everyday language reveal our thoughts, feelings, personality, and motivations. Based on years of scientific research, LIWC2015 is more accurate, easier to use, and provides a broader range of social and psychological insights compared to earlier LIWC versions
    • Figure 1c shows words overrepresented in later posts, i.e. words where the usage of the words correlates positively with how long the users has been active on the forum. The words here typically lack emotional content and are indicators of higher complexity in language. Again, this analysis provides preliminary support for the idea that time on the forum is related to more complex thinking, and less emotionality.
      • WordCloud
    • The second hypothesis was that the linguistic style of new users would become increasingly similar to other users on the forum over time. This hypothesis is evaluated by first z-transforming each LIWC score, so that each has a mean value of zero and a standard deviation of one. Then we measure how each post differs from the standardized values by summing the absolute z-values over all 62 LIWC categories from 2007. Thus, low values on these deviation scores indicate that posts are more prototypical, or highly similar, to what other users write. These deviation scores are analyzed in the same way as for Hypothesis 1 (i.e., by correlating each user score with the number of days on the forum, and then t-testing whether the correlations are significantly different from zero). In support of the hypothesis, the results show an increase in similarity, as indicated by decreasing deviation scores (Figure 2). The mean correlation coefficient between this measure and time on the forum was -.0086, which is significant, t(11749) = -3.77, p < 0.001. (pg 85)
      • ForumAlignmentI think it is reasonable to consider this a measure of alignment
    • Because individuals form identities online and because we see this in the use of pronouns, we also expected to see tendencies of social influence and adaption. This effect was also found, such that individuals’ linguistic style became increasingly similar to other users’ linguistic style over time. Past research has shown that accommodation of communication style occurs automatically when people connect to people or groups they like (Giles & Ogay 2007; Ireland et al., 2011), but also that similarity in communicative style functions as cohesive glue within a group (Reid, Giles, & Harwood, 2005). (pg 86)
    • Still, the results could not confirm an increase in cognitive complexity. It is difficult to determine why this was not observed even though a general trend to conform to the linguistic style on the forum was observed. (pg 87)
      • This is what I would expect. As alignment increases, complexity, as expressed by higher dimensional thinking should decrease.
    • This idea would also be in line with previous research that has shown that expressing oneself decreases arousal (Garcia et al., 2016). Moreover, because the forum is not explicitly racist, individuals may have simply adapted to the social norms on the forum prescribing less negative emotional displays. Finally, a possible explanation for the decrease in negative emotional words might be that users who are very angry leave the forum, because of its non-racist focus, and end up in more hostile forums. An interesting finding that was not part of the hypotheses in the present research is that the third person plural category correlated positively with all four negative emotions categories, suggesting that people using for example ‘they’ express more negative emotions (pg 87)
    • In line with social identity theory (Tajfel & Turner, 1986), we also observe linguistic adaption to the group. Hence, our results indicate that processes of identity formation may take place online. (pg 87)
  • Me, My Echo Chamber, and I: Introspection on Social Media Polarization
    • Homophily — our tendency to surround ourselves with others who share our perspectives and opinions about the world — is both a part of human nature and an organizing principle underpinning many of our digital social networks. However, when it comes to politics or culture, homophily can amplify tribal mindsets and produce “echo chambers” that degrade the quality, safety, and diversity of discourse online. While several studies have empirically proven this point, few have explored how making users aware of the extent and nature of their political echo chambers influences their subsequent beliefs and actions. In this paper, we introduce Social Mirror, a social network visualization tool that enables a sample of Twitter users to explore the politically-active parts of their social network. We use Social Mirror to recruit Twitter users with a prior history of political discourse to a randomized experiment where we evaluate the effects of different treatments on participants’ i) beliefs about their network connections, ii) the political diversity of who they choose to follow, and iii) the political alignment of the URLs they choose to share. While we see no effects on average political alignment of shared URLs, we find that recommending accounts of the opposite political ideology to follow reduces participants’ beliefs in the political homogeneity of their network connections but still enhances their connection diversity one week after treatment. Conversely, participants who enhance their belief in the political homogeneity of their Twitter connections have less diverse network connections 2-3 weeks after treatment. We explore the implications of these disconnects between beliefs and actions on future efforts to promote healthier exchanges in our digital public spheres.
  • What We Read, What We Search: Media Attention and Public Attention Among 193 Countries
    • We investigate the alignment of international attention of news media organizations within 193 countries with the expressed international interests of the public within those same countries from March 7, 2016 to April 14, 2017. We collect fourteen months of longitudinal data of online news from Unfiltered News and web search volume data from Google Trends and build a multiplex network of media attention and public attention in order to study its structural and dynamic properties. Structurally, the media attention and the public attention are both similar and different depending on the resolution of the analysis. For example, we find that 63.2% of the country-specific media and the public pay attention to different countries, but local attention flow patterns, which are measured by network motifs, are very similar. We also show that there are strong regional similarities with both media and public attention that is only disrupted by significantly major worldwide incidents (e.g., Brexit). Using Granger causality, we show that there are a substantial number of countries where media attention and public attention are dissimilar by topical interest. Our findings show that the media and public attention toward specific countries are often at odds, indicating that the public within these countries may be ignoring their country-specific news outlets and seeking other online sources to address their media needs and desires.
  • “You are no Jack Kennedy”: On Media Selection of Highlights from Presidential Debates
    • Our findings indicate that there exist signals in the textual information that untrained humans do not find salient. In particular, highlights are locally distinct from the speaker’s previous turn, but are later echoed more by both the speaker and other participants (Conclusions)
      • This sounds like dimension reduction and alignment
  • Algorithms, bots, and political communication in the US 2016 election – The challenge of automated political communication for election law and administration
    • Philip N. Howard (Scholar)
    • Samuel C. Woolley (Scholar)
    • Ryan Calo (Scholar)
    • Political communication is the process of putting information, technology, and media in the service of power. Increasingly, political actors are automating such processes, through algorithms that obscure motives and authors yet reach immense networks of people through personal ties among friends and family. Not all political algorithms are used for manipulation and social control however. So what are the primary ways in which algorithmic political communication—organized by automated scripts on social media—may undermine elections in democracies? In the US context, what specific elements of communication policy or election law might regulate the behavior of such “bots,” or the political actors who employ them? First, we describe computational propaganda and define political bots as automated scripts designed to manipulate public opinion. Second, we illustrate how political bots have been used to manipulate public opinion and explain how algorithms are an important new domain of analysis for scholars of political communication. Finally, we demonstrate how political bots are likely to interfere with political communication in the United States by allowing surreptitious campaign coordination, illegally soliciting either contributions or votes, or violating rules on disclosure.
  • Ok, back to getting HTTPClient posts to play with PHP cross domain
  • Maybe I have to make a proxy?
    • Using the proxying support in webpack’s dev server we can highjack certain URLs and send them to a backend server. We do this by passing a file to --proxy-config
    • Well, that fixes the need to have all the server options set, but the post still doesn’t send data. But since this is the Right way to do things, here’s the steps:
    • To proxy localhost:4200/uli -> localhost:80/uli
      • Create a proxy.conf.json file in the same directory as package.json
        {
          "/uli": {
            "target": "http://localhost:80",
            "secure": false
          }
        }

        This will cause any explicit request to localhost:4200/uli to be mapped to localhost:80/uli and appear that they are coming from localhost:80/uli

      • Set the npm start command in the package.json file to read as
        "scripts": {
          "start": "ng serve --proxy-config proxy.conf.json",
          ...
        },

        Start with “npm start”, rather than “ng serve”

      • Call from Angular like this:
        this.http.post('http://localhost:4200/uli/script.php', payload, httpOptions)
      • Here’s the PHP code (script.php): it takes POST and GET input and feeds it back with some information about the source :
        function getBrowserInfo(){
             $browserData = array();
             $ip = htmlentities($_SERVER['REMOTE_ADDR']);
             $browser = htmlentities($_SERVER['HTTP_USER_AGENT']);
             $referrer = "No Referrer";
             if(isset($_SERVER['HTTP_REFERER'])) {
                 //do what you need to do here if it's set
                 $referrer = htmlentities($_SERVER['HTTP_REFERER']);         if($referrer == ""){
                     $referrer = "No Referrer";
                 }
             }
             $browserData["ipAddress"] = $ip;
             $browserData["browser"] = $browser;
             $browserData["referrer"] = $referrer;
             return $browserData;
         }
         function getPostInfo(){
             $postInfo = array();
             foreach($_POST as $key => $value) {
                if(strlen($value) < 10000) {               $postInfo[$key] = $value;           }else{               $postInfo[$key] = "string too long";           }       }       return $postInfo;   }   function getGetInfo(){       $getInfo = array();       foreach($_GET as $key => $value) {
                if(strlen($value) < 10000) {
                    $getInfo[$key] = $value;
                }else{
                    $getInfo[$key] = "string too long";
                }
            }
            return $getInfo;
        }
        
        /**************************** MAIN ********************/
        $toReturn = array();
        $toReturn['getPostInfo'] = getPostInfo();
        $toReturn['getGetInfo'] = getGetInfo();
        $toReturn['browserInfo'] = getBrowserInfo();
        $toReturn['time'] = date("h:i:sa");
        $jstr =  json_encode($toReturn);
        echo($jstr);
      • And it arrives at localhost:80/uli/script.php. The following is the javascript console of the Angular CLI code running on localhost:4200
        {getPostInfo: Array(0), getGetInfo: {…}, browserInfo: {…}, time: "05:17:16pm"}
        browserInfo:
        	browser:"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.181 Safari/537.36"
        	ipAddress:"127.0.0.1"
        	referrer:"http://localhost:4200/"
        getGetInfo:
        	message:"{"title":"foo","body":"bar","userId":1}"
        getPostInfo:[]
        time:"05:17:16pm"
        
      • Got the pieces parsing in @Component and displaying, so the round trip is done. Wan’t expecting to wind up using GET, but until I can figure out what the deal is with POST, that’s what it’s going to be. Here are the two methods that send and then parse the message:
        doGet(event) {
          let payload = {
            title: 'foo',
            body: 'bar',
            userId: 1
          };
          let message = 'message='+encodeURIComponent(JSON.stringify(payload));
          let target = 'http://localhost:4200/uli/script.php?';
        
          //this.http.get(target+'title=\'my title\'&body=\'the body\'&userId=1')
          this.http.get(target+message)
            .subscribe((data) => {
              console.log('Got some data from backend ', data);
              this.extractMessage(data, "getGetInfo");
            }, (error) => {
              console.log('Error! ', error);
            });
        }
        
        extractMessage(obj, name: string){
          let item = obj[name];
          try {
            if (item) {
              let mstr = item.message;
              this.mobj = JSON.parse(mstr);
            }
          }catch(err){
            this.mobj = {};
            this.mobj["message"] = "Error extracting 'message' from ["+name+"]";
          }
          this.mkeys = Object.keys(this.mobj);
        }
      • And here’s the html code: html
      • Here’s a screenshot of everything working: PostGetTest