Monthly Archives: April 2017

Phil 4.28.17

7:00 – 8:00 Research

Submitted the paper. Need to start on the HCIC poster next
Continuing Examining the Alternative Media Ecosystem through the Production of Alternative Narratives of Mass Shooting Events on Twitter. Notes are here. Finished!
The menace of unreality: How the Kremlin weaponizes information, culture and money

8:30 – 4:30 BRC

Working on finding an exit condition for the subdivision surface
I’m currently calculating the corners of a constricting rectangle that contracts towards the best point. Each iteration is saved, and I’m working on visualizing that surface, but my brain has shut down, and I can do simple math anymore.
Had a thought for Aaron about how to visualize his dimension reduction. Turns out to do well.

Aaron 4.27.17

Cycling
- Got a late start in the office today, so as soon as I get in I got my gear on for a brain cleaning ride. Pushed really hard today, and combined with some nice weather and low traffic hit my first 16+ average MPH door-to-door. Landed a 16.4 mph average, and felt really proud of it.

Focus today was on learning some more about Manifold learning and its applications for reduction of high dimensional data for unsupervised learning.
- SciKit includes some great documentation and resources including a working sample comparing various Manifold learning techniques against test data sets.
- My goal now is to take the sorted data_generator.py code from yesterday and compare the manifold learning examples against the clustered output of the unreduced data. Once I have a benchmark set up I can do the same for the sample live data.
- The output of the SciKit examples in MatPlotLib is really attractive as well.

Phil 4.27.17

7:00 – 9:00 Research

Some more echo chamber flocking: Iran Deal Is More Popular Than Ever, Poll Shows Republicans registered the biggest uptick in support for the deal, which has been heavily criticized by GOP lawmakers since its inception in July 2015: 53 percent of Republican voters said they supported it, compared with 37 percent who backed it last summer and just 10 percent who supported it shortly after it was announced. Democratic support for the deal has been largely unchanged since August, and a larger share of independents are getting on board, from 41 percent in August to 48 percent now.
Finishing corrections to paper
This really is my phase I research question: If ‘laws of motion’ can indeed be ascribed to behavior, we should be able to model the effects of those laws. The question them becomes what form do these models take? Also, how do we detect these behaviors with domain independence and at scale?
Submitted!
The Relevance of Hannah Arendt’s Reflections on Evil:Globalization and Rightlessness

BRC 9:30 5:00

Continuing Subdivision surfacing
didn’t like the documentation on sortedcollections. going to try panda Series

Allowable options in an arg:

parser.add_argument("--algorithm", type=str, choices=['naive', 'subdivision', 'genetic'], default='naive', help="hill climbing algorithm")

Note that range(), which returns a list should also work

And here’s how you get the key/values from a pandas Series:

print("calc_subdivision_fitness_landsape(): key = {0}, val = {1}".format(fitness.index[0], fitness.values[0]))

Looks like it’s working. I think I should be using the average of the 4 fitnesses to decide if I’m done

calc_subdivision_fitness_landsape(): fitness = 
1    10.0
0     7.0
3     6.0
2     6.0
dtype: float64
calc_subdivision_fitness_landsape(): fitness = 
1    10.0
0     7.0
3     6.0
2     6.0
dtype: float64
calc_subdivision_fitness_landsape(): fitness = 
1    10.0
0     7.0
3     6.0
2     6.0
dtype: float64
done calc_subdivision_fitness_landsape

Phil 4.26.17

7:00 – 8:30 Research

Proofreading and tweaking the CSCW paper.
Finished the paper edit. Started to roll in the changes
Made a 10D chart of the explorer probability distribution. I think it tells the story better:
Had to install a dictionary in TexStudio. This helped a lot.
Started rolling in the changes to the tex file

BRC 9:00 – 4:30

Looks like the sort changes to the data_generator.py code haven’t been pushed yet

Starting on subdivision surfacing

def calc_subdivision_fitness_landsape(self, eps_step: float, min_cluster: int) -> pandas.DataFrame:
    # create the four extreme corners. These will work their way in
    # calculate halfway points
    # keep the square with the greatest (single? average?) value
    # repeat until an epsilon, max value, or max iterations are reached
    # construct a sparse matrix with spacing equal to the smallest spacing
    # fill in the values that have been calculated
    # build a dataframe and return it for visualization

I need to sort a dict, so I’m trying SortedContainers.

Then things went off the hails a bit, and I wrote a haiku program as a haiku that prints itself:

def haiku(sequence):
    this_is_not_needed = ""
    return "".join(sequence)

if __name__ == "__main__":
    f = open('haiku.py')
    print(haiku(f.readlines()[:3]))

Aaron 4.25.17

Wasted a ton of time today tracking down progress of integration of additional teams into our program.
Spent a couple of hours tackling a poster presentation to be delivered at a technical leadership summit next week. I’ll be presenting the “Advanced Analytics” presentation and discussing all of our tools, and capabilities. Phil helped a lot, and I ended up quite pleased with the results. One of the nice things is we were able to include screenshots of actual tools and graphs of the data we’re using. I think this will be a nice difference from the rest of the presenters.
Did some good pair programming with Phil on the Pandas DataFrame.sort issue, moved to the non-deprecated version of DataFrame.sort_values and got it working correctly at all matrix sizes.

Phil 4.25.17

7:00 – 8:30 Research

Wikipedia founder Jimmy Wales launches Wikitribune, a large-scale attempt to combat fake news
Listening to the BBC Business Daily on Machine Learning. They had an interview with Joanna J Bryson (Scholar). She has an approach for explaining the behavior of AI that seems to involve simulation? Here are some papers that look interesting:
- Behavior Oriented Design (MIT Dissertation: Intelligence by Design: Principles of Modularity and Coordination for Engineering Complex Adaptive Agents)
- Learning from Play: Facilitating character design through genetic programming and human mimicry
  - Mimicry and play are fundamental learning processes by which individuals can acquire behaviours, skills and norms. In this paper we utilise these two processes to create new game characters by mimicking and learning from actual human players. We present our approach towards aiding the design process of game characters through the use of genetic programming. The current state of the art in game character design relies heavily on human designers to manually create and edit scripts and rules for game characters. Computational creativity approaches this issue with fully autonomous character generators, replacing most of the design process using black box solutions such as neural networks. Our GP approach to this problem not only mimics actual human play but creates character controllers which can be further authored and developed by a designer. This keeps the designer in the loop while reducing repetitive labour. Our system also provides insights into how players express themselves in games and into deriving appropriate models for representing those insights. We present our framework and preliminary results supporting our claim.
- Replicators, Lineages and Interactors: One page note on cultural evolution
  - If we adopt the other option and refer to culture itself is the lineage, then the culture itself can evolve since the replicators are the ideas and practices that exist within that culture. However, if it is the culture that is the lineage, we cannot say that it evolves when it takes more territory, in the same way that a species does not evolve with more individuals. Adaptation is presently understood to be about changes in the frequency of replicators, not about absolute numbers of interactors. In sum, cultural evolution (changes of practices within a group) is necessarily a separate process from cultural group selection (changes of the frequency of group-types at a specific location).
- The behavior-oriented design of modular agent intelligence
- Should probably cite some of these and a reference to Behavior-Oriented Design in the conclusions section of the paper

Continuing Examining the Alternative Media Ecosystem through the Production of Alternative Narratives of Mass Shooting Events on Twitter

We collected data using the Twitter Streaming API, tracking on the following terms (shooter, shooting, gunman, gunmen, gunshot, gunshots, shooters, gun shot, gun shots, shootings) for a ten-month period between January 1 and October 5, 2016. This collection resulted in 58M total tweets. We then scoped that data to include only tweets related to alternative narratives of the event—false flag, falseflag, crisis actor, crisisactor, staged, hoax and “1488”.
- These keywords specify a ‘primary information space’. Bag-of-words of text correlated with each term could make this a linear axis
Of 15,150 users who sent at least one tweet with a link, only 1372 sent (over the course to the collection period) tweets citing more than one domain.
- This is the difference between implicit behaviors (clicking, reading, navigating) and explicit actions. Twitter monitors what people are willing to write
Interestingly, the two most influential Domains in Alternative Narrative Tweets Interesting, the two most highly tweeted domains were both associated with significant automated account or “bot” activity. The Real Strategy, an alternative news site with a conspiracy theory orientation, is the most tweeted domain in our dataset (by far). The temporal signature of tweets citing this domain reveals a consistent pattern of coordinated bursts of activity at regular intervals generated by 200 accounts that appear to be connected to each other (via following relationships) and coordinated through an external tool.
- There is clearly a desire to have a greater effect through the use of bots. Two questions: 1) How does this work? 2) How did this emerge?

The InfoWars domain, an alternative news website that focuses on Alt-Right and conspiracy theory themes, was the second-most tweeted domain, but as (Figure 1) shows it was only tenuously connected to one other node.

Why? Is InforWars more polarized? Is it using something other than Twitter?

Infowars Inbound links

Domain score	Domain trust score	Domain	Backlinks	IP Address	Country	First seen	Last seen
0	0	breakingnewsfeed.com	1857029	174.129.22.101	us	2015-09-28	2017-03-26
4	4	e-graviton.com	1335835	67.210.126.35	us	2014-01-19	2017-03-21
33	39	prisonplanet.com	648958	69.16.175.42	us	2013-06-07	2017-03-25
1	0	nwostop.com	346153	104.28.28.16	us	2014-01-19	2017-03-21
13	31	nwoo.org	182060	81.0.208.215	cz	2013-06-07	2017-03-26
12	30	conservative-headlines.com	151778	104.18.50.72	us	2016-06-27	2017-03-22
1	0	america2fear.com	92766	69.64.46.138	us	2014-11-14	2017-03-23
4	29	subbmitt.com	49288	64.251.23.173	us	2015-02-04	2017-03-26
14	30	anotherdotcom.com	47195	174.129.236.72	us	2014-10-02	2017-03-20
1	0	exzacktamountas.com	43748	208.100.60.13	us	2016-06-08	2017-03-24

9:00 – 5:30 BRC

John is having trouble getting Linux running on the laptop
- No luck. Re-submitting for an Alienware deskside
Back to getting the temporal coherence. last try to finish up, then switching to fitness landscape optimization, which I dreamed about last night
Finished coherence! Had to include a state check for a timeline to see if a DIRTY state had been touched with an update. If not, then the timeline is set to CLOSED. If a new cluster appears that would have had some overlap, a new timeline is created anyway. This could be an optional behavior.
- Still need to test rigorously across multiple data sets
Long scrum, then ML meeting.
- Hard tasks
  - TF server set up to work in our environment
  - Pre-calculated models to speed up training from research browser
  - T-SNE or other mapping of returned CSE text to support exploration
  - Fast, on-the-fly classification and entity extraction within the research browser framework. Plus interactive training
  - NMF (or other) topic extraction tied to human labeling and curation, plus cross-user validation of topics
Poster with Aaron later? Yep. Couple of hours. Done?

Oh, just why? Spent an hour on this before going brute force:

def get_last_cluster(self) -> ClusterSample:
    # return self._cluster_dict[self._cluster_dict.keys()[-1] TODO: This should work
    toReturn = None
    for key in self._cluster_dict:
        toReturn = self._cluster_dict[key]
    return toReturn

Walked through some gradient descent regression code with Bob. More tomorrow?
Got the new sort working with Aaron. Much faster progress as a pair

Phil 4.24.17

7:00 – 8:00, 3:00 – 4:00 Research

Continuing to tweak paper
Starting Examining the Alternative Media Ecosystem through the Production of Alternative Narratives of Mass Shooting Events on Twitter
- From the introduction. Do I need something like this? Our contributions include an increased understanding of the underlying nature of this subsection of alternative media — which hosts conspiratorial content and conducts various anti-globalist political agendas. Noting thematic convergence across domains, we theorize about how alternative media may contribute to conspiratorial thinking by creating a false perception of information diversity.
Conspiracy Theories
- Cass R. Sunstein
- Adrian Vermeule
A cool thing on explorers and veloviewer: Here’s an overview of the project
Brownbag
- Teaching abstract concepts to children and tweens (STEM)
- Cohesive understanding of science over time
- Wearable technology as the gateway for elementary-school-aged kids? Research shows that they find them valuable
- How are these attributes measured? <——-!!!!!!!
- Live visualization sensing and visualization
- Zephyr bioharness
- Gender/age differences? Augmented reality? Through-phone?
- leylanorooz.com/research/

8:30 – 2:30 BRC

Expense report! Done? Had to get a charge number, and re enter. Took forever.
Found out that I’m getting a laptop rather than what I asked for
- Having John install Ubuntu and verify that multiple monitors work in Linux
Helped Bob set up Git repo
Still working on temporal coherence. Think I’ve figured out the logic. Now I need to set clusters in ClusterTimelines
Learned how to do Enums in Python

Phil 4.21.1

6:00 – 7:00 Research

Continuing A Parsimonious Language Model of Social Media Credibility Across Disparate Events – Finished
- All in all, my thoughts about credibility are that it is important to understand when credibility differs from trustworthiness. When is the story more compelling than the facts? This paper shows how to get one side of that, at least through the lens of textual analysis. More than anything, it makes me want to have time to read Judgement under uncertainty, which has been languishing on my Kindle. Here’s a letter the authors published in Science that might provide some worthwhile insights in fewer pages
- Event Factuality – This looks much closer. How to determine factuality in language. But also how to relate the relevant facts to the penitent facts
- Roser Sauri’s Dissertation: A Factuality Profiler for Eventualities in Text
  - Are You Sure That This Happened? Assessing the Factuality Degree of Events in Text. This looks very interesting, and may have some semantic space concepts. Need to read. TODO
- Did it happen? The pragmatic complexity of veridicality assessment
  - Christopher Potts
    - Mapping the social genome
- Social psychologists argue that when faced with complex, difficult to explain phenomenon, individuals often take the “cognitive shortcut” of believing the phenomenon instead of assessing and analyzing it. This could be important, as it implies that there are shortcuts (wormholes?) through belief space that take advantage of psychological principles.

8:30 – 5:00 BRC

Need to think about handling time, so we can see if people are getting better
All hands meeting
- Transforming healthcare WRT identifying risks and anomalies for the purpose of reducing variance in care. From what to what?
4 things:
- Technology: get to and stay at the leading edge of what we’re marketing. Investment commitment (CCRi These guys? Alias, Commonwealth university)
- Sales development
- Partnering
- Capital (direct raise from investors) Alignment at a capital level?
V2 & V3 timelines and capabilities
Sales and capital story
Discussion (2 hours)

Phil 4.20.17

7:00 – 8:00 Research

Working on submitting CSCW abstract. Slow response from the PCS website. Actually wound up using my private email rather than the university’s because the response was so slow. Anyway, the abstract is uploaded!
Reading the following papers and adding to the background for the paper:
- From CSCW 2017
  - http://comp.social.gatech.edu/papers/cscw17-cred-mitra.pdf
  - http://faculty.washington.edu/kstarbi/Arif_Starbird_CorrectiveBehavior_CSCW2017.pdf
- The other Kate Starbird paper to be presented at ICWSM17
  - http://faculty.washington.edu/kstarbi/Alt_Narratives_ICWSM17-CameraReady.pdf
Starting A Parsimonious Language Model of Social Media Credibility Across Disparate Events
- “CREDBANK: A Large-scale Social Media Corpus With Associated Credibility Annotations” (CSCW paper)

8:30 – 6:00, 7:00 – 10:00 BRC

Drove up to NJ
Still working on temporal coherence of clusters. Talked through it with Aaron, and we both believe it’s close
Another good discussion with Bob
BRC dinner meet-n-greet

Phil 4.19.17

7:00 – 8:00 Research

Heard about these folks this morning Time Well Spent. It talks about civilizational design, although it seems like a somewhat naive appeal?
- Livable Media Research Areas
- Explorable Explanations (2011)
- The Humane Representation of Thought (2014 Talk)
CSCW 2018:

Abstract due this Thursday, full paper due next Thursday.

INFO: https://cscw.acm.org/2018/submit.html

The Actual submission site: https://new.precisionconference.com/user/login
- Rewrite the abstract to foreground the numerical model/theory aspect – done.
- Add recent CSCW to background:
  Two CSCW 2017 papers to cite:
  
  http://comp.social.gatech.edu/papers/cscw17-cred-mitra.pdf
  
  http://faculty.washington.edu/kstarbi/Arif_Starbird_CorrectiveBehavior_CSCW2017.pdf
  
  The other Kate Starbird paper to be presented at ICWSM17
  
  http://faculty.washington.edu/kstarbi/Alt_Narratives_ICWSM17-CameraReady.pdf

8:30 – 5:00 BRC

Have Aaron read abstract

Finishing up temporal coherence in clustering. Getting differences, now I have to figure out how to sort, and when to make a new cluster.

timestamp = 10.07
	t=10.07, id=0, members = ['ExploitSh_54', 'ExploitSh_65', 'ExploitSh_94', 'ExploreSh_0', 'ExploreSh_1', 'ExploreSh_17', 'ExploreSh_2', 'ExploreSh_21', 'ExploreSh_24', 'ExploreSh_29', 'ExploreSh_3', 'ExploreSh_35', 'ExploreSh_38', 'ExploreSh_4', 'ExploreSh_40', 'ExploreSh_43', 'ExploreSh_48', 'ExploreSh_49', 'ExploreSh_8']
	t=10.07, id=1, members = ['ExploitSh_50', 'ExploitSh_51', 'ExploitSh_52', 'ExploitSh_53', 'ExploitSh_55', 'ExploitSh_56', 'ExploitSh_57', 'ExploitSh_58', 'ExploitSh_59', 'ExploitSh_60', 'ExploitSh_61', 'ExploitSh_62', 'ExploitSh_64', 'ExploitSh_66', 'ExploitSh_67', 'ExploitSh_69', 'ExploitSh_70', 'ExploitSh_71', 'ExploitSh_72', 'ExploitSh_73', 'ExploitSh_74', 'ExploitSh_75', 'ExploitSh_76', 'ExploitSh_77', 'ExploitSh_78', 'ExploitSh_79', 'ExploitSh_80', 'ExploitSh_81', 'ExploitSh_82', 'ExploitSh_83', 'ExploitSh_84', 'ExploitSh_85', 'ExploitSh_87', 'ExploitSh_88', 'ExploitSh_89', 'ExploitSh_90', 'ExploitSh_91', 'ExploitSh_92', 'ExploitSh_93', 'ExploitSh_95', 'ExploitSh_96', 'ExploitSh_97', 'ExploitSh_99', 'ExploreSh_10', 'ExploreSh_11', 'ExploreSh_13', 'ExploreSh_14', 'ExploreSh_15', 'ExploreSh_16', 'ExploreSh_18', 'ExploreSh_19', 'ExploreSh_20', 'ExploreSh_23', 'ExploreSh_25', 'ExploreSh_26', 'ExploreSh_27', 'ExploreSh_28', 'ExploreSh_30', 'ExploreSh_31', 'ExploreSh_32', 'ExploreSh_33', 'ExploreSh_34', 'ExploreSh_36', 'ExploreSh_37', 'ExploreSh_41', 'ExploreSh_42', 'ExploreSh_45', 'ExploreSh_46', 'ExploreSh_47', 'ExploreSh_5', 'ExploreSh_7', 'ExploreSh_9']
	t=10.07, id=-1, members = ['ExploitSh_63', 'ExploitSh_68', 'ExploitSh_86', 'ExploitSh_98', 'ExploreSh_12', 'ExploreSh_22', 'ExploreSh_39', 'ExploreSh_44', 'ExploreSh_6']

timestamp = 10.18
	t=10.18, id=0, members = ['ExploitSh_50', 'ExploitSh_51', 'ExploitSh_52', 'ExploitSh_53', 'ExploitSh_55', 'ExploitSh_56', 'ExploitSh_57', 'ExploitSh_58', 'ExploitSh_59', 'ExploitSh_60', 'ExploitSh_61', 'ExploitSh_62', 'ExploitSh_63', 'ExploitSh_64', 'ExploitSh_65', 'ExploitSh_66', 'ExploitSh_67', 'ExploitSh_69', 'ExploitSh_70', 'ExploitSh_71', 'ExploitSh_72', 'ExploitSh_73', 'ExploitSh_74', 'ExploitSh_75', 'ExploitSh_76', 'ExploitSh_77', 'ExploitSh_78', 'ExploitSh_79', 'ExploitSh_80', 'ExploitSh_81', 'ExploitSh_82', 'ExploitSh_83', 'ExploitSh_84', 'ExploitSh_85', 'ExploitSh_86', 'ExploitSh_87', 'ExploitSh_88', 'ExploitSh_89', 'ExploitSh_90', 'ExploitSh_91', 'ExploitSh_92', 'ExploitSh_93', 'ExploitSh_94', 'ExploitSh_95', 'ExploitSh_96', 'ExploitSh_97', 'ExploitSh_99', 'ExploreSh_0', 'ExploreSh_1', 'ExploreSh_10', 'ExploreSh_11', 'ExploreSh_13', 'ExploreSh_14', 'ExploreSh_15', 'ExploreSh_16', 'ExploreSh_17', 'ExploreSh_18', 'ExploreSh_19', 'ExploreSh_2', 'ExploreSh_20', 'ExploreSh_21', 'ExploreSh_23', 'ExploreSh_24', 'ExploreSh_25', 'ExploreSh_26', 'ExploreSh_27', 'ExploreSh_28', 'ExploreSh_29', 'ExploreSh_3', 'ExploreSh_30', 'ExploreSh_31', 'ExploreSh_32', 'ExploreSh_33', 'ExploreSh_34', 'ExploreSh_35', 'ExploreSh_36', 'ExploreSh_37', 'ExploreSh_38', 'ExploreSh_4', 'ExploreSh_40', 'ExploreSh_41', 'ExploreSh_42', 'ExploreSh_43', 'ExploreSh_45', 'ExploreSh_46', 'ExploreSh_47', 'ExploreSh_48', 'ExploreSh_49', 'ExploreSh_5', 'ExploreSh_7', 'ExploreSh_8', 'ExploreSh_9']
	t=10.18, id=-1, members = ['ExploitSh_54', 'ExploitSh_68', 'ExploitSh_98', 'ExploreSh_12', 'ExploreSh_22', 'ExploreSh_39', 'ExploreSh_44', 'ExploreSh_6']
current[0] 32.43% similar to previous[0]
current[0] 87.80% similar to previous[1]
current[0] 3.96% similar to previous[-1]
current[-1] 7.41% similar to previous[0]
current[-1] 82.35% similar to previous[-1]

In the above example, we originally have 3 clusters and then 2. The two that map are pretty straightforward: current[0] 87.80% similar to previous[1], and current[-1] 82.35% similar to previous[-1]. Not sure what to do about the group that fell away. I think there should be an increasing ID number for clusters, with the exception of [-1], which is unclustered. Once a cluster goes away, it can’t come back.

Long discussion with Bob and Aaron, basically coordinating and giving Bob a sense of where we are. That wound up being most of the day.

Phil 4.18.17

7:00 – 8:00, 4:00 – 5:00 Research

Redid the paper in the ACM format. I have to say, LaTex did make that pretty easy…
Got the gridded population data. Now I have to find a Python reader for arcGIS .asc files. This looks like it might work (ASCII to Raster)
Chat with Helena
- Now in CSCW
- “In the field of CSCW there are emergent trends”. Starbird & etc. Check mail
- Abstract due on Thursday!

8:30 – 3:30 BRC

Discussion with Aaron about spatial organization and dimension reduction.
- Went looking for reasons about the ordering of ICD10 codes and got here: List ICD-10 codes in correct order to preserve medical necessity
- Found this: ICD-10-CM Official Guidelines for Coding and Reporting
Working on cluster similarity
Built a dictionary of all clusters in a sample. Need to compare them next

Phil 4.17.17

7:00 – 8:00, 3:00 – 4:00 Research

This looks good: Bayesian data analysis for newcomers
Also this; Seeing Theory
I want to do a map that is based on population vs geography:
- Gridded Population of the world (Matplotlib to generate image?). Can’t get the data directly. Need to see if available via UMBC (or maybe GLOBE?)
- Wilbur terrain generation (installed. Will accept an image as the heightmap source)
Tried using QT designer, but it can’t find the web plugin?
Installing Python 3.6 on my home dev box
Downloaded all the python code, and my simulation data. I want to be able to merge tables to produce networks that can then be plotted, so I think it’s mostly going to be installing things this morning
NOTE: When installing Python, the only way to install for all users it to go through the advanced setup.
Installing packages. CMD needs to run as admin, which blows.
After some brief issues with the IDE not being set in structure, got all the pieces that use numpy, pandas and matplotlib running. That should be enough for table parsing (although there will be the excel reading and writing installs), though I still need to get started with graph-tool
Paper was rejected – time to try ACM? LaTex format. Downloaded and compiled! Now I just have to move the text over? Wrap the existing text? That’s something for tomorrow.

8:30 – 2:30, BRC

Working on table joins. That was pretty straightforward. Note that for column collision you have to provide a suffix. Makes me think that I want to compare across DataFrames instead

eu.read_dataframe_excel(args.excelfile, None)
cluster_df = eu.read_dataframe_sheet("Cluster ID")
print("cluster_df")
#  print(cluster_df)
dist_df = eu.read_dataframe_sheet("Distance from mean center")
print("dist_df")
#  print(dist_df)
merged_df = cluster_df.join(other=dist_df, lsuffix='_c', rsuffix='_d')
print("merged_df")
print(merged_df)

So now that I can read in and analyze sheets, what am I trying to do?I think that for each time slice, and by cluster, produce a sorted list from most to least common membership.

Phil 4.15.17

Thoughts about CSCW ‘Fake News’ class

Home to Roost
Fake it to make it game
The Law of Group Polarization
Online clustering, fear and uncertainty in Egypt’s transition
Study: Breitbart-led right-wing media ecosystem altered broader media agenda
On the Milo Bus With the Lost Boys of America’s New Right
Facebook’s guide to handling Fake News
Junk News and Bots during the U.S. Election: What Were Michigan Voters Sharing Over Twitter? And here’s the full paper.
Principles that motivate citizen behaviour according to Montesquieu
- Driving each classification of political system, according to Montesquieu, must be what he calls a “principle”. This principle acts as a spring or motor to motivate behavior on the part of the citizens in ways that will tend to support that regime and make it function smoothly.
  - For democratic republics (and to a somewhat lesser extent for aristocratic republics), this spring is the love of virtue—the willingness to put the interests of the community ahead of private interests.
  - For monarchies, the spring is the love of honor—the desire to attain greater rank and privilege.
  - Finally, for despotisms, the spring is the fear of the ruler.

A political system cannot last long if its appropriate principle is lacking. Montesquieu claims, for example, that the English failed to establish a republic after the Civil War (1642–1651) because the society lacked the requisite love of virtue.

My thoughts on Arendt’s Origin of Totalitarianism as it relates to information
FiveThirtyEight’s take on r/TheDonald
Information Wars: A Window into the Alternative Media Ecosystem
- By Kate Starbird, (Scholar)
  Professor, Human Centered Design & Engineering, University of Washington
  
  Crisis Informatics, Crowdsourcing, Crowdwork, CSCW, Human Computer Interaction
On Building a “Fake News” Classification Model
Helena sent a nice link to Announcing New Research: “A Field Guide to Fake News” it’s from First Draft News, which purports to be Essential resources for reporting and sharing information that emerges online
Defense Against the Dark Arts: Networked Propaganda and Counter-Propaganda
- Jonathan Stray
Filter bubbles, echo chambers, and online news consumption. Post is here
Thom Lieb found this for me today: Reuters Tracer – A Large Scale System of Detecting Verifying Real-Time News Events from Twitter. Downloaded. A must read.

Phil 4.14.17

7:00 – 8:00 Research

This Is Why Trump’s Conspiracy Theories Work, Say Experts
Good advice on writing papers
Setting up to get clustering data.
- Explore/Exploit ratio 50/50
- Explore will be in the flocking range: 0.1 – 1.6
- Exploit will be in the echo chamber stage: 3.2 – 10.0
- Cluster EPS of 0.25 gives good diversity
- Create a DataFrame from the “Cluster ID” sheet
- Ran the sims, but the default destination is wrong. Re-running
- Discriminate between explorer and exploiter cluster membership over time
  - Clusters the agent belonged to
  - Will need to post-process to fix cluster switching. Probably taking all the cluster average positions and applying the a common ID if the difference between the samples is less than a given delta

8:30 – 3:30 BRC

Generating data for clustering
Will try to read in and merge the DataFrames so that I have position (angle or origin distance) to calculate group persistence
Human Motion Recognition Using Isomap and Dynamic Time Warping
Fixing DataFrames. Mostly this is a case of bad data handling. This is in the data:
```
359000.00,836,2,2,4,0,1,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,#<Geocoder::Result::Bing:0x007fe9a1b39718>,#<Geocoder::Result::Bing:0x007fe9a1b39150>
```
Which means that we have to handle ‘#<Geocoder::Result::Bing:0x007fe9a1b39718>,#<Geocoder::Result::Bing:0x007fe9a1b39150>. I am very disappointed in this book
Giving up. Wrote a review. Monday I’ll try doing joins on Dataframes. That being said, I learned a lot on how to check for errors.
‘

Phil 4.13.17

7:00 – 8:00 Research

Reading the HCIC Boaster Poster description. Downloaded to HCIC 2017 folder
- A “boaster-poster” is a poster that describes your most current research endeavor and/or interest. The idea is to foster dialogue about your topic of interest/research so you can meet like-minded HCIC 2017 attendees. Format for a “boaster-poster” is as follows: a short description of your perspective and interest in this area, plus a description of your work in form of a single page (8.3 × 11.7 inches) poster. Boaster-posters offer an opportunity to showcase the work of new and experienced authors alike. You can use images and text to frame and illustrate your ideas. A list with boaster-poster titles, authors & abstracts will be distributed at the conference, and the posters will be available for view at the HCIC conference. We strongly encourage all student attendees to submit a boaster to HCIC, as boaster authors will have opportunities across the conference to discuss their work with other attendees through a new interactive format for 2017.
  Boaster-poster deadline: June 2nd, 2017
  A pdf that includes:
  - A cover page with
    - Title, author(s) (indicate those available to chat at meeting)
    - At least three keywords
    - A 150 word abstract
  - A draft of your poster
- So something like “sociophysics-informed design“? I’m thinking that if I can take agent cluster membership and use that to construct a social network graph, I could show something that looks like this:
- Maybe use graph-tool Python library?
- Need to look at Zappos and McMaster websites as examples of explorational interfaces
- Facebook’s guide to handling Fake News. High effort. I wonder what kind of feedback mechanisms there are?

8:30 – 6:00 BRC

Sprint planning
Doctor visit, 10:15 – 11:00
Discussion with Aaron about visualizing high-dimensional clusters in low-dimensional space for intuitive understanding
Working through Thoughtful Machine Learning. Very disappointed. The code in GitHub doesn’t match the book, doesn’t even have an entry point, and blows up in the init. Sad! Here’s the offending line (df is the read-in DataFrame):
```
df = (df - df.mean()) / (df.max() - df.min())
```
Learning more about the pandas DataFrame here so maybe I can fix the above.
Actually, Skillport has useful stuff, but all the videos crash before the end
The problem is that the floating point values in the file are being read in as string values, and crashing the calculation. I’ve tried doing an apply function that changes the value but it doesn’t result in the type change. Going to try changing everything to float tomorrow.
Helped Aaron break down the tasking for this sprint’s efforts.

viztales

Dimension reduction, State, Orientation, and Speed

Monthly Archives: April 2017

Phil 4.28.17

Aaron 4.27.17

Phil 4.27.17

Phil 4.26.17

Aaron 4.25.17

Phil 4.25.17

Phil 4.24.17

Phil 4.21.1

Phil 4.20.17

Phil 4.19.17

Phil 4.18.17

Phil 4.17.17

Phil 4.15.17

Phil 4.14.17

Phil 4.13.17