Author Archives: pgfeldman

Phil 6.6.17

Research about one hour

Broke apart the assets for the CI 2017 poster V2. What about the relationships from the P&C considered harmful outline? That’s not in the abstract…

phil 6.5.17

Research (2 hours)

I am thinking about flocking as a form of group decision making under uncertainty., kind of like the Multi-armed bandit problem parallelized. So in this context:
Explore: Assumption of unknown information
Flocking: Assumption of incomplete information
Exploit: Assumption of complete information

Phil 6.1.17

7:00 – 8:00 Research

Converted poster to PDF and test printed. And then noticed some things. And did it again. And then noticed some things. And did it again. And then noticed some things. And did it again. I think I’m done now. Need to send off this evening.

9:00 – 4:00 BRI

Sprint planning

Phil 5.31.17

7:00 – 8:00 Research

DeepWalk: Online Learning of Social Representations Bryan Perozzi, Steven Skiena
Saw the class notes on my CSCW guest lecture. The points got across, which feels nice.
Sent copies of the boaster to Wayne
Need to get started on the revision of the CI poster. Here’s some inspiration from Nicolai Marquardt:
First pass: fixed DTW charts to show separate populations:

8:30 – 4:00 BRI

The Meaning of Underscores in Python
Tried to add research code to timesheet. No luck. Let T know.
Tried to access new Jira and Confluence pages, They are visible thought the OpenVPN tunnel. but the login/password does not work
Reading the Ketos User guide and annotating. Finished – sending to Aaron
TEM meeting at 2:00
Meeting with CCRi. Lead dev: Vivek Dhand
- String matching, BOW, LSI competitors
- Based on word2vec, combined with a TF-IDF scoring
  Trained on wikipedia
- Trained on seperate training server?
- Apps on the training server? Train one classifier for each field
Things we did in 2016
- StanfordNLP+jsoup tool to categorize and tag web pages for statistical analysis
- Statistical analysis of said pages, include backlink and other meta data analysis
- Google CSE interface, plus cleaning tools
- Document centrality analysis tool (JavaFX! Woohoo!) (LSI, TF-IDF, PageRank, adjacency, etc calculations at interactive rates)(outputs for WEKA)
- Use of above tool to create CSE search terms that improved craw precision by 500% (https://viztales.com/wp-content/uploads/2016/05/extracting-better-search-terms.docx)
- Tagged hundreds of web pages because someone had to.
- Proposal writing
- Group polarization modeling using flocking agent-based simulation
- Microservices
- Classifiers in WEKA and the WEKA api
- Research Browser prototype
- NMF tool for topic extraction based on UTOPIAN paper

Phil 5.30.17

7:00 – 8:30 Research

Really tempted to call the HCIC poster ‘Precision and Recall Considered Harmful’. Maybe the CHIIR paper insted?
Got a good deal of work done over the weekend. Here’s my latest abstract:
Also made good progress on the poster. Will need to re-run the text for LMN -done:
Sent both mockups off to Wayne
Anatomy of news consumption on Facebook
- In this paper, we explore the anatomy of the information space on Facebook by characterizing on a global scale the news consumption patterns of 376 million users over a time span of 6 y (January 2010 to December 2015). We find that users tend to focus on a limited set of pages, producing a sharp community structure among news outlets. We also find that the preferences of users and news providers differ. By tracking how Facebook pages “like” each other and examining their geolocation, we find that news providers are more geographically confined than users. We devise a simple model of selective exposure that reproduces the observed connectivity patterns.

9:00 – 4:00 BRI

Send T my schedule – done
Clustering has been kicked under the bus. Need to respond, but also look for new work? Working on that with Aaron.
I’m getting a ’99’ charge number for research.
Helping Aaron put his accounts back together
Meeting with Chris Y, T, Aaron and me
- How do we segment off the Aaron/Phil work
- Figure out BRC and tell them they’re wrong? How to make this about success?
Fixed a bug with CorpusManager where the code broke it the kept percent is 100%

Phil 5.26.17

7:30 – 12:00 Research

Finished the abstract page, I think. Nope – need to extend the text a bit:
Starting on the Boaster part
Read the words from the Boaster V2 into LMN, clicked search on a vector of “network model social create twitter sunstein validate ” and got this: Towards the Modeling of Behavioral Trajectories of Users in Online Social Media by Alessandro Bessi. Looking at his publications, he’s another twin.
Looking for ‘flocking’ images. I found this work tracking users of Twitter in Manhattan. And this, too: iographica
Graphing the history of Philosophy using DBpedia

12:30 – 4:00 BRC

Wrote up stories for upcoming 2-week(!) sprints
Long discussion with Bob and Aaron about SOMs

Phil 5.25.17

Sand Spring Bank

Ohio RV

7:00 – 8:00 Research

Computer-mediated telepathy using NN to classify FMRI data tagged by human activity at the time (e.g. looking at an image)
- Dr. Mary Lou Jepsen (wikipedia)
- opnwatr.io: What if you could see what was going on in your brain or body with the detail of a high resolution camera or MRI machine in a simple wearable? Openwater is creating a device that can enable us to see inside our brains or bodies in great detail. With this comes the promise of new abilities to diagnose and treat disease and well beyond – communicating with thought alone.
- More work on abstract cover. Pretty much done, though a rewording is in order. Something about fake news rather than Trump v Clinton, I think

8:30 – 5:00 BRC

Updated confluence page to have Ellis email and the documents he referenced
Sprint grooming
Thinking hard about self-organizing maps
- Wikipedia article
- Self-Organizing Maps with Google’s TensorFlow
- This is a demonstration of how a self-organizing map (SOM), also known as a Kohonen network, can be used to map high-dimensional data into a two-dimensional representation. For the sake of an easy visualization ‘high-dimensional’ in this case is 3D.
- A Self Organizing Map (SOM) Package in Python: (SOMPY)
- Kohonen’s Self Organizing Feature Maps (tutorial)
- Scholar search for Teuvo Kohonen’s work
- Teuvo Kohonen’s home page at Helsinki Institute
  - Self-Organization and Associative Memory (book)
- Has Kohonen-style SOM fallen out of favor? (cross-validated post. Interesting links and compare with t-SNE)
- Scikit-learn developer discussion on whether to support SOMs
- Growing Self-Organizing Maps:
- Generative Topographic Mapping (GTM)
  - Relational generative topographic mapping (2011)

Phil 5.24.17

7:00 – 8:30 Research

Working on the new version of the HCIC boaster
Abstract: Add a picture of a network next to each item that reflects the polarization level. Maybe gray out the non-abstract bits.

9:00 – 5:30 BRC

Discovered that it’s the wired network that’s slow. Switched over to WiFi and finished setting up machines.
Checking out Facebook’s deep text
- The FB paper Text Understanding from Scratch, based on Natural language processing (almost) from scratch
- Connected to Jira!!
- Ran the papers through the LMN tool and searched on Scholar. Found some interesting papers
- This also led to an interesting post from the OpenAI people about Generative Models, which is kind of what I’m doing, though mediated through agents so that they can be understood by humans/me
Reading Natural Language Processing (Almost) from Scratch. Finished section 3.2
Got Ellis brain dump. Building a confluence page for the analytics research

Phil 5.23.18

7:00 – 8:00 Research

Reworking the poster using the research browser and found this:
- Emergence of metapopulations and echo chambers in mobile agents. Just from reading the abstract, their model is more complex, but similar results? Multi-agent models often describe populations segregated either in the physical space, i.e. subdivided in metapopulations, or in the ecology of opinions, i.e. partitioned in echo chambers. Here we show how both kinds of segregation can emerge from the interplay between homophily and social influence in a simple model of mobile agents endowed with a continuous opinion variable. In the model, physical proximity determines a progressive convergence of opinions but differing opinions result in agents moving away from each others. This feedback between mobility and social dynamics determines the onset of a stable dynamical metapopulation scenario where physically separated groups of like-minded individuals interact with each other through the exchange of agents. The further introduction of confirmation bias in social interactions, defined as the tendency of an individual to favor opinions that match his own, leads to the emergence of echo chambers where different opinions coexist also within the same group. We believe that the model may be of interest to researchers investigating the origin of segregation in the offline and online world.

8:30 – 5:00 BRC

Spent most of the day getting the development environment up and running on all the code.
Built a ‘dev file’ that fits on a thumb drive and contains versions of all the dev tools, SDKs and libraries. Was hampered by very slow internet. The workstation should go quicker as a result, though
Meeting with Theresa, Bob, Shawn, Aaron and Ellis. I think we laid out the dev process and it’s associated constraints. Shawn will go get us data to characterize, and Eli will get some ICU(?) data containing ICD codes.

Phil 5.22.17

Call acorn inn!

bike brake lights!

Google’s New AI Is Better at Creating AI Than the Company’s Engineers

7:00 – 8:00 Research

Working on HCIC poster. Yay! First pass done! Now I think I need to do an abstract?

8:30 – 2:30 BRC

Setting up the laptop
Getting intellij set up
Discussions with Aaron about mapping ICD codes

Phil 5.9.17

7:30 – 4:00 BRC

Bike to work day. Must’ve missed Mike
Tried to program the phone, but no pause for teleconf.
Still no Jira.
Working on documentation
Clustering meeting with Shawn, Aaron and Ellis?
PubMed search for ICD code Clustering. Not that much???

Research

Locally noisy autonomous agents improve global human coordination in network experiments

Phil 5.18.17

7:00 – 8:00 Research

Updated CV.
We tracked the Trump scandals on right-wing news sites. Here’s how they covered it. This is an article by Vox that might be good to look at a belief/information flow over time.
A Tale of Two Moralities, Part One: Regional Inequality and Moral Polarization
Working on HCIC poster
Social identity theory? No – Self-categorization theory
- Polarized Norms and Social Frames of Reference: A Test of the Self-Categorization Theory of Group Polarization

8:30 – 3:45 BRC

Still getting pushback on workstation. Ordered?
Still no Jira connectivity
Got a phone!
Cleaning up and documenting code

Phil 5.17.17

7:00 – 8:00 Research

Media Manipulation and Disinformation Online (Alice Marwick and Rebecca Lewis)
Call for Papers: Special Issue on Computational Propaganda and Political Big Data
Authoritarianism and Post-truth Politics
HCIC poster (June 2)
PhD Review (May 26) – done!
- Update CV!

8:30 – 4:30 BRC

Meetings. All. Day. Long.
Finished transferring Aaron’s ppt to LaTex. Looks pretty good, and an excellent first draft
- We need to publish as a baseline to reference for our methods and choices. If it makes it through the review process, then we can point to it when we’re called out like in this morning’s review

Phil 5.16.17

7:00 – 8:00 Research

Never made it to Fika yesterday. Worked a solid 13.5 hours. Artificial deadlines are pretty dumb.
Continuing with registration paperwork and other loose ends
- HCIC travel (May 19), poster (June 2) – Attempting to register. Account created
  - Registered!
  - Travel email sent. Sat June 24 – Sat Jul 1
- PhD Review (May 26) – started. Sure hope It incrementally saves…
  - Nope – starting over, offline. Making progress…

9:00 – 5:00 BRC

Sprint review. Went well, I think
Working on turning the PPT into a document

5.15.17

Well, the weekend ended on a sad, down note. Having problems getting motivated.

7:00 – 8:00 Research

Filled out CI 2017 form
Started HCIC registration
Started PhD review

8:30 – 8:00PM BRC

Run clustering on t-SNE again with Bob’s settings. It’s…. OK. We think that MDS and LLE are better for now, but there are almost certainly hyper parameter tweaking that we can do.
Here’s n example of actual data with lots of error between runs: but by adjusting the hyperparameter ‘neighbors’ from 5 (above) to 10 (below), we get a completely different result: Here, you can see that no cluster shared its nodes with any other cluster. That’s what we want. Stable, but with good granularity.
We can play some games on the clustering by seeing what happens when we remove some columns from our data. Here’s the above data with gender included and excluded: It’s possible to see that several items that were in cluster (0) distribute out when gender don’t override associated clusters.
Had a weird issue where LLE clustering on our test data that worked with neighbors = 10, now needs neighbors > 12 to work. Not sure why that’s happening.
Need to write up a report generator that does the following: For each cluster in the set that we are comparing:

size
stable/total
list of stable