Phil 12.21.16

7:00 – 8:00 Research

8:30 – 4:00 ASRC

Phil 12.20.16

7:00 – 8:00 School

  • Looks like I need to keep track of my hours better for next year. Getting started now
  • Continuing with Sociophysics

8:30 – 4:30 ASRC

  • Sorting TableColumn(s)- done. The trick is to change the name and the cellValueFactory:
    tc.setText(colName);
    tc.setCellValueFactory(new MapValueFactory<>(colName));
  • Sorting rows is more straightforward. Just got a list of the sorted row sums and reordered based on that.
  • Need to update the value in the ‘selected field’ textarea.
  • Verify that the cells are tracking
  • Add a tab so that it’s possible to switch between the original and the product matrices
  • Add an Edit/recalculate capability
    • Tweak original matrix
    • Adjust k
    • Column clustering/renaming

Phil 12.19.16

7:15 – 4:15 ASRC

  • Continuing with Sociophysics
    • Chapter 7:  of flocks, flows and transports [page 179]
    • Boids (Flocks, herds and schools: A distributed behavioral modelCraig Reynolds):
      • Try to avoid collisions with other boids (repulsion)
      • Attempt to match velocity with neighboring boids
      • attempt to stay close to nearby boids
    • If the collision avoidance is taken out and the number of dimensions increased, then this could be the model. Rather than the flock converging around a position, look at the distances between the individuals using DBSCAN and cluster.
    • Density and noise need to be independent variables and saved on runs. This would also be true in information space. You can have high organization in high density, low noise states. Thinking about that, this also implies one of the emergent properties of an information bubble is the low noise. Even though the environment may be very noisy, the bubble isn’t.
    • As with the other social models, individuals can have weight. That way the flock can have leaders and followers. (See Misinformed leaders lose influence over pigeon flocks to inform the model)
    • Also, I like the idea of a social network being built from belief proximity, which raises the cost for switching to another flock, even if they are nearby. It could be that once a social network forms that anti-belief repulsion starts to play a role.
  • BRC
    • Updating intellij and Java.
    • Intellij failed to patch. Odd. Tried again and it worked.
    • Working on getting tables to update
      • Clear() – done
      • Load – done
      • Select row and modify – done
      • Working on columns and cells
      • Need to sort by row and column. Do this as part of the update() process

Phil 12.18.16

1:30 – 4:00 School. Rain, rain, rain.

  • Continuing with Sociophysics
    • 6.5 Is it really a small world? Searching post Milgram
      • 6.5.8 Funneling properties.
          • The funneling capability of a node can be defined as the fraction of successful dynamic paths through it when the target is fixed and the source is varied. Two thoughts: First, this seems to be a measurement of centrality. Second, Large, vague nodes are needed for ‘laundering’ information into misinformation or conspiracy theory.
          • Consider four agents. Who have characteristics that can vary between (0, 1).
            • Agent 1 has two color intensities: R=0.1, G= 0.7
            • Agent 2 has one color and two note volumes R=0.3, A=0.2, F=0.6
            • Agent 3 also has one color and two note volumes B=0.4, D=1, E=0.2
            • Agent 4 has three notes A=0.3, D=0.4, E=0.5
          • Let’s assume that funneling is not required if agents share a color or note. This means that A4 can get to A1 through A2, but A3 has to get to A1 via A4 and then A2. In a matrix this looks like
        R G B A D E F
        Agent1 0.1 0.7
        Agent2 0.3 0.2 0.6
        Agent3 0.4 1.0 0.2
        Agent4 0.3 0.4 0.5
          • But if we add the hypernyms Color and Notes, we can get funneling. I am summing the color and notes to give a sense of the agent’s ‘projection’ into the larger, more general space. I think the ‘size’ of the funnels are the number of items that go in them times the range of each item. So Color would have a range of (0, 3) and Notes would have a range of (0, 4), since I’m not including B, C, and G here:
        R G B A D E F Color Notes
        Agent1 0.1 0.7 0.8
        Agent2 0.3 0.2 0.6 0.3 0.8
        Agent3 0.4 1.0 0.2 0.4 1.2
        Agent4 0.3 0.4 0.5 1.2
          • Now agents 2 and 3 can get to each other through either Color or note in two hops, and the Agents 1 and 4 can reach each other by going through each of the funnels.
          • There should be a cost in using a funnel though. You loose the information about which color or which note. Intuitively, a series of steps with non-funnel links should be somehow more specific than the same number of steps through a funnel.
          • Practical uses would be a way to detect poorly reasoned conclusions, as long as the beginning and end of the train of thought could be identified.
    • Knowing a network by walking on it: emergence of scaling (Alexei Vázquez) Looks like an interesting guy with a wide range of publications.

Phil 12.16.16

Phil 7:00 – 4:00 ASRC

  • Continuing with Sociophysics
    • Social Phenomena on complex networks
      • Opinion and community formation in coevolving networks (Gerardo Iñiguez González)
        • Abstract: In human societies opinion formation is mediated by social interactions, consequently taking place on a network of relationships and at the same time influencing the structure of the network and its evolution. To investigate this coevolution of opinions and social interaction structure we develop a dynamic agent-based network model, by taking into account short range interactions like discussions between individuals, long range interactions like a sense for overall mood modulated by the attitudes of individuals, and external field corresponding to outside influence. Moreover, individual biases can be naturally taken into account. In addition the model includes the opinion dependent link-rewiring scheme to describe network topology coevolution with a slower time scale than that of the opinion formation. With this model comprehensive numerical simulations and mean field calculations have been carried out and they show the importance of the separation between fast and slow time scales resulting in the network to organize as well-connected small communities of agents with the same opinion.
      • Citing paper: Effects of deception in social networks (Gerardo Iñiguez González)<— Important???
        • Abstract: Honesty plays a crucial role in any situation where organisms exchange information or resources. Dishonesty can thus be expected to have damaging effects on social coherence if agents cannot trust the information or goods they receive. However, a distinction is often drawn between prosocial lies (‘white’ lies) and antisocial lying (i.e. deception for personal gain), with the former being considered much less destructive than the latter. We use an agent-based model to show that antisocial lying causes social networks to become increasingly fragmented. Antisocial dishonesty thus places strong constraints on the size and cohesion of social communities, providing a major hurdle that organisms have to overcome (e.g. by evolving counter-deception strategies) in order to evolve large, socially cohesive communities. In contrast, white lies can prove to be beneficial in smoothing the flow of interactions and facilitating a larger, more integrated network. Our results demonstrate that these group-level effects can arise as emergent properties of interactions at the dyadic level. The balance between prosocial and antisocial lies may set constraints on the structure of social networks, and hence the shape of society as a whole.
    • 6.5 Is it really a small world? Searching post Milgram
      • In the introduction to this section [page 168], the authors say a very interesting thing: “Although the network may have the small world property, searches are usually done locally: the individual may not know the global structure of the network that would help them find the shortest path to the target node“. I think that they are talking about social networks explicitly here, but the same concept applies to an information network. This is a network description of the information horizon problem. You can’t find what you can’t see, at least in a broad outline.
      • Also this: “Searching can regarded as a learning process; repeating the search several times can avoid infinite loops and lead to better solutions
    • Sprint Planning Meeting
      • Clustering is my only task for the sprint
    • Writing out factored matrix
    • Put together test case term/doc spreadsheet. There are several test cases with and without randomly generated zeros. The goal is to determine the best way to cluster docs. Do we use the product mat? The factor mats? Only one way to find out.
    • Working on sorting the L2mat by row sum or column sum

Phil 12.15.16

7:00 – 5:30 ASRC

  • This came across my Twitter feed this morning:  Indivisible: A Practical Guide for Resisting the Trump Agenda. It’s written by Hill staffers (Apparently), and says something interesting about the Tea Party:
    • They were locally focused. The Tea Party started as an organic movement built on small local groups of dedicated conservatives. Yes, they received some support/coordination from above, but fundamentally all the hubbub was caused by a relatively small number of conservatives working together. To summarize:
      • Groups started as disaffected conservatives talking to each other online. In response to the 2008 bank bailouts and Obama’s election, groups began forming to discuss their anger and what could be done. They eventually realized that the locally-based discussion groups themselves could be a powerful tool.
      • Groups were small, local, and dedicated. Local Tea Party groups could be fewer than 10 people, but they were highly localized and dedicated significant personal time and resources. Members communicated with each other regularly, tracked developments in Washington, and coordinated advocacy efforts together.
      • Groups were relatively small in number. The Tea Party was not hundreds of thousands of people spending every waking hour focused on advocacy. Rather, the efforts were somewhat modest. Only 1 in 5 self-identified Tea Partiers contributed money or attended events. On any given day in 2009 or 2010, only twenty local events–meetings, trainings, townhalls, etc–were scheduled nationwide. In short, a relatively small number of groups were having a big impact on the national debate.
    • In reading this, I hear several things
      • Group polarization can start or be furthered by small groups. Indeed, it’s the embodiment of the Margaret Mead quote: Never doubt that a small group of thoughtful, committed citizens can change the world; indeed, it’s the only thing that ever has. This also jibes with Sunstein’s statement that small, polarized groups can act like incubators.
      • Though polarized, the members “tracked developments in Washington”, so they were informed. I’d really like to know their sources of information.
      • Scattered groups that are loosely coupled may have a bigger impact than a large single group.
  • Continuing with Sociophysics
    • Social Phenomena on complex networks
    • Dynamical Processes on Complex Networks. Got the Kindle edition so now I can search! Interesting section: 10.6 Coevolution of opinions and network
    • Similar chapter in this book – Social Phenomena on coevolutionary networks [pg 166]. One of the interesting things here is the use of the iterated prisoner’s dilemma. On a network, the agents typically calculate and aggregate payoff and imitate the strategy of the neighbor with the best payoff. In the coevolutionary model, an agent can cut off the link to a defector with a probability. This seems a bit like polarization, where the group severs ties with entities with sufficiently divergent views (and individuals leave when the group becomes too extreme)
    • Coevolution of agents and networks: Opinion spreading and community disconnection Abstract: We study a stochastic model for the coevolution of a process of opinion formation in a population of agents and the network which underlies their interaction. Interaction links can break when agents fail to reach an opinion agreement. The structure of the network and the distribution of opinions over the population evolve towards a state where the population is divided into disconnected communities whose agents share the same opinion. The statistical properties of this final state vary considerably as the model parameters are changed. Community sizes and their internal connectivity are the quantities used to characterize such variations.
  • adam g. dunn
    clinical epidemiology and medical informatics. Looking for PhD’s working on misinformation
  • Sprint grooming
  • Got the data for the npi_raw_integrity_ci_tbl_eligibility_chiroacup_polyrx table. Turning into pivot tables.
  • Can now read a single matrix into NmFModelGui, factor, and build the product matrix.

Phil 12.14.16

7:00 – 6:00 ASRC

  • Continuing with Sociophysics
    • Social Phenomena on complex networks
    • Loops of nodes behave differently from trees. what to do about that? I think loops drive the echo chamber process? It is, after all, feedback..
    • There is also a ‘freezing’ issue, where a stable state is reached where two cliques containing different states are lightly connected, but not enough that the neighbors in one clique can be convinced to change their opinion [Fig. 6.2, pg 135]
    • Residual Energy: The difference between the actual energy and the known energy of the perfectly-ordered ground state (full consensus).
  • BRC
    • Need to not split quoted columns
    • Generate a matrix where flags that are Yes -> 1 and empty/null -> o
    • Retrospective
    • Had a thought that NMF might work in tensors as well. I need to rewrite the gradient descent so that it takes an arbitrary number of dimension.
    • Meeting with Nir. Sold him on clustering.

Phil 12.13.16

7:00 – 5:30 ASRC

  • Added a page for my model notes
  • Continuing with Sociophysics
  • Integrity meeting
    • Bellrock needs a summary of why no HCAHPS display. We could put something really simple that does not get rolled into the scoring of the impactors. Like claim numbers. Katy suggests a ‘ticker’ (Sparkline?)  of claim volume/amounts
    • Need Gregg’s suggestion of what the ‘hot button’ indicators should be (action item), based on the claims data.
    • Small number of items that we can be tracking the actual values of (no calculation) that we can roll up and display.
    • NPPS as an input that gets added to stand in for self-reported flags??
    • Goal of the system is to decide whether the users should spend coordination time on expensive patients.
    • NDC – National Drug Code. Counts of drugs by claim period. Where does this come from? Counts of denied?
    • Claim period is monthly?
  • Having issues with getting lines read cleanly. For the time being, I’m going to throw away the bad lines, but later, I want to make persistent objects and get the data from postgres directly.
  • Tensor spectral clustering for partitioning higher-order network structures
  • Multilinear PageRankIn this paper, we first extend the celebrated PageRank modification to a higher-order Markov chain. Although this system has attractive theoretical properties, it is computationally intractable for many interesting problems. We next study a computationally tractable approximation to the higher-order PageRank vector that involves a system of polynomial equations called multilinear PageRank. This is motivated by a novel “spacey random surfer” model, where the surfer remembers bits and pieces of history and is influenced by this information. The underlying stochastic process is an instance of a vertex-reinforced random walk. We develop convergence theory for a simple fixed-point method, a shifted fixed-point method, and a Newton iteration in a particular parameter regime. In marked contrast to the case of the PageRank vector of a Markov chain where the solution is always unique and easy to compute, there are parameter regimes of multilinear PageRank where solutions are not unique and simple algorithms do not converge. We provide a repository of these non-convergent cases that we encountered through exhaustive enumeration and randomly sampling that we believe is useful for future study of the problem

Phil 12.12.16

7:00 – 3:00 ASRC

  • Register for Spring 2017
  • Continuing with Sociophysics
  • Integrity data
    • Pulling all empty columns, and columns that contain encrypted data. Stupid slow and error prone. Going to write a quick app. It should be able to store and output data about all the useful columns in the tables. I should also be able to incorporate Gregg’s data dictionary (And look for toLower(“need definition”))
    • Possibly output persistent queries for Java?
    • Sparsity score?
    • Column matches across tables?
    • Once master matrix is built, then use NMF to cluster columns for predictive capability
  • Sprint review
    • Ticket to fix my vpn access?

Phil 12.9.16

7:00 – 5:00 ASRC

  • Clickbait? How to Make an Amazing Tensorflow Chatbot Easily
  • Good article on information bubbles (Clinton and Trump). Visualizations clearly show bubble and star. Parallel Narratives
  • Continuing with Sociophysics
    • Opinion Formation
    • On [page 62], the authors discuss Phase transitions in a two-parameter model of opinion dynamics with random kinetic exchanges, which they say shows  that agent behavior is unrealistic when there is only positive influence. This could support my Anti-belief element in my model.
    • On [page 66] the authors briefly discuss Opinion dynamics with confidence threshold: an alternative to the Axelrod model, where voter have continuous opinion. Clusters happen when confidence is 0 < c < 1. Interval notation (0, 1)
    • An important point seems to be the number of opinions. For low numbers of discretized opinions and many agents, clustering happens. For the reverse, pretty much every agent has their own opinion. Runs at different number of opinions can show the thresholds that these these transitions happen (called precipitation?).
    • On [page 69] there is a brief mention of a model with a vector of opinions, which sounds a lot like my ‘belief’ being a set of statements. The title looks good too: Different topologies for a herding model of opinion (abstract below)
    • Back to NMF
      • Have sliders adjust scalar matrices
      • Chain the raw, scalar and scaled matrices together.
      • Make sure that the changed matrices are visible.
      • Add a load option that if there is only one matrix, that factorization can be run on that matrix, rather than having to read in the factored spreadsheets. Will need a (k) size select and a ‘calculate factors’ button
    • BRC
      • It’s impossible to get the VPN to work. Asked Stan for a better machine.
      • Matt is getting the full DB downloaded so I can work on it locally. Done
      • Can’t create the DB. Messages into Matt and Gregg.
        • Wound up doing the following to create the postgres db in pgAdminIII
          1. Created user postgres
          2. Created new db npi_raw as a UTF8 db, with postgres as owner.
          3. Created dev_ci and integrety_ci schemas with postgres as owner
          4. Right-click on the dev_ci schema and select restore. Navigate to the npi_raw.backup file and click ‘Restore’
          5. Right-click on the integrity_ci schema and select restore. Navigate to the integrity_ci folder and then restore each of the backup files.

Phil 12.8.16

7:00 – 4:00 ASRC

  • Continuing with Sociophysics
    • Found while reading about opinion dynamics modelling [page 56]: Heterogeneous bounds of confidence: meet, discuss and find consensus! – In this paper, heterogeneous bounds of confidence are studied. The surprising result is that a society of agents with two different bounds of confidence (open-minded and closed minded agents) can find consensus even when both bounds of confidence are significantly below the critical bound of confidence of a homogeneous society. I think that this may represent exploiters and explorers. Need to read.

  • Back to NMF
    • setting up sliders to manipulate row, column or cell for a matrix.
      • Changes to the row or col mat should result in a recalculation of the product matrix. Product matrix should just recalculate based on scalars
      • Got the event handlers set up, working on putting sliders in place
  • Getting the VPN set up so I can access the DB – no luck. Need to take it up with Heath tomorrow
  • 1:00 – 4:00 meeting with Aaron, Theresa, Katy, Greg & Jeremy

Phil 12.7.16

7:00 – 5:30 ASRC

  • Continuing with Sociophysics
  • Opinion Formation
  • NMF WMN
    • Got the product matrix calculated and showing
    • Added RowSum
    • Adding ColumnSum – done
    • Thinking about how to manipulate all this.
      • Adding scalar and scaled matrices. I still need to wire them up, but I’m going to work on modifying the weight matrix first.
      • Got the row, column and value of the selected cell. Next is to set up sliders to scale same on the selected matrix/cell.
  • Progress for today nmftestbed_12_7_16
  • More discussions on data modeling.
  • Meeting with Aaron and Katy.

Phil 12.6.16

7:00 – 4:00 ASRC

  • Getting a server
    • Campus cluster (underutilized) Free Student access (Damian Doyle) HPCF.umbc.edu. Write a note for Wayne to send. Done
  • Note to Wayne on getting a Google grant. Done.
  • Using Java Persistence API (JPA) with Cloud SQL
  • Google datasets
  • Continuing with Sociophysics – nope, maybe this afternoon.
  • Working on incorporating the factor matrices into the NmfModelGui. Looks like I use Maps as discussed at the bottom of this page. Building a L2DMat2Table class that should take care of doing this for each matrix.
  • Need to be able to output a Labled2DMatrix as a map. Working on that. Done. Oh, I need column maps. No, not really, just a complete header list.
  • And it’s working! Here’s the two factored matrices: factoredtables
  • Worked with Aaron on the data document. The tables are poorly put together and let’s just say, not self-documenting.

Phil 12.5.16

7:00 – 5:00 ASRC

  • Here’s Where Donald Trump Gets His News

    • It would be interesting to do a crawl on those sources (weighted by the amount visited?) and do a word model analysis.
  • Continuing with Sociophysics
  • Fixed the header problem in Labled2dMatrix.fromExcelSheet()
  • Here’s the term extraction using NMF with a k = 2:
    chapter-3-the-spouter-inn, chapter-1-loomings, chapter-2-the-carpet-bag
    harpooneer
    water
    landlord
    about
    light
    stand
    thousand
    other
    night
    passenger
    
    chapter-2-the-carpet-bag, chapter-4-the-counterpane, chapter-1-loomings
    queequeg
    landlord
    harpooneer
    sailor
    money
    nantucket
    passenger
    sight
    where
    dream
  • And here are the document clusters. Just looking at this, I see that each term-triple (the top three from the above list) could easily cluster into three groups each:
    queequeg-landlord-harpooneer
    chapter-2-the-carpet-bag 4.158255493
    chapter-4-the-counterpane 2.889135473
    chapter-1-loomings 2.370002854
    chapter-3-the-spouter-inn 0.651989511
    
    harpooneer-water-landlord
    chapter-3-the-spouter-inn 7.111998945
    chapter-1-loomings 1.72918472
    chapter-2-the-carpet-bag 1.157125575
    chapter-4-the-counterpane 0.000284084
  • Working on incorporating the factor matrices into the NmfModelGui. Looks like I use Maps as discussed at the bottom of this page. Building a L2DMat2Table class that should take care of doing this for each matrix.
  • Some thoughts abut Jobs, work and the Amish. Are they economically competitive? Value-add? Why does this work?
  • Fika talk Dr. Quincey Brown
    • Educational Games – Microsoft Imagine Cup
    • Mobile intelligent tutoring systems. Math on iPads
    • Children using touch and gesture. Different than adult usage?
    • Broadening participation in computing, Grace Hopper, etc.
    • Science and Technology fellowships at the AAAS (during healthcare.gov launch)
    • White House nation of makers
    •  No real effort to cultivate press relationships and new media ways to get the word out.Leverage YouTube?
  • Meeting with Wayne
    • Latent social hacking
  • Getting a server
    • Contact Ron for an abandoned Blade
    • Campus cluster (underutilized) Free Student access (Damian Doyle) HPCF.umbc.edu. Write a note for Wayne to send.
    • UMBC agreement with AWS
  • Note to Wayne on getting a Google grant