Monthly Archives: August 2016

Phil 8.31.16

7:00 – 5:00 ASRC

Put Wayne’s schedule for October and early November
Ping the rest of the Committee
- Aaron – done
- Don – done
- Shimei – done
- Thom – done
Onward with incorporating comments – added ‘fourth estate’ paragraph.
- I trust my favorite knife because I’ve used it before and I can feel it’s sharpness.
Working on building a corpus config file from my GoogleCSE results.
Need to add a To Arff menu selection and query.
- Query is running.
- Need a binary variable as to whether this is something we want to train on. Probably match plus high quality.

Phil 8.30.16

7:00 – 3:30 ASRC

Adding in Wayne’s comments.
Got the Corpus generating arff files for BagOfWords and TF-IDF.
Here’s the result for NaiveBayes on the first four chapters of Mobey Dick

Correctly Classified Instances 3 75 %
Incorrectly Classified Instances 1 25 %
Kappa statistic 0.6667
Mean absolute error 0.125 
Root mean squared error 0.3536
Relative absolute error 29.1667 %
Root relative squared error 71.4435 %
Total Number of Instances 4 

=== Detailed Accuracy By Class ===

 TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 1_-_Loomings
 0.000 0.000 0.000 0.000 0.000 0.000 0.500 0.250 3_-_The_Spouter_Inn
 1.000 0.333 0.500 1.000 0.667 0.577 0.833 0.500 2_-_The_Carpet_Bag
 1.000 0.000 1.000 1.000 1.000 1.000 1.000 1.000 4_-_The_Counterpane
Weighted Avg. 0.750 0.083 0.625 0.750 0.667 0.644 0.833 0.688 

=== Confusion Matrix ===

 a b c d <-- classified as
 1 0 0 0 | a = 1_-_Loomings
 0 0 1 0 | b = 3_-_The_Spouter_Inn
 0 0 1 0 | c = 2_-_The_Carpet_Bag
 0 0 0 1 | d = 4_-_The_Counterpane

This worked really well: weka.classifiers.functions.Logistic -R 1.0E-8 -M -1 -num-decimal-places 4
And comparing Jack London stories to Edgar Allen Poe stories works with a corpus of six stories each and not so much with 3 stories each.

Phil 8.29.16

7:00 – 6:00 ASRC

Selective Use of News Cues: A Multiple-Motive Perspective on Information Selection in Social Media Environments – Quite close to the Explorer/Confirmer/Avoider study but using a custom(?) browsing interface that tracked the marking of news stories to read later. Subjects were primed for a task with motivations – accuracy, defense and impression. Added this to paragraph 2.9, where explorers are introduced.
Looked through Visual Complexity – Mapping Patterns of Information, and it doesn’t even mention navigation. Most information mapping efforts are actually graphing efforts. Added a paragraph in section 2.7
Added a TODO for groupthink/confirmation bias, etc.
Chat with Heath about AI.He’s looking to build a MUD agent and will probably wind up learning WEKA, etc. so a win, I think.
Working on getting the configurator to add string values.
Added to DocumentStatistics. Need to switch over to getSourceInfo() from getAddressStrings in the Configurator.
Meeting with Wayne about the proposal. One of the branches of conversation went into some research he did on library architecture. That’s been rattling around in my head.

We tend to talk about interface design where the scale is implicitly for the individual. The environment where these systems function is often thought of as an ecosystem, with the Darwinian perspective that goes along with that. But I think that such a perspective leads to ‘Survival of the Frictionlesss’, where the easiest thing to use wins and damn the larger consequences.

Reflecting on how the architecture and layout of libraries affected the information interactions of the patrons, I wonder whether we should be thinking about Information Space Architecture. Such a perspective means that the relationships between design at differing scales needs to be considered. In the real world, architecture can encompass everything from the chairs in a room to the landscaping around the building and how that building fits into the skyline.

I think that regarding information spaces as a designed continuum from the very small to very large is what my dissertation is about at its core. I want a park designed for people, not a wilderness, red in tooth and claw.

Phil 8.26.16

7:00 – 4:00 ASRC

Adding more model feedback
Something more to think about WRT Group Polarization models? Collective Memory and Spatial Sorting in Animal Groups
Need to be able to associate an @attribute key/value map with Labeled2Dmatrix rows so that we can compare different nominal values across a shared set of numeric columns. This may wind up being a derived class?
- Working on adding an array of key/value maps;
- Forgot to add the name to the @data section – oops!
- text is added to ARFF out. Should I add it to the xlsx outputs as well?
Here’s the initial run against the random test data within the class (L2D.arff).

=== Run information ===

Scheme: weka.classifiers.bayes.NaiveBayes
Relation: testdata
Instances: 8
Attributes: 12
name
sv1
sv2
sv3
p1
p2
p3
p4
s1
s2
s3
s4
Test mode: split 66.0% train, remainder test

=== Classifier model (full training set) ===

Naive Bayes Classifier

Class
Attribute p1 p2 p3 p4 s1 s2 s3 s4
(0.13) (0.13) (0.13) (0.13) (0.13) (0.13) (0.13) (0.13)
=======================================================================
sv1
p4-sv1 1.0 1.0 1.0 2.0 1.0 1.0 1.0 1.0
s2-sv1 1.0 1.0 1.0 1.0 1.0 2.0 1.0 1.0
p2-sv1 1.0 2.0 1.0 1.0 1.0 1.0 1.0 1.0
s1-sv1 1.0 1.0 1.0 1.0 2.0 1.0 1.0 1.0
[total] 4.0 5.0 4.0 5.0 5.0 5.0 4.0 4.0

sv2
p2-sv2 1.0 2.0 1.0 1.0 1.0 1.0 1.0 1.0
s4-sv2 1.0 1.0 1.0 1.0 1.0 1.0 1.0 2.0
p1-sv2 2.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
s1-sv2 1.0 1.0 1.0 1.0 2.0 1.0 1.0 1.0
[total] 5.0 5.0 4.0 4.0 5.0 4.0 4.0 5.0

sv3
p2-sv3 1.0 2.0 1.0 1.0 1.0 1.0 1.0 1.0
p1-sv3 2.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0
s4-sv3 1.0 1.0 1.0 1.0 1.0 1.0 1.0 2.0
p3-sv3 1.0 1.0 2.0 1.0 1.0 1.0 1.0 1.0
p4-sv3 1.0 1.0 1.0 2.0 1.0 1.0 1.0 1.0
s2-sv3 1.0 1.0 1.0 1.0 1.0 2.0 1.0 1.0
s1-sv3 1.0 1.0 1.0 1.0 2.0 1.0 1.0 1.0
[total] 8.0 8.0 8.0 8.0 8.0 8.0 7.0 8.0

p1
mean 1 0 0 0 1 1 0 0
std. dev. 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667
weight sum 1 1 1 1 1 1 1 1
precision 1 1 1 1 1 1 1 1

p2
mean 0 1 0 0 1 0 1 0
std. dev. 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667
weight sum 1 1 1 1 1 1 1 1
precision 1 1 1 1 1 1 1 1

p3
mean 0 0 1 0 1 0 0 1
std. dev. 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667
weight sum 1 1 1 1 1 1 1 1
precision 1 1 1 1 1 1 1 1

p4
mean 0 0 0 1 1 0 0 1
std. dev. 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667
weight sum 1 1 1 1 1 1 1 1
precision 1 1 1 1 1 1 1 1

s1
mean 1 1 1 1 1 0 0 0
std. dev. 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667
weight sum 1 1 1 1 1 1 1 1
precision 1 1 1 1 1 1 1 1

s2
mean 1 0 0 0 0 1 0 0
std. dev. 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667
weight sum 1 1 1 1 1 1 1 1
precision 1 1 1 1 1 1 1 1

s3
mean 0 1 0 0 0 0 1 0
std. dev. 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667
weight sum 1 1 1 1 1 1 1 1
precision 1 1 1 1 1 1 1 1

s4
mean 0 0 1 1 0 0 0 1
std. dev. 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667 0.1667
weight sum 1 1 1 1 1 1 1 1
precision 1 1 1 1 1 1 1 1



Time taken to build model: 0 seconds

=== Evaluation on test split ===

Time taken to test model on training split: 0 seconds

=== Summary ===

Correctly Classified Instances 0 0 %
Incorrectly Classified Instances 3 100 %
Kappa statistic 0
Mean absolute error 0.2499
Root mean squared error 0.4675
Relative absolute error 108.2972 %
Root relative squared error 133.419 %
Total Number of Instances 3

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class
0.000 0.333 0.000 0.000 0.000 0.000 ? ? p1
0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.333 p2
0.000 0.333 0.000 0.000 0.000 0.000 ? ? p3
0.000 0.000 0.000 0.000 0.000 0.000 ? ? p4
0.000 0.000 0.000 0.000 0.000 0.000 0.500 0.500 s1
0.000 0.000 0.000 0.000 0.000 0.000 1.000 1.000 s2
0.000 0.333 0.000 0.000 0.000 0.000 ? ? s3
0.000 0.000 0.000 0.000 0.000 0.000 ? ? s4
Weighted Avg. 0.000 0.000 0.000 0.000 0.000 0.000 0.500 0.611

=== Confusion Matrix ===

a b c d e f g h <-- classified as
0 0 0 0 0 0 0 0 | a = p1
0 0 0 0 0 0 1 0 | b = p2
0 0 0 0 0 0 0 0 | c = p3
0 0 0 0 0 0 0 0 | d = p4
0 0 1 0 0 0 0 0 | e = s1
1 0 0 0 0 0 0 0 | f = s2
0 0 0 0 0 0 0 0 | g = s3
0 0 0 0 0 0 0 0 | h = s4

Need to add text data from xml or from other(wrapper info? structured data? UI selections?) sources

Phil 8.25.16

7:00 – 3:30 ASRC

Paper
- More on contributions. Realized that I need a figure showing the relationships between individual and group behaviors.
- Found Driving a Wedge Between Evidence and Beliefs – How Online Ideological News Exposure Promotes Political Misperceptions. Nice look about awareness vs belief. The question is how this manifests. It could be that a person who believes false information may spend a lot of time looking at opposition information and as such creates an explorer pattern? Added to the paper archive.
- Added model feedback to section 5.4.6.2
Code
- Build class(s) that uses some of the CorpusBuilder (or just add to output?) codebase to
- Access webpages based on xml config file
- Read in, lemmatize , and build bag-of-words per page (configurable max). Done. Took out DF-ITF code and replaced it with BagOfWords in DocumentStatistics.
- Write out .arff file that includes the following elements
  - @method (TF-IDF, LSI, BOW)
  - @source (loomings, the carpet bag, the spouter inn, the counterpane)
  - @title (Moby-dick, Tarzan)
  - @author (Herman Melville, Edgar Rice Burroughs)
  - @words (nantucket,harpooneer,queequeg,landlord,euroclydon,bedford,lazarus,passenger,circumstance,civilized,water,thousand,about,awful,slowly,supernatural,reality,sensation,sixteen,awake,explain,savage,strand,curbstone,spouter,summer,northern,blackness,embark,tempestuous,expensive,sailor,purse,ocean,tomahawk,black,night,dream,order,follow,education,broad,stand,after,finish,world,money,where,possible,morning,light)
- So a line should look something like
  - LSI, chapter-1-loomings, Moby-dick, Herman Melville, 0,0,0,0,0,0,0,5,0,0,7,4,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,3,3,0,0,0,0,2,0,0,0,5,0,0,3,4,2,0,0,0
  - Updated LabledMatrix2D to generate arff files.

Phil 8.24.16

7:00 – 3:00 ASRC

Finished fact checking session
Starting contributions. Finished the first pass. That was quick!
Ordered my map of the internet

Weka Tutorials
- 1 – ARFF 101
- 2 – Data PreProcessing 101
- 3 – Classification 101
- 4 – Systematic Oversampling
Just stumbled across Deeplearning4j. Looks very nice.
- Attempting to set up

Phil 8.23.16

7:00 – 4:00 ASRC

Continuing to read The Sovereign Map. While thinking about the Twitter expert paper, I thought that maybe there were mapping projects of the Wikipedia, Schema.org or dmoz.org. I found this for Wikipedia.
xkcd maps
- Online Communities
- Online Communities 2 – and a really nice version
- IPV4 2006
Paper – continued work on fact-checking/crowdsourced data
Code
- Enable slider in fitnessTest – done
- Enable reading xml config files – done. Also setting the sliders from load
- Added Dom4j utils to JavaUtils2
- get started on WEKA – Starting with Emily’s intro. So far so good! Also ran a Naive Bayes classifier on the weather data set for Aaron to compare.

Phil 8.22.16

7:00 – 2:30 ASRC

Get the Porsche towed! Scheduled for 3:00
For tracking bike cases: http://www.phreakmonkey.com/2016/08/towl-telemetry-over-opportunistic-wifi.html
Paper
- Finishing up credibility. Skimming n the Wisdom of Experts vs. Crowds – Discovering Trustworthy Topical News in Microblogs. It’s really good. Need to drill down into this one. Page rank of user lists is used to determine credible experts.
- Starting on Crowdsourcing.
Code
- Changed fitnessTest() to work with adjusting statements based on similarity/dissimilarity between two agents.
- Added slider for similarity.
- Fixed some reloading bugs
- Found Emily’s WEKA blog

Phil 8.19.16

7:00 – 3:30 ASRC

Wrote up the action items from the discussion with Thom last night. Now that I have the committee’s initial input, I need to write up an email and re-distribute. Done.
Had a thought about the initial GP model. In the fitness test, look for beliefs that are more than ATTRACTION_THRESHOLD similar and be more like them. Possibly look for beliefs that are less than REPULSION_THRESHOLD similar and make the anti-belief more like them. If a statement exists in both belief and antibelief, delete the lowest ranked item, or choose randomly.
- Working through the logic in beliefMain. I’m just *slow* today.
- Think I got it. Had to write a method ‘rectifyBeliefs’ that goes in BaseBeliefCA and makes sure that beliefs and antibeliefs don’t overlap. And it’s late enough in the day that I don’t want to try it in the full sim.
Working through the fact-checking section
Submitted ACM, ICA and UMBC reimbursement requests.

Phil 8.18.16

7:00 – 4:30 ASRC

Wrote up my recollection of the meetings with Don, Aaron and Shimei
Realized that there should be explicit affordances for confirming and avoiding as well as exploring. Added to the strawman mockup.
Adding a ‘model feedback’ section after the evaluation criteria for RQs 2, 3 and 4
Back to a simple model of GP
Starting on livePrint(String) needs a store and clear method. Done. Also added ctrl-click.
Need to make a special test case where one belief chases another. Reworking BeliefMain to support this. Done. The algorithm is good (hacked right now to transform quickly)
Meeting with Thom – he wants to see the contributions fleshed out. We discussed the larger frame of the contribution of a self-organizing trustworthy journalism, based on the idea that many people doing small implicit fact checking as a function of their information browsing patterns can be (as?) effective as a few people doing deliberate in-depth fact checking, and with the ability to scale.

Phil 8.17.16

7:00 – 7:00 ASRC

Working on the fact checking section
Trying out different approaches to the polarization algorithm – still not quite right. I think I need onscreen printing to debug
Meeting with Don and Aaron. They’d like a more sophisticated belief calculation that ages off old statements as a way of avoiding regression to the mean. Also, triangulation on group polarization in different contexts. I’m suggesting sexual selection, which I’m using already, and stock bubbles. It does look like these systems have been explored extensively. The following cover a pretty good range of years (1989 – 2016)
Meeting with Shimei. She’d like a better description of how the results of experiments are used to adjust the model.
- I think this will mean that there may be several models in the end.
  - Models of information browsing based on term collections generated in/between search sessions
  - Models of GUI use using the tool. Since the idea behind the GUI is to provide affordances that support explorers, confirmers and avoiders, using those sections that support those behaviors should provide good insight and an addition to the model that is language independent. I think this may also address Aaron’s IDE question. Though, to have the a group polarization effect, there needs to be a way of supporting a non-verbal conversation. Maybe the way the page gets set up is dynamically configured by the group use?
    - Which makes me think of MMOGs. Not much published, but there is this: Virtual Warlords An Ethnomethodological View of Group Identity and Leadership in EVE Online

Phil 8.16.16

7:00 – 5:00 VTX

Added Models of speciation by sexual selection on polygenic traits to the corpus. Really good stuff. I think I’m going to try to work out the math today.
Folded an explicit mention of the paper in to the proposal (the end of section 5.4.1)
Fixing the TODO on section 3.9 – working
Vistronix got sold to the Arctic Slope Regional Corporation
Meetings on ML and SNA. I was thinking that badges might be needed for in depth sna, but we can just track your phone…

Phil 8.15.16

7:00 – 5:00 VTX

Extending the group polarization section based on the lovely introduction in Modelling Group Opinion Shift to Extreme : the Smooth Bounded Confidence Model
Still thinking about genetic co-evolution, particularly as demonstrated in sexual selection. In S.S., we have a group that can develop extreme, ‘useless’ and even objectively dangerous traits as genetic conversation with in a group. Birds provide several examples, ranging from peacocks to the Resplendent quetzal. And I think I may have found a paper that proposes some useful models: Models of speciation by sexual selection on polygenic traits. So Group Polarization == Speciation???
From the above paper: “The evolution of mating preferences may be self-reinforcing because, once started, females are selecting not only for more extreme males but also indirectly, through the genetic correlation, for a higher intensity of mating preferences. Fisher (3) stated that the result of this positive feedback could be a “runaway process,” in which a male trait and female preferences for it both increase geometrically or exponentially with time until finally checked by severe counterselection. “
And more: “In a finite population, random genetic drift in female mating preferences produces random selective forces on males, which in turn affect mating preferences through the genetic correlation between the traits. When the line of equilibria created by genetic variance in mating preferences is unstable, random genetic drift could trigger a runaway process of sexual selection. Even when the line of equilibria is stable, evolution along it can occur rapidly through the interaction of random genetic drift with natural and sexual selection because populations starting from the same point may drift to different sides of the line of equilibria and be selected in opposing directions“

Phil 8.12.16

7:00 – 3:30 VTX

Working on diagram -done! Incorporated in proposal

Added strawman webpage design
Thinking about using some kind of GA approach to adopting a belief. Every pass you look select a belief from another agent, with probability based on nearness+behavior. You then randomly mutate your belief based on the other belief plus some randomness.
- I’m starting to really like this idea. Evolution, particularly co-evolution is very much about bubbles until the environment constrains. So what’s the breeding algorithm?
Found the jenetics.io GA library. Checking out the programmers guide

Phil 8.11.16

7:00 – 4:30 VTX

Scheduling Meetings with Aaron, Don, and Shimei.
Continuing with diagram.
Wrote up evaluation criteria for RQ3. Probably needs some fleshing out.
Integrating cluster tests! Done! Working!
Added a slider to manipulate where in the sorted list of distances the eps is taken from
In the BeliefAgentShape.getOpinions() set the belief and antibelief mechanisms the same. We get our antibeliefs from our neighbors.
Need to add an adjustable drag. Done
Still thinking about how to calculate opinions. Clearly, closer opinions should be more powerful. What about antiBeliefs? Is that a two-pass process?
4:00 Acceptance meeting.

viztales

Dimension reduction, State, Orientation, and Speed

Monthly Archives: August 2016

Phil 8.31.16

Phil 8.30.16

Phil 8.29.16

Phil 8.26.16

Phil 8.25.16

Phil 8.24.16

Phil 8.23.16

Phil 8.22.16

Phil 8.19.16

Phil 8.18.16

Phil 8.17.16

Phil 8.16.16

Phil 8.15.16

Phil 8.12.16

Phil 8.11.16