Author Archives: pgfeldman

Phil 4.6.16

7:00 – 3:30 VTX

  • Continuing Technology, Humanness, and Trust: Rethinking Trust in Technology.
    • Really nice layout of hypothesis
    • Really nice layout of methods
      • They even have the questionnaire!
    • At section 4.2. Measurement Items, page 893.
  • ———————-
  • Mercer Marketplace wants more documentation….
  • Conference call with Andy and Margarita about flags and rating. Theresa joined in at the end.
  • Rediscovering all my postgres notes
  • added a role for a non-super-user who can create databases
  • created a new googlecse2 database
  • added postgres jdbc driver
  • Aaaaand JPA works! Db created and users added. Password checking behaves!
  • Set up my postgres to accept external access by following these directions
  • Waiting for Gregg on DB access
  • Chatted with Jeremy about a RESTful interface to extract flag data. More tomorrow

Phil 4.5.16

7:00 – 4:30 VTX

  • Had a good discussion with Patrick yesterday. He’s approaching his wheelchair work from a Heideggerian framework, where the controls may be present-at-hand or ready-to-hand. I think those might be frameworks that apply to non-social systems (Hammers, Excel, Search), while social systems more align with being-with. The evaluation of trustworthiness is different. True in a non-social sense is a property of exactness; a straightedge may be true or out-of-true. In a social sense, true is associated with a statement that is in accordance with reality.
  • While reading Search Engine Agendas  in Communications of the ACM, I came upon a mention of Frank Pasquale, who wrote an article on the regulation of Search, given its impact (Federal Search Commission? Access, Fairness, and Accountability in the Law of Search). The point of Search Engine Agendas is that the ranking of political candidates affects people’s perception of them (higher is better) This ties into my thoughts from March 29th. That there are situations where the idea of ordering among pertinent documents may be problematic and further that how users might interact with the ordering process might be instructive.
  • Continuing Technology, Humanness, and Trust: Rethinking Trust in Technology.
  • ————————
  • Added the sites Andy and Margarita found to the blacklist and updated the repo
  • Theresa has some sites too – in process.
  • Finished my refactoring party – more debugging than I was expecting
  • Converted the Excela spreadsheet to JSON and read the whole thing in. Need to do that just for a subsample now.
  • Added a request from Andy about creating a JSON object for the comments in the flag dismissal field.
  • Worked with Gregg about setting up the postgres db.

Phil 4.4.16

7:00 – 2:30 VTX

  • Happy perfect square day.
  • Continuing Technology, Humanness, and Trust: Rethinking Trust in Technology.
    • Page 833: Ability/competence is the belief that a person has the skills, competencies, and characteristics that enable them to have influence in some specific domain. Benevolence is the belief that a person will want to do good to the trustor aside from an egocentric profit motive. Integrity is the belief that a person adheres to an acceptable set of principles.
    • Page 833: It is not as clear, however, whether technologies have volition or can make ethical decisions without being pre-programmed to do so. Because of this issue, some researchers have developed alternative trust belief constructs that do not assume technologies have volition or ethical decision making capability. For example, Lippert and Swiercz (2005) use utility, reliability, and predictiveness, and Söellner, Hoffman, Hoffman, Wacker, and Leimester (2012) use performance, process, and purpose to represent technology-trusting beliefs.
    • Page 833: We adopt McKnight et al.’s (2011) conceptualization of system-like trust in a technology’s reliability, functionality, and helpfulness to measure trust in technology because these three attributes were directly derived from, and are corollaries to, the human-like trust attributes of integrity, competence, and benevolence
  • The discussion on affordances started me thinking about SERPs again. This is kind of related but almost more basic – how users search within documents using find: The Myth of Find: User Behaviour and Attitudes Towards the Basic Search Feature. and the documents that cite (WRT document triage, etc) are also pretty interesting looking.
  • ———————————
  • Starting up the computers after the weekend at work today, and Skype For Business doesn’t let me log in. Says my email address is bad. And it’s not.
  • Got the PoiOptionalStrings object integrated and running.
  • Realized that I need to have a generalized ‘OptionalContent’ class. generalizing from above.
  • Need to see how JQL works with all this new stuff now.
    • Fancy JPQL query of the day:
      @NamedQuery(name = "PoiObject.getFromOptionalStrings", query = "SELECT p from poi_object p, IN (p.optStringSet) os WHERE os.name = :name AND os.value = :value"),
    • Should I be doing this as a template? If so, what does the table get named?

Phil 4.1.16

7:15 – 4:15 VTX

  • Had a bunch of paperwork to do for my folks. All handled now?
  • Continuing What is Trust? A Conceptual Analysis and An Interdisciplinary Model. Done
    • Disposition to Trust. This construct means the extent to which one displays a consistent tendency to be willing to depend on general others across a broad spectrum of situations and persons
      • a general propensity to be willing to depend on others.
      • does not necessarily imply that one believes others to be trustworthy
      • only has a major effect on one’s trust-related behavior when novel
        situations arise, in which the person and situation are unfamiliar
      • Disposition to Trust has two subconstructs, Faith in Humanity and Trusting Stance
        • Faith in Humanity means one assumes others are usually upright, well-meaning, and dependable.
        • Trusting Stance means that, regardless of what one assumes about other people generally, one assumes that one will achieve better outcomes by dealing with people as though they are well-meaning and reliable
      • Because Faith in Humanity relates to assumptions about peoples’ attributes, it is more likely to be an antecedent to Trusting Beliefs (in people) than is Trusting Stance. Trusting Stance may relate more to Trusting Intention, which, depending on the situation, is probably not based wholly on beliefs about the other person.
    • Institution-based Trust means one believes the needed conditions are in place to enable one to anticipate a successful outcome in an endeavor or aspect of one’s life
      • This construct comes from the sociology tradition that people can rely on others because of structures, situations, or roles  that provide assurances (Affordances???) that things will go well
      • Institution-based Trust has two subconstructs, Structural Assurance and Situational Normality.
        • Structural Assurance means one believes that success is likely because guarantees, contracts, regulations, promises, legal recourse, processes, or procedures are in place that assure success
        • Situational Normality means one believes that success is likely because the situation is normal or favorable. (I think that this comes from very primitive parts of our brains. It can be observed in many animals and may be one of those things that separates infant and adult behavior. If you trust too much, you are likely to get eaten..?)
          • Situational Normality means that a properly ordered setting is likely to facilitate a successful venture. When one believes one’s role and others’ roles in the situation are appropriate and conducive to success, then one has a basis for trusting the people in the situation.
          • likely related to Trusting Beliefs and Trusting Intention. A system developer who feels good about the roles and setting in which they work is likely to have Trusting Beliefs about the people in that setting.
    • Trusting Beliefs means one believes (and feels confident in believing) that the other person has one or more traits desirable to one in a situation in which negative consequences are possible.
      • We distinguish four main trusting belief subconstructs, while recognizing that others exist.
        • Trusting Belief-Competence means one believes the other person has the ability or power to do for one what one needs done.
        • Trusting Belief-Benevolence means one believes the other person cares about one and is motivated to act in one’s interest.  A benevolent person does not act opportunistically.
        • Trusting Belief-Integrity means one believes the other person makes good faith agreements, tells the truth, and fulfills promises
        • Trusting Belief-Predictability means one believes the other person’s actions (good or bad) are consistent enough that one can forecast them in a given situation
    • Trusting Intention means one is willing to depend on, or intends to depend on, the other person in a given task or situation  with a feeling of relative security, even though negative consequences are possible
      • Trusting intention subconstructs include Willingness to Depend and Subjective Probability of Depending.
        • Willingness to Depend means one is volitionally prepared to make oneself vulnerable to the other person in a situation by relying on them.
        • Subjective Probability of Depending means the extent to which one forecasts or predicts that one will depend on the other person.
      • Trusting Intention definitions embody five elements synthesized from the trust literature.
        1. The possibility of negative consequences or risk is what makes trust important but problematic.
        2. A readiness to depend or rely on another is central to trusting intention.
        3. A feeling of security means one feels safe, assured, and comfortable (not anxious or fearful) about the prospect of depending on another. Feelings of security reflect the affective side of trusting intention.
        4. Trusting intention is situation-specific.(???? why? Examples?)
        5. Trusting intention involves willingness that is not based on having control or power over the other party. Note that Trusting Intention relates well to the system development power literature because we define it in terms of dependence and control.
    • Another limitation relates to Whetten’s (1989) recommendation that Who and Where conditions should be placed around models.  Whereas we have assumed that the model applies to any kind of relationship between two people (Who) in any situation (Where), this may not be the case. Empirical research is needed to better define the boundary conditions of the model.
  • Starting Technology, Humanness, and Trust: Rethinking Trust in Technology, also by D. Harrison McKnight
    • Page 881 (Basic?) Social Trust: human-like trust constructs of integrity, ability/competence, and benevolence that researchers have traditionally used to measure interpersonal trust.
    • Page 881 (Basic?) System Trust: system-like trust constructs such as reliability,
      functionality, and helpfulness
    • Page 881. First, we hypothesize that technologies can differ in humanness. Second, we predict that users will develop trust in the technology differently depending on whether they perceive it as more or less human-like, which will result in human-like trust having a stronger.  influence on outcomes for more human-like technologies and system-like trust having a stronger influence on outcomes for more system-like technologies. (Cite Kate Bush Deeper Understanding 1989)
    • Here’s the beginning of a thought: What is self-trust? Just thinking about it, it seems to be a sense of the reliability of my future self to do what my present self desires. That’s different from Social Trust, which in the literature is more about integrity, competence and benevolence. It seems closer to system trust in that reliability and functionality are more significant. There are things that I trust that I will do tomorrow: Get up, go to work, exercise if the weather is good enough. But there are also things that I can’t trust myself to do. My future self will almost certainly eat more calories than my current self desires. My grocery shopping behaviors are based around this lack of trust. There are items that I do not bring into my house because I know that they will get eaten (I was going to write that I know that my will is weak around chocolate, but that’s not really it. Or at least, that’s not all of it, or maybe even most of it..). Because (interactive?) information technology is more like a self-amplifier, I wonder if what we think of system trust can be thought of as the trust in ourselves, but the part of ourselves that is more reliable and trustworthy. A search tomorrow will work as well as a search today. Maybe better. And the effectiveness of that search reflect somehow my ability to interact effectively with the external world? This is starting to sound a lot my point of view that living a life in prolonged contact with a compiler changes you in profound ways.
    • So what would that mean? I think it’s a reasonable hypothesis to change search results from focusing on pertinence to revelation. This does not mean that the ‘Ten Blue Links’ need to go away. But it does imply that peripheral information could be just as important, so that a less casually polarized worldview might be developed.
  • Finishing up the CSE version control setup – need to write up the process for confluence – done.
  • Since I need to be able to now read in the Excella data, I was going to look to Gregg’s ontology as a way to determine the table structure. But it’s way too big and nested. In a Person’s description includes a reference to a complete organization, activities, charges, arrests, and it doesn’t even have room for nice things yet (will we have co-authors?). Anyway, To avoid this, I’m going to have basic person characteristics with an associated  StringMaps, NumMaps and DateMaps. Anything that’s not recognized as a column gets added to that. Need to see how persistence will work with that in some testing first.
  • Got the code working. JPA 2 says you should be able to build a map entirely without annotations, but I couldn’t get it to work. Modified JsonLoadable so that it goes through the Json Object and anything that is not a member of the current class is added to HashMaps of PoiOptionalStrings. It should be very straightforward to extend to number and date types. Probably worth doing?

Phil 3.31.16

7:00 – 4:00 VTX

  • Starting on What is Trust? A Conceptual Analysis and An Interdisciplinary Model.
  • Starting to set up the key and sitelist repo
  • It turns out that you can export xml configuration of the CSE and the annotations for that CSE. From webapps.stackexchange.com:
  • We can only have a total of 5k annotations. That’s not a problem – yet.
  • All the files are set up and transferred. New search engines are
    ONLY_COM = "cx=006834724223295726872:k0pebqyqa8m"
    ONLY_EDU = "cx=006834724223295726872:gded1dvdt94"
    ONLY_GOV = "cx=006834724223295726872:ydjrxqpedqq"
    ONLY_ORG = "cx=006834724223295726872:lsgxnigrfme"
    ONLY_US = "cx=006834724223295726872:dw0n0_hai6s"
  • Found a more credible source than boardactions.com (possibly just for New York state? But it has VA records..). Anyway, not only does it have a nice listing, it also has a pdf of the relevant board order. Which means we can build a good legal languagge model. Very nice: http://w3.nyhealth.gov/opmc/factions.nsf/physiciansearch?openform
  • Need to rethink the PoiObject class to be more general.

Phil 3.30.16

7:00 – 3:30 VTX

  • So I was starting The spreading of misinformation online, but it was discussing more of the same. This feels a lot like saturation. My thoughts are coalescing around the idea of the difference between trusted and trustworthy interactions in computer-mediated systems. The anonymous citizen journalism concept becomes a unifying thought experiment that can be used to show the potential strengths and weaknesses of particular concepts.
  • The last piece I think I need is what is trust from a developmental perspective. The initial google scholar search of “trust development” didn’t bring up exactly what I want (object permanence maybe?), but it did provide this: Effects of four computer-mediated communications channels on trust development. The citations provided this: The mechanics of trust: A framework for research and design In International Journal of Human – Computer Studies 2005 62(3):381-422. This one seems different enough to look through carefully.
  • Ok, I think I found what I’m looking for: The ‘like me’ framework for recognizing and becoming an intentional agent. I think I’ll read The Mechanics of Trust first, them ‘like me’ second.
  • Starting The mechanics of trust: A framework for research and design.
    • It does seem to be focused on how effectively a system transmits(?) cues that support well-placed trust. I think that we tend to confuse the trust we place in the channel vs the trust we place in the entity at the other end of the channel. And these lines are not clearly drawn:
      • In IR, we trust that the search engine is providing us with the relevant documents we seek. People trust Google more than Bing because the results are more pertinent. Does this trust carry over into the documents retrieved? Probably, though I can’t find a study that does this. (It would be pretty easy to do with the Google Custom Search Engine API + noise)
      • In GPS the trust in the system is very high, even though it is synthesizing information from retrieved and processed sources (maps, DTED, etc) that could in turn be wrong. Here though, the entity we are interacting with is clearly the GPS, not the mapmakers.
      • Skype, on the other hand is essentially transparent when it’s working right. And that ‘working right’ is a kind of conditional trust in the system that has no effect on out evaluation of the trustworthiness of the person that we are interacting with at the other end of the channel.
      • So what does that mean in the context of our imaginary citizen journalists?
        • They are anonymized. We have no names. We probably don’t even have the exact words as written. These are the same issues that newspapers face when dealing with anonymous sources. And in this case, it’s reasonable to assume that the newspaper is the entity that is attempting to get us to place our trust in it.
          • Reporters as proxies
          • Additional perspectives – images, videos etc.
          • Stories that match reader’s experiences, so that trust can be evaluated.
          • What else?
    • One of the cited papers is What is Trust? A Conceptual Analysis and An Interdisciplinary Model. Quickly scanning through it, I found this on page 830-831: Garfinkel found in natural experiments that people don’t trust others when things “go weird,” that is, when they face inexplicable, abnormal situations. For example, one subject told the experimenter he had a flat tire on the way to work. The experimenter responded, “What do you mean, you had a flat tire?” The subject replied, in a hostile way, “What do you mean? What do you mean? A flat tire is a flat tire. That is what I meant. Nothing special. What a crazy question!” At this point, trust between them broke down because the illogical question produced an abnormal situation. 
      • I think that this is core. Trust is tied to normalicy, and probably builds out from there. 
  • Prepping for the sprint planning session.
  • As far as the OMG work, I think the following
    • Set up version controlled system for Google CSE keys and url exclude lists, including a way to submit an url for inclusion in an exclusion list.
    • Add PDF parsing and storing to Crawl Service
    • Add MSWord parsing and storing to Crawl Service
    • Add MSExcel parsing and storing to CrawlService
    • Add backlink calculation and storing to CrawlService – this is looking like a good way to increase pertinence within a return, particularly with respect to the matched-name wrong-person condition.
    For the machine learning work
    • Get DB up, accessible and on a backup schedule
    • Set up deployment infrastructure for Rating App.
    • Small scale test of Rating App, with refinement and development of manual
    • Accumulate corpus
    • Test corpus in WEKA
      • Translator from DB to WEKA format
      • Construction of training data sets
      • Tests and evaluations
      • Report
    As far as my research, it’s more vague, so I’m just going to free-associate a bit here.
    First, I just need to write up the proposal, and since that’s where my head is at right now, it’s hard to come up with specifics. One of the overall goals is to build a search result interface that ‘nudges’ users from bubble patterns into star patterns.
    Secondly, it’s my current belief is that this interface could be along the lines of the word cloud plus slider display interface I’ve discussed with you before. On the back end, there’s a topic extraction/document classification system that builds a graph database that is used for:
    • In my case, placing the search results in a context of discussion vs information (DvI) along the axis’ defined by the topics in the search results. The user can select a topic (which then shows the DvI graphs and where the current search falls on those spectrums). Once a topic has been selected, the user can adjust the weights on subsequent topics, causing the result list to reorder and the position on the DvI graph to move.
    • In EIT’s case (1) predictions and alerts and (2) for the user interface [and I think this can be pitched as the gamified display]. For example, I think there are many cases where conditions for making a judgment (medical best practices or behavior related) may be ambiguous. Using such an interface could allow a user to explore and resolve such ambiguity. The nice thing is that in the EIT case, the data is (potentially) more structured and granular, allowing a more fluid analysis (e.g. a bad manager indirectly affecting performance or combined conditions such as opiate addiction + newborn).

Phil 3.29.16

7:00 – 4:00 VTX

  • Continuing The Law of Group Polarization – done!
    • Group polarization: A critical review and meta-analysis. Looks like a more rigorous version of TLoGP. It’s available in the library as a PDF if needed.
    • Page 194: In short, the external materials and expert panels shift the argument pool available to the deliberators and are also likely to have effects on social influence. 
      • The way I read this, external trusted sources can shift the poles if they are incorporated into the discussion. Think about how a GPS affects wayfinding arguments. If search interfaces are modified such that they show the range of opinion and the position of the ‘Ten Blue Links’ within that range then, given its high system trust, we might expect individuals to adjust their belief trajectories based on their understanding of the pole’s position given the larger information landscape.
    • Page 195: There are large lessons here about appropriate institutional design for 
      deliberating bodies. Group polarization can be heightened, diminished, and
      possibly even eliminated with seemingly small alterations in institutional
      arrangements.

      • Now substitute system for institutional. Although I would contend that search is an institution, given its reach. Also, presenting a better mechanism for placing the returned information in a context allows for ‘nudging‘ cues, which seem to work better than more ‘authoritarian’ systems.
  • Starting The spreading of misinformation online
  • Before continuing on backlinks, I spent some tom,e looking at the Microsoft Oxford system. LUIS is interesting, though I’m not sure exactly how to take advantage of it yet. I think this can be a chatbot construction kit? The WebLM system looks more immediately useful, kinda like AlchemyNLP. Maybe cheaper? You need a key, which you get here. And this is different from the Academic Knowledge API, which is also an Oxford project, but not listed on the Oxford site.
  • Got the SrBacklinkObject persisting
  • Adding backlinks to the ResultItemObject2 class. Whoops! Forgot that you have to set both relationships in a many-to-one:

    curResult.addBacklink(bo);
    bo.setResultObj(curResult);

  • needed to split off the protocol from the curResult.link and add it back to the curResult.displayLink to get backlinks
  • Done and working. Kinda like the fallback strategy.

Phil 3.28.16

7:00 – 2:30 VTX

  • Took some notes on the MS Tay fiasco yesterday. Need to ping Peter Lee and see if I can get anywhere talking about Group Polarization Theory. Done
  • Microsoft Research Open source for academics
  • Microsoft Language Understanding Intelligent Service (beta) LUIS
  • Veracity Roadmap:Is Big Data Objective, Truthful and Credible?
  • Continuing The Law of Group Polarization
    • Page 193: The  constraints  of  time  and  attention  call  for  limits  to heterogeneity; and-a separate point-for good deliberation to take place, some views  are properly placed off the  table, simply because  time  is  limited and they are so  invidious,  implausible,  or both.  This  point might seem  to  create  a  final conundrum: To know what points of view should be represented in any group deliberation,  it  is  important  to  have  a  good  sense  of  the  substantive  issues involved,  indeed a  sufficiently  good sense  as to  generate judgments about what points of view must be included and excluded. But if we already know that, why should we  not proceed directly to  the  merits?  If we  already  know that,  before deliberation occurs, does deliberation have any point at all?
    • The answer is that we often do  know enough to  know which views  count as reasonable,  without  knowing  which  view  counts  as  right,  and  this  point  is sufficient to allow people to  construct deliberative  processes that should correct for the most serious problems potentially created by group deliberation. What is necessary is not to allow every view to be heard, but to ensure that no single view is so  widely heard,  and reinforced,  that  people  are  unable  to  engage  in  critical evaluation of the reasonable competitors. 
    • At E. THE  DELIBERATIVE  OPINION POLL: A  CONTRAST 
  • Now that I’ve gotten the queries behaving, working on the SemRushIO and BacklinkObject
    • Added configuration file
    • Nice to know. If SemRush finds nothing, it returns
      ERROR 50 :: NOTHING FOUND so, we can do two passes; if the specific result returns nothing, we can go to the root.
    • Built up the SemRush base class based on the JsonLoadable
    • Built the SrBacklinkObject
    • Loading the object successfully.
  • Fika

Phil 3.25.16

7:30 – 3:30 VTX

  • Saw The Who last night and got into bed after 1:00am. Sleeeeeeeeeepy.
  • Still browsing the team sensemaking paper over breakfast. There are some very similar goals. In group polarization, the awareness of where the boundaries of the discussion help to determine how the average viewpoint moves. Current search returns no context on where the results lie on those axis. Translucency in search could allow users to see ‘meta information’ about the search results that they have and where the results lie in that information space, while also providing a means to adjust the position in that space in a way that is not intrusive. Or something like that.
  • Group polarization works on chatbots. There is something really interesting here about measuring polarization. Not quite sure what exactly yet.
  •  Continuing The Law of Group Polarization
    • Phrase of the day ‘Skewed Argument Pools
    • Page 187: And shifts toward more in the way of  enclave deliberation will increase society’s aggregate “argument pool,” and hence enrich the marketplace of ideas, while also increasing extremism, fragmentation, hostility, and even violence.
    • First, it’s a neat thought to think of an interwoven pattern of Bubbles and Stars. Second, I think the continuum to be most interested is the one from most bubble-ish to most star-ish for a given topic. Now that, in and of itself is a big document classification/topic extraction problem, but I would submit that being able to visualize what that search result could look like could help to produce useful work in that direction. And there are proxies that can be used intact, such as papers. Bubbles are papers and topics that point at each other a lot, for example.
    • Page 187: It is important to ensure social spaces for deliberation by like minded persons, but it is equally important to ensure that members of the relevant groups are not isolated from conversation with people having quite different views.
    • ^^^Translucency^^^
    • Page 187: The most important point here is that those who emphasize the ideals associated with deliberative democracy tend to emphasize its preconditions, which include political equality, an absence of strategic behavior, full information, and the goal of “reaching understanding (pp. 52-94).”
    • At Page 189: B. THE VIRTUES  OF HETEROGENEITY
  • Scrum – some big changes coming?
  • 11:00 all hands
  • Working on backlink object
    • Started query generator, SemRushIO.java. After some hiccups in getting the format of the query results right, the generation part is working. The reader should be pretty straightforward, though a little more complex/brittle than reading JSON. Here’s an example return:
      page_score;source_title;source_url;target_url;anchor;external_num;internal_num;first_seen;last_seen
      1;Visit AZ – Vacation Information for Arizona, the Grand Canyon State | Arizona Office of Tourism;http://visitarizona.com/places-to-visit/northern-arizona/monument-valley;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;29;116;1452309348;1452309348
      1;"""New Concertina Wire"" Fencing Around Closed Nevada Prison And Guard In Tower - Are Closed Prisons Going To Be Used As ""Fema Camps""? - Veteran Who Took Photos Followed By White Van";http://allnewspipeline.com/Veteran_Notes_New_Wire_Fencing_Nevada.php;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;14;23;1443887704;1452368321
      1;Evening Meeting. - Kirkham & Rural Fylde;http://rotary-ribi.org/clubs/page.php?ClubID=1161&PgID=514041;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;170;59;1444332718;1457694435
      1;Visit AZ – Vacation Information for Arizona, the Grand Canyon State | Arizona Office of Tourism;http://arizodiac.com/places-to-visit/northern-arizona/monument-valley;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;29;115;1447807529;1457622986
      1;Visit AZ – Vacation Information for Arizona, the Grand Canyon State | Arizona Office of Tourism;http://www.arizodiac.com/places-to-visit/northern-arizona/monument-valley;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;29;113;1454861531;1457744933
      1;UNIST - Sajun.org;http://sajun.org/index.php?diff=prev&oldid=2901321&printable=yes&title=UNIST;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;60;48;1448012851;1454851002
      1;1 عدد تمبر جان مون نت - دیپلمات - جمهوری فدرال آلمان 1977;http://tambrestan.com/-/5689-7-1983.html;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;27;1308;1452387237;1452387237
      1;About Puslinch Lake - Calmwaters Cottage & Fly Fishing;http://calmwaterscottage.ca/1337-2/;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;42;12;1454519315;1457440491
      1;PEABODY 100 - demetrioskritikos.com;http://demetrioskritikos.com/peabody100/;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;32;16;1454518621;1454518621
      1;PEABODY SPORTS - demetrioskritikos.com;http://demetrioskritikos.com/peabody-sports/;https://en.wikipedia.org/wiki/Geographic_coordinate_system;Coordinates;21;16;1454518773;1454518773
    • SemRushIO will create the backlink object
      • Calls the service, using a default or read-in key
      • Is fired the same time the page source is loaded (in GuiVars.loadNextPage)
      • Creates a BackLinkObject data from SEMRush includes:
        • page_score
        • source_title
        • source_url
        • target_url
        • anchor (the text in the source)
        • external_num
        • internal_num
        • first_seen
        • last_seen
    • ResultItemObject2 changes
      • Set of BackLinkObjects

Phil 3.24.16

7:00 – 10:00, 11:00 – 3:00 VTX

  • Was going to continue The Law of Group Polarization, but got sucked into the following. On a related note, I peeked at the group sensemaking paper from CSCW and realized that they are dealing with group polarization issues.
  • Soooooooooo, I went back to check the links that the google search “link:http://dotearth.blogs.nytimes.com” brings up. In looking at the pages (mostly other blog-like sites), the link to dotearth is almost always in the blogroll list that’s off to the side on many of these sites. For example look at the lower right on climatecentral.org, and you’ll see the link.
  • I think this makes sense. These are the generic pages that point to other generic pages. So I went back to Google and searched for ‘Paul Krugman blog‘ and then looked for the oldest post that I could find in the result, which was this one from January 16. Top ratings means that it has to be linked to a lot, so I tried “link:krugman.blogs.nytimes.com/2016/01/23/how-to-make-donald-trump-president/“. Alas, that doesn’t return anything, though “link:krugman.blogs.nytimes.com” does.
  • So I went to the the Wikipedia most referenced pages page. Top ranked was Geographic coordinate system, which has over 600k inbound links. But –
  • Apparently, this is Google being coy. Searching for backlinks can be expensive. Moz has plans that start at $500/month. Bing also seems to have something with an API. Starting to check that out.
    • Added philfeldman.com to my bing webmaster profile. Had to add BingSiteAuth.xml to the site.
    • Nope, looks like it’s just the verified pages
  • Looking at SEMrush. Pretty straightforward and $15 buys you 7,500 lines of results.
    • Here’s the REST-ish API
    • Here’s the first format I’ve tried:
      http://api.semrush.com/analytics/v1/?key=xxxxxxxxxxxxxxxxxxxxxx&target=boardsanctions.com/&type=backlinks&target_type=root_domain&display_sort=page_score_desc&display_limit=10
    • The first thing I tried out was on my angular blog entry, and this is what comes back:
      page_score;source_title;source_url;target_url;anchor;external_num;internal_num;first_seen;last_seen
      1;Philip Feldman;http://philfeldman.com/resume.html;https://phifel.wordpress.com/;blog;7;2;1435698192;1452178691
      1;Phil Feldman Resume (WebGL);http://philfeldman.com/;https://phifel.wordpress.com/;My Primary Blog;15;4;1424207638;1452178080
      1;Phil Feldman Resume (WebGL);http://www.philfeldman.com/;https://phifel.wordpress.com/;My Primary Blog;15;4;1435689880;1452178091
    • Pretty good! Very clean. Then I tried boardsanctions.com:
      page_score;source_title;source_url;target_url;anchor;external_num;internal_num;first_seen;last_seen
      0;Plastic Surgery - Avoiding The Nightmare Case - Social Gaming Wiki FR;http://fr.socialgamingwiki.com/index.php/Plastic_Surgery_-_Avoiding_The_Nightmare_Case;http://boardsanctions.com/;Georgia Medical Board Actions;4;32;1454582397;1454582397
      0;Plastic Surgeon - Advice To Allow You Choose – TFC;http://www.tvfc.de/index.php?printable=yes&title=Plastic_Surgeon_-_Advice_To_Allow_You_Choose;http://boardsanctions.com/;Doctors to avoid;2;28;1452634501;1452634501
      0;Finding A Plastic Surgeon In Your Area – TheorieWiki;http://theoriewiki.org/index.php?oldid=8721&title=Finding_A_Plastic_Surgeon_In_Your_Area;http://boardsanctions.com/;Ohio Medical Board Actions;4;40;1451297137;1451297137
      0;How To Prepare For Your Breast Augmentation – TheorieWiki;http://theoriewiki.org/index.php?title=How_To_Prepare_For_Your_Breast_Augmentation;http://boardsanctions.com/;Doctor Complaints;4;33;1444916428;1453210146
      0;Finding A Plastic Surgeon In Your Area: Unterschied zwischen den Versionen – TheorieWiki;http://theoriewiki.org/index.php?diff=8723&oldid=8721&title=Finding_A_Plastic_Surgeon_In_Your_Area;http://boardsanctions.com/;Florida Medical Board Sanctions;4;39;1457400844;1457400844
      0;Benutzer:FelicaAngelo06 – TheorieWiki;http://theoriewiki.org/index.php?title=Benutzer%3AFelicaAngelo06;http://boardsanctions.com/;NC Medical Board Actions;5;35;1448297485;1458043290
      0;Benutzer:FelicaAngelo06 – TheorieWiki;http://theoriewiki.org/index.php?title=Benutzer%3AFelicaAngelo06;http://boardsanctions.com/;http://boardsanctions.com/;5;35;1448297485;1458043290
      0;Benutzer:FelicaAngelo06 – TheorieWiki;http://theoriewiki.org/index.php?printable=yes&title=Benutzer%3AFelicaAngelo06;http://boardsanctions.com/;NC Medical Board Actions;5;30;1456257160;1457931212
      0;Benutzer:FelicaAngelo06 – TheorieWiki;http://theoriewiki.org/index.php?printable=yes&title=Benutzer%3AFelicaAngelo06;http://boardsanctions.com/;http://boardsanctions.com/;5;30;1456257160;1457931212
      0;Finding A Plastic Surgeon In Your Area – TheorieWiki;http://theoriewiki.org/index.php?title=Finding_A_Plastic_Surgeon_In_Your_Area;http://boardsanctions.com/;Florida Medical Board Sanctions;4;33;1443858328;1457622408
    • Note that it’s a good thing I’m limiting the results to 10! The second thing to notice is every one of these links is SEO garbage. This one is my favorite. Now, this is ordered according to rank (however that’s calculated) and maybe there are better ways to order the results, but this does make me nervous about using backlinks without some checking. Maybe cosine similarity?
    • So the last thing, if we want to spend some money is to use the common crawl for backlinks. Not sure if it would make any difference, but there would be more insight. As an example, there’s wikireverse which did exactly that.

Phil 3.23.16

7:00 – 4:00 VTX

  • Continuing The Law of Group Polarization. Slow going. Mostly because there is so much good stuff.
    • Overall, I’m arguing that viewing Group Polarization through the lens of Connectivism, we can see how networked communities are often driven into bubbles and that property can be used to evaluate the trustworthiness of an information source. This has implications for design at different levels of abstraction.At the UI level, it implies that giving a user more interactive control over the makeup of their news feed can inform them about the range of diversity in views about a particular topic and where their feed falls on that spectrum. Because this implies the presence of a larger group, it it is possible to provide the user with the means (through direct manipulation) to interactively adjust the makeup of their news feeds and expose them to more trustworthy sourcesAt the document level, it imples that a mix of lexical and link analysis should be sufficient to allow for indexing a document on a trustworthiness scale.At the network level, it implies that the relationships of documents within a network should be sufficient to place documents on a trustworthiness scale.
    • Page 182 – And when one or more people in a group know the right answer to a factual question, the group is likely to shift in the direction of accuracy.
      • This is the effect of the Star Pattern. So how does someone find the right answer?
    • Looking around for automated ways of doing Delphi Method
    • Page 184: Group polarization has particular implications for insulated “outgroups” and (in the extreme case) for the treatment of conspiracies. Recall that polarization increases when group members identify themselves along some salient dimension, and especially when the group is able to define itself by contrast to another group. Outgroups are in this position-of self-contrast to others-by definition. Excluded by choice or coercion from discussion with others, such groups may become polarized in quite extreme directions, often in part because of group polarization. It is for this reason that outgroup members can sometimes be led, or lead themselves, to violent acts
    • Stopped at pg 186 – III. DELIBERATIVE TROUBLE.
  • Looking at IBM Bluemix briefly in case we have to go down that route
    • Registered.
    • Chrome, or at least the way I set up Chrome and bluemix do not get along. trying Firefox. Still not great, but better.
    • Since it looks like we’re not going to do wacky mash-ups, back to work on the rating app.
  • Hit the MySql max_packet limit. Changed to 4M. Other follow-on changes:
## of RAM but beware of setting memory usage too high
innodb_buffer_pool_size = 64M
innodb_additional_mem_pool_size = 8M
## Set .._log_file_size to 25 % of buffer pool size
innodb_log_file_size = 20M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 50

Phil 3.22.16

7:00 – 7:30

  • I think I want to install this??? https://github.com/dthree/cash
  • Still thinking about social trust and system trust. Today, Brussels was attacked by ISIS or ISIS sympathisers. An official when interviewed said that Belgium had been ‘prepared’ and was ready. No one was surprised that one group of people would try to kill another group of people. In other news, the iPhone from another set of killers was unflaggingly resisting attempts to unlock it. In many ways, every day (ironically because of the news) we are informed how horrible and untrustworthy people can be. And at the same time, every day, our machines generally do what they are supposed to do, and when looked at over time, get better at it. Is it any wonder that we have high system trust and low social trust (or high cynicism?).
  • This isn’t really new. Music can be pure. Musicians can be awful.
  • Continuing The Law of Group Polarization.
    • Page 181: Thus when the  context emphasizes  each  person’s  membership  in  the  social  group  engaging  in deliberation,  polarization  increases.  This finding  is  in  line  with  more  general evidence  that social  ties  among  deliberating  group  members  tend  to  suppress dissent  and  in  that  way  to  lead  to  inferior  decisions.
      • So a website with a strong point of view (Breitbart or Moveon or PETA for example) should have less variance among commenters, while more balanced should have more variance? Data may be here: http://www.journalism.org/2014/10/21/political-polarization-media-habits/. I would think that these could be compared against edit histories on Wikipedia for a more Star-like pattern?
    • Persuasive Arguments Theory (PAT)?
    • Interaction with others increases decision confidence but not decision quality: evidence against information collection views of interactive decision making.
      • So in this case, the paper was scanned and protected, so I couldn’t do OCR on it. The workaround was to export as jpg, then open the first jpg in Acrobat DC, select Tools->organize pages then Inset->from file, shift-click all the pages, select ‘insert after’ and read them in. Once that’s done go to ‘Enhance scans’ and run OCR on the file.
      • Anyway, the paper looks interesting, with quantitative support. I wonder why all this research seems to be focussed in the 1990s through early 2000s? The Wikipedia page on Group Polarization has a wider date range.
  • Working on the rating app. Worried that jsoup doesn’t seem to be pulling down pages that well
    • Got a 403 on https://stackoverflow.com/questions/10716828/joptionpane-showconfirmdialog using URL.openStream, but it works on Google.
    • Going to try a more web-scapey pattern. Checking out Jaunt.
  • Changing the selection lists
  • Adding a check to see what ratings have changed as a user check – Done
  • Need to start on the backlinks.
  • Meeting with Aaron about next steps based on the

Phil 3.21.16

7:30 – 4:30 VTX

  • Class today
    • Two things – First, I wonder if we as researchers need to use the GSA standards for storing PII:
      • Encryption. Encrypt, using only NIST certified cryptographic modules, all data on mobile computers/devices carrying agency data unless the data is determined not to be sensitive, in writing, by your Deputy Secretary25 or a senior-level individual he/she may designate in writing;
      • Control Remote Access. Allow remote access only with two-factor authentication where one of the factors is provided by a device separate from the computer gaining access;
      • Time-Out Function. Use a “time-out” function for remote access and mobile devices requiring user re-authentication after thirty minutes of inactivity;
      • Log and Verify. Log all computer-readable data extracts from databases holding sensitive information and verify each extract, including whether sensitive data has been erased within 90 days or its use is still required; and
      • Ensure Understanding of Responsibilities. Ensure all individuals with authorized access to personally identifiable information and their supervisors sign at least annually a document clearly describing their responsibilities.
    • Second, basically every security measure we take in a closed network provides a value judgement to the owner of the network. But our high system trust prevents us from seeing that when we untag a picture of us doing something embarrasing, we’re essentially saying to Facebook ‘this is a guilty pleasure‘.
  • Taxes this evening
  • In Emergencies, Should You Trust a Robot?
  • Starting The Law of Group Polarization. And in a semi-related thought, I wonder if flocking behavior can be used to describe this kind of behavior along dimensions of belief???
    • Cass R. Sunstein
    • Wacky. The text was unrecognizable so the quotation manager wouldn’t work. Wound up exporting the PDF to jpg, then using the ‘combine files’ tool to import all the pages, combining them into one document again then running OCR on that. And this was the official file from the Journal of Political Philosophy, so go figure.
  • Did some shepherding of the Crawl configuration. Gregg was sending 4 CSEs.
  • Finished up the CSEkiller. Wrote up documentation and added it to the CommonComponents.
  • Back to getting the rating app working.
  • Changing Provider to PersonOfInterest
  •  Need to add ‘Personal’, ‘Educational’ and ‘Other’ to sources

Phil 3.18.16

7:30 – 4:00 VTX

  • Continuing Presenting Diverse Political Opinions: How and How Much – Finished. Wow.
    • Some subjects wrote that they specifically did not want a list of solely supportive  items and that they want opinion aggregators to represent a fuller spectrum of items, even if that includes challenge.
      • So here I’m wondering if interactivity in presenting the contents of the stories could be used as a proxy for these kinds of answers. Consistently setting values one way could mean more bubbly, while more change could imply star.
    • BubbleVsStarBehaviorMaybe
    • BubbleVsStarBehaviorMaybe2
    • In a plot of the percent agreeable items and satisfaction (Figure 5, top), the slope of the fit lines for the two list lengths follow each other quite closely, suggesting that count does not matter. When we plot the number of agreeable items (Figure 5, bottom), we can see a clear divergence. Furthermore, 2 agreeable items out of a total of 8 is superior to 2 agreeable items out of a total of 16(t(7.373) = 3.3471,  p<0.05). Clearly, the  presence of challenging items, not just the count of agreeable items,drives satisfaction. We conclude that the remaining subjects as  a group are  challenge-averse, though a few individuals may be support-seeking
  • News aggregator API list: http://www.programmableweb.com/category/News%20Services/apis?category=20250. I’m wondering if a study of slider ranking hooked up to a news aggregator feed might be useful.
  • Still working on the test harness to exercise the GoogleCSE.
  • Added command line args
  • Fixed stupid threading errors.
  • Checked in.