Monthly Archives: June 2016

Phil 6.24.16

7:00 – 3:30 VTX

  • Thinking about Predictwise’s ostensible failure to get the Brexit right. (Brexit – PredictWise-6.24.2016). The comments are also hugely interesting. Added it to the corpus and coded it lightly. Don’t want to lose it. From the Predictwise blog:
    • Second, market did not pick up on enough idiosyncratic data in the field. Maybe this is because traders do not have the pulse of working masses? Possible. But, I go back to the first point as probably more important. This morning I have a new line of research that I am obsessed with: political impact on financial markets is under-explored and fascinating. I suspect that we underestimated the effect that the volatility of the underlying currency had on the prediction and financial markets.
  • Back to creating friction and refining that part of the contribution. I need to find that article that does fact-checking by looking at Wikipedia hops. Found it: Computational Fact Checking from Knowledge Networks
  • Writing report. Done!

Phil 6.22.16

6:45 – 4:45 VTX

  • Running analytics on the CSCW corpora
  • My codebase at home was out of data, and I was having my missing bouncycastle jar file issue, so I updated the development folders. I also started updating my IntelliJ, which has taken 10 minutes so far…
  • First pass of running the CSCW data through the tool.
    There are three categories:
    • CSCW17 – these are the submitted papers
    • MostCited – These are (generally) the most cited paper by the author where they are first or last author. It took me a while to start doing this, so the set isn’t consistent.
    • MostRecent – These are the most recent papers that I could get copies of. Same constraints and caveats as above.
    I also deleted the term ‘participants’, as it overwhelmed the rest of the relationships and is a pretty standard methods element that I don’t think contributes to the story the data tell.
    Here’s the top ten items, ranked by influence of terms inside of the top 52 items in the LSI ranking. It’s kind of interesting…
    CSCW2017 Most Cited Most Recent Most Cited Most Recent
    older social media Sean P. Goggins.pdf Donald McMillan.pdf
    ageism student privacy Chinmay Kulkarni.pdf Mark Rouncefield.pdf
    adult photo twitter Airi Lampinen.pdf Sarah Vieweg.pdf
    blogger awareness behavior Cliff Lampe.pdf Jeffrey T. Hancock.pdf
    ageist object device Anne Marie Piper.pdf David Randall.pdf
    platform class interview Frank Bentley.pdf Cliff Lampe.pdf
    workplace facebook notification Mor Naaman.pdf Sean P. Goggins.pdf
    woman friend deception Morgan G. Ames.pdf Airi Lampinen.pdf
    gender flickr phone Gabriela Avram.pdf Wayne Lutters.pdf
    snapchat software facebook Lior Zalmanson.pdf Vivek K. Singh.pdf
  • Finished rating! 530 pages. Now I need to get the outputs to Excel. I think the view_ratings should be enough…?
  • I don’t have just names alone, but I’m going to assume that the initial set of queries (‘board actions’, ‘criminal’, ‘malpractice’ and ‘sanctions’) may modestly improve the search. So with a proxy for the current system, with my small data set, I have the following results:
    • Hits or near misses – 46 pages or 16.7% of the total pages evaluated
    • Misses – 230 or 83.3% of the total pages evaluated

    With the new CSE configuration (exactTerms=<name permutation>, query=<full state name>, orTerms=<TF-IDF string>, we get much better results:

    • Hits or near misses – 252 pages or 78% of the total pages evaluated
    • Misses – 71 or 22% of the total pages evaluated

    So it looks like we can expect something on the order of a 450% improvement in results.

  • Good presentation on document similarity

Phil 6.21.16

7:00 – 5:00 VTX

  • Finished MostRecent.
  • Checked Data directory into SVN
  • Testing rating algorithms. Seems to be working pretty well 🙂
  • Rated all day. Should finish tomorrow.
  • Worked through paragon and fallen angel patterns with Aaron. Pulled out by bayesian spreadsheets and realized I no longer understood them…

Phil 6.20.16

7:00 – 7:00 VTX

  • Building chair corpus = Current and Cited
  • Filled MostCited.
  • Rating a few more pages. Still not getting any name hits.
  • Going to advanced search and entering items into each field, I get a different looking query:
    • These seem to be the important differences
    • as_q=New+York — This is a ‘normal’ query
    • as_epq=Nader+Golian — This must be in the results
    • as_oq=+license+board+practice+patient+physician+order+health+practitioner+medicine+medical — at least one of these must be in the result
  • Going to add a test to look for the name in the query (and the state?) and at least check the NA box and throw up a dialog. Could also list the number of occurrences by default in the notes

1:00 – Patrick’s proposal

  • Framing of problem and researcher
  • Overview of the problem space
    • Ready to Hand
    • Extension of self
  • Assistive technology abandonment
    • Ease of Acquisition
    • Device Performance
    • Cost and Maintenance
    • Stigma
    • Alignment with lifestyles
  • Prior Work
    • Technology Use
    • Methods Overview
      • Formative User Needs
      • Design Focus Groups
      • Design Evaluation and Configuration Interviews
    • Summary of Findings
    • Priorities
      • Maintain form factor
      • Different controls for different regions
      • Familiarity
      • Robustness to environmental changes
    • Potential of the wheelchair
      • Nice diagram. Shows the mapping from a chair to a smartphone
    • Inputs to wheelchair-mounted devices
    • Force sensitive device, new gestures and insights
    • Summary (This looks like research through design. Why no mention?)
      • Prototypes
      • Gestures
      • Demonstration
  • Proposed Work
    • Passive Haptic Rehabilitation
      • Can it be done
      • How effective
      • User perception
      • Study design!!!
    • Physical Activity and Athletic Performance
      • Completed: Accessibility of fitness trackers. (None of this actually tracks to papers in the presentation)
      • Body location and sensing
      • Misperception
        • Semi-structured interviews
        • Low experience / High interest (Lack of system trust!)
    • Chairable Computing for Basketball
      • Research Methods
        • Observations
        • Semi-structured interviews
        • Prototyping
        • Data presentation – how does one decide what they want from what is available?
  • What is the problem – Helena
    • Assistive technologies are not being designed right. We need to improve the design process.
    • That’s too general – give me a citation that says that technology abandonment WRT wheelchair use has high abandonment
    • Patrick responds with a bad design
    • Helena – isn’t the principal user-centered design. How has the HCI community done this before WRT other areas than wheelchairs to interact with computing systems
    • Helena – Embodied interaction is not a new thing, this is just a new area.Why didn’t you group your work. Is the prior analysis not embodied? Is your prior work not aligned with this perspective
  • How were the design principles used o develop an refine the pressure sensors?

More Reading

  • Creating Friction: Infrastructuring Civic Engagement in Everyday Life
    • This is the confirming information bubble of the ‘ten blue links’: Because infrastructures reflect the standardization of practices, the social work they do is also political: “a number of significant political, ethical and social choices have without doubt been folded into its development” ([67]: 233). The further one is removed from the institutions of standardization, the more drastically one experiences the values embedded into infrastructure—a concept Bowker and Star term ‘torque’ [9]. More powerful actors are not as likely to experience torque as their values more often align with those embodied in the infrastructure. Infrastructures of civic engagement that are designed and maintained by those in power, then, tend to reflect the values and biases held by those in power.
  • Meeting with Wayne. My hypothesis and research questions are backwards but otherwise good.

Phil 6.17.16

8:00 – 3:00 VTX

  • Finishing up the chairs. And then I have some more coding to do on the new papers.
  • Anatomy of the Unsought Finding. Serendipity: Origin, history, domains, traditions, appearances, patterns and programmability (stored here)
  • Supporting serendipity: Using ambient intelligence to augment user exploration for data mining and web browsing (stored here)
  • Starting to create new json file of practitioners
    	"first_name": "Nader",
    	"last_name": "Golian",
    	"pp_state": "New York",
    	"gender": "male",
    }, {
    	"first_name": "Ata",
    	"middle_name": "Ollah",
    	"last_name": "Mehrtash",
    	"pp_state": "New York",
    	"gender": "male",
    }, {
    	"first_name": "Souheil",
    	"last_name": "Saba",
    	"pp_state": "New Jersey",
    	"gender": "male",
    }, {
    	"first_name": "Kamal",
    	"last_name": "Patel",
    	"pp_state": "Illinois",
    	"gender": "male",
  • Built new queries
    sst = new SmartSearchTerm("schedule guideline license substance board sentence increase other prescription commission", null, null);
            sst = new SmartSearchTerm("schedule criminal license sentence prescription doctor defendant board practice research county", null, null);
            sst = new SmartSearchTerm("license board practice patient physician order health practitioner medicine medical", null, null);
            sst = new SmartSearchTerm("physician license professional return number effective", null, null);
            sst = new SmartSearchTerm("respondent consent committee probation agreement pursuant", null, null);
  • Hmm. Getting good pages that are like this, but no matches on the names yet. Tempted to have an option that rejects pages that don’t have an entity with the right name in it and let it cook.
  • Helped write up ML text for SOW.

Phil 6.16.16

7:30 – 5:30 VTX

  • Still working through chairs. It’s been a really good exercise. I’m finding some very good stuff.
  • Worked through equations that would help find paragons, oscillators, cyphers and criminals.
  • Committed all the changes that I forgot about yesterday.
  • “name” <state> license board practice patient physician order health practitioner medicine medical seems to work well.
  • Building auto-trimmed matrix based on 50% of rank value. Done!

Phil 6.15.16

7:00 – 10:00, 12:00 – 4:00 VTX

  • Got the official word that I should be charging the project for research. Saved the email this time.
  • Continuing to work on the papers list
  • And in the process of looking at Daniele Quercia‘s work, I found Auralist: introducing serendipity into music recommendation which was cited by
    An investigation on the serendipity problem in recommender systems. Which has the following introduction:

    • In the book ‘‘The Filter Bubble: What the Internet Is Hiding from You’’, Eli Pariser argues that Internet is limiting our horizons (Parisier, 2011). He worries that personalized filters, such as Google search or Facebook delivery of news from our friends, create individual universes of information for each of us, in which we are fed only with information we are familiar with and that confirms our beliefs. These filters are opaque, that is to say, we do not know what is being hidden from us, and may be dangerous because they threaten to deprive us from serendipitous encounters that spark creativity, innovation, and the democratic exchange of ideas. Similar observations have been previously made by Gori and Witten (2005) and extensively developed in their book ‘‘Web Dragons, Inside the Myths of Search Engine Technology’’ (Witten, Gori, & Numerico, 2006), where the metaphor of search engines as modern dragons or gatekeepers of a treasure is justified by the fact that ‘‘the immense treasure they guard is society’s repository of knowledge’’ and all of us accept dragons as mediators when having access to that treasure. But most of us do not know how those dragons work, and all of us (probably the search engines’ creators, either) are not able to explain the reason why a specific web page ranked first when we issued a query. This gives rise to the so called bubble of Web visibility, where people who want to promote visibility of a Web site fight against heuristics adopted by most popular search engines, whose details and biases are closely guarded trade secrets.
    • Added both papers to the corpus. Need to read and code. What I’m doing is different in that I want to add a level of interactivity to the serendipity display that looks for user patterns in how they react to the presented serendipity and incorporate that pattern into a trustworthiness evaluation of the web content. I’m also doing it in Journalism, which is a bit different in its constraints. And I’m trying to tie it back to Group Polarization and opinion drift.
  • Also, Raz Schwartx at Facebook: , Editorial Algorithms: Using Social Media to Discover and Report Local News
  • Working on getting all html and pdf files in one matrix
  • Spent the day chasing down a bug where if the string being annotated is too long (I’ve set the  number of wordes to 60), then we skip. THis leads to a divide by zero issue. Fixed now

Phil 6.14.16

7:00 – 3:00 VTX

  • Working on finishing the papers of the CSCW chairs
  • Built a new version of RatingApp and sent over to Andy to deploy.
  • More rating. Once done, run through the domains and see what comes up.
    • Finished!
    • Ok, I have a mix of html, pdf and msword docs.
      • Change corpusManager so the config file can handle multiple types
      • Convert the docs to pdf. Done
      • Parsing. Ran into a java.lang.NoClassDefFoundError: org/bouncycastle/jce/provider/BouncyCastleProvider error. Added the org.bouncycastle:bcprov-jdk16:1.46 and now all is fine…?
  • Need to trap a reset connection and resubmit. Done

Phil 6.13.16

6:30 – 2:30 VTX

Phil 6.9.16

6:00 – 12:00 Writing

  • Going to go through the RQs and describe how to address them
  • Start with the back end and my local cohort, which I can assume to be diversity-seeking because of where they are.
  • Iteratively develop tool so that it gets used for diversity-related activities
  • Logs and questionairres.
  • Scraping for Google Scholar and CaseLaw? Java code is here.
  • Looks like Google Scholar has also started to add the concept of pertinence in?
  • Finished the Research Plan. Do need a timeline.
  • Finished discussion/conclusion. Done(ish)!

Phil 6.8.16

6:30 – 4:30 Writing

  • Wondering if I should add a section on trust and credibility
  • Huh – just saw this on Google image search. You get a bar of context words that allow for drilling down into a result
  • Reworked a lot of the paper since the whole anonymous part has been shelved
  • Started on the hypotheses and research questions in the plan section