Monthly Archives: October 2016

Phil 10.31.16

7:00 – 5:00 ASRC

  • Working on survey
  • Reading Vector space models of semantics. Fifty pages. Ow!
  • Useful neural net blog
  • Meeting with Wayne
    • Add an open comments section “is there anything that you’d like to add”
    • UI+AI to determine online trustworthiness
    • Ran my proposal (with equalized docs – better!) through LMN. More words in the query mean more specificity, so I went up to 9.
      • The Scholar result is here, set for no patents.
      • Scholar, with ‘single counts’ is here.
      • The standard result, which is also interesting is here.
      • Standard, with single counts (better?) is here.
    • Need to start on a spreadsheet of venues. (Get these off the laptop)
      • ICWSM 2017 – Abstracts Jan 6, full papers Jan 13
      • ICPSR is a site containing lots of data. Looking for qualitative corpora. Using ‘interview’ seems to bring up a lot, but you need to log in to use.

Phil 10.28.16

7:00 – 5:00 ASRC

  • Listening to BBC Business Daily this morning on trusting algorithms.
    • This site: Computational Legal Studies. Seems very relevant for all kinds of reasons. Does Aaron Massey know about them?
      • Daniel Martin Katz: “Research Interests include legal futurism, legal informatics,  law & entrepreneurship, quantitative modeling of litigation and jurisprudence, quantitative finance, computational legal studies, big data and the law, economics of the legal profession, positive legal theory, technology aided access to justice, legal complexity and the overall impact of information technology, analytics and automation on the future of the legal profession.”
      • Jon Zelner “My research is focused on using spatial and social network analysis to prevent infectious diseases, with a focus on tuberculosis and diarrheal disease, and to understand social and epidemiological systems characterized by complex spatiotemporal dynamics. 
    • Cathy O’Neil, author of Weapons of Math Destruction. MathBabe.
  • Interview with Judea Pearl on Bayesian computation: “We are losing the transparency now with deep learning. I am talking to users who say “it works well” but they don’t know why. Once you unleash it, it has its own dynamics, it does its own repair and its own optimization, and it gives you the right results most of the time. But when it doesn’t, you don’t have a clue as to what went wrong and what should be fixed.”
  • Working on survey questions. Reading Internet, Mail, and Mixed Mode Surveys, chapter 6,
  • Need to add a section on ‘Transparent’ Cognitive Computing.
  • Took a look at Theresa’s slides. Need to call out some ML/AI word salad
  • Working on adding a key. Slow UI. Done, I think…
  • Add credit card to semrush
  • Meeting with Shimei. We started talking about recommender systems, but wound up talking about neural word/sentence/paragraph/topic embedding. Trained against something like the Wikipedia and its outbound links, this might be a good way of having a knowledge coordinate frame that velocity and position can be determined.

Phil 10.27.16

7:00 – ASRC

  • Get a new copy of SPSS? My license has expired.
  • Some thoughts about librarians in the internet age from PBS. This is more of the guide instead of recommender thing for Friday.
  • Start writing up NIH proposal
  • Working on Shimei’s evaluation. Oops – getting some kind of parse exception. Ah. I was calling the PDF parser. Cut and paste error.
    • Took terms from ratemyprofessor.com
      • Top three comments that were longer than a sentance for top three professors rated 5.0 and below, 4.0 and below, 3.0 and below. Giving a corpora consisting of 27 entries
      • Broken into three texts, sorted by LSI, with top 25% terms used
      • Centrality calculated using PageRank algorithm with terms as links
      • Culled out terms such as ‘there’ and ‘about’
      • Reweighted documents based on overall counts
      • Remaining sorted terms were:
        project
        write
        group
        stupid
        slide
        curve
        clearly
        member
        helpful
        workload
        unrealistic
        worst
        experience
        midterm
        school
        homework
        great
        exactly
        knowledgable
        science
        flexible
      • Build likert scale questionnaire grounded in the top x terms?
      • I can use the terms pointing into the text to ground the questions (e.g. “Three easy exams and two easy projects. It would be hard not to get an A.”
    • Wrote up response to RFI with Aaron

Phil 10.25.16

Image result for the gauntlet bus

PhD proposal! Passed!Here are the slides.

  • Shimei brought up a point about recommender systems. I think that as we currently have them, recommenders are like advisors. Maybe we need a guide more than a recommender? Could be an interesting study.
  • Aaron M. wants me to be clear on the differences between True and Antibubble
  • Aaron D. brought up the thought that all bubbles are bad. Need to point him at the section in the Law of Group Polarization that shows how small bubbles can allow nascent ideas to mature. I think that this is an important part of the agent model. I also think that agents should be able to ‘see’ how another agent feels about them (third belief/antibelief model)
  • Need to follow up with everyone to get their thoughts before everything fades.
    • Wayne – Monday
    • Don
    • Aaron
    • Shimei
    • Thom

Phil 10.24.16

7:00 – 4:00 ASRC

  • More slide prep.
  • Got the centrality calculations working in the research browser. Gotta get multithreading working
  • I’m now working on IRAD, not BRC
  • Meeting with Wayne

Phil 10.21.16

7:00 – 4:00 ASRC

  • Fixing slides
    • Add breadcrumbs in the lower border throughout – done
    • Add timeline that starts back at the initial politifact paper – done
    • Future
      • app dev takes 4-6 months
      • Running experiments/data gathering takes about a month
      • Analysis tools take about two months to build/learn/produce results
      • Writing papers takes about two months
      • Total time for each step = 9-11 months
      • RQ1 – August 2016 – May 2017
      • RQ2 – March 2017 – Dec 2017
      • RQ3 – Oct 2017 – Jul 2018
      • RQ4 – May 2018 – Feb 2019
      • RQ5 – Dec 2018 – Sept 2019
  • RateMyProfessor UMBC IS: https://www.ratemyprofessors.com/campusRatings.jsp?sid=1244
    • Build a corpora (by rating?)
    • Central terms for each rating
    • Survey questions built from those

Phil 10.20.16

ASRC 7:00 – 6:00

  • Fixing slides
    • Add breadcrumbs in the lower border throughout
    • Add timeline that starts back at the initial politifact paper
  • Timeline
    • December 2013
      • Register Dr. Lutters as advisor and committee chair, began building committee
      • Begin research into Trustworthy Anonymous Citizen Journalism
      • Lit review on writing stylometry  identification
      • Begin JavaScript development of anonymous posting system
    • February 2014
      • Lost my ‘20% research time’ things slow down
    • April 2014
      • Comprehensive Exam
    • April 2014
      • Fact-checking research
      • Google Similarity Distance
    • June 2014
    • July 2014
    • August 2014
    • September 2014
      • Transition from YUI to Angular
    • October 2014
      • IRB approval
    • December 2014
    • January 2015
      • Mechanical Turk data gathering
      • TEI 2015
    • February 2015
      • Trustworthy Anonymous Journalism talk: http://philfeldman.com/iRevTalk/
      • MS in HCC awarded
    • March 2015
      • Start Bayesian and cluster analysis of Politifact data
      • PageRank and Eigentrust presentation for IS800
    • April 2015
      • Continued Bayesian and cluster analysis of Politifact data
    • May 2015
      • Finished Bayesian analysis and wrote first draft of the paper
    • June 2015
      • Submission of Trustworthy, Distrustworthy, and Newsworthy: Fact Checking by Inferred Reputation to CSCW 2016
    • July 2015
      • Began first stab at the full-stack news reader. Depends on the Google News RSS feed and Alchemy’s parser. Lots of work on links, authors, text, etc.
      • driverf1
    • October 2015
      • Initial version of the newsreader running – begin looking for data and testing. Stalled out.
    • November 2015
      • Starting Lit review
      • Started looking at Saracevic’s work and thinking about leveraging a user’s search for pertinent information inside a relevant SERP to understand the user’s information needs and browsing patterns.
      • Added dictionaries and centrality calculations to the webapp. Started to think about bubbles and antibubbles. Can’t seem to get good data, and the display is confusing. Need to rethink things.
    • December 2015
    • January 2016
      • Started using AtlasTi for LitReview
      • Discovered StanfordNLP
    • March 2016
      • CHIIR 2016
      • Discovered Group Polarization – started thinking about flocking models
    • April 2016
      • First successful calculations of AtlasTi output.
      • Started developing centrality app (Language Model Testbed, or LMT)
    • May 2016
      • LMT development
        • Added graph compare using bootstrapping
        • Term extraction using TF-IDF and LSI
        • PDF and html ingestion
        • Term filtering
        • Excel .xlsx output
      • Began centrality for qualitative research paper
    • June 2016
      • Finished paper, but incorporated into proposal
      • Began proposal
      • Began testing LMN on papers (CSCW 2016 corpora)
    • July 2016
      • Began agent-based modelling of information bubbles with spreadsheets
    • August 2016
      • Started simulation framework for Group Polarization agents
      • GP2
      • Added ARFF output to LMT
    • September 2016
      • Started Research Browser
    • October 2016
      • Submitted Proposal
      • Proposed

Phil 10.19.16

7:00 – 3:00 ASRC

  • Fixing slides
    • Add breadcrumbs in the lower border throughout
    • Move LMN slide to bridge lit review and current work
    • Overview Lit review slide that sets section titles with codes
    • Add timeline that starts back at the initial politifact paper
  • Moving content for Vinny
  • Starting on the lazy loading for the research browser
  • Need to move the settings out of the search page to the setting page and get them
  • TensorFlow seminar
    • @martin_wicke
    • TF is tooling to manage complexity
    • TF can run any algorithm addition -> Bayes -> deep neural nets -> etc
    • Multiple chained Neural Networks (NN)
    • Data center scale systems
    • Flow graph is written in Python and then compiled
    • State is maintained in nodes
    • First build the graph, then run it.
    • TensorBoard (visualization)
    • relu
    • TensorFlow uses CUDA8, so NVidia
    • TensorFlow runs on Linux (Ubuntu), MacOS, Windows coming soon (a week or 2) that will support GPUs
    • Tensorflow installed with pip?
    • Placeholders are like variables that require data from the user or TF doesn’t work.
    • Pre trained models exist. There is a tutorial that gives quick results that is used a pre-trained model
    • Performance tracing is available by use of flags. Run times and communication times

Phil 10.18.16

6:30 – 6:00 ASRC

  • IS Distinguished Speaker Dr. Margaret Burnett
    • Information foraging theory – Peter Pirolli biologically inspired mathematical models. Predator model?
    • Ignorance leads to unwitting barriers
    • Barrier level, not system level for design
    • GenderMag The goal is problem solving, not browsing. This is another behavioral marker? Depth-first vs. breadth first vs all first
      • Personas
        • Motivations
        • Information Processing Style
        • Computer self-efficacy
        • Risk averseness (confirm/avoid???)
        • Technology learning style (tinkering vs. tinkering pausfully – behavioral cue??)
  • TensorFlow lecture at noon. Video link. Nope – tomorrow.
  • Adding some code that will hide common prefixes in documents – done
  • Extremely Fast Text Feature Extraction for Classification and Indexing
  • Machine learning meeting. We’re going to start with a deployable, scalable simple junk filter.
  • Long discussion with Aaron about priorities and search spaces.
  • Fixing slides
    • Add breadcrumbs in the lower border throughout
    • Add overview slide
    • Tie back to diversity of thought vs monolithic thought
    • Move LMN slide to bridge lit review and current work
    • Overview Lit review slide that sets section titles with codes
      • Motivations – setting the stage
    • Rework domain independence slide. Maybe we’re not proving, but disproving the negative?
    • Add timeline that starts back at the initial politifact paper

Phil 10.17.16

7:00 – 6:00 ASRC

  • More slides. Gonna have to edit…
  • Finish slides today and walkthrough
  • Extract and ingest the chapters of the proposal.
  • Presented! 31 minutes
    • Add breadcrumbs in the lower border throughout
    • Add overview slide
    • Move LMN slide to bridge lit review and current work
    • Overview Lit review slide that sets section titles with codes
      • Motivations – setting the stage
    • Rework domain independence slide. Maybe we’re not proving, but disproving the negative?
    • Add timeline that starts back at the initial politifact paper

Phil 10.13.16

Phil 7:00 – 4:00 ASRC

Phil 10.12.16

7:00 – 6:00 ASRC

  • Word Vector analysis on 18th century lit. Very interesting: http://ryanheuser.org/word-vectors-4/
  • More slides
  • Starting on switching out the combobox to a custom component. Using this tutorial
  • Components work
  • Adding tabs. It works, but the second
  • Meeting with Wayne. I should time slides for a 30 minute talk
    • Follow up with Barbara and get a room nailed down
    • Also met with Shimei – I’ll send out a note to the rest of the committee asking if there are any items that they want me to clarify in the talk from their reading of the proposal
    • Created a doodle for the walkthrough
  • More reading The Last Place on Earth. Amundson was literally raised into arctic exploration techniques. His explorer framing is different from Scott’s? We’ll find out.

Phil 10.11.16

Phil 7:00 – 5:00  ASRC

  • Heard about LoRa on the radio this morning. This might be the bike box tracker technology I’m looking for. Not sure if the networks are active at airports though.
  • More slides. Putting questions that don’t seem to fit anywhere else at the back
  • Finished first pass through motivations. Started on Lit Review and a bit of Current Work.
  • Work on Google CSE integration in the Research Browser today? Then lazy loading of the pages behind the listing.
  • Getting the search results using the ONLY_COM engine for now (TODO: Add more engines later)
  • Working on taking the items and making them data/graphics objects that can be included in a list. Going to start with a combobox to get the mechanics working right and then walk back from that.
  • Combobox list works! There is only one combobox that is assigned to the selected cell, so the list has to be repopulated on selection. Otherwise it’s all straightforward.
  • Started reading The Last Place on Earth.