Phil 11.10.16

7:00 – 4:30 ASRC

  • Had some thoughts last night about how flocking at different scales in Hilbert space might work. Flocks built upon flocks. There is some equivalent of mass and velocity, where mass might be influence (positive and negative attraction). Velocity is related to how fast beliefs change.
  • Also thought about maps some more, weather maps in particular. A weather map maintains a coordinate frame, even though nothing in that frame is stable. Something like this, with a sense of history (playback of the last X years) could provide an interesting framework for visualization.
  • Continuing Novelty Learning via Collaborative Proximity Filtering review. Done! Need to submit both now.
  • Adding StrVec to the ARFF outputs – done
  • Starting this tutorial on Nonnegative Matrix Factorization
  • Working on building JSON files for loading CI
  • Meeting about Healthdatapalooza

Phil 11.9.16

7:00 – 5:00 ASRC

  • President-elect Trump. Wow. Just wow.
  • Starting Novelty Learning via Collaborative Proximity Filtering review
  • Working with Aaron to get the java version of the classifier working
  • LibRec (http://www.librec.net) is a Java library for recommender systems (Java version 1.7 or higher required). It implements a suit of state-of-the-art recommendation algorithms. It consists of three major components: Generic Interfaces, Data Structures and Recommendation Algorithms. This should save a *lot* of work. Remember to thank and cite.
  • The forces that drove this election’s media failure are likely to get worse – Lots of stuff on echo chambers and social media

Phil 11.8.16

7:00 – 6:30 ASRC

Phil 11.7.16

6:30 – 3:00 ASRC

  • Notes from Aaron to discuss today:
    • http://karpathy.github.io/2015/05/21/rnn-effectiveness/?branch_used=true Great article on RNN. Sample code available too.

    • Slider based decisions for clustering topic models where we weight similarity contributions individually, including entities (who the document is about via NLP extraction), BOW comparison, TF-IDF LS comparison, etc. The clusters change based off the combined contribution of each vector of attractors.
  • Starting review of Novelty Learning via Collaborative Proximity Filtering
  • LingPipe is tool kit for processing text using computational linguistics. LingPipe is used to do tasks like:
    • Find the names of people, organizations or locations in news
    • Automatically classify Twitter search results into categories
    • Suggest correct spellings of queries
  • GATE is open source software capable of solving almost any text processing problem
  • Semantic Vectors creates semantic WordSpace models from free natural language text. Such models are designed to represent words and documents in terms of underlying concepts. They can be used for many semantic (concept-aware) matching tasks such as automatic thesaurus generation, knowledge representation, and concept matching.
  • LSA-based essay grading – could be good for document classification/spam detection

Phil 11.4.16

6:45 – 3:00 ASRC

  • Nervous enough about the election to move 1/3 of my retirement into long term treasuries.
  • Writing up review of Topic-Relevance Map – Visualization for Improving Search Result Comprehension for IUI 2017. Done!
  • Got similarity distance working on retrieved documents using a config file

doccluster

Phil 11.3.16

7:00 – 3:00 ASRC

Phil 11.1.16

7:00 – 5:00 ASRC

  • Playing around with using dissertation to search from. Interesting and different results for non–and-equalized docs and single counts.
    • baseline: information model behavior agent result pattern search system between
    • equalized docs: information search behavior design system result source document provide
    • single counts: information result provide system between search process example behavior
    • equalized + single: information result provide between search system process approach design
  • Finishing survey and sending out – done!
  • Tested the new CSE
  • Back to Vector space models of semantics
  • Worked on proposal with Aaron.

Phil 10.31.16

7:00 – 5:00 ASRC

  • Working on survey
  • Reading Vector space models of semantics. Fifty pages. Ow!
  • Useful neural net blog
  • Meeting with Wayne
    • Add an open comments section “is there anything that you’d like to add”
    • UI+AI to determine online trustworthiness
    • Ran my proposal (with equalized docs – better!) through LMN. More words in the query mean more specificity, so I went up to 9.
      • The Scholar result is here, set for no patents.
      • Scholar, with ‘single counts’ is here.
      • The standard result, which is also interesting is here.
      • Standard, with single counts (better?) is here.
    • Need to start on a spreadsheet of venues. (Get these off the laptop)
      • ICWSM 2017 – Abstracts Jan 6, full papers Jan 13
      • ICPSR is a site containing lots of data. Looking for qualitative corpora. Using ‘interview’ seems to bring up a lot, but you need to log in to use.

Phil 10.28.16

7:00 – 5:00 ASRC

  • Listening to BBC Business Daily this morning on trusting algorithms.
    • This site: Computational Legal Studies. Seems very relevant for all kinds of reasons. Does Aaron Massey know about them?
      • Daniel Martin Katz: “Research Interests include legal futurism, legal informatics,  law & entrepreneurship, quantitative modeling of litigation and jurisprudence, quantitative finance, computational legal studies, big data and the law, economics of the legal profession, positive legal theory, technology aided access to justice, legal complexity and the overall impact of information technology, analytics and automation on the future of the legal profession.”
      • Jon Zelner “My research is focused on using spatial and social network analysis to prevent infectious diseases, with a focus on tuberculosis and diarrheal disease, and to understand social and epidemiological systems characterized by complex spatiotemporal dynamics. 
    • Cathy O’Neil, author of Weapons of Math Destruction. MathBabe.
  • Interview with Judea Pearl on Bayesian computation: “We are losing the transparency now with deep learning. I am talking to users who say “it works well” but they don’t know why. Once you unleash it, it has its own dynamics, it does its own repair and its own optimization, and it gives you the right results most of the time. But when it doesn’t, you don’t have a clue as to what went wrong and what should be fixed.”
  • Working on survey questions. Reading Internet, Mail, and Mixed Mode Surveys, chapter 6,
  • Need to add a section on ‘Transparent’ Cognitive Computing.
  • Took a look at Theresa’s slides. Need to call out some ML/AI word salad
  • Working on adding a key. Slow UI. Done, I think…
  • Add credit card to semrush
  • Meeting with Shimei. We started talking about recommender systems, but wound up talking about neural word/sentence/paragraph/topic embedding. Trained against something like the Wikipedia and its outbound links, this might be a good way of having a knowledge coordinate frame that velocity and position can be determined.

Phil 10.27.16

7:00 – ASRC

  • Get a new copy of SPSS? My license has expired.
  • Some thoughts about librarians in the internet age from PBS. This is more of the guide instead of recommender thing for Friday.
  • Start writing up NIH proposal
  • Working on Shimei’s evaluation. Oops – getting some kind of parse exception. Ah. I was calling the PDF parser. Cut and paste error.
    • Took terms from ratemyprofessor.com
      • Top three comments that were longer than a sentance for top three professors rated 5.0 and below, 4.0 and below, 3.0 and below. Giving a corpora consisting of 27 entries
      • Broken into three texts, sorted by LSI, with top 25% terms used
      • Centrality calculated using PageRank algorithm with terms as links
      • Culled out terms such as ‘there’ and ‘about’
      • Reweighted documents based on overall counts
      • Remaining sorted terms were:
        project
        write
        group
        stupid
        slide
        curve
        clearly
        member
        helpful
        workload
        unrealistic
        worst
        experience
        midterm
        school
        homework
        great
        exactly
        knowledgable
        science
        flexible
      • Build likert scale questionnaire grounded in the top x terms?
      • I can use the terms pointing into the text to ground the questions (e.g. “Three easy exams and two easy projects. It would be hard not to get an A.”
    • Wrote up response to RFI with Aaron

Phil 10.25.16

Image result for the gauntlet bus

PhD proposal! Passed!Here are the slides.

  • Shimei brought up a point about recommender systems. I think that as we currently have them, recommenders are like advisors. Maybe we need a guide more than a recommender? Could be an interesting study.
  • Aaron M. wants me to be clear on the differences between True and Antibubble
  • Aaron D. brought up the thought that all bubbles are bad. Need to point him at the section in the Law of Group Polarization that shows how small bubbles can allow nascent ideas to mature. I think that this is an important part of the agent model. I also think that agents should be able to ‘see’ how another agent feels about them (third belief/antibelief model)
  • Need to follow up with everyone to get their thoughts before everything fades.
    • Wayne – Monday
    • Don
    • Aaron
    • Shimei
    • Thom

Phil 10.24.16

7:00 – 4:00 ASRC

  • More slide prep.
  • Got the centrality calculations working in the research browser. Gotta get multithreading working
  • I’m now working on IRAD, not BRC
  • Meeting with Wayne

Phil 10.21.16

7:00 – 4:00 ASRC

  • Fixing slides
    • Add breadcrumbs in the lower border throughout – done
    • Add timeline that starts back at the initial politifact paper – done
    • Future
      • app dev takes 4-6 months
      • Running experiments/data gathering takes about a month
      • Analysis tools take about two months to build/learn/produce results
      • Writing papers takes about two months
      • Total time for each step = 9-11 months
      • RQ1 – August 2016 – May 2017
      • RQ2 – March 2017 – Dec 2017
      • RQ3 – Oct 2017 – Jul 2018
      • RQ4 – May 2018 – Feb 2019
      • RQ5 – Dec 2018 – Sept 2019
  • RateMyProfessor UMBC IS: https://www.ratemyprofessors.com/campusRatings.jsp?sid=1244
    • Build a corpora (by rating?)
    • Central terms for each rating
    • Survey questions built from those