Phil 5.31.17

7:00 – 8:00 Research

8:30 – 4:00 BRI

  • The Meaning of Underscores in Python
  • Tried to add research code to timesheet. No luck. Let T know.
  • Tried to access new Jira and Confluence pages, They are visible thought the OpenVPN tunnel. but the login/password does not work
  • Reading the Ketos User guide and annotating. Finished – sending to Aaron
  • TEM meeting at 2:00
  • Meeting with CCRi. Lead dev: Vivek Dhand
    • String matching, BOW, LSI competitors
    • Based on word2vec, combined with a TF-IDF scoring
      Trained on wikipedia
    • Trained on seperate training server?
    • Apps on the training server? Train one classifier for each field
  • Things we did in 2016
    • StanfordNLP+jsoup tool to categorize and tag web pages for statistical analysis
    • Statistical analysis of said pages, include backlink and other meta data analysis
    • Google CSE interface, plus cleaning tools
    • Document centrality analysis tool (JavaFX! Woohoo!) (LSI, TF-IDF, PageRank, adjacency, etc calculations at interactive rates)(outputs for WEKA)
    • Use of above tool to create CSE search terms that improved craw precision by 500% (
    • Tagged hundreds of web pages because someone had to.
    • Proposal writing
    • Group polarization modeling using flocking agent-based simulation
    • Microservices
    • Classifiers in WEKA and the WEKA api
    • Research Browser prototype
    • NMF tool for topic extraction based on UTOPIAN paper