VTX 7:00 – 5:00
- Continuing A Survey on Assessment and Ranking Methodologies for User-Generated Content on the Web
- Adding N. Diakopoulos and M. Naaman. Topicality, Time, and Sentiment in Online News Comments. Conference on Human Factors in Computing Systems (CHI) Works in Progress. May, 2011. [PDF] Short! Yay!
- Added Adaptive Faceted Ranking for Social Media Comments. I think it may touch on my idea of Pertinence ranking using Markov Chains.
- Scanned Exploiting Social Context for Review Quality Prediction and realized that it’s got some very good hints for markers that can be used to use for machine learning on the doctor records
Feature Name Type Feature Description NumToken Text-Stat Total number of tokens. NumSent Text-Stat Total number of sentences. UniqWordRatio Text-Stat Ratio of unique words SentLen Text-Stat Average sentence length. CapRatio Text-Stat Ratio of capitalized sentences. POS:NN Syntactic Ratio of nouns. POS:ADJ Syntactic Ratio of adjectives. POS:COMP Syntactic Ratio of comparatives. POS:V: Syntactic Ratio of verbs. POS:RB Syntactic Ratio of adverbs. POS:FW Syntactic Ratio of foreign words. POS:SYM Syntactic Ratio of symbols. POS:CD Syntactic Ratio of numbers. POS:PP Syntactic Ratio of punctuation symbols. KLall Conformity KL div DKL(Tr||Ti) PosSEN Sentiment Ratio of positive sentiment words. NegSEN Sentiment Ratio of negative sentiment words.
- This means I need to store the whole page in the rating app so that I can evaluate machine ratings after getting human ratings.
- Finished the UI part of the display, now to change the DB back end. I’m going to start the DB over again since there is so much new stuff.
- Cleaning up classes. Moved LoginDialog and CheckboxGroup to utils.
- Meeting about the relative merits of StanfordNLP and Rosette. We’ll stick with Stanford for now. I have some questions about how Webhose.io will be handled, but Aaron thinks that it can be filtered in the TAS, with a query string preprocessor.
