7:00 – 3:30 ASRC
- Paper
- More on contributions. Realized that I need a figure showing the relationships between individual and group behaviors.
- Found Driving a Wedge Between Evidence and Beliefs – How Online Ideological News Exposure Promotes Political Misperceptions. Nice look about awareness vs belief. The question is how this manifests. It could be that a person who believes false information may spend a lot of time looking at opposition information and as such creates an explorer pattern? Added to the paper archive.
- Added model feedback to section 5.4.6.2
- Code
- Build class(s) that uses some of the CorpusBuilder (or just add to output?) codebase to
- Access webpages based on xml config file
- Read in, lemmatize , and build bag-of-words per page (configurable max). Done. Took out DF-ITF code and replaced it with BagOfWords in DocumentStatistics.
- Write out .arff file that includes the following elements
- @method (TF-IDF, LSI, BOW)
- @source (loomings, the carpet bag, the spouter inn, the counterpane)
- @title (Moby-dick, Tarzan)
- @author (Herman Melville, Edgar Rice Burroughs)
- @words (nantucket,harpooneer,queequeg,landlord,euroclydon,bedford,lazarus,passenger,circumstance,civilized,water,thousand,about,awful,slowly,supernatural,reality,sensation,sixteen,awake,explain,savage,strand,curbstone,spouter,summer,northern,blackness,embark,tempestuous,expensive,sailor,purse,ocean,tomahawk,black,night,dream,order,follow,education,broad,stand,after,finish,world,money,where,possible,morning,light)
- So a line should look something like
- LSI, chapter-1-loomings, Moby-dick, Herman Melville, 0,0,0,0,0,0,0,5,0,0,7,4,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,5,3,3,0,0,0,0,2,0,0,0,5,0,0,3,4,2,0,0,0
- Updated LabledMatrix2D to generate arff files.
