Category Archives: research

Phil7.25.16

7:00 – 4:00 VTX

Rollers
Reworking the lit review. Meeting set up with Wayne for tomorrow at 4:00.
Still thinking about modelling. I could use sets of strings that would define a CAs worldview and then compare individuals by edit distance.
- Not sure how to handle weights, a number, or repetitions of the character?
- Comparing a set of CAs using centrality could see what the most important items are in that (overall and sub) population. how close the individual CA conforms to that distribution is a measure of the ‘belonging’?
- CAs could adjust their internal model. Big changes should be hard, little changes should be easy. Would the dropping of a low ranked individual item result in a big change in edit distance with a group that doesn’t have the item?
- Working on infrastructure that builds, collects and maintains Factoids

Phil 7.22.16

7:00 – 1:00 VTX

More bubble modelling. Found a nice paper from a financial perspective that looks like a good source for similar models.
Split out the calculation and spreadsheet functions to support snapshots and debugging.
- Set up the base class to be the control. Explorers only look outside their SD, while confirmers and avoiders stay within. Not sure how to tease out the difference between those. I think it will have something to do with the way they look for information, which is beyond the scope of this model for now. Also switched to a random distribution. Here’s an initial result. Much more work to follow

I was riding and thinking about something I read on fivethirtyeight.com “This isn’t the most artful way to say it, but it’s like, where do you go when the only people who seem to agree with you on taxes hate black people?” It’s by Ben Howe, a redstate commentator. And it makes me think that rather than basing the sim on only one value, there should be a cluster. Confirmed could look for a match in the cluster while avoiders would clusters if they hit somethings that doesn’t match. And the distance from the value should matter. Adopting a very different concept should take more energy than a similar one. And this makes me think that the CAs have to have a bit more alife in them. They need to budget their energy with reference to their internal and external states.
And then mom died. Here’s the OPM web page that matters: https://www.opm.gov/retirement-services/my-annuity-and-benefits/life-events/death/report-of-death/

Phil 6.21.16

7:00 – 5:00 VTX

Finished MostRecent.
Checked Data directory into SVN
Testing rating algorithms. Seems to be working pretty well 🙂
Rated all day. Should finish tomorrow.
Worked through paragon and fallen angel patterns with Aaron. Pulled out by bayesian spreadsheets and realized I no longer understood them…

Phil 6.20.16

7:00 – 7:00 VTX

Building chair corpus = Current and Cited
Filled MostCited.
Rating a few more pages. Still not getting any name hits.
Going to advanced search and entering items into each field, I get a different looking query:
```
https://www.google.ca/search?as_q=New+York&as_epq=Nader+Golian&as_oq=+license+board+practice+patient+physician+order+health+practitioner+medicine+medical
```
- These seem to be the important differences
- as_q=New+York — This is a ‘normal’ query
- as_epq=Nader+Golian — This must be in the results
- as_oq=+license+board+practice+patient+physician+order+health+practitioner+medicine+medical — at least one of these must be in the result
Going to add a test to look for the name in the query (and the state?) and at least check the NA box and throw up a dialog. Could also list the number of occurrences by default in the notes

1:00 – Patrick’s proposal

Framing of problem and researcher
Overview of the problem space
- Ready to Hand
- Extension of self
Assistive technology abandonment
- Ease of Acquisition
- Device Performance
- Cost and Maintenance
- Stigma
- Alignment with lifestyles
Prior Work
- Technology Use
- Methods Overview
  - Formative User Needs
  - Design Focus Groups
  - Design Evaluation and Configuration Interviews
- Summary of Findings
- Priorities
  - Maintain form factor
  - Different controls for different regions
  - Familiarity
  - Robustness to environmental changes
- Potential of the wheelchair
  - Nice diagram. Shows the mapping from a chair to a smartphone
- Inputs to wheelchair-mounted devices
- Force sensitive device, new gestures and insights
- Summary (This looks like research through design. Why no mention?)
  - Prototypes
  - Gestures
  - Demonstration
Proposed Work
- Passive Haptic Rehabilitation
  - Can it be done
  - How effective
  - User perception
  - Study design!!!
- Physical Activity and Athletic Performance
  - Completed: Accessibility of fitness trackers. (None of this actually tracks to papers in the presentation)
  - Body location and sensing
  - Misperception
    - Semi-structured interviews
    - Low experience / High interest (Lack of system trust!)
- Chairable Computing for Basketball
  - Research Methods
    - Observations
    - Semi-structured interviews
    - Prototyping
    - Data presentation – how does one decide what they want from what is available?
What is the problem – Helena
- Assistive technologies are not being designed right. We need to improve the design process.
- That’s too general – give me a citation that says that technology abandonment WRT wheelchair use has high abandonment
- Patrick responds with a bad design
- Helena – isn’t the principal user-centered design. How has the HCI community done this before WRT other areas than wheelchairs to interact with computing systems
- Helena – Embodied interaction is not a new thing, this is just a new area.Why didn’t you group your work. Is the prior analysis not embodied? Is your prior work not aligned with this perspective
How were the design principles used o develop an refine the pressure sensors?

Phil 6.15.16

7:00 – 10:00, 12:00 – 4:00 VTX

Got the official word that I should be charging the project for research. Saved the email this time.
Continuing to work on the papers list
And in the process of looking at Daniele Quercia‘s work, I found Auralist: introducing serendipity into music recommendation which was cited by
An investigation on the serendipity problem in recommender systems. Which has the following introduction:
- In the book ‘‘The Filter Bubble: What the Internet Is Hiding from You’’, Eli Pariser argues that Internet is limiting our horizons (Parisier, 2011). He worries that personalized filters, such as Google search or Facebook delivery of news from our friends, create individual universes of information for each of us, in which we are fed only with information we are familiar with and that confirms our beliefs. These filters are opaque, that is to say, we do not know what is being hidden from us, and may be dangerous because they threaten to deprive us from serendipitous encounters that spark creativity, innovation, and the democratic exchange of ideas. Similar observations have been previously made by Gori and Witten (2005) and extensively developed in their book ‘‘Web Dragons, Inside the Myths of Search Engine Technology’’ (Witten, Gori, & Numerico, 2006), where the metaphor of search engines as modern dragons or gatekeepers of a treasure is justified by the fact that ‘‘the immense treasure they guard is society’s repository of knowledge’’ and all of us accept dragons as mediators when having access to that treasure. But most of us do not know how those dragons work, and all of us (probably the search engines’ creators, either) are not able to explain the reason why a specific web page ranked first when we issued a query. This gives rise to the so called bubble of Web visibility, where people who want to promote visibility of a Web site fight against heuristics adopted by most popular search engines, whose details and biases are closely guarded trade secrets.
- Added both papers to the corpus. Need to read and code. What I’m doing is different in that I want to add a level of interactivity to the serendipity display that looks for user patterns in how they react to the presented serendipity and incorporate that pattern into a trustworthiness evaluation of the web content. I’m also doing it in Journalism, which is a bit different in its constraints. And I’m trying to tie it back to Group Polarization and opinion drift.
Also, Raz Schwartx at Facebook: , Editorial Algorithms: Using Social Media to Discover and Report Local News
Working on getting all html and pdf files in one matrix
Spent the day chasing down a bug where if the string being annotated is too long (I’ve set the number of wordes to 60), then we skip. THis leads to a divide by zero issue. Fixed now

Phil 6.13.16

6:30 – 2:30 VTX

Building CSCW matrix – got the papers open with more terms than I’ve used before. It opened and ran fine but saved as a zero byte file. Need to look into that…
- Louise Barkhuus
  - Location-Based Services for Mobile Telephony: a Study of Users’ Privacy Concerns.
  - Acting with technology: rehearsing for mixed-media live performances
- Marcos Borges
  - A framework for awareness support in groupware systems
  - aDApTA: Adaptive approach to information integration in dynamic environments
- Wendy Kellogg, IBM T.J. Watson Research Center
- Social translucence: an approach to designing systems that support social processes
- Designing Information for Remediating Cognitive Biases in Decision-Making
- Mor Naaman, Cornell Tech
- HT06, tagging paper, taxonomy, Flickr, academic article, to read
- You Can’t Always Get What You Want: Challenges in P2P Resource Sharing
- Morgan Ames, UC Berkeley
- Why we tag: motivations for annotation in mobile and online media
- Learning consumption: Media, literacy, and the legacy of One Laptop per Child
- Gabriela Avram, University of Limerick
- At the crossroads of knowledge management and social software
- At the crossroads of knowledge management and social software
- Frank Bentley, Yahoo
- Ambient social tv: drawing people into a shared experience
- Ambient social tv: drawing people into a shared experience
- Michael Bernstein, Stanford University
- Imagenet large scale visual recognition challenge
- Generating Personalized Spatial Analogies for Distances and Areas
- Jed Brubaker, University of Colorado
- Developing a life story: Constructing relations between self and experience in autobiographical narratives
- Legacy Contact: Designing and Implementing Post-mortem Stewardship at Facebook
- Licia Capra, University College London
- Carisma: Context-aware reflective middleware system for mobile applications
- Sens-Us: Designing Innovative Civic Technology for the Public Good
- Matthew Chalmers, University of Glasgow
- Tourism and mobile technology
- Personal Tracking of Screen Time on Digital Devices
- Mauro Cherubini, Google
- Let’s go to the whiteboard: how and why software developers use drawings
- To Call or to Recall&quest; That’s the Research Question
- Luigina Ciolfi, Sheffield Hallam University
- Methodological challenges and innovations in mobilities research
- Collocated Interaction: New Challenges in’Same Time, Same Place’Research
- Xianghua Ding, Fudan University
- In the eye of the beholder: a visualization-based approach to information system security
- Today’s Life Style and Yesterday’s Life Experience: A Study of Financial Practices of Retirees In China
- Venessa Evers, Twente University
- The role of culture in interface acceptance
- Crowd-designed motivation: Motivational messages for exercise adherence based on behavior change theory
- Casey Fiesler, University of Colorado
- Redistributing leadership in online creative collaboration
- An Archive of Their Own: A Case Study of Feminist HCI and Values in Design
- Joel Fischer, University of Nottingham
- Automics: souvenir generating photoware for theme parks
- Innovations in autonomous systems: Challenges and opportunities for human-agent collaboration
- Susan Fussell, Cornell University
- Coordination of communication: Effects of shared visual context on collaborative work
- The Future of Robotic Telepresence: Visions, Opportunities and Challenges
- Ana Cristina Bicharra Garcia, Universidade Federal Fluminense
- A quality inspection method to evaluate e-government sites
- Collaboration and Decision Making in Crisis Situations
- Marco Gerosa, Universidade de São Paulo
- The development and application of distance learning courses on the internet
- Overcoming open source project entry barriers with a portal for newcomers
- Sean Goggins, University of Missouri
- Building a Model Explaining the Social Nature of Online Learning.
- One-Sided Conversations: The 2012 Presidential Election on Twitter
- Sukeshini Grandhi, Eastern Connecticut State University
- P3 systems: Putting the place back into social networks
- To Reply or To Reply All: Understanding Replying Behavior in Group Email Communication
- Irene Greif, IBM emeritus
- Christine Halverson, IBM
- Jeff Hancock, Stanford University
- Experimental evidence of massive-scale emotional contagion through social networks
- Lies in the Eye of the Beholder Asymmetric Beliefs about One’s Own and Others’ Deceptiveness in Mediated and Face-to-Face Communication
- Bjorn Hartmann, UC Berkeley
- Soylent: a word processor with a crowd inside
- Engaging Amateurs in the Design, Fabrication, and Assembly of Electronic Devices
- Brent Hecht, University of Minnesota
- Falling asleep with Angry Birds, Facebook and Kindle: a large scale study on mobile application usage
- ScrollingHome: Bringing Image-based Indoor Navigation to Smartwatches
- Valeria Hegrskovic, Pontificia Universidad Católica de Chile
- Pamela Hinds, Stanford University
- Jina Huh, University of California, San Diego
- BlogCentral: the role of internal blogs at work
- BeUpright: Posture Correction Using Relational Norm Intervention
- Tomoo Inoue, University of Tsukuba
- Steve Jackson, Cornell University
- Understanding infrastructure: dynamics, tensions, and design, NSF Report of a Workshop: History and theory of infrastructure: lessons for new scientific cyberinfrastructures
- Values in repair
- Alex Jaimes, AiCure
- Quentin Jones, New Jersey Institute of Technology
- Karrie Karahalios, University of Illinois, Urbana-Champaign
- Predicting tie strength with social media
- EnGaze: Designing Behavior Visualizations with and for Behavioral Scientists
- Brian Keegan, Harvard Business School
- As real as real? Macroeconomic behavior in a large-scale virtual world
- Ping to Win?: Non-Verbal Communication and Team Performance in Competitive Online Multiplayer Games
- Juho Kim, Korea Advanced Institute of Science and Technology (KAIST)
- How video production affects student engagement: An empirical study of mooc videos
- Revising Learner Misconceptions Without Feedback: Prompting for Reflection on Anomalies
- Yong Ming Kow, City University of Hong Kong
- Chinmay Kulkarni, Carnegie Mellon University
- Peer and self assessment in massive online classes
- Connecting Stories and Pedagogy Increases Participant Engagement in Discussions
- Cliff Lampe, University of Michigan
- The benefits of Facebook “friends:” Social capital and college students’ use of online social network sites
- It’s Creepy, But it Doesn’t Bother Me
- Airi Lampinen, Mobile Life Centre
- All my people right here, right now: management of group co-presence on a social networking site
- Smartwatch in vivo
- Uichin Lee, Korea Advanced Institute of Science and Technology (KAIST)
- Automatic identification of user goals in web search
- FamiLync: facilitating participatory parental mediation of adolescents’ smartphone use
- Gilly Leshed, Cornell University
- CoScripter: automating & sharing how-to knowledge in the enterprise
- Taking a HIT: Designing around Rejection, Mistrust, Risk, and Workers’ Experiences in Amazon Mechanical Turk
- Myriam Lewkowicz, Troyes University of Technology
- Project memory in design
- Counting on the Group: Reconciling Online and Offline Social Support among Older Informal Caregivers
- Wayne Lutters, University of Maryland, Baltimore County
- Preserving the big picture: Visual network traffic analysis with tnv
- Interruptive notifications in support of task management
- Sabrina Marczak, PucRS
- Collaboration patterns and the impact of distance on awareness in requirements-centred social networks
- A systematic literature review on agile requirements engineering practices and challenges
- Jennifer Marlow, FXPAL
- Flickr: a first look at user behaviour in the context of photography as serious leisure
- Supporting Multitasking in Video Conferencing using Gaze Tracking and On-Screen Activity Detection
- Winter Mason, Facebook
- Conducting behavioral research on Amazon’s Mechanical Turk
- The social ties of immigrant communities in the United States
- Donald McMillan, Mobile Life Center, Stockholm University
- Further into the wild: Running worldwide trials of mobile systems
- Pick up and play: understanding tangibility for cloud media
- Giorgio de Michelis, University of Milano – Bicocca
- Cooperative information systems: a manifesto
- Why knowledge is linked to space
- Andrew Miller, University of Washington
- Give and take: a study of consumer photo-sharing culture and practice
- This Digital Life: A Neighborhood-Based Study of Adolescents’ Lives Online
- Arcan Misra, Singapore Management University
- Andres Monroy-Hernandez, Microsoft Research
- Scratch: programming for all
- Surviving an” Eternal September”-How an Online Community Managed a Surge of Newcomers
- Michael Muller, IBM Research
- Participatory design: the third space in HCI
- What Can You Do?: Studying Social-Agent Orientation and Agent Proactive Interactions with an Agent for Employees
- Jeffrey Nichols, Google
- Generating remote control interfaces for complex appliances
- Understanding the Challenges of Designing and Developing Multi-Device Experiences
- Oded Nov, New York University
- Sergio Ochoa, University of Chile
- Mobile computing in urban emergency situations: Improving the support to firefighters in the field
- Identifying Opportunities to Support Family Caregiving in Chile
- Jacki O’Neill, Microsoft Research India
- Text chat in action
- ICT-Enabled Grievance Redressal in Central India: A Comparative Analysis
- Taiwoo Park, Michigan State University
- Seemon: scalable and energy-efficient context monitoring framework for sensor-rich mobile environments
- BeUpright: Posture Correction Using Relational Norm Intervention
- Andrea Parker, Northeastern University
- Celebratory technology: new directions for food research in HCI
- Youth Advocacy in SNAs: Challenges for Addressing Health Disparities
- Anne Marie Piper, Northwestern University
- SIDES: a cooperative tabletop computer game for social skills development
- Designing for the Third Hand: Empowering Older Adults with Cognitive Impairment through Creating and Sharing
- Erica Poole, Penn State University
- Values as lived experience: evolving value sensitive design in support of value discovery
- It Matters If My Friends Stop Smoking: Social Support for Behavior Change in Social Media
- Raquel Prates, Universidade Federal de Minas Gerais
- Methods and tools: a method for evaluating the communicability of user interfaces
- The Rise of Curation on GitHub
- Daniele Quercia, Bell Labs Cambridge
- Our Twitter Profiles, Our Selves: Predicting Personality with Twitter
- The Emotional and Chromatic Layers of Urban Smells
- Dave Randall, University of Siegen
- Faltering from ethnography to design
- ICT Use by Prominent Activists in Republika Srpska
- David Ribes, University of Washington
- Toward information infrastructure studies: Ways of knowing in a networked environment
- Organizing for ontological change: The kernel of an AIDS research infrastructure
- Chiara Rossitto, Stockholm University
- Understanding agency in interaction design materials
- Mark Rouncefield, Lancaster University
- At home with the technology: an ethnographic study of a set-top-box trial
- Microblog Analysis as a Programme of Work
- Raz Schwartz, Facebook
- The livehoods project: Utilizing social media to understand the dynamics of a city
- Editorial Algorithms: Using Social Media to Discover and Report Local News
- Stacey Scott, University of Waterloo
- Territoriality in collaborative tabletop workspaces
- Collocated Interaction: New Challenges in’Same Time, Same Place’Research
- Bryan Semaan, Syracuse University
- Resilience in collaboration: Technology as a resource for new patterns of action
- Transition Resilience with ICTs:’Identity Awareness’ in Veteran Re-Integration
- Orit Shaer, Wellesley College
- Tangible user interfaces: past, present, and future directions
- Beyond the Lab: Using Technology Toys to Engage South African Youth in Computational Thinking
- Chirag Shah, Rutgers University
- Evaluating and predicting answer quality in community QA
- Coagmento 2.0: A System for Capturing Individual and Group Information Seeking Behavior
- David A. Shamma Director of Research Yahoo Labs
- Characterizing debate performance via aggregated twitter sentiment
- The Design, Perception, and Practice of Tablet Photography
- Irina Shklovski, The IT University of Copenhagen
- Backchannels on the front lines: Emergent uses of social media in the 2007 southern California wildfires
- Sharing Steps in the Workplace: Changing Privacy Concerns Over Time
- Carla Simonee, University de Milano-Bicocca
- Chaos as coordination technology
- Moving Western Neighborliness to East?: A study on Local Exchange in Bangladesh
- Vivek Singh, Rutgers University
- Social pixels: genesis and evaluation
- Predicting Privacy Attitudes Using Phone Metadata
- Victoria Sosik, Google
- See friendship, sort of: how conversation and digital traces might support reflection on friendships
- Leveraging social media content to support engagement in positive interventions
- Cleidson de Souza, Vale Institute of Technology & Federal University of Pará
- Seeking the source: software source code as a social and technical artifact
- Doing CSCW Research in Latin America: Differences, Opportunities, Challenges, and Lessons Learned.
- Chengzheng Sun, Nanyang Technical University
- Achieving convergence, causality preservation, and intention preservation in real-time cooperative editing systems
- Conditions and Patterns for Achieving Convergence in OT-based Co-Editors
- Andrea Tapia, Penn State University
- Seeking the trustworthy tweet: Can microblogged data fit the information needs of disaster response and humanitarian relief organizations
- Collaborative Analytics and Brokering in Digital Humanitarian Response
- Zachary Toups, New Mexico State University
- Implicit coordination in firefighting practice: design implications for teaching fire emergency responders
- Collaborative Planning Gameplay from Disaster Response Practice
- Vaninha Vieira, Universidade Federal da Bahia
- Designing context-sensitive systems: An integrated approach
- A context simulator as testing support for mobile apps
- Sarah Vieweg, Qatar Computing Research Institute
- Microblogging during two natural hazards events: what twitter may contribute to situational awareness
- Privacy & Social Media in the Context of the Arab Gulf
- Adriana Vivacqua, Universidade Federal do Rio de Janeiro
- Agents to assist in finding help
- An Environment to Foster Scientific Collaborations
- Amy Voida, University of Colorado
- Listening in: practices surrounding iTunes music sharing
- Creating Friction: Infrastructuring Civic Engagement in Everyday Life
- Donghee Yvette Wohn, New Jersey Institute of Technology
- Facebook as a toolkit: A uses and gratification approach to unbundling feature use
- From Ambient to Adaptation: Interpersonal Attention Management Among Young Adults
- Jude Yew, National University of Singapore
- From shared databases to communities of practice: A taxonomy of collaboratories
- Breaking Down and Building Up: Accelerating Sociotech Scholarship in the iSchool Community
- Lior Zalmanson, New York University
- Content or community? A digital business strategy for content providers in the social age
- ‘Your Action is Needed’: The Effect of Website-Initiated Participation on User Contributions to Content Websites
- Shengdong Zhao, National University of Singapore
- Earpod: eyes-free menu selection using touch input and reactive audio feedback
- 5-Step Approach to Designing Controlled Experiments
- Haiyi Zhu, University of Minnesota
- Effectiveness of shared leadership in online communities
- Market in Your Social Network: The Effects of Extrinsic Rewards on Friendsourcing and Relationships
Need to look into a newspaper corpus that covers the red scare. Looks like Proquest should do it. UMBC doesn’t have an account? Here’s the full list from Wikipedia
Start on 10 page proposal
Schedule?
Rating webpages
This looks like a good site? http://www.leagle.com/

Phil 6.2.16

7:00 – 5:00 VTX

Writing
- Loaded extracted terms into LMT.
  - removed method and datum
- Working through current IR and group polarization
- A document retrieval system for man-machine interaction. From 1964, folks. Also, the ACM is mapping concepts to Wikipedia entries now.
Write up sprint story – done
- Develop a ‘training’ corpus known bad actors (KBA) for each domain.
  - KBAs will be pulled from http://w3.nyhealth.gov/opmc/factions.nsf, which provides a large list.
  - List of KBAs will be added to the content rating DB for human curation
  - HTML and PDF data will be used to populate a list of documents that will then be scanned and analyzed to prepare TF-IDF and LSI term-document tables.
  - The resulting table will in turn be analyzed using term centrality, with the output being an ordered list of terms to be evaluated for each domain.

Building view to get person, rating and link from the db – done, or at least V1

CREATE VIEW view_ratings AS
  select io.link, qo.search_type, po.first_name, po.last_name, po.pp_state, ro.person_characterization from item_object io
    INNER JOIN query_object qo ON io.query_id = qo.id
    INNER JOIN rating_object ro on io.id = ro.result_id
    INNER JOIN poi_object po on qo.provider_id = po.id;

Took results from w3.nyhealth.gov and ran them through the whole system. The full results are in the Corpus file under w3.nyhealth.gov-PDF-centrality_06_02_16-13_12_09.xlsx and w3.nyhealth.gov-WEB-centrality_06_02_16-13_12_09.xlsx. The results seem to make incredibly specific searches. Here are the two first examples. Note that there are very few .com sites.:
- PDF-derived: respondent consent committee probation agreement pursuant
  - This is very medical for some reason, though not very region specific
- Web-derived: physician license professional return number effective
  - This is very medical, but also biased towards NY state.
- Prepending physician names seems (so far) to lead to consistent hits in the search results on standard Google

Phil 5.31.16

7:00 – 4:30 VTX

Writing. Working on describing how maintaining many codes in a network contains more (and more subtle) information than grouping similar codes.
Working on the UrlChecker
- In the process, I discovered that the annotation.xml file is unique only for the account and not for the CSE. All CSEs for one account are contained in one annotation file
- Created a new annotation called ALL_annotations.xml
- fixed a few things in Andy’s file
- Reading in everything. Now to produce the new sets of lists.
- I think it’s just easier to delete all the lists and start over.
- Done and verified. You run UrlChecker from the command line, with the input file being a list of domains (one per line) and the ALL_annotations.xml file.
https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2
Need to add a Delete or Hide button to reduce down a large corpus to a more effective size.
Added. Tomorrow I’ll wire up the deletion of a row or cilumn and the recreation of the initialMatrix

Phil 5.30.16

7:00 – 10:00 Thesis/VTX

Built a new matrix for the coded lit review. I had coded a couple of more papers
Working on copying over the read papers into a new folder that I can run text analytics over
After carefully reading through the doc manager list and copying over each paper, I just discovered I could have exported selected.
Ooops: Exception in thread “JavaFX Application Thread” java.lang.IllegalArgumentException: Invalid column index (16384). Allowable column range for EXCEL2007 is (0..16383) or (‘A’..’XFD’)
- Going to add a limit of
```
SpreadsheetVersion.EXCEL2007.getMaxColumns()-8
```
  columns for now. Clearly that can be cut down.
- Figuring out where to cut the terms. I’m summing the columns of the LSI calculation, starting at the highest value and then dividing that by the sum of all values. The top 20% of rank weights gives 280 columns. Going to try that first
- Success! Some initial thoughts
  - The coded version is much more ‘crisp’
  - There are interesting hints in the LSI version
  - Clicking on a term or paper to see the associated items is really nice.
  - I think that document subgroups might be good/better, and it might be possible to use the tool to help build those subgroups. This goes back to the ‘hiding’ concept. (hide item / hide item and associated)

Phil 5.9.16

7:00 – 4:00 VTX

Started the paper describing the slider interface
TF-IDF today!
- Read docs from web and PDF
- Calculate the rank
- Create matrix of terms and documents, weighted by occurrence.
Hmm. What I’m actually looking for is the lowest-occurring terms within a document that occur over the largest number of documents. I’ve used this page as a starting point. After flailing for many hours in java, I wound up walking through the algorithm in Excel and I think I’ve got it. This is the spreadsheet that embodies my delusional thinking ATM.

Phil 5.6.16

7:00 – 4:00 VTX

Today’s shower thought is to compare the variance of the difference of two (unitized) rank matrices. The maximum difference would be (matrix size), so we do have a scale. If we assume a binomial distribution (there are many ways to be slightly different, only two ways to be completely different), then we can use a binomial (one tailed?) distribution centered on zero and ending at (matrix size). That should mean that I can see how far one item is from the other? But it will be withing the context of a larger distribution (all zeros vs all ones)…
Before going down that rabbit hole, I decided to use the bootstrap method just to see if the concept works. It looks mostly good.
- Verified that scaling a low-ranked item (ACLED) by 10 has less impact than scaling the highest ranking item (P61) by 1.28.
- Set the stats text to red if it’s outside 1 SD and green if it’s within.
- I think the terms can be played around with more because the top one (Pertinence) gets ranked at .436, while P61 has a rank of 1.
- There are some weird issues with the way the matrix recalculates. Some states are statistically similar to others. I think I can do something with the thoughts above, but later.
There seems to be a bug calculating the current mean when compared to the unit mean. It may be that the values are so small? It’s occasional….
Got the ‘top’ button working.
And that’s it for the week…

Oh yeah – Everything You Ever Wanted To Know About Motorcycle Safety Gear

Phil 5.5.16

7:00 – 5:30 VTX

Continuing An Introduction to the Bootstrap.
This helped a lot. I hope it’s right…
Had a thought about how to build the Bootstrap class. Build it using RealVector and then use Interface RealVectorPreservingVisitor to do whatever calculation is desired. Default methods for Mean, Median, Variance and StdDev. It will probably need arguments for max iteration and epsilon.
Didn’t do that at all. Wound up using ArrayRealVector for the population and Percentile to hold the mean and variance values. I can add something else later
I think to capture how the centrality affects the makeup of the data in a matrix. I think it makes sense to use the normalized eigenvector to multiply the counts in the initial matrix and submit that population (the whole matrix) to the Bootstrap
Meeting with Wayne? Need to finish tool updates though.
Got bogged down in understanding the Percentile class and how binomial distributions work.
Built and then fixed a copy ctor for Labled2DMatrix.
Testing. It looks ok, but I want to try multiplying the counts by the eigenVec. Tomorrow.

Phil 5.3.16

7:00 – 3:30 VTX

Out riding, I realized that I could have a column called ‘counts’ that would add up the total number of ‘terms per document’ and ‘documents per terms ‘. Unitizing the values would then show the number of unique terms per document. That’s useful, I think.
Helena pointed to an interesting CHI 2016 site. This is sort of the other side of extracting pertinence from relevant data. I wonder where they got their data from?
- Found it!. It’s in a public set of Google docs, in XML and JSON formats. I found it by looking at the GitHub home page. In the example code there was this structure:
```
source: {
    gdocId: '0Ai6LdDWgaqgNdG1WX29BanYzRHU4VHpDUTNPX3JLaUE',
    tables: "Presidents"
  }
```
  That gave me a hint of what to look for in the document source of the demo, where I found this:
```
var urlBase = 'https://ca480fa8cd553f048c65766cc0d0f07f93f6fe2f.googledrive.com/host/0By6LdDWgaqgNfmpDajZMdHMtU3FWTEkzZW9LTndWdFg0Qk9MNzd0ZW9mcjA4aUJlV0p1Zk0/CHI2016/';
```
  And that’s the link from above.
- There appear to be other useful data sets as well. For example, there is an extensive CHI paper database sitting behind this demo.
- So this makes generalizing the PageRank approach much more simple since it looks like I can pull the data down pretty simply. In my case I think the best thing would be to write small apps that pull down the data and build Excel spreadsheets that are read in by the tool for now.
Exporting a new data set from Atlas. Done and committed. I need to do runs before meeting with Wayne.
Added Counts in and refactored a bit.
I think I want a list of what a doc or term is directly linked to and the number of references. Addid the basics. Wiring up next. Done! But now I want to click on an item in the counts list and have it be selected? Or at least highlighted?
Stored the new version on dropbox: https://www.dropbox.com/s/92err4z2posuaa1/LMN.zip?dl=0
Meeting with Wayne
- There’s some bug with counts. Add it to the WeightedItem.toString() and test.
- Add a ‘move to top’ button near the weight slider that adds just enough weight to move the item to the top of the list. This could be iterative?
- Add code that compares the population of ranks with the population of scaled ranks. Maybe bootstrapping? Apache Commons Math has KolmogorovSmirnovTest, which has public double kolmogorovSmirnovTest(double[] x, double[] y, boolean strict), which looks promising.
Added ability to log out of the rating app.

Phil 4.29.16

7:00 – 5:00 VTX

Expense reports and timesheets! Done.
Continuing Informed Citizenship in a Media-Centric Way of Life
- The pertinence interface may be an example of a UI affording the concept of monitorial citizenship.
  - Page 219: The monitorial citizen, in Schudson’s (1998) view, does environmental surveillance rather than gathering in-depth information. By implication, citizens have social awareness that spans vast territory without having in-depth understanding of specific topics. Related to the idea of monitorial instead of informed citizenship, Pew Center (2008) data identified an emerging group of young (18–34) mobile media users called news grazers. These grazers ind what they need by switching across media platforms rather than waiting for content to be served.
- Page 222: Risk as Feelings. The abstract is below. There is an emotional hacking aspect here that traditional journalism has used (heuristically?) for most(?) of its history.
  - Virtually all current theories of choice under risk or uncertainty are cognitive and consequentialist. They assume that people assess the desirability and likelihood of possible outcomes of choice alternatives and integrate this information through some type of expectation-based calculus to arrive at a decision. The authors propose an alternative theoretical perspective, the risk-as-feelings hypothesis, that highlights the role of affect experienced at the moment of decision making. Drawing on research from clinical, physiological, and other subfields of psychology, they show that emotional reactions to risky situations often diverge from cognitive assessments of those risks. When such divergence occurs, emotional reactions often drive behavior. The risk-as-feelings hypothesis is shown to explain a wide range of phenomena that have resisted interpretation in cognitive–consequentialist terms.
- At page 223 – Elections as the canon of participation
Working on getting tables to sort – Done
Loading excel file -done
Calculating – done
Using weights -done
Reset weights – done
Saving (don’t forget to add sheet with variables!) – done
Wrapped in executable – done
Uploading to dropbox. Wow – the files with JavaFX are *much* bigger than Swing.

Phil 4.25.16

5:30 – 4:00 VTX

Saw this on Twitter about visualizing networks with D3
Working my way through the JavaFX tutorial. It is a lot like a blend of Flex and a rethought Swing. Nice, actually…
Here is the list of stock components
Starting with the ope file dialog – done.
Yep, there’s a spinner. And here’s dials and knobs
And here’s how to do a word cloud.
Here’s a TF-IDF implementation in JAVA. Need to build some code that reads in from our ‘negative match’ ‘positive match’ results and start to get some data driven terms
Tregex is a utility for matching patterns in trees, based on tree relationships and regular expression matches on nodes (the name is short for “tree regular expressions”). Tregex comes with Tsurgeon, a tree transformation language. Also included from version 2.0 on is a similar package which operates on dependency graphs (class SemanticGraph, calledsemgrex).
Semgrex
Sprint review
- Google CSEs
  - Switched over from my personal CSEs to Vistronix CSEs
  - Added VCS rep for CSEs
  - Figured out how to save out and load CSE from XML
  - Added a few more CSEs ONLY_NET, MOBY_DICK
  - Wrote up care and feeding document for Confluence
  - Added blacklists
- Rating App
  - Re-rigged the JPA classes to be Ontology-agnostic Version 2 of nearly everything)
  - Upped my JQL game to handle SELECT IN WHERE precompiled queries
  - Reading in VA and PA data now
  - Added the creation of a text JSON object that formalizes the rating of a flag
  - Got hooked up to the Talend DB!!!
  - Deployed initial version(s)
  - Added backlink logging using SemRush
- Future work
  - Developed Excel ingest
  - Still working on PDF and Word ingest

viztales

Dimension reduction, State, Orientation, and Speed

Category Archives: research

Phil7.25.16

Phil 7.22.16

Phil 6.21.16

Phil 6.20.16

Phil 6.15.16

Phil 6.13.16

Phil 6.2.16

Phil 5.31.16

Phil 5.30.16

Phil 5.9.16

Phil 5.6.16

Phil 5.5.16

Phil 5.3.16

Phil 4.29.16

Phil 4.25.16