Phil 3.13.16

9:00 – 5:00

  • Data journalism is IR with better affordances?

Still thinking about getting lost. In low information environments, credibility cues and entertainment value can lead to habituation. Habituation can help maintain this process beyond what someone who’s unfamiliar with the situation might draw the line at. Which means that the sense betrayal is higher?

ACM CHIIR Conference Day 1 – Tutorials  User modelling on information retrieval

  • Quantifying performance
  • Practically significant?
  • Statistical significance?
  • User-centered evaluation
    • Measure Users in the wild
      • A/B Testing, etc.
    • User in the lab
  • User performance prediction
    • Record user
    • Create model
    • Calibrate
    • validate
    • Use model to predict performance
  • Cranfield Paradigm – Cyril Cleverdon
    • TREC – paid assessors
    • User satisfaction for retrieval evaluation metrics
    • Discounted Cumulative Gain (probability of document visited WRT rank) can also be normalized WRT an optimal return
    • Expected Reciprocal Rank – pertinence calculation??? Based on the idea that there is one perfect document that is who’s utility is based on the position of the document
    • Average precision <– search for this
  • Diversity, novelty (novelty), tractability
  • Underspecified vs. ambiguous queries
  • Specifications have aspects
  • Ambiguities have interpretations
  • Inferring query intent from reformulations and clicks
  • Ian Soboroff – Mr. TREC
  • Randomization – check animation in slides
  • Bootstrap –
  • Sign test – just for one side or the other of a value. A binomial distribution

Afternoon session

  • Evaluating whole systems
  • Metrics Derived from Query Logs
    • Use the logs o understand user behavior, then…
    • Learn the parameter of the user model from the query logs
  • Incorporating UI
  • User Variance
  • Time
    • Costs in time spent searching
    • Benefits in time well spent
    • Initial Assessment – quickly scan the document first. So what if we could make that more amenable to measuring that effort.
      • Findability
      • Readablity
      • Understandability
      • If the judge has to use tools to find the relevant part of the document and mark it, those biometrics might be usable…
    • Utility Extraction
    • A real user goes through both stages, an Assessor only does step 1, Initial Assessment. But learning can be a third step? It’s certainly the step that would take the most time and require interdocument relationships
    • What about learning how to disambigulating your query?
    • Conceptual leaps???? Is that an information distance issue???
  • Session
    • Time spent on the last clicked document.
    • A session is just based on time (e.g. 30 minutes). TREC is leaving session and going to Task-Based
  • Task
    • What is a Gold-Standard task??
    • Which metrics to use??

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.