Approximately One-Tenth of KDD 2018

Through mysterious means, I was able to attend the last half-day of KDD. So here’s some notes on the talks I went to.

(As an aside, I’m also going to try taking notes directly in markdown rather than on paper like some sort of neanderthal, wish me luck.)

Xiangliang Zhang, paper
the task: given user and tweets, predict keywords over time
- should be relevant, diverse, dynamic
related work:
- expert finding (based on documents) - e.g. Ryback 2014
- Balong 2007 - generative language (static)
- dynamic topic modeling
- dynamic word embedding (Kim 2014, Hamilton 2016, Bamler & Mandt 2017)
want words & users embedded in same semantic space, modeled dynamically

duwe

Dynamic User and Word Embedding - graphical model
- model user & word representations as diffusion, parametrized by how much user’s documents (tweets) change over time
- use skipgram filtering (Bamler & Mandt 2017) for inference
Streaming Keyword Diversification Model - not really described
- (but this has given me some ideas….. )
evaluation - not really described, but naively this seems tricky to do right

Chen Luo, paper
analysing system behaviour of large-scale online services (e.g. AWS) - automate that shit
invariant network model - capture normal system behaviour, can do anomaly detection etc.
standard methods of learning (time series etc.) is slow slow slow
difficult to directly transfer invariant network to new environ
given complete old IN and incomplete new IN - get complete new IN
- domain-specific knowledge from target
- common knowledge from source
- deal with heterogeneity

tinet

TINET = Entity Estimation Model + Dependency Construction Model
- transfer relevant entities and construct missing dependencies
compare with random walk & collective matrix factorization (other ways to compute node importance & link prediction)
effectiveness demonstrated in synthetic & real experiments
converges within 20 iterations, no parameters need to be tuned

Victor Kristof, paper
distributed peer production systems (open source, crowdsourcing…)
want to predict quality of contributions
- e.g. user reputation, specialised classifiers
- want something both general and accurate

interank

INTERANK - simple, general, accurate
model the probability that edit is successful as game between user and item
- “skill” of user and “difficulty” of item
- also model closeness of user and item in shared “skill” latent embedding space
Wikipedia experiments
- difficulty param correlates highly with manually determined controversial articles
- topical clustering in latent space
Linux experiments
- core components most difficult (low acceptance rates)

Yaliang Li, paper
truth discovery - weighted aggregation (e.g. for crowdsourcing)
- estimate accuracy of answers and reliability of users
but may require sharing info that you want to keep private
randomized response - return true response with some probability, otherwise some default
two-layer system
- sample private probability from some distribution
- use this for randomized response
analyse privacy with epsilon-local differential privacy
analyse utility by bounding change in error rate
I kinda stopped paying attention to this one partway through, whoops

Biwei Huang, paper
causal discovery from observational data
existing score-based methods (e.g. BIC) make strong assumptions on data distribution, etc.
want something general and asymptotically correct
- mixed data types
- nonlinear causal relations
- arbitrary data distributions
- multi-dimensionalities
cross-validated likelihood and marginal likelihood in RKHS
- locally statistically consistent
I found this presentation really hard to follow, but it seems like a simple and theoretically well-founded idea

Pan Lu, paper
image + question => predict answer
past work
- image & question feature extraction, fusion, multiclass classification into answers from training data
- models tend to extract entities
relation facts (triples) have larger semantic capacities - extract these instead
existing data sets don’t have aligned relation facts (e.g. COCO-QA, VQA) or aligned question-answer pairs (visual genome)
=> Relation-VQA dataset! (to be released soon)

rvqa

architecture tries to extract relation facts from image and question
then semantic attention over relations to answer question
also context-aware visual attention conditioned on image and question
gets SOTA on R-VQA (a dataset they just came up with, so of course)
- … and also COCO-QA (though they trained on strictly more information I guess since there’s also relations? maybe other work does as well)
I still don’t really understand what the point of this task is, though I guess it’s mildly AGI-ish and probably learns some useful semantic representations

And that’s a wrap!