My favourite papers from day one of ICML 2015

07 July 2015

Aargh! How can I possibly keep all the amazing things I learnt at ICML today in my head?! Clearly I can’t. This is a list of pointers to my favourite papers from today, and why I think they are cool. This is mainly for my benefit, but you might like them too!

Neural Nets / Deep Learning

BilBOWA: Fast Bilingual Distributed Representations without Word Alignments

Stephan Gouws, Yoshua Bengio, Greg Corrado

Why this paper is cool: It simultaneously learns word vectors for words in two languages without having to learn a mapping between them.

Compressing Neural Networks with the Hashing Trick

Wenlin Chen, James Wilson, Stephen Tyree, Kilian Weinberger, Yixin Chen

Why this paper is cool: Gives a huge reduction (32x) in the amount of memory needed to store a neural network. This means you can potentially use it on low memory devices like mobile phones!

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, Christian Szegedy

Why this paper is cool: Makes deep neural network training super fast, giving a new state of the art for some datasets.

Deep Learning with Limited Numerical Precision

Suyog Gupta, Ankur Agrawal, Kailash Gopalakrishnan, Pritish Narayanan

Why this paper is cool: Train neural networks with very limited fixed precision arithmetic instead of floating points. The key insight is to use randomness to do the rounding. The goal is to eventually build custom hardware to make learning much faster.

Recommendations etc.

Fixed-point algorithms for learning determinantal point processes

Zelda Mariet, Suvrit Sra

Why this paper is cool If you want to recommend a set of things, rather than just an individual thing, how do you choose the best set? This will tell you.

Surrogate Functions for Maximizing Precision at the Top

Why this paper is cool: If you only care about the top n things you recommend, this technique works faster and better than other approaches.

Purushottam Kar, Harikrishna Narasimhan, Prateek Jain

And Finally…

Learning to Search Better than Your Teacher

Kai-Wei Chang, Akshay Krishnamurthy, Alekh Agarwal, Hal Daume, John Langford

Why this paper is cool: A new, general way to do structured prediction (tasks like dependency parsing or semantic parsing) which works well even when there are errors in the training set. Thanks to the authors for talking me through this one!

Machine Learning in Practice 6