Primer: Why is deep learning so deep?
University of Pennsylvania
Deep neural networks are more powerful than shallow ones, but can be harder to train. In this primer, we will show mathematically why both of these statements are true. Specifically, we will see that depth leads to an exponentially greater ability to express even simple polynomial functions. We will identify why some initializations and architectures impede learning in deeper networks, and demonstrate (both theoretically and empirically) several principles to bear in mind when designing a deep neural network that will learn effectively.