Minutes of Roundtable Discussion 1

Our first roundtable discussion took place after Yong-Jae Lee’s MADDD seminar talk on “Learning to Understand Visual Data with Minimal Human Supervision“. His talk slides can be found here.

The following note summaries this roundtable discussion.

I. Future of Deep Learning

The first part of this roundtable discussion was naturally devoted to the future of deep learning due to the continuity from the talk. The moderator brought up the point that implementation is moving quickly, with dedicated chips already being integrated into the new iPhone 11 series. Yet we don’t have a solid handle on why it works, or in what cases it can be expected to have difficulty. The speaker roughly divided the field of people working on Deep Nets into two camps: those who view explainability as important, and those who don’t. Their primary disagreement is over whether or not it is worth taking a performance hit to be able to explain why a model is able to work. This camp makes parallels with human cognition, as described in Thinking Fast and Slow; often, we can’t even explain how a human knows what they know. On the other hand, there are good legal and debugging reasons to have a method that is explainable, especially in certain fields, such as medicine. An attendee made a contrast between security and network control; in the former, understanding why certain code is classified as malware is quite important. With a hat-tip to David Donoho (Stanford), the moderator claimed that it is most likely that there is some theory behind methods that work well, so we should expect this to apply to Deep Nets as well. The speaker drew a contrast between visualization techniques, which are best thought of as a method of debugging a neural net, and actual explanations. They may serve some of the same purposes, but don’t dig into the details as thoroughly. As a way of getting around the trade-off between performance and explainability, an attendee suggested training explainable and non-explainable methods in parallel, and checking for when they agree and disagree. The speaker and other attendees were somewhat skeptical; just because the two networks give the same answer doesn’t mean they do so for the same reason, so the utility of such an approach is somewhat less clear.

There are many cases when Deep Nets prove to be somewhat problematic. The subfield of adversarial examples, where examples are generated that are visually in one class, but the network perceives them as in a different class are the most problematic. Using the lovely example of giraffe detection, the speaker described Deep Nets’ problems with domain adaptation, where it is difficult for them to generalize from a giraffe seen in a children’s book to a giraffe in the wild, as well as few-shot learning, where the problem is to go from a single example of a giraffe to recognizing giraffes in general. In some sense, the Deep Nets do not yet capture Platonic ideal model of a giraffe as we humans have, from which we can associate an illustration of a giraffe, a photograph of a giraffe, and a real view of a giraffe. We discussed some ways of getting around these problems; for example, with images, we’re only getting 2D shots of 3D objects, so one attendee suggested it would be more effective if we had access to point-cloud representations of objects, while another suggested video data. The speaker described that while both approaches have had some traction, neither has become a major area of Deep Net research, despite being the right direction to go in the long term. The problems here were a relative dearth of data in the case of point clouds, and the amount of programmer headaches that video data cause. Ideally with time both of these will be addressed.

Finally, we discussed both multi-modal data and interactive data. While both are ideal, they are difficult to collect, and the speaker concluded the best route for this would be robotics and better methods for unsupervised learning.

II. TRIPODS Activities This Year

Naoki Saito presented a summary of planned activities of the UCD4IDS project. His presentation slides can be found here.

[Scribe: David Weber (GGAM)]

Primary Category

Tags