Posts

Recently, we have transferred from Rice University to GaTech. After a relatively long time settlement and exploration, the life went to normal but a different style. My new apartment is 4 miles far from the campus so I have to drive there everyday even without Klaus’s parking pass.

Efficient DNN Training Summary Model compression has been extensively studied for light-weight inference, popular means includes network pruning, weight factorization, network quantization, and neural architecture search among many others. On the other hand, the literature on efficient training appears to be much sparser, DNN training still requires us to fully train the over-parameterized neural network.

Accepted as NeurIPS 2020 regular paper! Abstract: Multiplication (e.g., convolution) is arguably a cornerstone of modern deep neural networks (DNNs). However, intensive multiplications cause expensive resource costs that challenge DNN deployment on resource-constrained edge devices, driving several attempts for multiplication-less deep networks.

Accepted as NeurIPS 2020 regular paper! Abstract: Recent breakthroughs in deep neural networks (DNNs) have fueled a tremendous demand for intelligent edge devices featuring on-site learning, while the practical realization of such systems remains a challenge due to the limited resources available at the edge and the required massive training costs for state-of-the-art (SOTA) DNNs.

Accepted as ECCV 2020 regular paper! Abstract: There has been an explosive demand for bringing machine learning (ML) powered intelligence into numerous Internet-of-Things (IoT) devices. However, the effectiveness of such intelligent functionality requires in-situ continuous model adaptation for adapting to new data and environments, while the on-device computing and energy resources are usually extremely constrained.

Recent works show that DNN training undergoes different stages, each stage shows different effects given a hyper-parameter setting and therefore entails detailed explaination. Below I aims to analyze and share the deep understanding of DNN training, especially from the following three perspectives:

Accepted as ISCA 2020 regular paper! Abstract: We present SmartExchange, an algorithm-hardware co-design framework to trade higher-cost memory storage/access for lower-cost computation, for energy-efficient inference of deep neural networks (DNNs).

Lottery Ticket Hypothesis A randomly-initialized, dense neural network contains a subnetwork that is initialized such that—when trained in isolation—it can match the test accuracy of the original network after training for at most the same number of iterations.

This post is supposed to be my reflect about cultivating good research taste as an individual researcher, and should always be maintained and reviewed! Updates 01/02/2020 Today I find one paper I have criticized got accepted as an oral presentation, I was dismissive at the first glance since one can easily understand how it suppose to work technically and further regarded it as granted.

Before applying for Ph.D., I heard that Rice University is of highest happiness index. I was believed (mind changes later) that this is the case for undergraduates while not applied to Ph.

Posts

Hiking at Kennesaw Mountain

Efficient DNN Training

[NeurIPS 2020] ShiftAddNet: A Hardware-Inspired Deep Network

[NeurIPS 2020] FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

[ECCV 2020] HALO: Hardware-Aware Learning to Optimize

DNN Training Stages Understanding

[ISCA 2020] SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation

Lottery Ticket Hypothesis

Cultivate Good Research Taste

Hermann Park