Optimizing Machine Learning
Machine learning models have grown increasingly large over the past years, while at the same time are being deployed more often to embedded devices. Embedded devices often have limited resources. Therefore, several methods have been devised to reduce the model size, decrease computing power required for inference, and lower energy consumption, while keeping the models’ accuracy acceptable. One of the methods to achieve said goal is neural network pruning. When applying this method, certain weights (unstructured) or even complete structures (structured) of a neural network are pruned (removed).
Reducing energy consumption in neural networks
In literature, many algorithms have been described to apply said pruning methods. Each with their own strategy, and different effects on the size, accuracy, and energy consumption of the pruned neural network. In Sebastiaan’s research, a subset of the most popular (most referenced in literature) pruning algorithms from two different pruning method ‘families’ (unstructured and structured pruning) has been analyzed. His analysis is, to the best of his knowledge, one of the first real-world energy consumption, model size and accuracy analysis of these pruning algorithms, applied to one of the most utilized compact network architectures (MobileNetV2) with deployment to one of the most popular development boards (based on ARM). The research tries to generalize results where possible to make predictions about other, untouched pruning methods and model architectures based on, for example, the properties of the examined algorithms and model architectures. This generalization can possibly be used by engineers as a reference providing insight and guidance upon deciding which pruning method could be applied to a machine learning model, that is to be implemented on an embedded system, depending on the accuracy-, energy consumption- and model size requirements.
Last but not least
Furthermore, Sebastiaan is proud to announce that his paper will be published in Springer’s “IFIP Advances in Information and Communication Technology Series” as part of the IFIP IoT Conference 2022 proceedings, where the paper has been accepted. He will be presenting his research on October 27 and 28 of this year in Amsterdam. The DOI of the paper is available via this link.
Do you have an assignment or concept you would like to explore around AI, Machine Learning and/or Neural networks? Maybe Sebastiaan can help you out, feel free to get in touch with us.