转移神经网络
机器学习 (Machine Learning)
Neural network topology describes how neurons are connected to form a network. This architecture is infinitely adaptable, and novel topologies are often hailed as breakthroughs in neural network research. From the advent of the Perceptron in 1958 to the Feed Forward neural network to Long/Short Term Memory models to — more recently — Generative Adversarial networks developed by Ian Goodfellow, innovative architectures represent great advancements in machine learning.
Ñeural网络拓扑描述的神经元是如何连接以形成网络。 这种体系结构是无限适应的,新颖的拓扑通常被誉为神经网络研究的突破。 从1958年Perceptron的问世到前馈神经网络,再到长期/短期记忆模型,再到由Ian Goodfellow开发的Generative Adversarial网络,最近,创新的体系结构代表了机器学习的巨大进步。
But how does one find novel, effective architectures for specific problem sets? Solely through human ingenuity, until recently. This brings us to Neural Architecture Search (NAS), an algorithmic approach to discover optimal network topologies through raw computational power. The approach is basically a massive GridSearch. Test many combinations of hyperparameters, such as number of hidden layers, number of neurons in each layer, activation function, etc., to find the best performing architecture. If this sounds incredibly resource intensive, it is.
但是,如何找到针对特定问题集的新颖有效的体系结构呢? 仅凭人类的才智,直到最近。 这使我们进入了神经体系结构搜索(NAS),这是一种通过原始计算能力发现最佳网络拓扑的算法方法。 该方法基本上是大规模的GridSearch。 测试超参数的许多组合,例如隐藏层的数量,每层中神经元的数量,**函数等,以找到性能最佳的体系结构。 如果这听起来非常耗费资源,那就是。
The NAS can be broken down into three components that capture the various optimization problems it must solve.
NAS可以分为三个组件,以捕获必须解决的各种优化问题。
-
Search Space: the parameters you want to check. As Kyriakides and Margaritis warn [2], reducing the search space will conserve resources, but without applicable domain knowledge and intuition about possible effective architectures one can easily miss the optimal solution. Indeed, it is very easy to introduce bias into the search by only adapting already known-to-be-successful architectures. You will only solidify what we already know to work.
搜索空间 :您要检查的参数。 正如Kyriakides和Margaritis所警告的那样[2],减少搜索空间将节省资源,但是如果没有适用的领域知识和对可能的有效架构的直觉,人们很容易会错过最佳解决方案。 确实,仅通过调整已知的成功架构就很容易在搜索中引入偏见。 您只会巩固我们已经知道可以工作的内容。
-
Optimization Method: how to explore the search space. One must balance the value of further exploration toward the global optimum with the cost of that exploration. There are a host of methods to utilize, including evolutionary algorithms (which will appear again later), reinforcement learning, and Bayesian optimization.
优化方法 :如何探索搜索空间。 必须在进一步探索的价值与全球探索的成本之间达成平衡。 有很多方法可以利用,包括进化算法(稍后将再次出现),强化学习和贝叶斯优化。
-
Candidate Evaluation: how to choose the best model. This is more straight-forward but, again, computationally expensive. This can be circumvented to some degree by using fewer training epochs or by sharing weights between models as the process goes on.
候选人评估 :如何选择最佳模型。 这更直接,但又在计算上昂贵。 可以通过使用较少的训练时期或在过程进行时在模型之间共享权重来在某种程度上避免这种情况。
I recommend reading more about evolutionary algorithms here [3].
As the authors note, it may seem that we have simply traded out one difficult decision process of model fine-tuning for another equally difficult one of architecture search tuning. So where do we go from here?
正如作者所指出的那样,似乎我们只是将模型微调的一个困难决策过程换成了另一个同样困难的体系结构搜索调整决策过程。 那么,我们该何去何从?
神经体系结构转移,对转移学习进行了重新构想 (Neural Architecture Transfer, transfer learning reimagined)
Transfer learning has traditionally meant using pre-trained neural networks with frozen neuron weights to drastically improve specific model outcomes. These models, like VGG, Inception, Xception, and MobileNet, are sophisticated, deep networks trained on the ImageNet dataset containing over 1.2 million images. This allows individuals to leverage powerful models without the need for resource intensive training.
传统上,转移学习意味着使用具有冻结神经元权重的预训练神经网络来大大改善特定模型的结果。 这些模型,如VGG , Inception , Xception和MobileNet ,是在包含120万张图像的ImageNet数据集上经过训练的复杂深度网络。 这使个人可以利用功能强大的模型,而无需进行资源密集的培训。
Transfer learning can have stunning results, however one caveat with their adoption is that their architectures cannot be altered by the end user to better suit their needs. Once their architecture has been changed one must re-train the model, voiding the advantages of the pre-trained model.
迁移学习可以产生惊人的结果,但是采用它们的一个警告是最终用户不能更改其体系结构以更好地满足他们的需求。 一旦更改了他们的体系结构,就必须重新训练模型,从而丧失了预训练模型的优势。
Enter neural architecture transfer, a new approach to transfer learning. Described by researchers at Michigan State University, Zhichao Lu, Gautam Sreekumar, Erik Goodman, Wolfgang Banzhaf, Kalyanmoy Deb, and Vishnu Naresh Boddeti, NAT allows for customized architectures optimized for the user’s problem space. This process automates NAS, but it is able to circumvent some of the complications that NAS creates.
进入神经体系结构转移,这是转移学习的一种新方法。 NAT由密歇根州立大学的研究人员Lu Zhichao,Gautam Sreekumar,Erik Goodman,Wolfgang Banzhaf,Kalyanmoy Deb和Vishnu Naresh Boddeti描述,NAT允许针对用户的问题空间进行优化的定制架构。 此过程可自动执行NAS,但可以避免NAS造成的一些复杂情况。
The steep computational costs of exploring and training many models during NAS is negated to a degree via supernets, which are composed of multiple component subnets which can be trained simultaneously with weight sharing. The supernet then returns optimal subnets on the Pareto optimization front that can be leveraged for novel problems, just like traditional transfer learning.
在NAS期间探索和训练许多模型的高昂计算成本通过超级网络被消除了一定程度,超级网络由多个组成部分的子网组成,可以同时进行权重共享训练。 然后,超级网络在Pareto优化方面返回最优子网,可以像传统的转移学习一样利用它来解决新问题。
The Pareto front, or frontier, is useful for visualizing the trade-offs within a multi-objective search, like NAS. An appealing aspect of NAT is that it only returns subnets that lie on the Pareto front, ensuring models with an optimal topology.
Pareto前沿或前沿对于可视化NAS等多目标搜索中的权衡取舍很有用。 NAT的一个吸引人的方面是,它仅返回位于Pareto前端的子网,从而确保具有最佳拓扑的模型。
What initially spawned this post was the method utilized to conduct the multi-objective search itself. Researchers used an evolutionary algorithm (EA) to search for the optimal topologies for subnets. EAs leverage “survival of the fittest” over many generations to identify the optimal hyperparameters.
最初产生此帖子的是用于进行多目标搜索本身的方法。 研究人员使用进化算法(EA)搜索子网的最佳拓扑。 EA通过“优胜劣汰”历经了几代人来确定最佳超参数。
The general process is as follows:
大致过程如下:
- Select initial population from previously explored architectures 从先前探索的架构中选择初始种群
- Create “offsprings” as mutations and cross-over from the “parent” architectures 创建“后代”作为“父代”体系结构的变异和交叉
- “Survival of the fittest” — only keep the best k architectures to be the next generation of parents “适者生存”-只保留最好的k架构成为下一代父母
- Repeat until desired objective thresholds are met 重复直到达到所需的客观阈值
The reported results from the researchers are quite astounding. With the supernet trained on ImageNet and subnets evaluated on ten image classification datasets, including ImageNet, CIFAR-10, and CIFAR-100, NAT models consistently outperformed state-of-the-art models, including Inception, ResNet, and MobileNet, while requiring an order of magnitude fewer FLOPs.
研究人员报告的结果令人震惊。 通过在ImageNet上训练的超级网络以及在包括ImageNet,CIFAR-10和CIFAR-100的十个图像分类数据集上评估的子网,NAT模型始终优于最新模型,包括Inception,ResNet和MobileNet,同时需要FLOP减少了一个数量级。
As the researchers note, NATnets were significantly more effective than conventional fine-tuning. Additionally, because the same supernet was used for all datasets and the datasets were incredibly varied, it appears that NAT was indeed able to produce customized, optimal subnets for each novel problem space.
正如研究人员所指出的那样,NATnet比传统的微调有效得多。 另外,由于所有数据集都使用了相同的超级网络,并且数据集的变化令人难以置信,因此看来NAT确实能够为每个新颖的问题空间生成定制的最佳子网。
Neural architecture transfer could very well be an innovation in league with those advancements mentioned above. It is able to robustly generate high performing, low cost subnets for novel problems, and it has already outperformed many of the current vanguard models on a selection of classification cases.
与上述那些进步相结合,神经体系结构的转移很可能是一项创新。 它能够为新问题鲁棒地生成高性能,低成本的子网,并且在选择分类案例时,它已经超过了许多当前的先锋模型。
If NAT or any of its methods are of any interest, please read [1] for an in-depth explanation! And here is the researcher’s GitHub page for NAT.
如果您对NAT或它的任何方法感兴趣,请阅读[1]以获得更深入的解释! 这是研究人员在NAT上使用的GitHub页面。
Feel free to explore my other articles here, and connect with me on LinkedIn here.
资料来源 (Sources)
[1] Z. Lu, G. Sreekumar, E. Goodman, et al., Neural Architecture Transfer (2020), arXiv:2005.05859.
[1] Z. Lu,G. Sreekumar,E. Goodman等人,《 神经体系结构转移》 ( Neural Architecture Transfer) (2020),arXiv:2005.05859。
[2] G. Kyriakides and K. Margaritis, An Introduction to Neural Architecture Search for Convolutional Networks (2020), arXiv:2005.11074.
[2] G. Kyriakides和K. Margaritis, 《卷积网络的神经体系结构搜索入门》 (2020年),arXiv:2005.11074。
[3] A.N. Sloss and S. Gustafson, 2019 Evolutionary Algorithms Review (2019), arXiv:1906.08870.
[3] AN Sloss和S.Gustafson, 2019年进化算法评论 (2019年),arXiv:1906.08870。
翻译自: https://towardsdatascience.com/neural-architecture-transfer-54226b2306e3
转移神经网络