A graph placement methodology for fast chip design

A graph placement methodology for fast chip design
发表时间：2021（nature）
文章要点：这篇文章把强化学习用到芯片设计上，缩短了芯片设计时间，达到甚至超过了现有技术水平。芯片设计的主要工作就是排列组合那些模块，比如memory subsystem, compute unit or control logic system，同时还要满足一些约束条件，比如density and routing congestion等等，相当于是个组合优化问题。方法思路先是将芯片设计看作一个序列决策问题，一个一个放置模块，放完了芯片就设计完了。然后用edge-based graph convolutional neural network编码状态，并用PPO算法训练策略网络。为了训练更快，先收集了一些数据用监督的形式来训练前面的状态编码网络，收集数据是通过先trained a vanilla policy network with various congestion weights得到的，这些数据的状态就是芯片那些原始特征表示，label就是reward。注意，这个问题只有最后terminal的时候才有reward，其他时候都是0。用这些数据监督训练了之后，把预测reward的那一层去掉，直接接上后面的policy网络用PPO训练整个网络。前面的图网络用图网络的方式更新，后面的部分用PPO更新。其中图网络的更新准则为(1) each edge updates its representation by applying a fully connected network to an aggregated representation of intermediate node embeddings, and (2) each node updates its representation by taking the mean of adjacent edge embeddings，公式为
A graph placement methodology for fast chip design
作者的意思是，这种表征具有很好的泛化性和迁移性（rich and transferable representations of the chip）。就算直接拿来用，不在具体芯片设计上fine-tune也有不错的效果。最后，整个网络结构如下：
A graph placement methodology for fast chip design
总结：强化学习一个很厉害的应用啊，已经用来设计Google的TPU了。没啥说的，牛逼就完事了。
疑问：图卷积不会。芯片那些相关的知识和算法不会。