Let’s talk about the Dynamic Movement Primitive (DMP) for robots learning from demonstration. In this article, we make an assumption that you readers all have the background of control theory and robotics. (Updating…)

The Basics about DMP

Dynamic movement primitives (DMPs) are a method of trajectory control / planning. It was motivated by the desire to find a way to represent complex motor actions that can be flexibly adjusted without manual parameter tuning or having to worry about instability.

To begin with, we take some time to discuss 2nd order dynamic system as follows:
τy˙=z\tau\dot{y}=z τz˙=α(β(gy)z)\tau\dot{z}=\alpha(\beta(g-y)-z) For the sake of simplicity, we take τ=1\tau = 1. Then the equations above reduces to a time-invariant linear system which has been deeply studied in linear control theory. The only equilibrium of the system above is z=0,y=gz=0,y=g. Here we take gg as the goal position, α,β\alpha,\beta are constant parameters to be chosen such that the system is strictly damping (e.g. α>4β>0\alpha > 4\beta > 0). This is why the DMP is always stable at the goal position — the dynamics of the system is always dominated by a stable linear system.

To make the DMP fit any trajectory, it is necessary to add some terms in the equations above without any effects on its stability:
τy˙=z\tau\dot{y}=z τz˙=α(β(gy)z)+f(x)\tau\dot{z}=\alpha(\beta(g-y)-z) + f(x) where τx˙=αxx\tau \dot{x} = -\alpha_{x}x αx>0\alpha_x>0 is a constant parameter of the dynamics of xx that is also called a cannonical system. As we can see, xx will inevitably reduces to zero with time elapsing. If f(x)f(x) is smooth and satisfies f(0)=0f(0) = 0, the stability of the DMP would not be damaged by the added term f(x)f(x). We can define the nonlinear function ff (also called ‘forceing function’) as:
f(x)=i=1Nψiwii=1Nψix(gy(0))f(x) = \frac{\sum_{i=1}^{N}\psi_{i}w_{i}}{\sum_{i=1}^{N}\psi_{i}}x(g-y(0)) where ψi=exp(hi(xci)2)\psi_{i} = \exp(-h_{i}(x-c_{i})^{2})
wiw_{i} is a weighting for a given basis function ψi\psi_{i}. You may recognize that the ψi\psi_i equation above defines a Gaussian centered at cic_i, where hih_i is the variance. So our forcing function is a set of Gaussians that are ‘activated’ as the canonical system xx converges to its target.

The paramter τ\tau can be used as a temporal scaling term. To slow the system down you set τ\tau greater than 1 while set it between 0 and 1 to speed the dynamics up.

To examplify the basic DMP, we show a diagram when w=0w=0:
Dynamic Movement Primitve - My Superficial Review
Dynamic Movement Primitve - My Superficial Review

Learn a DMP: LWR

Now we have a forcing term that can make the system take a weird trajectory as it converges to a target point, and temporal and spatial scalability. How do we set up the system to follow a trajectory that we specify? If we are given a trajectory: {yd,y˙d,y¨d}\{y_d,\dot{y}_d,\ddot{y}_d\}, fd=τ2y¨dα(β(gyd)τy˙df_d = \tau^2\ddot{y}_d-\alpha(\beta(g-y_d)-\tau\dot{y}_d And we know that the forcing term is comprised of a weighted summation of basis functions which are activated through time, so we can use an optimization technique like locally weighted regression to choose the weights over our basis functions such that the forcing function matches the desired trajectory fdf_{d}. In locally weighted regression sets up to minimize:
Ji=t=0Tψi(t)(fd(t)wiξ(t))2J_i=\sum_{t=0}^{T}\psi_i(t)(f_d(t)-w_{i}\xi(t))^2 where ξ(t)=x(t)(gyd(0))\xi(t) =x(t)(g-y_d(0)) Let
S=(ξ(0),ξ(1),,ξ(T))T,Ti=diag(ψi(0),ψi(1),,ψi(T))S = (\xi(0),\xi(1),\dots,\xi(T))^{T}, \Tau_i = \mathrm{diag}(\psi_{i}(0),\psi_{i}(1),\dots,\psi_{i}(T)) and Fd=(fd(0),fd(1),,fd(T))TF_d = (f_d(0),f_d(1),\dots,f_d(T))^{T} Then, in a compact form: Ji=(FdwiS)TTi(FdwiS)J_{i} = (F_d-w_{i}S)^{T}\Tau_i(F_d-w_iS) that can be solved as wi=(STTiS)1STTiFdw_i = (S^T\Tau_iS)^{-1}S^{T}\Tau_iF_d
The performance is illustrated by the following diagrams: (tt is the desired trajectory, yy is generated by DMP)
Dynamic Movement Primitve - My Superficial Review
Dynamic Movement Primitve - My Superficial Review
That’s for exactly following a given trajectory, which is often not the case. The strength of the DMP framework is that the trajectory is a dynamical system. This lets us do simple things to get really neat performance, like scale the trajectory spatially on the fly simply by changing the goal, rather than rescaling the entire trajectory


Thanks

相关文章:

  • 2022-03-07
  • 2021-11-16
  • 2021-09-25
  • 2022-12-23
  • 2022-01-19
  • 2021-08-03
  • 2022-01-14
  • 2022-12-23
猜你喜欢
  • 2021-07-29
  • 2022-12-23
  • 2021-11-17
  • 2022-12-23
  • 2022-12-23
  • 2022-01-14
  • 2022-12-23
相关资源
相似解决方案