【问题标题】:How to change the OS on an existing Service Fabric cluster?如何更改现有 Service Fabric 群集上的操作系统?
【发布时间】:2019-10-12 10:18:33
【问题描述】:

我正在尝试更改我的 VMSS:

    "imageReference": {
      "publisher": "MicrosoftWindowsServer",
      "offer": "WindowsServer",
      "sku": "2016-Datacenter-with-Containers",
      "version": "latest"
    }

收件人:

    "imageReference": {
      "publisher": "MicrosoftWindowsServer",
      "offer": "WindowsServerSemiAnnual",
      "sku": "Datacenter-Core-1803-with-Containers-smalldisk",
      "version": "latest"
    }

我尝试的第一件事是:

Update-AzureRmVmss -ResourceGroupName "DevServiceFabric" -VMScaleSetName "HTTP" -ImageReferenceSku Datacenter-Core-1803-with-Containers-smalldisk -ImageReferenceOffer WindowsServerSemiAnnual

这给了我错误:

Update-AzureRmVmss:不允许更改属性“imageReference.offer”。 错误代码:PropertyChangeNotAllowed

这在文档中得到了证实;您只能在创建规模集时设置报价。

接下来我尝试Add-AzureRmServiceFabricNodeType 添加一个新的节点类型,我想我可以删除旧的。但是,此命令似乎不允许您设置操作系统映像。您只能设置 VM SKU(换句话说,集群上的所有 VM 必须具有相同的操作系统)。

有没有办法在不删除整个集群并从头开始的情况下进行更改?

【问题讨论】:

    标签: azure azure-service-fabric azure-vm-scale-set


    【解决方案1】:

    编辑如果您可以保持在当前发布商+报价的范围内,则只需更改 SKU 即可非常轻松地切换操作系统。 See the answer by Mike.


    如果您确实需要更改报价,可以这样做:

    Upgrade the size and operating system of the primary node type VMs.

    请注意,您需要考虑很多事情,例如可用性级别。集群也将在一段时间内无法从外部访问。

    大幅缩短:

    • 将具有所需操作系统的第二个规模集添加到主节点类型
    • 禁用旧的规模集,然后将其删除
    • 切换负载均衡器
    # Variables.
    $groupname = "sfupgradetestgroup"
    $clusterloc="southcentralus"  
    $subscriptionID="<your subscription ID>"
    
    # sign in to your Azure account and select your subscription
    Login-AzAccount -SubscriptionId $subscriptionID 
    
    # Create a new resource group for your deployment and give it a name and a location.
    New-AzResourceGroup -Name $groupname -Location $clusterloc
    
    # Deploy the two node type cluster.
    New-AzResourceGroupDeployment -ResourceGroupName $groupname -TemplateParameterFile "C:\temp\cluster\Deploy-2NodeTypes-2ScaleSets.parameters.json" `
        -TemplateFile "C:\temp\cluster\Deploy-2NodeTypes-2ScaleSets.json" -Verbose
    
    # Connect to the cluster and check the cluster health.
    $ClusterName= "sfupgradetest.southcentralus.cloudapp.azure.com:19000"
    $thumb="F361720F4BD5449F6F083DDE99DC51A86985B25B"
    
    Connect-ServiceFabricCluster -ConnectionEndpoint $ClusterName -KeepAliveIntervalInSec 10 `
        -X509Credential `
        -ServerCertThumbprint $thumb  `
        -FindType FindByThumbprint `
        -FindValue $thumb `
        -StoreLocation CurrentUser `
        -StoreName My 
    
    Get-ServiceFabricClusterHealth
    
    # Deploy a new scale set into the primary node type.  Create a new load balancer and public IP address for the new scale set.
    New-AzResourceGroupDeployment -ResourceGroupName $groupname -TemplateParameterFile "C:\temp\cluster\Deploy-2NodeTypes-3ScaleSets.parameters.json" `
        -TemplateFile "C:\temp\cluster\Deploy-2NodeTypes-3ScaleSets.json" -Verbose
    
    # Check the cluster health again. All 15 nodes should be healthy.
    Get-ServiceFabricClusterHealth
    
    # Disable the nodes in the original scale set.
    $nodeNames = @("_NTvm1_0","_NTvm1_1","_NTvm1_2","_NTvm1_3","_NTvm1_4")
    
    Write-Host "Disabling nodes..."
    foreach($name in $nodeNames){
        Disable-ServiceFabricNode -NodeName $name -Intent RemoveNode -Force
    }
    
    Write-Host "Checking node status..."
    foreach($name in $nodeNames){
    
        $state = Get-ServiceFabricNode -NodeName $name 
    
        $loopTimeout = 50
    
        do{
            Start-Sleep 5
            $loopTimeout -= 1
            $state = Get-ServiceFabricNode -NodeName $name
            Write-Host "$name state: " $state.NodeDeactivationInfo.Status
        }
    
        while (($state.NodeDeactivationInfo.Status -ne "Completed") -and ($loopTimeout -ne 0))
    
    
        if ($state.NodeStatus -ne [System.Fabric.Query.NodeStatus]::Disabled)
        {
            Write-Error "$name node deactivation failed with state" $state.NodeStatus
            exit
        }
    }
    
    # Remove the scale set
    $scaleSetName="NTvm1"
    Remove-AzVmss -ResourceGroupName $groupname -VMScaleSetName $scaleSetName -Force
    Write-Host "Removed scale set $scaleSetName"
    
    $lbname="LB-sfupgradetest-NTvm1"
    $oldPublicIpName="PublicIP-LB-FE-0"
    $newPublicIpName="PublicIP-LB-FE-2"
    
    # Store DNS settings of public IP address related to old Primary NodeType into variable 
    $oldprimaryPublicIP = Get-AzPublicIpAddress -Name $oldPublicIpName  -ResourceGroupName $groupname
    
    $primaryDNSName = $oldprimaryPublicIP.DnsSettings.DomainNameLabel
    
    $primaryDNSFqdn = $oldprimaryPublicIP.DnsSettings.Fqdn
    
    # Remove Load Balancer related to old Primary NodeType. This will cause a brief period of downtime for the cluster
    Remove-AzLoadBalancer -Name $lbname -ResourceGroupName $groupname -Force
    
    # Remove the old public IP
    Remove-AzPublicIpAddress -Name $oldPublicIpName -ResourceGroupName $groupname -Force
    
    # Replace DNS settings of Public IP address related to new Primary Node Type with DNS settings of Public IP address related to old Primary Node Type
    $PublicIP = Get-AzPublicIpAddress -Name $newPublicIpName  -ResourceGroupName $groupname
    $PublicIP.DnsSettings.DomainNameLabel = $primaryDNSName
    $PublicIP.DnsSettings.Fqdn = $primaryDNSFqdn
    Set-AzPublicIpAddress -PublicIpAddress $PublicIP
    
    # Check the cluster health
    Get-ServiceFabricClusterHealth
    
    # Remove node state for the deleted nodes.
    foreach($name in $nodeNames){
        # Remove the node from the cluster
        Remove-ServiceFabricNodeState -NodeName $name -TimeoutSec 300 -Force
        Write-Host "Removed node state for node $name"
    }
    
    

    【讨论】:

    • 酷!我还发现您可以使用Update-AzureRmVmss 命令更改现有规模集上的操作系统,前提是您不要更改报价,并且每个操作系统中都有很多个操作系统提供比门户 UI 中显示的内容。所以,我想我可以找到适合我的操作系统(基本上我只需要升级到 1803 内核)
    • 获取某个Offer内SKU列表的命令:Get-AzureRmVMImageSku -Location 'westus2' -PublisherName MicrosoftWindowsServer -Offer WindowsServer
    【解决方案2】:

    对于那些想要切换到另一个操作系统但可以切换到同一发布者/优惠中的操作系统映像的人来说,这是另一个(更简单)的答案。您可以使用以下命令获取可用操作系统 SKU 的列表:

    Get-AzureRmVMImageSku -Location 'westus2' -PublisherName MicrosoftWindowsServer -Offer WindowsServer
    

    然后,您可以升级集群以使用该映像:

    Update-AzureRmVmss -ResourceGroupName "DevServiceFabric" -VMScaleSetName "HTTP" -ImageReferenceSku 2019-Datacenter-Core-with-Containers-smalldisk
    

    该命令需要一个小时或更长时间才能运行。

    我还遇到了一些 SKU,即使它们出现在列表中,也会出现“找不到图像”错误。不知道这是什么原因。但是,在这种情况下,我发现这对我有用。

    【讨论】:

    • 这也很酷,如果应用到场景中就容易多了!我会在上面的回答中引用它。
    猜你喜欢
    • 2021-12-01
    • 2020-10-15
    • 1970-01-01
    • 2021-07-27
    • 2017-08-09
    • 2023-04-05
    • 2017-03-13
    • 2018-10-27
    • 1970-01-01
    相关资源
    最近更新 更多