【发布时间】:2021-03-26 07:05:07
【问题描述】:
我将 AKS 群集配置为使用系统分配的托管标识来访问其他 Azure 资源
resource "azurerm_subnet" "aks" {
name = var.aks_subnet_name
resource_group_name = azurerm_resource_group.main.name
virtual_network_name = module.network.vnet_name
address_prefix = var.aks_subnet
service_endpoints = ["Microsoft.KeyVault"]
}
resource "azurerm_kubernetes_cluster" "aks_main" {
name = module.aks_name.result
depends_on = [azurerm_subnet.aks]
location = azurerm_resource_group.main.location
resource_group_name = azurerm_resource_group.main.name
dns_prefix = "aks-${local.name}"
kubernetes_version = var.k8s_version
addon_profile {
oms_agent {
# For monitoring containers
enabled = var.addons.oms_agent
log_analytics_workspace_id = azurerm_log_analytics_workspace.example.id
}
kube_dashboard {
enabled = true
}
azure_policy {
# If we want to enfore policy definitions in the future
# Check requirements https://docs.microsoft.com/en-ie/azure/governance/policy/concepts/policy-for-kubernetes
enabled = var.addons.azure_policy
}
}
default_node_pool {
name = "default"
orchestrator_version = var.k8s_version
node_count = var.default_node_pool.node_count
vm_size = var.default_node_pool.vm_size
type = "VirtualMachineScaleSets"
availability_zones = var.default_node_pool.zones
# availability_zones = ["1", "2", "3"]
max_pods = 250
os_disk_size_gb = 128
vnet_subnet_id = azurerm_subnet.aks.id
node_labels = var.default_node_pool.labels
enable_auto_scaling = var.default_node_pool.cluster_auto_scaling
min_count = var.default_node_pool.cluster_auto_scaling_min_count
max_count = var.default_node_pool.cluster_auto_scaling_max_count
enable_node_public_ip = false
}
# Configuring AKS to use a system-assigned managed identity to access
identity {
type = "SystemAssigned"
}
network_profile {
load_balancer_sku = "standard"
outbound_type = "loadBalancer"
network_plugin = "azure"
# if non-azure network policies
# https://azure.microsoft.com/nl-nl/blog/integrating-azure-cni-and-calico-a-technical-deep-dive/
network_policy = "calico"
dns_service_ip = "10.0.0.10"
docker_bridge_cidr = "172.17.0.1/16"
service_cidr = "10.0.0.0/16"
}
lifecycle {
ignore_changes = [
default_node_pool,
windows_profile,
]
}
}
我想使用该托管标识(在 AKS 群集部分代码中创建的服务主体)在子网上为其赋予类似 Network Contributor 的角色:
resource "azurerm_role_assignment" "aks_subnet" {
# Giving access to AKS SP identity created to akssubnet by assigning it
# a Network Contributor role
scope = azurerm_subnet.aks.id
role_definition_name = "Network Contributor"
principal_id = azurerm_kubernetes_cluster.aks_main.identity[0].principal_id
# principal_id = azurerm_kubernetes_cluster.aks_main.kubelet_identity[0].object_id
# principal_id = data.azurerm_user_assigned_identity.test.principal_id
# skip_service_principal_aad_check = true
}
但是我在 terraform apply 之后得到的输出是:
Error: authorization.RoleAssignmentsClient#Create: Failure responding
to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error.
Status=403 Code="AuthorizationFailed"
Message="The client 'afd5bd09-c294-4597-9c90-e1ee293e5f3a' with object id
'afd5bd09-c294-4597-9c90-e1ee293e5f3a' does not have authorization
to perform action 'Microsoft.Authorization/roleAssignments/write'
over scope '/subscriptions/77dfff95-fbd3-4a15-b97a-b7182939e61a/resourceGroups/rhd-spec-prod-main-6loe4lpkr0hd8/providers/Microsoft.Network/virtualNetworks/rhd-spec-prod-main-wdaht6cn7s3s8/subnets/aks-subnet/providers/Microsoft.Authorization/roleAssignments/8733864c-a5f7-a6a9-a61d-6393989f0ad1'
or the scope is invalid. If access was recently granted, please refresh your credentials."
on aks.tf line 23, in resource "azurerm_role_assignment" "aks_subnet":
23: resource "azurerm_role_assignment" "aks_subnet" {
似乎正在创建的服务主体没有足够的权限在子网上执行角色分配,或者我的scope 属性可能有误。我经过那里,aks 子网 id。
我做错了什么?
更新
检查托管身份分配角色的方式,看起来我们只能为其分配与订阅、资源组、存储服务、SQL 服务和 KeyVault 相关的角色。
阅读here
在您可以使用托管标识之前,必须对其进行配置。有两个步骤:
为身份分配一个角色,将其与将用于运行 Terraform 的订阅相关联。此步骤授予身份访问 Azure 资源管理器 (ARM) 资源的权限。
为一个或多个 Azure 资源配置访问控制。例如,如果您使用密钥保管库和存储帐户,则需要分别配置保管库和容器。
在您可以使用托管标识创建资源并分配 RBAC 角色之前,您的帐户需要足够的权限。您需要是帐户所有者角色的成员,或者具有贡献者和用户访问管理员角色。
尝试相应地进行,我定义了这个部分代码:
resource "null_resource" "wait_for_resource_to_be_ready" {
provisioner "local-exec" {
command = "sleep 60"
}
depends_on = [
azurerm_kubernetes_cluster.aks_main
]
}
data "azurerm_subscription" "current" {}
# FETCHING THE IDENTITY CREATED ON AKS CLUSTER
data "azurerm_user_assigned_identity" "test" {
name = "${azurerm_kubernetes_cluster.aks_main.name}-agentpool"
resource_group_name = azurerm_kubernetes_cluster.aks_main.node_resource_group
}
data "azurerm_role_definition" "contributor" {
name = "Network Contributor"
}
resource "azurerm_role_assignment" "aks_subnet" {
# Giving access to AKS SP identity created to akssubnet by assigning it
# a Network Contributor role
# name = azurerm_kubernetes_cluster.aks_main.name
# scope = var.aks_subnet_name # azurerm_subnet.aks.id var.aks_subnet
scope = data.azurerm_subscription.current.id
#role_definition_name = "Network Contributor"
role_definition_id = "${data.azurerm_subscription.current.id}${data.azurerm_role_definition.contributor.id}"
# principal_id = azurerm_kubernetes_cluster.aks_main.identity[0].principal_id
# principal_id = azu rerm_kubernetes_cluster.aks_main.kubelet_identity[0].object_id
principal_id = data.azurerm_user_assigned_identity.test.principal_id
skip_service_principal_aad_check = true
depends_on = [
null_resource.wait_for_resource_to_be_ready
]
}
terraform 工作流尝试创建角色...
> terraform_0.12.29 apply "prod_Infrastructure.plan"
null_resource.wait_for_resource_to_be_ready: Creating...
null_resource.wait_for_resource_to_be_ready: Provisioning with 'local-exec'...
null_resource.wait_for_resource_to_be_ready (local-exec): Executing: ["/bin/sh" "-c" "sleep 60"]
null_resource.wait_for_resource_to_be_ready: Still creating... [10s elapsed]
null_resource.wait_for_resource_to_be_ready: Still creating... [20s elapsed]
null_resource.wait_for_resource_to_be_ready: Still creating... [30s elapsed]
null_resource.wait_for_resource_to_be_ready: Still creating... [40s elapsed]
null_resource.wait_for_resource_to_be_ready: Still creating... [50s elapsed]
null_resource.wait_for_resource_to_be_ready: Still creating... [1m0s elapsed]
null_resource.wait_for_resource_to_be_ready: Creation complete after 1m0s [id=8505830187297683728]
azurerm_role_assignment.aks_subnet: Creating...
但这次订阅通过了,但最终得到了相同的AuthorizationFailed 错误。
Error: authorization.RoleAssignmentsClient#Create: Failure responding to request: StatusCode=403 -- Original Error: autorest/azure: Service returned an error. Status=403 Code="AuthorizationFailed" Message="The client 'afd5bd09-c294-4597-9c90-e1ee293e5f3a' with object id 'afd5bd09-c294-4597-9c90-e1ee293e5f3a' does not have authorization to perform action 'Microsoft.Authorization/roleAssignments/write' over scope '/subscriptions/77dfff95-fbd3-4a15-b97a-b7182939e61a' or the scope is invalid. If access was recently granted, please refresh your credentials."
on aks.tf line 145, in resource "azurerm_role_assignment" "aks_subnet":
145: resource "azurerm_role_assignment" "aks_subnet" {
完全不确定如何验证此声明
在您可以使用托管标识创建资源并分配 RBAC 角色之前,您的帐户需要足够的权限。您需要是帐户所有者角色的成员,或者具有贡献者和用户访问管理员角色。
顺便说一句,我在我正在使用的订阅中拥有所有者角色。
更新 2
上述两条错误消息中引用的对象 ID 属于我的租户内的服务主体。 这是
az ad sp show --id afd5bd09-c294-4597-9c90-e1ee293e5f3a
{
"accountEnabled": "True",
"addIns": [],
"alternativeNames": [],
"appDisplayName": "Product-xxxx-ServicePrincipal-Production",
"appId": "ff9c642c-06b9-47e2-9565-e3f6e782e14f",
"appOwnerTenantId": "xxxxxxxx",
"appRoleAssignmentRequired": false,
"appRoles": [],
"applicationTemplateId": null,
"deletionTimestamp": null,
"displayName": "Product-xxxx-ServicePrincipal-Production",
"errorUrl": null,
"homepage": null,
"informationalUrls": {
"marketing": null,
"privacy": null,
"support": null,
"termsOfService": null
},
"keyCredentials": [],
"logoutUrl": null,
"notificationEmailAddresses": [],
"oauth2Permissions": [],
# THIS IS THE OBJECT ID
"objectId": "afd5bd09-c294-4597-9c90-e1ee293e5f3a",
"objectType": "ServicePrincipal",
"odata.metadata": "https://graph.windows.net/15f996bf-aad1-451c-8d17-9b95d025eafc/$metadata#directoryObjects/@Element",
"odata.type": "Microsoft.DirectoryServices.ServicePrincipal",
"passwordCredentials": [],
"preferredSingleSignOnMode": null,
"preferredTokenSigningKeyEndDateTime": null,
"preferredTokenSigningKeyThumbprint": null,
"publisherName": "xxxxxxx",
"replyUrls": [],
"samlMetadataUrl": null,
"samlSingleSignOnSettings": null,
"servicePrincipalNames": [
"ff9c642c-06b9-47e2-9565-e3f6e782e14f"
],
"servicePrincipalType": "Application",
"signInAudience": "AzureADMyOrg",
"tags": [
"WindowsAzureActiveDirectoryIntegratedApp"
],
"tokenEncryptionKeyId": null
}
关于权限,不确定是否足够,我会说是的,因为它用于订阅中的多个内容
Users Consent 权限呢?我那里什么都没有
但另一方面,为什么进程试图通过使用此服务主体来分配角色? 我的意思是,托管标识的使用旨在摆脱对服务主体的使用,但也许,工作流进程使用此 SP 只是为了将角色分配给托管标识,并且从那以后,访问权限将由托管标识 (?)
【问题讨论】:
-
托管标识最终是服务主体。在这种情况下,服务主体(称为托管标识)由 Microsoft Azure AD 为您管理。目的是 Azure 为开发人员管理机密和身份,因此他们不必担心令牌、机密等。 docs.microsoft.com/en-us/azure/active-directory/…