【问题标题】:Asp.net Core get "Back-off restarting failed container" on AKSAsp.net Core 在 AKS 上获得“后退重启失败的容器”
【发布时间】:2021-01-20 03:57:32
【问题描述】:

我正在尝试在 AKS 上部署我的第一个简单的 ASP.net Core Web Api(参考 this article

这是我的 yaml 文件

apiVersion: apps/v1
kind: Deployment
metadata:
  name: aexp
  labels:
    app: aexp
spec:
  replicas: 1
  selector:
    matchLabels:
      service: aexp
  template:
    metadata:
      labels:
        app: aexp
        service: aexp
    spec:
      containers:
        - name: aexp
          image: f2021.azurecr.io/aexp:v1
          imagePullPolicy: Always
          ports:
            - containerPort: 80
              protocol: TCP
          env:
            - name: ASPNETCORE_URLS
              value: http://+:80
---
apiVersion: v1
kind: Service
metadata:
  name: aexp
  labels:
    app: aexp
    service: aexp
spec:
  ports:
    - port: 80
      targetPort: 80
      protocol: TCP
  selector:
    service: aexp

它看起来简单明了,但我不明白为什么我的 pod 得到 Back-off restarting failed 容器。有什么建议可以防止错误吗?提前致谢。

Name:         aexp-5b5b7b6464-5lfz4
Namespace:    default
Priority:     0
Node:         aks-nodepool1-38572550-vmss000000/10.240.0.4
Start Time:   Wed, 20 Jan 2021 10:01:52 +0700
Labels:       app=aexp
              pod-template-hash=5b5b7b6464
              service=aexp
Annotations:  <none>
Status:       Running
IP:           10.244.0.14
IPs:
  IP:           10.244.0.14
Controlled By:  ReplicaSet/aexp-5b5b7b6464
Containers:
  aexp:
    Container ID:   docker://25ffdb3ce92eeda465e1971daa363d6f532ac73ff82df2e9b3694a8949f50615
    Image:          f2021.azurecr.io/aexp:v1
    Image ID:       docker-pullable://f2021.azurecr.io/aexp@sha256:bf6aa2a47f5f857878280f5987192f1892e91e365b9e66df83538109b9e57c46
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 20 Jan 2021 10:33:47 +0700
      Finished:     Wed, 20 Jan 2021 10:33:47 +0700
    Ready:          False
    Restart Count:  11
    Environment:
      ASPNETCORE_URLS:  http://+:80
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-g4ks9 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  default-token-g4ks9:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  default-token-g4ks9
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  36m                  default-scheduler  Successfully assigned default/aexp-5b5b7b6464-5lfz4 to aks-nodepool1-38572550-vmss000000
  Normal   Pulled     35m (x4 over 36m)    kubelet            Successfully pulled image "f2021.azurecr.io/aexp:v1"
  Normal   Created    35m (x4 over 36m)    kubelet            Created container aexp
  Normal   Started    35m (x4 over 36m)    kubelet            Started container aexp
  Normal   Pulling    34m (x5 over 36m)    kubelet            Pulling image "f2021.azurecr.io/aexp:v1"
  Warning  BackOff    62s (x166 over 36m)  kubelet            Back-off restarting failed container

这是我创建 AKS 集群的 az sn-p

az aks create \
  --location $REGION \
  --resource-group $AKS_RG \
  --name $AKS_NAME \
  --ssh-key-value ./.ssh/id_rsa.pub \
  --service-principal "xxxxxxxx-b8d1-4206-8a8a-xxxxx66c086c" \
  --client-secret "xxxx.xxxxeNzq25iJeuRjWTh~xxxxxUGxu" \
  --network-plugin kubenet \
  --load-balancer-sku basic \
  --outbound-type loadBalancer \
  --node-vm-size Standard_B2s \
  --node-count 1 \
  --tags 'ENV=DEV' 'SRV=EXAMPLE'  \
  --generate-ssh-keys

更新 1: 我尝试使用 VS2019,使用“Bridge to Kubernetes”启动 Debug,然后它可以工作,相同的 docker 映像,相同的部署和相同的服务。

更新 2:添加 docker 文件

#See https://aka.ms/containerfastmode to understand how Visual Studio uses this Dockerfile to build your images for faster debugging.

FROM mcr.microsoft.com/dotnet/core/aspnet:3.1-buster-slim AS base
WORKDIR /app
EXPOSE 80
EXPOSE 443

FROM mcr.microsoft.com/dotnet/core/sdk:3.1-buster AS build
WORKDIR /src
COPY ["Aexp/Aexp.csproj", "Aexp/"]
RUN dotnet restore "Aexp/Aexp.csproj"
COPY . .
WORKDIR "/src/Aexp"
RUN dotnet build "Aexp.csproj" -c Release -o /app/build

FROM build AS publish
RUN dotnet publish "Aexp.csproj" -c Release -o /app/publish

FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "Aexp.dll"]

更新 3 [1 月 27 日]:我发现这个问题根本与我的代码或我的 yaml 无关。我有 02 个 azure 订阅,一个遇到了问题,一个在相同的代码、相同的 deployment.yaml 和配置下工作得很好。

【问题讨论】:

    标签: asp.net-core azure-aks


    【解决方案1】:

    pod 崩溃可能有多种原因。最好的方法是检查你的 pod 的日志,看看崩溃是否来自你的应用程序。

    kubectl logs aexp-5b5b7b6464-5lfz4 --previous

    --previous 确保您可以从崩溃的 pod 访问日志。

    如果日志为空,您需要检查 Dockerfile。容器似乎没有任何长时间运行的进程,因为它以“成功”退出代码完成:

    Last State:   Terminated
    Reason:       Completed
    Exit Code:    0
    

    【讨论】:

    • 我已经编辑了更多信息的答案是日志为空的情况
    • 谢谢,我猜我的 pod 没有问题,但我不明白为什么我可以使用 Docker Desktop 本地运行,甚至使用“Bridge to Kubernetes”来查看它并在 AKS 中运行,但它不适用于“kubectl apply”。我已经添加了我的 docker 文件。
    猜你喜欢
    • 2020-10-02
    • 2020-05-21
    • 2022-01-15
    • 2021-01-13
    • 2019-10-30
    • 2020-03-05
    • 2018-05-15
    • 2021-11-21
    • 2019-06-23
    相关资源
    最近更新 更多