环境

Rancher Versions:  牧场主版本:

  • <= 2.12.3
  • 2.11.x
  • 2.10.x
  • 2.9.2 and later  2.9.2及以后版本

Rancher installed by Helm Chart
由 Helm Chart 安装的牧场主

情况

There is a known bug that can occur when uninstalling the Rancher Helm Chart. The below rancher-post-delete job error is observed:
卸载牧场主舵手图表时可能出现一个已知的 bug。观察到以下牧场者删除后工作错误:

<span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code># helm uninstall rancher -n cattle-system
Error: uninstallation completed with 1 error(s): 1 error occurred:
        * job rancher-post-delete failed: BackoffLimitExceeded </code></span></span></span>

This indicates a resource is not removed with the uninstall. Checking the logs from the rancher-post-delete job pod, in this case it is confirmed that the Fleet app is failing to uninstall. The job logs will have:
这表示卸载时没有移除资源。查看牧场者删除后工作舱的日志,确认舰队应用卸载失败。工作日志将包含:


     
<span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code>Uninstalling Rancher resources in the following namespaces: cattle-fleet-system cattle-system rancher-operator-system
--- Deleting the app [fleet] in the namespace [cattle-fleet-system]
Error: failed to delete release: fleet
--- Skip the app [fleet-crd] in the namespace [cattle-fleet-system]
Removing Rancher bootstrap secret in the following namespace: cattle-system
------ Summary ------
Failed to uninstall the following apps: fleet</code></span></span></span>

The Fleet resource preventing the app from being uninstalled is a cronjob:
阻止应用卸载的舰队资源是一个 cronjob:

<span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code># kubectl get cronjobs -n cattle-fleet-system
NAMESPACE             NAME                                 SCHEDULE    TIMEZONE   SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cattle-fleet-system   fleet-cleanup-gitrepo-jobs           @daily      <none>     False     0        <none>          6h48m</code></span></span></span>
解决方案

Workaround  变通方法

A workaround is to have 2 terminals or 2 panes in a multiplexer (tmux/screen) open with helm/kubectl access to the Rancher cluster. In one pane, you will run the helm uninstall command. In the second terminal/pane, run the kubectl patch command to add RBAC permission for the cronjob resource, allowing the rancher-post-delete job to successfully delete the Fleet cronjob during the uninstall.
一种变通方法是在多路复用器(多工/屏幕)中打开两个终端或两个面板,并通过 helm/kubectl 访问 Rancher 集群。在一个面板里,你会执行 “helm uninstall”命令。在第二个终端/窗格中,运行 kubectl patch 命令,为 cronjob 资源添加 RBAC 权限,这样 rancher-post-delete 作业在卸载时就能成功删除 Fleet 的 cronjob。

1. Run the helm uninstall command first.
1. 先执行 helm 卸载命令。

<span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code># Terminal/Pane 1

helm uninstall rancher -n cattle-system</code></span></span></span>

2. Immediately after step 1, run the patch command.
2. 在步骤1结束后立即执行补丁命令。

<span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code># Terminal/Pane 2

kubectl patch clusterrole rancher-post-delete --type='json' -p='[{"op": "add", "path": "/rules/1/resources/-", "value": "cronjobs"}]'</code></span></span></span>

3. The helm uninstall will succeed with no failures.
3. 头盔卸载将成功且无故障。

<span style="color:#000000"><span style="background-color:#ffffff"><span style="background-color:#efefef"><code># helm uninstall rancher -n cattle-system
release "rancher" uninstalled</code></span></span></span>

Resolution  结局

A GitHub pull request has been submitted to fix this issue permanently in Rancher version 2.13.0 and 2.12.4 releases. Pull Request: https://github.com/rancher/rancher/pull/52277
GitHub 已提交拉取请求,旨在永久修复 Rancher 2.13.0 和 2.12.4 版本中的这个问题。拉取请求:https://github.com/rancher/rancher/pull/52277

原因

A new Fleet cronjob to maintain gitrepo resources requires clean up when uninstalling Rancher. This new cronjob resource is not accounted for in the rancher-post-delete job RBAC, and causes the failure due to missing permissions. Repo code: https://github.com/rancher/rancher/blob/main/chart/templates/post-delete-hook-cluster-role.yaml#L15-L17
为了维护 gitrepo 资源,新的舰队 cronjob 需要在卸载 Rancher 时进行清理。这个新的 cronjob 资源未被考虑在 rancher-post-delete 作业 RBAC 中,导致因缺失权限而失败。仓库代码:https://github.com/rancher/rancher/blob/main/chart/templates/post-delete-hook-cluster-role.yaml#L15-L17

访问Rancher-K8S解决方案博主 :
https://blog.csdn.net/lidw2009

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐