Situation  地理位置

Rancher v2.11 ships with a new API extension, v1.ext.cattle.io, which is required for internal Rancher cluster management components, such as the capi-controller-manager.
Rancher v2.11 自带了一个新的 API 扩展 v1.ext.cattle.io,这是 Rancher 内部集群管理组件(如 capi-controller-manager)所必需的。
When Rancher is installed using a custom Helm release name, services relying on resources within this new API group fail to discover the API extension correctly, leading to functional disruption.
当 Rancher 使用自定义 Helm 发布名安装时,依赖该新 API 组资源的服务无法正确发现 API 扩展,导致功能中断。
Engine logs for affected components may display synchronization errors indicating API discovery failure:
受影响组件的引擎日志可能会显示同步错误,表明 API 发现失败:


failed to sync schemas: unable to retrieve the complete list of server APIs: [ext.cattle.io/v1]: stale GroupVersion discovery: [ext.cattle.io/v1]

While standard Rancher Helm charts are designed to dynamically template labels and resource names based on the chosen release name, this specific extension service does not adhere to dynamic naming conventions.
虽然标准的 Rancher Helm 图表设计为基于所选发布名称动态模板标签和资源名称,但该扩展服务不遵循动态命名规范。

Additionally, this backing service may persist in the `cattle-system` namespace if a rollback from Rancher 2.11 to a previous version is attempted.
此外,如果尝试从 Rancher 2.11 回滚到更早的版本,这种支持服务可能会在“牛系统”命名空间中保留。
 

Resolution  结局

### Workaround: Manually Patch the Service Selector
### 变通方法:手动修补服务选择器

The recommended immediate resolution is to manually edit the `imperative-api-extension` service selector to match the actual application label applied to the Rancher deployment pods.
建议的即时解决方法是手动编辑“imperative-api-extension”服务选择器,使其与实际应用标签匹配到 Rancher 部署舱上。

1. Identify the Correct Application Label
1. 识别正确的应用标签

Determine the actual application label used by your Rancher pods (this is usually your Helm release name, e.g., `rancher-stable`):
确定你的牧场主舱实际使用的应用标签(通常是你的 Helm 发布名称,例如“牧场主稳定”):

# Replace <RANCHER_RELEASE_NAME> with your actual Helm release name
kubectl get deployment <RANCHER_RELEASE_NAME> -n cattle-system -o yaml | grep 'app:'

2. Patch the Imperative API Extension Service
2. 修补命令式 API 扩展服务

Patch the Service in the `cattle-system` namespace, replacing `<ACTUAL_APP_LABEL>` with the value found in the previous step:
在“牛系统”命名空间中修补服务,将“''替换<ACTUAL_APP_LABEL>为前一步找到的值:

kubectl patch svc imperative-api-extension -n cattle-system \
-p '{"spec":{"selector":{"app":"<ACTUAL_APP_LABEL>"}}}'

### Cleanup after Rollback (If applicable)
### 回滚后的清理(如适用)

If this issue occurs after rolling back from Rancher v2.11 to a prior version, the `imperative-api-extension` service may persist. 
如果在从 Rancher v2.11 回滚到之前版本后出现此问题,“imperative-api-extension”服务可能会持续存在。
If the service is orphaned and not required by the older Rancher version, you should manually delete it:
如果该服务被遗弃且旧版牧场者不强制要求,你应手动删除:

kubectl delete svc imperative-api-extension -n cattle-system

Deleting the service is a specific cleanup step, contrasting with the general Rancher rollback process which usually involves restoring state using the Rancher backup operator.
删除服务是一个特定的清理步骤,与通常通过 Rancher 备份操作员恢复状态的 Rancher 回滚过程形成对比。
General best practices recommend taking backups (snapshots) before any major operation like an upgrade or rollback.
一般最佳实践建议在升级或回滚等重大操作前进行备份(快照)。

Cause  病因

The root cause is a software bug related to the hardcoded label selector configured for the API extension's backing service.
根本原因是与为 API 扩展后台服务配置的硬编码标签选择器相关的软件漏洞。

1.  API Extension Service: The API extension is backed by the service `cattle-system/imperative-api-extension`.
1.  API 扩展服务:API 扩展由服务“cattle-system/imperative-api-extension”支持。
2.  Hardcoded Selector: This service contains a hardcoded label selector of `app: rancher`.
2.  硬编码选择器: 该服务包含 一个硬编码的标签选择器“app: rancher”
3.  Label Mismatch: When a custom Helm release name (e.g., `rancher-stable`) is used, the Rancher deployment pods correctly receive an `app` label matching that name (e.g., `app: rancher-stable`).
3.  标签不匹配: 当使用自定义 Helm 发布名称(例如“rancher-stable”)时,Rancher 部署舱会正确获得与该名称相匹配的“app”标签(例如“app: rancher-stable”)。
4.  Service Failure: Due to the label mismatch, the `imperative-api-extension` service cannot select the intended Rancher pods [User Query]. This prevents the proper registration of the `v1.ext.cattle.io` `APIService` via the Kubernetes API Aggregation Layer, resulting in the reported stale GroupVersion discovery errors.
4.   服务失败: 由于标签不匹配,“imperative-api-extension”服务无法选择预期的牧场主播客[用户查询]。这阻止了通过 Kubernetes API 聚合层正确注册“v1.ext.cattle.io”APIService“,导致报告的过时 GroupVersion 发现  错误。

This situation confirms a bug in the new `imperative-api-extension` feature, as it expects the static `app: rancher` label instead of dynamically resolving the name based on the Helm release. This issue is being tracked internally for a long-term resolution.
这种情况证实了新“imperative-api-extension”功能中的一个 bug,因为它期望使用静态的“app: rancher”标签,而不是根据 Helm 发布动态解析名称。该问题正在内部进行长期跟踪,以期长期解决。

 

Changing Release Name Risk:
更改发行名称风险:

If you consider renaming your existing Helm chart release (e.g., from `rancher-stable` back to `rancher`), be advised that this process is highly disruptive: it forces a complete removal of the previous Rancher deployment and the creation of a fresh instance.
如果你考虑将现有的 Helm 图表发布重命名(例如,从“rancher-stable”改回“rancher”),请注意,这个过程非常具有破坏性:它会强制完全移除之前的 Rancher 部署,并创建一个新的实例。

This action will cause the Rancher management plane UI/API to be unavailable during the entire process, although downstream cluster workloads are designed to remain functional.
此操作将导致牧场管理平面的 UI/API 在整个过程中不可用,尽管下游集群工作负载设计为保持功能正常。

Additional Information  附加信息

This issue has been fixed in SUSE Rancher 2.12.0.
这个问题在 SUSE Rancher 2.12.0 版本中已被修复。
 

Environment  环境
• Rancher v2.11.x
•  牧场主 v2.11.x
• Rancher deployed using a custom Helm release name other than the default (rancher) (e.g., rancher-stable).
• Rancher 部署时使用 了非默认的自定义 Helm 发布名称 (例如 rancher-stable)。

 访问Rancher-K8S解决方案博主,企业合作伙伴 :
https://blog.csdn.net/lidw2009

Logo

腾讯云面向开发者汇聚海量精品云计算使用和开发经验,营造开放的云计算技术生态圈。

更多推荐