CRD(自定义资源定义)

Kubernetes 内置了 Pod、Deployment、Service 这些资源。但如果我想定义一个「数据库集群」资源呢?

CRD 让你可以自定义 Kubernetes 的 API,添加自己的资源类型。

什么是 CRD?

CustomResourceDefinition(自定义资源定义,简称 CRD)是 Kubernetes 扩展 API 的方式。通过 CRD,你可以定义新的资源类型,Kubernetes 会自动为你创建对应的 RESTful API。

crd-database.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.example.com
spec:
  group: example.com
  names:
    kind: Database
    plural: databases
    shortNames:
    - db
  scope: Namespaced
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        properties:
          spec:
            type: object
            properties:
              engine:
                type: string
              version:
                type: string
              replicas:
                type: integer

创建 CRD

基本 CRD

crd-book.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: books.example.com
spec:
  group: example.com
  names:
    kind: Book
    plural: books
    singular: book
    shortNames:
    - bk
  scope: Namespaced
  versions:
  - name: v1
    served: true
    storage: true
    schema:
      openAPIV3Schema:
        type: object
        required:
        - spec
        properties:
          apiVersion:
            type: string
          kind:
            type: string
          metadata:
            type: object
            properties:
              name:
                type: string
          spec:
            type: object
            required:
            - title
            - author
            properties:
              title:
                type: string
              author:
                type: string
              pages:
                type: integer
                minimum: 1
              published:
                type: boolean
# 创建 CRD
kubectl apply -f crd-book.yaml

# 查看 CRD
kubectl get crd
# NAME                 CREATED AT
# books.example.com    2024-01-15T10:00:00Z

# 查看 CRD 详情
kubectl describe crd books.example.com

创建自定义资源

book.yaml
apiVersion: example.com/v1
kind: Book
metadata:
  name: k8s-guide
spec:
  title: "Kubernetes in Action"
  author: "Marko Lukša"
  pages: 624
  published: true
# 创建资源
kubectl apply -f book.yaml

# 查看资源
kubectl get books
# NAME         AUTHOR         PAGES   PUBLISHED
# k8s-guide    Marko Lukša    624     true

# 查看详情
kubectl get book k8s-guide -o yaml

CRD 版本管理

多版本支持

crd-multi-version.yaml
spec:
  versions:
  - name: v1
    served: true
    storage: false  # v1beta1 是存储版本
  - name: v1beta1
    served: true
    storage: true
  version: v1beta1  # 转换 webhook 需要

版本转换

versions:
- name: v1
  served: true
  storage: true
  subresources:
    status: {}
- name: v1beta1
  served: true
  storage: false
conversion.go
// 实现转换逻辑
func (a *Book) ConvertTo(dst runtime.Object) error {
    // v1 -> v1beta1 转换
    dst.(*BookV1Beta1).Spec.Pages = intstr.FromInt(a.Spec.Pages)
    return nil
}

func (a *BookV1Beta1) ConvertFrom(src runtime.Object) error {
    // v1beta1 -> v1 转换
    a.Spec.Pages = a.Spec.Pages.IntValue()
    return nil
}

验证与默认值

OpenAPI Schema

schema:
  openAPIV3Schema:
    type: object
    required:
    - spec
    properties:
      spec:
        type: object
        required:
        - name
        - capacity
        properties:
          name:
            type: string
            pattern: "^[a-z][a-z0-9-]*$"
          capacity:
            type: object
            properties:
              storage:
                type: string
                pattern: "^[0-9]+Gi$"
              memory:
                type: string
                pattern: "^[0-9]+Mi$"

默认值

schema:
  openAPIV3Schema:
    type: object
    properties:
      spec:
        type: object
        default:
          replicas: 1

Webhook 验证

webhook.go
func (in *Database) ValidateCreate() error {
    var allErrs field.ErrorList
    
    if in.Spec.Replicas < 1 {
        allErrs = append(allErrs, field.Invalid(
            field.NewPath("spec").Child("replicas"),
            in.Spec.Replicas,
            "must be at least 1"))
    }
    
    return allErrs.ToAggregate()
}

子资源

Status 子资源

subresources:
  status: {}
# 更新 status
kubectl patch book k8s-guide --subresource=status \
  -p '{"status":{"conditions":[{"type":"Available","status":"True"}]}}'

Scale 子资源

subresources:
  scale:
    specReplicasPath: .spec.replicas
    statusReplicasPath: .status.replicas
    labelSelectorPath: .status.labelSelector
# 使用 HPA 风格的接口
kubectl scale book k8s-guide --replicas=5
# 或
kubectl patch book k8s-guide --subresource=scale \
  --type='merge' -p '{"spec":{"replicas":5}}'

Finalizer 和 GC

Finalizer

metadata:
  finalizers:
  - finalizer.database.example.com
// 删除前的清理工作
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    db := &examplev1.Database{}
    if err := r.Get(ctx, req.NamespacedName, db); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }
    
    if db.DeletionTimestamp.IsZero() {
        // 添加 finalizer
        if !containsString(db.Finalizers, finalizerName) {
            db.Finalizers = append(db.Finalizers, finalizerName)
            r.Update(ctx, db)
        }
    } else {
        // 删除逻辑
        if containsString(db.Finalizers, finalizerName) {
            // 执行清理
            if err := r.cleanup(db); err != nil {
                return ctrl.Result{}, err
            }
            // 移除 finalizer
            db.Finalizers = removeString(db.Finalizers, finalizerName)
            r.Update(ctx, db)
        }
    }
    return ctrl.Result{}, nil
}

常见 CRD 示例

1. IngressRoute(Traefik)

apiVersion: traefik.containo.us/v1alpha1
kind: IngressRoute
metadata:
  name: my-app
spec:
  entryPoints:
  - web
  routes:
  - match: Host(`app.example.com`)
    kind: Rule
    services:
    - name: my-app
      port: 80

2. PrometheusRule

apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: my-app-rules
spec:
  groups:
  - name: my-app
    rules:
    - alert: HighErrorRate
      expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
      for: 5m
      labels:
        severity: warning
      annotations:
        summary: "High error rate detected"

3. Certificate(cert-manager)

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: my-app-tls
spec:
  secretName: my-app-tls
  issuerRef:
    name: letsencrypt-prod
    kind: ClusterIssuer
  dnsNames:
  - my-app.example.com

CRD vs 聚合 API

特性CRD聚合 API
实现复杂度
功能基本 CRUD + 验证完全自定义
性能一般可优化
适用场景简单资源复杂资源
flowchart TB
    subgraph CRD["CRD"]
        K["kube-apiserver"] --> CRD["CRD Controller"]
    end

    subgraph Aggregated["聚合 API"]
        K2["kube-apiserver"] --> A["API Server"]
        A --> Ext["Extension\nAPI Server"]
    end

最佳实践

1. 使用明确的 API 组

spec:
  group: databases.example.com

2. 提供 Schema 验证

schema:
  openAPIV3Schema:
    type: object
    properties:
      spec:
        type: object
        required:
        - name

3. 添加条件状态

status:
  conditions:
  - type: Ready
    status: "True"
    lastTransitionTime: "2024-01-15T10:00:00Z"
    reason: "Created"
    message: "Database created successfully"

4. 实现优雅删除

metadata:
  finalizers:
  - custom.example.com/cleanup

常见问题

CRD 创建失败

# 查看 CRD 详情
kubectl describe crd <name>

# 检查 Schema 是否有效
# CRD 的 Schema 必须符合 OpenAPI v3 规范

资源创建失败

# 检查资源验证错误
kubectl apply -f resource.yaml --validate=false

# 查看资源事件
kubectl describe <kind> <name>

多版本资源问题

# 查看所有版本的资源
kubectl get <plural>.<group>/<name> --all-versions

延伸思考

CRD 是 Kubernetes 最强大的扩展机制之一:

  1. 声明式 API:用户可以用熟悉的 kubectl 管理自定义资源
  2. 生态系统:无数的 Operator 建立在 CRD 之上
  3. 一致性:与内置资源使用相同的模式

但 CRD 也有局限性:

  1. 性能:大量 CRD 会影响 API Server 性能
  2. 功能受限:无法完全替代原生 API 的所有功能
  3. Operator 负担:需要额外的控制器实现业务逻辑

对于需要复杂业务逻辑的场景,Operator 是更好的选择。

延伸阅读