Cards (17)

  • Building a Culture:
    • Organizations should focus on educating technical teams about Kubernetes best practices to optimize costs effectively.
    • Transparency and awareness among developers and operators regarding application costs are essential.
    • Implementing guardrails and policies can prevent unexpected cost escalations and enforce resource quotas.
  • Best Practices - Technical Side:
    • Review small development clusters:
    • Consider using multi-tenant clusters for development to save on overhead.
    • Utilize namespaces and policies to cap and isolate resources for experimentation.
    • Disable or limit unnecessary add-ons like Cloud Logging, Monitoring, Horizontal Pod Autoscaling, Kubernetes Dashboard, and Kube DNS to reduce costs.
  • Add Pod Disruption Budgets:
    • Set thresholds for the number or percentage of pods that can be disrupted during voluntary actions like upgrades or autoscaling.
    • Determine the minimum number of pods needed to run without disrupting application users, considering each application independently.
    • Pod disruption budgets help maintain application availability during autoscaling without overprovisioning.
  • GKE Architecture:
    • Review small development clusters:
    • Consider using multi-tenant clusters for development to save on overhead.
  • PodDisruptionBudgets:
    • Determine the minimum number of pods needed to run without disrupting application users, considering each application independently.
    • Pod disruption budgets help maintain application availability during autoscaling without overprovisioning.
  • Observing GKE Clusters:
    • Organizations often have separate teams managing clusters and developing applications, necessitating collaboration to optimize costs effectively.
    • The monitoring dashboard provides crucial insights into resource utilization and performance metrics.
    • Cloud Operations, including Monitoring, Logging, and Alerting, offers detailed views into cluster metrics, aiding in identifying issues and optimizing resource usage.
  • Observing GKE Clusters:
    • Utilization of the Infrastructure, Services, and Workload tabs provides different perspectives for analyzing cluster behavior.
    • Custom charts created using the Metrics Explorer offer deeper insights into specific metrics such as CPU usage time by container.
  • Logging and Monitoring:
    • Cloud Logging and Cloud Monitoring offer observability into applications and infrastructure, with costs incurred based on usage and log retention duration.
    • Increased logging and custom metrics lead to higher costs, necessitating careful management.
    • Implementing multi-tenant logging with exclusion rules helps filter out irrelevant logs to minimize costs.
    • Further details and tips on cost optimization for logs and monitoring are available for reference.
  • Enable GKE Usage Monitoring:
    • Enabling GKE Usage Monitoring allows automatic collection of granular metrics, exported to BigQuery for detailed analysis.
    • Insights into CPU, memory, storage, and network usage facilitate comparison with allocated resources, identifying over-allocated resources.
    • Namespace-based resource usage monitoring and utilization of labels for filtering enhance visibility into resource consumption and cost attribution.
  • Kubernetes Resource Quotas:
    • Kubernetes Resource Quotas enable capping resource usage at the namespace level, ideal for multi-tenant clusters.
    • Limit ranges enforce resource limitations at the pod and cluster level, complementing Resource Quotas.
    • Detailed metrics inform the setting of appropriate quotas, preventing resource over-consumption and potential cost spikes.
  • Metrics Server:
    • The Metrics Server collects and exposes metrics to the Kubernetes metrics API, aiding in autoscaling decisions.
    • Maintaining the metric server's stability is crucial to prevent disruptions in autoscaling operations, necessitating careful monitoring and version compatibility checks.
  • CI/CD for Cost Optimization:
    • Reviewing configuration changes before deployment helps prevent unintended cost escalations, with tools like Anthos Policy Controller automating policy enforcement.
    • Integration of policy validation into CI/CD pipelines using tools like KPT ensures continuous validation of configurations throughout the development cycle.
  • Recommendation Hub:
    • The Recommendation Hub provides actionable suggestions for optimizing costs based on usage patterns, spanning various aspects such as security and cost.
    • Recommendations tailored to specific usage patterns offer insights into potential cost-saving opportunities, such as optimizing compute costs through long-term commitments.
  • GKE Monitoring:
  • Resource and Limit Quotas:
  • Metrics Servers: