Blogs – Programming

Blogs - Programming

RSS Kubernetes Blog

  • Navigating Failures in Pods With Devices 2025-07-03
    Kubernetes is the de facto standard for container orchestration, but when it comes to handling specialized hardware like GPUs and other accelerators, things get a bit complicated. This blog post dives into the challenges of managing failure modes when operating pods with devices in Kubernetes, based on insights from Sergey Kanzhelev and Mrunal Patel's talk […]
  • Image Compatibility In Cloud Native Environments 2025-06-25
    In industries where systems must run very reliably and meet strict performance criteria such as telecommunication, high-performance or AI computing, containerized applications often need specific operating system configuration or hardware presence. It is common practice to require the use of specific versions of the kernel, its configuration, device drivers, or system components. Despite the existence […]
  • Changes to Kubernetes Slack 2025-06-16
    UPDATE: We’ve received notice from Salesforce that our Slack workspace WILL NOT BE DOWNGRADED on June 20th. Stand by for more details, but for now, there is no urgency to back up private channels or direct messages. Kubernetes Slack will lose its special status and will be changing into a standard free Slack on June […]
  • Enhancing Kubernetes Event Management with Custom Aggregation 2025-06-10
    Kubernetes Events provide crucial insights into cluster operations, but as clusters grow, managing and analyzing these events becomes increasingly challenging. This blog post explores how to build custom event aggregation systems that help engineering teams better understand cluster behavior and troubleshoot issues more effectively. The challenge with Kubernetes events In a Kubernetes cluster, events are […]
  • Introducing Gateway API Inference Extension 2025-06-05
    Modern generative AI and large language model (LLM) services create unique traffic-routing challenges on Kubernetes. Unlike typical short-lived, stateless web requests, LLM inference sessions are often long-running, resource-intensive, and partially stateful. For example, a single GPU-backed model server may keep multiple inference sessions active and maintain in-memory token caches. Traditional load balancers focused on HTTP […]
  • Start Sidecar First: How To Avoid Snags 2025-06-03
    From the Kubernetes Multicontainer Pods: An Overview blog post you know what their job is, what are the main architectural patterns, and how they are implemented in Kubernetes. The main thing I’ll cover in this article is how to ensure that your sidecar containers start before the main app. It’s more complicated than you might […]
  • Gateway API v1.3.0: Advancements in Request Mirroring, CORS, Gateway Merging, and Retry Budgets 2025-06-02
    Join us in the Kubernetes SIG Network community in celebrating the general availability of Gateway API v1.3.0! We are also pleased to announce that there are already a number of conformant implementations to try, made possible by postponing this blog announcement. Version 1.3.0 of the API was released about a month ago on April 24, […]
  • Kubernetes v1.33: In-Place Pod Resize Graduated to Beta 2025-05-16
    On behalf of the Kubernetes project, I am excited to announce that the in-place Pod resize feature (also known as In-Place Pod Vertical Scaling), first introduced as alpha in Kubernetes v1.27, has graduated to Beta and will be enabled by default in the Kubernetes v1.33 release! This marks a significant milestone in making resource management […]
  • Announcing etcd v3.6.0 2025-05-16
    This announcement originally appeared on the etcd blog. Today, we are releasing etcd v3.6.0, the first minor release since etcd v3.5.0 on June 15, 2021. This release introduces several new features, makes significant progress on long-standing efforts like downgrade support and migration to v3store, and addresses numerous critical & major issues. It also includes major […]
  • Kubernetes 1.33: Job's SuccessPolicy Goes GA 2025-05-15
    On behalf of the Kubernetes project, I'm pleased to announce that Job success policy has graduated to General Availability (GA) as part of the v1.33 release. About Job's Success Policy In batch workloads, you might want to use leader-follower patterns like MPI, in which the leader controls the execution, including the followers' lifecycle. In this […]
  • Kubernetes v1.33: Updates to Container Lifecycle 2025-05-14
    Kubernetes v1.33 introduces a few updates to the lifecycle of containers. The Sleep action for container lifecycle hooks now supports a zero sleep duration (feature enabled by default). There is also alpha support for customizing the stop signal sent to containers when they are being terminated. This blog post goes into the details of these […]
  • Kubernetes v1.33: Job's Backoff Limit Per Index Goes GA 2025-05-13
    In Kubernetes v1.33, the Backoff Limit Per Index feature reaches general availability (GA). This blog describes the Backoff Limit Per Index feature and its benefits. About backoff limit per index When you run workloads on Kubernetes, you must consider scenarios where Pod failures can affect the completion of your workloads. Ideally, your workload should tolerate […]
  • Kubernetes v1.33: Image Pull Policy the way you always thought it worked! 2025-05-12
    Image Pull Policy the way you always thought it worked! Some things in Kubernetes are surprising, and the way imagePullPolicy behaves might be one of them. Given Kubernetes is all about running pods, it may be peculiar to learn that there has been a caveat to restricting pod access to authenticated images for over 10 […]
  • Kubernetes v1.33: Streaming List responses 2025-05-09
    Managing Kubernetes cluster stability becomes increasingly critical as your infrastructure grows. One of the most challenging aspects of operating large-scale clusters has been handling List requests that fetch substantial datasets - a common operation that could unexpectedly impact your cluster's stability. Today, the Kubernetes community is excited to announce a significant architectural improvement: streaming encoding […]
  • Kubernetes 1.33: Volume Populators Graduate to GA 2025-05-08
    Kubernetes volume populators are now generally available (GA)! The AnyVolumeDataSource feature gate is treated as always enabled for Kubernetes v1.33, which means that users can specify any appropriate custom resource as the data source of a PersistentVolumeClaim (PVC). An example of how to use dataSourceRef in PVC: apiVersion: v1 kind: PersistentVolumeClaim metadata: name: pvc1 spec: […]
  • Kubernetes v1.33: From Secrets to Service Accounts: Kubernetes Image Pulls Evolved 2025-05-07
    Kubernetes has steadily evolved to reduce reliance on long-lived credentials stored in the API. A prime example of this shift is the transition of Kubernetes Service Account (KSA) tokens from long-lived, static tokens to ephemeral, automatically rotated tokens with OpenID Connect (OIDC)-compliant semantics. This advancement enables workloads to securely authenticate with external services without needing […]
  • Kubernetes v1.33: Fine-grained SupplementalGroups Control Graduates to Beta 2025-05-06
    The new field, supplementalGroupsPolicy, was introduced as an opt-in alpha feature for Kubernetes v1.31 and has graduated to beta in v1.33; the corresponding feature gate (SupplementalGroupsPolicy) is now enabled by default. This feature enables to implement more precise control over supplemental groups in containers that can strengthen the security posture, particularly in accessing volumes. Moreover, […]
  • Kubernetes v1.33: Prevent PersistentVolume Leaks When Deleting out of Order graduates to GA 2025-05-05
    I am thrilled to announce that the feature to prevent PersistentVolume (or PVs for short) leaks when deleting out of order has graduated to General Availability (GA) in Kubernetes v1.33! This improvement, initially introduced as a beta feature in Kubernetes v1.31, ensures that your storage resources are properly reclaimed, preventing unwanted leaks. How did reclaim […]
  • Kubernetes v1.33: Mutable CSI Node Allocatable Count 2025-05-02
    Scheduling stateful applications reliably depends heavily on accurate information about resource availability on nodes. Kubernetes v1.33 introduces an alpha feature called mutable CSI node allocatable count, allowing Container Storage Interface (CSI) drivers to dynamically update the reported maximum number of volumes that a node can handle. This capability significantly enhances the accuracy of pod scheduling […]
  • Kubernetes v1.33: New features in DRA 2025-05-01
    Kubernetes Dynamic Resource Allocation (DRA) was originally introduced as an alpha feature in the v1.26 release, and then went through a significant redesign for Kubernetes v1.31. The main DRA feature went to beta in v1.32, and the project hopes it will be generally available in Kubernetes v1.34. The basic feature set of DRA provides a […]