Kubecon Europe 2020 - Day 3

Third day of the KubeCon CloudNativeCon Europe 2020 virtual event.

Again lot of interesting sessions !


Check out other days:

Kubecon Europe 2020 - Day 1

Kubecon Europe 2020 - Day 2

Kubecon Europe 2020 - Day 4




Keynotes



Keynote: Kubernetes Project Update - Vicki Cheung, KubeCon + CloudNativeCon Europe 2020 Co-Chair & Engineering Manager, Lyft



Vicky Cheung presented the release notes of Kubernetes 1.18.

  • 38 enhancements

  • Release team: 34 persons

  • 40 000 individual contributors to date !

Key features:

  • Storage Enhancement:

  • Raw block device support graduates to Stable

  • alpha version of CSI Proxy for Windows (to perform privileged storage operations in Windows)

  • Scheduling Enhancements:

  • Run multiple scheduling profiles

  • Taint based Eviction graduation to Stable

  • PodTopologySpred graduation to Beta

  • HPA (Horizontal Pod Autoscaler)

  • Feature in Alpha

  • Finer grain control on autoscaling rates

  • Avoid flapping of replicas

  • Adjust scaling behavior based on application profile

  • Kubectl alpha debug

  • Ephemeral containers were added in 1.16

  • When “kubectl exec” isn’t enough

  • 1 kubectl alpha debug -it demo --image=debian --target=demo

  • Priority and Fairness for API Server Requests

  • Protect API servers from being overloaded while ensuring critical requests go through

  • Prevent loss of user access should anything run amok

  • Node Topology Manager graduates to Beta:

  • Useful for high performance computing and machine learning workloads

  • Other Notable features:

  • IPv6 Beta

  • Certificate API

  • APIServer DryRun and “kubectl diff”

In the next Kubernetes 1.19:

  • Generic ephemeral volumes

  • Kube-proxy IPV6DualStack support on Windows

  • Initial support for cgroup v2 (YES !!!)


Sponsored Keynote: The Kubernetes Effect - Igniting Transformation in Your Team - Briana Frank, Director of Product, IBM Cloud


Briana Franck explained how to transform your IT projects into the Cloud Native world.

New projects allow teams to re-evaluate the current path of transformation using Design thinking workshop. Take advantage of the start of new projects to catalyze innovation and enforce the culture of automation to gain efficiency. Human and communication is key. An exemple for operate and improve communication is ChatOps.

Automate the deployment of thousands of clusters with ease: exemple with Razee.io


Keynote: How to Love K8s and Not Wreck the Planet - Holly Cummins, Worldwide IBM Garage Developer Lead, IBM

This talk could have been renamed “Zombie land” !

Excellent keynote of Holly Cummins explaining the impacts of bad resources usage on the Planet.

The consumption of resources between data centers and aviation is not that far! (1 to 2% vs. 2.5%)


The Kubesprawls trend (pattern of “many clusters” rather than “one big shared” cluster) is not uncommon for multiple teams in the same organization. Cluster are less elastic than applications and have overhead.

There are several reasons to get your own cluster: isolation, security, perf, name collisions, …

Consolidation with multi-tenancy could be a good approach.


In all cases don’t forget your resources : Zombie workloads ! You know the things you forget in your cluster and that stay “alive” but not really and that consume resources for a while !

  • manual solutions exists: meeting, tags on manifest objects, …

  • The solution is to do the right thing

  • GitOps helps a lot with the infra-as-code: “disposable infrastructure” you can deploy, redeploy and delete in a single operation.

To avoid Zombie workload, See the talk “Sharing Clusters: Learnings From Building a Namespace On-Demand Platform“ especially the part “Monitor cost and identify idle namespaces”


Sponsored Keynote: Keep It Simple - A Human Approach to Coping with Complexity - Hannah Foxwell, Director – Platform Services, VMware Pivotal Labs


In this presentation, Hannah Foxwell explains what is the path of least resistance to a successful Cloud Native project:

  • Start by identify Early Adopters

  • Build a Minimum Viable Product

  • Don’t think like building the platform as a silo and more as service

  • Scaling the success

  • Communication and relation is key

  • Small steps are easier

  • Adopt the KISS principle: Keep It Short and Simple

And remember we are human not super human !


Containerd Deep Dive


By Akihiro Suda (Software Engineer, NTT) and Wei Fu (Software Engineer, Alibaba)

This talk was about Containerd, a container runtime implemented the CRI runtime specification.

Containerd is not only used by Docker but also by many distributions: K3s, kubespray, micro8s, Charmed Kubernetes, Kind, minikube, Alibaba ACK, Amazon EKS, Azure AKS, Google GKE, … And also by libs/frameworks like BuildKit, LinuxKit, Fassd, Vmware Fusion Nautilus


A lots of nice features were introduced with version 1.4:

  • Lazy pulling of images: run containers before completion of downloading the images -> improve start speed: it is based on Stargz & eStargz. With a tar format you cannot navigate to a specific offset until all the file is downloaded

  • Support for cgroup v2 and improved support for rootless mode (resource limitation support)

  • Windows CRI support

  • and much more (see release notes: https://github.com/containerd/containerd/releases/tag/v1.4.0)

Containerd is highly Customizable throught its V2 Runtime:

  • Runtime v2 with a shim API that allows for instance integration with low level runtime like gVisor, KataContainer, Firecracker.

  • Binary: specific naming convention

  • Support pluggable logging via STDIO URIs

Containerd is definitely a good CRI runtime with lots of nice features in v1.4. We especially appreciate the lazy pulling of images and expect a lot from cgroup V2.


What You Didn’t Know About Ingress Controllers’ Performance


By Mikko Ylinen (Senior Software Engineer, Intel) and Ismo Puustinen (Cloud Software Engineer, Intel)


Basically Ingress exposes HTTP(s) routes to services within the cluster. The Ingress controller follows the Kubernetes objects and creates the route configuration. The Ingress Proxy reads the configuration and handles the actual traffic routing

There are several implementations like Nginx, HAProxy, Envoy…

Performance bottleneck are mostly bandwidth and latency.

Tuning areas concern:


A focus on TLS Handshake because most of the time of a request is spent on the TLS handshake.

How to improve it:

  • TLS 1.3 or a faster cypher…

  • Sync vs async TLS

  • Async TLS is currently being added in Ingress

  • Async TLS Offloading

When to offloading depends on CPU-bound of the controller, if lot of new HTTPs connections, …


Exemple with HAProxy Hardware Acceleration, HAProxy RSA Multibuffer

Call to action:

  • Check where are your bottlenecks

  • Ingress Controllers: check config of crypto offload, non native resources, node affinity labels

  • Ingress Proxy devs: switch to async TLS and allow custom TLS handshake

A good and very technical presentation on how to tune your Ingress if you need to get the best performances.


Managing Multi-Cluster/Multi-Tenant Kubernetes with GitOps


By Chris Carty (Customer Engineer, Independent)

The author presented what is GitOps (You will also find a presentation of GitOps on the Sokube Blog)


2 main GitOps projects:

Project Structure contains your deployment yaml files.

Some useful tools in the GitOps context :

  • OPA (Open Policy Agent)

  • Conftest (helps you write tests against structured configuration data)

  • Kubeval (tool for validating a Kubernetes YAML or JSON configuration file)

  • Kind (tool for running local Kubernetes clusters using Docker container "nodes" like k3d/k3s):

Git Single vs Multi Repo in case of multi-tenancy


Single Git repo:

  • Containing the CI, Cluster-admins (monitoring, networking, security), teams-1 (app resources), teams-2 (App resources), … :

  • each cluster (Dev, QA, Prod, …) Sync on a specific branch or tag

Multi-repo:

  • Several Git repos can target the same cluster but in different namespaces

  • Branch/Tag = Grouping/Environnement of cluster


A nice presentation for this very important topic. The speaker did not have time to go into details and to present the pros and cons of each approach, but it gives important pointers and a way to achieve multi-tenancy using Flux in Kubernetes .


Sharing Clusters: Learnings From Building a Namespace On-Demand Platform


By Lukas Gentele (CEO, DevSpace Technologies Inc.)

2 Approaches

  • Single-Tenant k8s: 1 Team/App per Cluster => too expensive

  • Multi-tenant k8s: Sharing large Cluster => less expensive but more complex

Several learning:

  • Centralize user management and authentification

  • SSO for k8s via Dex for instance

  • Restrict user but use smart defaults (UX matters)

  • Pod Security Policy

  • Resource Quotas: Set default via LimitRange (Mutating Admission Controller)

  • Network Policies; Default to deny all (allow traffic from inside the namespace)

  • Automate as much as possible

  • template for RBAC, Quotas, Net Pol

  • OPA for dynamic admission control (ex: Hostname validation for ingress resource or for certificate resource or block certain storage and network conf…)

  • Store everything in kubernetes + git

  • Use annotation, labels, secrets config map to store info about owner, tenants, …

  • GitOps for History, audit, rollback and Approval process vial PR and Code Owner

  • CRDs for even more control & automation

  • Kiosk:

  • Do not hide kubernetes but make it easier to use

  • Engineers need direct access to Kubernetes (to verify, debug, ….)

  • Kubectl is an API not a dev tool

  • Simplifying Local Development in Kubernetes with Telepresence

  • File-Sync-based Dev Experience: Skaffold, DevSpace, Tilt

  • Monitor cost and identify idle namespaces

  • automate the shutdown of idle namespaces with:

  • Cluster-turndown by Kubecost

  • idling on Openshift

  • Sleep mode in loft

  • Sometimes users need more than just namespaces

  • Namespaces based multi tenancy has limitations: CRD needed by users or specific versions of k8s

  • Virtual Cluster can solve this problem: vCluster

A very interesting talk on how to address multi-tenancy providing best practices and tools like Telepresence, Cluster-turndown, Sleep-mode, VCluster and Kiosk…


Multi-Tenant Clusters with Hierarchical Namespaces


By Adrian Ludwin (Senior Software Engineer, GKE, Google)

Concept of hierarchical namespaces is new in Kubernetes.

Multitenancy to care about cost and velocity.

One tenant per cluster was for small team

Kubesprawl : it does scale very well to get 1 cluster per tenant or team

A Kubernetes multi-tenancy group exists to adresse those issues.

Namespaces are the primary unit of tenancy in kubernetes but some security features require namespaces:

  • RBAC works best at the namespace level

  • Also applies to most other policies : resource Quota, NetworkPolicy, …

Policies across namespaces

  • need a tool and source of truth outside k8s: Flux, ArgoCD, ….

  • Alternatively, some in-cluster solution add accounts or tenants: Kiosk or Tenant CRD

Hierarchical Namespace Controller (HNC):

Hierarchical namespaces make it easier to share your cluster by making namespaces more powerful. For example, you can create additional namespaces under your team's namespace, even if you don't have cluster-level permission to create namespaces, and easily apply policies like RBAC and Network Policies across all namespaces in your team (e.g. a set of related microservices).
  • Entirely Kube native

  • Builds on regular kube namespaces

  • Delegate subnamespace creation without cluster privileges !

  • policy propagation

  • subnamespace hierarchy

  • It cannot be moved

  • Trusted label with tree suffix in the child namespace

  • Easy to extend


Other features:

  • Authorization checks before modifying the hierarchy

  • Cascading deletion of subnamespaces

  • Monitoring options (Metrics via OpenCensus)

  • Uninstallation support (to avoid data deletion)


Hierarchical namespaces is in alpha but can simply be added like an addon on Kubernetes 1.15+ (Well done !)


It is a really cool feature that will simplify the multi-tenancy approach in a K8S cluster (especially with a GitOps approach) and give teams control over their own namespace hierarchy.


Automating Load Balancing and Fault Tolerance via Predictive Analysis - Steven Rosenberg, Red Hat


By Steven Rosenberg (Software Engineer, Red Hat)


Load Balancing → Type of solution Fault Tolerance → Live Migration Scheduling → Predictive Analysis

Load Balancing

  • Priority based upon urgency

  • Even Distribution within categories :

  • Urgent priority - Mission Critical - Real Time Processing

  • High Priority - High Importance - near Real Time Processing

  • Neutral Priority - Medium importance - Normal Processing

  • Low Priority - Low importance - Not Time Critical Processing

  • No Priority - Unimportant - Unimportant Proecesses


Fault Tolerance Redundancy Example

Scheduling

  • Ability to launch processes based upon needed resources

  • Monitor the amount of resource each process utilizes

  • Type of Launching/Migration Scenarios :

  • Initial Launch

  • Migration for maintenance

  • Re-balancing - Migration to another host

  • Fault recovery - Migrating to mitigate system/process failure


Policy units - Attributes of scheduling Migrations

  • Filters

  • Weights/Scoring

  • Balancers :

  • Even distribution

  • Power saving

  • Prioritizing

  • Affinity

  • CPU/Non-Unimform Memory Access Pinning for optimal performance

Live migration process to migrate from source host toward destination host :

  • Network connectivity

  • Remote disk availability

  • Migration data on local disk(s)

  • Copying memory state in phases

  • All of the Curren memory contents

  • Current differences before VM pausing

  • Minimal difference during VM pausing

  • Copy CPU State

  • The goal is to limit pausing of the VM

  • Restarting the VM on the destination host

  • Clean up on the source host

Predictive Analysis Topics

  • Predicting future occurrences via analysis of past performance

  • Techniques for predictive analysis

  • Process for developing a prediction model

Predictive analytics methodology :

Get historical data → Create a training Set → Create an algorithm and a model → Get result → Restart process


Process fo developing a prediction model :

Applying Predictive Analytics to Schedulers

  • Criteria for Data

  • Processing time / Iteration - Adjusted for resource capacity and priority

  • Percentage of resources used - Adjusted for capacity and priority

  • Adjust for anomalies when calculating averages


  • Ideas - DSelective techniques applied for other scheduling applications :

  • Comining regression-like modeling and functional approximation using the sum of exponential functions to produce probability estimates

  • Machine Learning and advanced mathematical models


Predictive analysis architecture :


Tracking historical data :

  • The time each process starts and terminates

  • The resources used by each process

  • The time each process uses to migrate

  • The time/iteration that memory/disk transfer occurs per size

Consideration based upon analysis

  • If early can proceed

  • When early migration shall start

  • Error correction/anomaly detection for accurate results


This topic was interesting but is a complex subject. It demonstrates how to be proactive about load-balancing and fault-tolerance, based on collected historical data and mathematical compute to prevent future faults. There are interesting models to improve infrastructure reliability.


Simplify Your Cloud Native Application Packaging and Deployments - Chris Crone, Docker


By Chris Crone (Engineering Manager, Docker)

What is a cloud native application ?

“A program or piece of software designed to fulfill a particular purpose” - Oxford English Dictionary

A cloud native application is :

  • Compute :

  • Containers

  • Function (AWS Lambda, Azure Functions…)

  • Virtual Machines

  • Storage :

  • Databases

  • Object storage

  • Volumes

  • Networking


The CNCF Cloud Native Landscape map represents a view of many applications, tools , runtime and more recommended by a CNCF. On this interactive map you can directly visit the product website by clicking on the product logo.


Deploying application

Are you encountering the following errors ? Probably yes !

  • Often need more than one tool to deploy an application

  • Is the ReadMe up to date ?

  • Which version of the tools ?

  • What if I'm using Windows or Mac and not Linux ?

  • Difficult coordination problem between team members, CI, users …

Ideal deployment tooling ?

  • Defined as code : tools, versions options → what the solution ?

  • Same deployment environment everywhere → what the solution ?


Packaging application

Different parts, different places :


Ideal application packaging ?

  • Immutable application artefact → what's the solution ?

  • Store the whole application in a registry → what's the solution ?

  • Ability to store application artefact online → what's the solution ?

There are many questions without answers!


Cloud Native Application Bundles

→ Here the Cloud Native Application Bundles (CNAB) enter in the picture.

CNAB is a package format specification that describes a technology for building, installing and managing distributed applications, that are by design, cloud agnostic.

CNAB specification

  • Target is tolling developers

  • Packaging specification (bundle)

  • Bundle runtime (action)

  • Install, upgrade, uninstall

  • Optionally

  • Lifecycle tracking

  • Registry storage

  • Security

  • Dependency


Bundle structure :

CNAB runtime

  • Standard actions : install, upgrade, uninstall

  • Custom actions :

  • status, logs …

  • Stateful, stateless

  • Application lifecycle tracked by claims

  • Keep track of state of installation

  • keep record of parameters, outputs…

  • Only data structure defined in specification


Finally CNAB can respond to previous questions :

→ Ideal deployment tooling ?

  • Defined as code : tools, versions options → solution :

  • Porter tool : Package everything you need to do a deployment (command-line tools, configuration files, secrets and bash scripts to glue it together). Package that into a versions bundle distributed over standard Docker registries or tgz files.

  • porter.yaml

  • Stored in CNAB invocation image

  • Same deployment environment everywhere → solution :

  • Containers

→ Ideal application packaging ?

  • Immutable application artifact → solution :

  • Hashes for components

  • Leverage OCI image specification

  • Store the whole application in a registry → solution :

  • Any OCI compliant container registry

  • Ability to store application artifact online → solution :

  • OCI image layout


CNAB in registries :