Kubecon Europe 2020 - Day 4

The fourth and last day of the KubeCon CloudNativeCon Europe 2020 virtual event. An amazing conference, congratulations to the CNCF team. And thanks to our SoKube team for following the confs and for writing these blog posts!


Check out other days:

Kubecon Europe 2020 - Day 1

Kubecon Europe 2020 - Day 2

Kubecon Europe 2020 - Day 3




Going Beyond CI/CD with Prow


By Leonardo Di Donato Open Source Software Engineer, Sysdig


At the beginning of Falco, CI was done through Travis CI.

Pain points were especially about the non-interactive workflow between a classical CI and GitHub (the CI does not handle status from Github repo):

  • no clear ownership

  • PR merged event GitHub status is KO

  • Some policies but they were not easily discoverable, auditable

  • No automation

  • No enforcement for approvals

Falco context: didn’t want to spend time to:

  • build a custom ci/cd

  • create automatic policy enforcer

Falco team wants to focus only on development of their product.

As Kubernetes used Prow, Falco chosed to follow this path.


Prow capabilities:

  • GitHub ChatOps

  • Manage and enforce policies

  • Auto-merge bot, with considerations for GitHub status

  • Prow is OSS, so you can add some plugins and extensions if needed

  • Built for Kubernetes, on Kubernetes

=> With these capabilities (and the one in particular), Prow is by nature very scalable.


It seems that Falco now uses Prow as their CI/CD solution, and it fits perfectly their needs.

We think it is very interesting to have a kubernetes native solution for CI/CD such as Prow, but as of now it is limited to GitHub repositories, and that is a pain point.

A huge proportion of organizations have other SCM from the market in place (Bitbucket, GitLab, SourceForge, …) and don’t want to migrate to GitHub. It seems that GitLab is considering helping the Prow project by provising an integration with their system, but for now it’s only at the ideation stage.


The Past, Present, and Future of Cloud Native API Gateways


By Daniel Bryant Product Architect, Datawire


Boundaries between apps and users has evolved in last 30 years. We will refer at those boundaries of an application (networking etc) as “edge”, as the speaker used this vocabulary.

1990: Hardware load balancers

2000: software load balancers appear (nginx/haproxy,…)

2010: API and so… API gateways begin

2015: Microservices => independent, and so: different protocols, languages, locations, authentication systems, …

API gateway needs to to handle all of this: authentication, load-balancing, discovering of new services.

Since the advent of micro-services, the workflow changed and now app teams are fully responsible for a service delivery


2 biggest challenges:

  • Scale edge management (who does what), because we have more and more resources like, routes, etc in the API gateway (retries, authentication, caching, tracing, rate limiting are the main features of an API gateway solution)

  • Support all these requirements in different ways, since every service will choose a solution that best fits its own needs.

Three strategies:

  • Deploy an additional kube API gateway:

  • dev teams are responsible

  • OR existing ops teams can manage this

  • Extend existing API gateway:

  • Augmenting an existing API gateway solution

  • Custom ingress controller or load balancer

  • Enable sync between the API endpoints and location of k8s services

  • Hard to maintain (custom scripts must avoid conflict between routes inside the cluster)

  • Deploy an in-cluster edge stack:

  • Deploy Kubernetes-native API gateway

  • Install in each of your kube clusters

  • Ops team own it, and provides default

  • Dev teams are responsible for configuring the network boundaries of their services as part of their normal workflow

  • Simple to maintain, but learn about new proxies technologies can be hard at the beginning for Ops team.


Nice session on the different evolutions of API Gateway during the last decade. If we just keep in mind the key points:

  • Edge and API gateways have evolved through several evolutions driven by architecture (Hardware vs software, networking: from L4 to L7, and changes of the workflow and responsibilities since the apparition of micro-services)

  • Adoption of micro-services, with changes in your workflow, will led to choose a strategy for implementing an API gateway solution, and you’ll need to choose your own way, with the choice that will best fit the best your requirements.


Kubeflow 1.0 Update by a Kubeflow Community Product Manager


By Josh Bottum Vice President, Arrikto

The Kubeflow project is dedicated to making deployments of machine learning (ML) workflows on Kubernetes simple, portable and scalable.



  • Jupyter notebooks: source code

  • Training operators: Machine Learning layer

  • Workflow building: tools that simplify the kubeflow pipelines

  • Pipelines: way to schedule, run and monitor a workflow that will run your ML model

  • Data management: provides the versioning, sharing and reproducibility of your models

  • Tools: TensorBoard, Prometheus, etc. Dashboard for visualization around the KubeFlow

  • Metadata: metadata of your models

  • Serving: serving tools allow you to put and provide your model efficiently in the system

Interesting points from surveys presented by the speaker:

  • Kubeflow essentially used by software engineers and data scientists

  • Only 16% users of KubeFlow used it in production currently (~25% of users uses it just for learning).

  • Users develop Machine Learning models faster with Kubeflow

  • Some tools, like CUJ, helps the organizations:



The demonstration, using miniKF (a small KubeFlow, available on GCP), showed how well it integrated and was very visual:

  • code

  • deployments (with colors that help us to see what is used or not)

  • pipeline

  • before a pipeline runs, a snapshot is taken

  • you can see the pipeline status in real time while it is running

  • when you take a snapshot of a stage into the pipeline will retrieve the context too, giving you the opportunity to reproduce the exact issue you had (you can rerun only one step thanks to the serving components).


The demonstration was the best moment, it illustrated KubeFlow as a great platform for developing in the ML domain. It aggregates all the tools needed to develop a data model, and is very user friendly (getting started with such a platform should help people curious to learn and and experiment ML).


Design Choices Behind Making gRPC Available on Web Platforms


By Wenbo Zhu Senior Staff Engineer, Google


Presentation about a new protocol for supporting gRPC at “web” level: gRPC-Web.



Requirements:


This new protocol (gRPC-Web):

  • must be compatible with gRPC

  • introduces minimum changes to the original protocol (just implement specificity for web, like CORS).

Limit the streaming support

  • avoid complexity to support protocols that require fallback, such as websockets

  • don’t invent anything we may regret in the future

  • don’t make an underlying streaming technology more reliable than it is


Need to work anywhere, support old platforms like IE10 and new platforms, and both browsers and non-browsers clients


Developer joy: prioritize feature that improve the development experience (Code-gen and build, TS, Node, …)


Be compatible with REST, keeping JSON support. No need for reimplementation of protocol agnostic features such as security. Will just integrate it.


Roadmap:

  • Bidi streaming

  • Security features (XSRF, XSS, CSP)

  • Gateway with more languages - very limited for the moment

  • Protobuf improvements and performance


Using Kubernetes Secrets in GitOps Workflows Securely


By Seth Vargo Engineer, Google - Alexandr Tcherniakhovski Engineer, Google


Several problems when kubectl create secret :

  • who create the secret, when and why?

  • is it tested? can we rollback it? is it the truth?

Focus on what is the source of truth for kube secrets, and how we can protect it.

What about git and gitops? History, rollback and reviews, source of truth


First approach:

Use git, and not with plaintext secret (if you don’t use git, the pattern works too):

Use asymmetric cryptography: JSON Web encryption (JWE)

Usage of an envelop is recommended because secrets in most KMS cannot exceed 64Kb. So the envelope gives us the flexibility to encrypt larger payloads.

Workflow with personas:

  • Key admin (management of KMS)

  • Secret admin (manage sensitive data)

  • Cluster admin (deploy, manage and configure the kube cluster)

Key admin create a key in the key management system, and push the public key in the git repo

Secret admin uses this key to create JWE and the secret manifest in git

Cluster admin will retrieve secret file in the repo and push it in kubernetes cluster,.



Store secret in etcd is the problem… since it can be retrieved after that…


Workthrough

Demonstration using Google cloud KMS and GCE (but it can works with other solutions, of course)

  • Key admin:

  • generate the asymmetric key in kms and export it to the git repository

  • grant the decrypt priviledges to the dedicated service account in kubernetes

  • Secret admin:

  • encrypt the credentials with the same crypto algorythm than the key admin used

  • create JWE and secret manifest, push it to the git repo

  • Cluster admin:

  • will use the service account dedicated to retrieve secret in KMS

  • configure the webhook to access the KMS



Great session about how to really secure your secrets. Instead of storing directly the secret in Kubernetes, you can use a webhook which can be triggered only by the service account in order to retrieve the secret from the KMS. As the KMS secret can be read only by the service account, which is secured.

And since the secret is just a secret with access to the webhook, we don’t care if it can been seen in etcd, since Google KMS will deliver the key only when called by the appropriate service account.

No matter what solution you'll choose, the key is to use a security solution over Kubernetes and do not store directly your secret as “plaintext” in etcd.


Threat Modelling: Securing Kubernetes Infrastructure & Deployments


By Jonathan Meadows (Head of Cloud Cyber Security Engineering, Citibank) and Rowan Baker (Head of Security, ControlPlane)

Summary


What is Threat Modeling ?


Threat modeling is preventing from finding out about security issues when it's too late. As early as possible, once a shared understanding is established and when features are designed for every subsequent release. Everybody can bring their own unique perspective. In fact, architects know how things should work, DevOps know how things actually work and other team like product owners, business analysts or internal users can brings informative and necessary information to the modeling. To implement that, answer the four questions: what are you building ? What can go wrong once it’s built ? What should you do about what can go wrong ? Did you do a decent job for analysis ?


What does Threat Modeling look like for Kubernetes ?


Kubernetes cluster Threat models :

  • Provisioning and scaling

  • Runtime and cluster configuration

  • CI/CD and application deployment




What can go wrong after you deployed your pod or run your CI/CD pipeline ?


You can have the most secure system in the world at runtime, but if it's exploited because you forgot about the supply-chain security and deploying securely in the system then it's only wasted time.


The first is defined a end to end pipeline diagram like this :


Diagrams are really important for breaking down what is built into flow processes, trust boundaries and stores within the system.

After the diagram is established we can use different techniques to find what is wrong in the system. The most common is STRIDE.

  • STRIDE : to characterize and identify the kinds of threats that affect processes data flows stores within the system

Existing runtime models - CNCF attack trees

Here is a GitHub repository with the threat model for Kubernetes system : https://github.com/cncf/financial-user-group/tree/master/projects/k8s-threat-model

Attack trees :

Attack trees provide a formal, methodical way of describing the security of systems, based on varying attacks. Basically, you represent attacks against a system in a tree structure, with the goal as the root node and different ways of achieving that goal as leaf nodes.” - Bruce Schneier (1999)

What are we going to do if one of threats is true ?

There should be security controls. Here are a few items :

  • Use a dedicated devices and network for management

  • Harden EC2 instances

  • Restrict EC2 instances IMA roles

  • Containers based IDS/IPS

  • Encore control and etc mTLS

You can enforce your controls by complementing them around :

  • Networking (VPC, ACL, Security Group, Subnet…)

  • Runtime (Security context for pods and containers : Run as non-root user, as unprivileged, drop all linux capabilities… )

  • RBAC and policy ( Kuberntes RBAC, Admission controllers, Open Policy Agent…)

  • Supply Chain Security

Determining Control sets :

You can start simple but more complex control set requires automation and testing → Risk is the deterring factor

Defense in depth with attack trees :



Integrated Kubernetes with a global SOC

  • Threat model

  • Reproduce the attack against test cluster repeatedly

  • Gather the signals generated

  • Work with System and Organization and Controls (SOC)

  • Re-run the test cases

  • Make sure Docker starts correctly


This topic deals with a very important aspect of Kubernetes : Security. It was very informative and educative, explaining what is a threat model and how to create it, by utilizing diagrams and on focusing in the right aspects like runtime, networking, supply chain and many more.


Autoscaling and Cost Optimization on Kubernetes: From 0 to 100


By Guy Templeton (Senior Software Engineer, Skyscanner) and Jiaxin Shan (Software Developer Engineer, Amazon)


Autoscaling project reviews :


The Horizontal Pod Autoscaler

  • Core logic lives in the Kube-controller-manager and is responsible for comparing current state of metrics against desired state and adjusting as necessary

Three different metrics types which can be used :

  • Resource (metrics.k8s.io)

  • Custom (custom.metrics.k8s.io)

  • External (external.metrics.k8s.io)

Resource metrics :

  • Resource metrics are the simplest of the 3 metrics - CPU and Memory based autoscaling.

  • Provided by the API metrics.k8s.io - the same metrics you can see when running kubectl top

  • Now usually provided by the Metrics Server - this scrapes the resource metrics from kubelet APIs and serves them via API aggregation

  • Currently based on the usage of the entire pod - this can be an issue if only one container in your pod is the bottleneck

Custom metrics :

  • Served under the API custom.metrics.k8s.io

  • No “official” implementation - though the most widely adopted is the Prometheus Adapter

  • Say you have a service where you know how many requests a given pod can handle at any time but the memory or CPU usage isn’t a good indicator of this - i.e. a fixed number of uWSGI processes

  • Scaling on CPU or memory is either going to waste money or result in decreased performance

External metrics :

  • Served under the external.metrics.k8s.io API path

  • A number of implementations exist for this - Azure, GCP and AWS provide ones for their metrics systems so that you can scale your k8s services based on metrics from them as well as some of the previously mentioned custom metrics implementations

  • Intended for metrics entirely external to kubernetes objects (e.g. kafka queue length, Azure servicebus queue length, AWS ALB active requests)

The HPA’s Algorithm


What if I want to scale on multiple metrics ?

As of k8s 1.15 the HPA handles this well, you can scale on multiple metrics and the HPA will make the safest (i.e. highest) choice, even if one or more of the metrics is unavailable


What about scaling down to zero ?

It’s possible, but you have to set your HPA up in the right way - requires both enabling an alpha feature gate - HPAScaleToZero and setting the associated HPA up with at least one object or external metric


Vertical Pod Autoscaling

Application is changing over time, maybe init request setting is no longer suitable later :

  • Daily/Weekly traffic patterns

  • User base growing over time

  • App lifecycle phases with different resource needs

The Vertical Pod Autoscaler (VPA) aims to solve these problems - scaling the resource requests and limits for monitored pods up and down to match demand and reduce waste.

  • Three components to it :

  • Recommender : Responsible for calculations of recommendations based on historical data

  • Updater : responsible for eviction of pods which are to have their resources modified

  • Admission plugin : a Mutating Admission Webhook - parsing all pod creation requests and modifying those with a matching VPA to match recommendations

  • Currently provides 4 modes : Auto, Recreate, Initial, Off

Benefits :

  • Useful for singletons

  • Services used by internal teams

  • No use giving them peak resource usage and burning money during the quiet periods

Limitation :

  • Shouldn’t use it in conjunction with resource based HPAs as the two will conflict

  • Modifying the resource requests requires recreating the pod - meaning a pod restart

  • Can be tricky to use with JVM based workloads on the memory side

The Cluster Autoscaler (CA)

Scale ups are triggered by pending pods. CA then performs an evaluation of which node groups it monitors would be able to fit the pending pods if they were scaled up. Scale down is evaluated for nodes using resources below a certain threshold.



Cluster Autoscaler Expanders

The different methods supported by the Cluster Autoscaler for deciding which node group to scale up when needed

  • Random (the default) : picks a random candidate node group which can fit the pending pods

  • Priority (available from 1.14 onwards) : can use this in conjunction with custom logic

  • Price (Currently GKE/GCP only) - automatically picks the cheapest candidate node group for you

  • Least waste : picks the candidate node group with the least wasted CPU after scale up

There are a number of things to consider when enabling Cluster Autoscaling like which pods can tolerate interruptions, whether pods being scaled down need to do any clean up, pod priorities and more …


Cost optimisation with the Cluster Autoscaler


If you have batch jobs or jobs which don’t need to run immediately, you can use “-expendable-pods-priority-cutoff” to avoid the CA scaling up purely for ultra low priority jobs.

If you want fall back to on-demand instances when Spot/Preemptible instances are out of capacity, users can create on-demand node groups with lower expansion priority and spot instance node groups with higher priority



You can also use field “--max-node-provision-time” if you have multiple spot node groups and each fallback takes 15m and you want reduce the time.


The best practice using the CA is to map each node group to a single ASG because accurate simulation requires instances have same resources. 



Gotchas with the Cluster Autoscaler


How to protect my critical workloads and ensure they don’t get interrupted by CA ?

Pods with the annotation “cluster-autoscaler.kubernetes.io/safe-to-evict=false” prevents the CA terminating the node with your critical job even if the node utilization is lower than the default threshold


How to over-scale Kubernetes with the cluster-autoscaler ? 

Overprovision feature puts dummy pods with low priority to reserve space. K8s scheduler will remove them to make space for unschedulable pods with a higher priority. Critical pods then don’t have to wait for a new nodes to be provisioned. These pods don’t even have to be dummy pods if you have a suitable workload that is non-critical and can tolerate interruption.


What if all of my services start scaling and don’t stop scaling ?

  • ResourceQuotas are invaluable here, figure out the maximum resources a given namespace should use at peak load, and allowing for failovers and set the ResourceQuota for that namespace to guard against runaway scaling

  • In addition, setting the maximum size of the node groups to limit the scale of clusters on the Cluster Autoscaler’s side


Cluster Autoscaler doesn’t yet support all cloud providers, but most of the big ones are covered. Decouple cloud provider and support pluggable Cloud Provider over gRPC


This sessions was very interesting, demonstrating how autoscaling at different levels is possible. As for anything, cost saving in Kubernetes is about analyzing the trade-offs you can make, which pods can afford to be interrupted, how quickly you need services to scale up and down and what scaling behaviour you want in the cluster. Finally, the best cost saving strategies can vary depending on your workloads, environment and cloud provider.


Next Generation of CI/CD: Analytics-driven Traffic Management on Kubernetes


By Fabio Oliveira (Research Scientist, IBM Research)


The goals of this presentation are :

  • Raise awareness of a fundamental yet largely ignored problem at the core of cloud native canary releases, performance tests, and A/B & A/B/n testing

  • Offer an open solution to that problem and engage the community

It is about agility :




But it is also about learning :

  • What if you could safely :

  • learn how your code behaves in production or test ?

  • What if you could continuously and safely :

  • learn what resonates with your users ?

  • find ways to increase your company’s revenue ?

  • maximize your company’s revenue as you learn ?



Analytics is crucial, continuous experimentation is an analytics problem and a comparative analytics problem ! For that, enter iter8


Overviews of iter8



  • Version assessment :

  • With confidence

  • Traffic control strategies

  • Progressive

  • Top-2

  • Uniform

  • Traffic control safety filters

  • Cutoff on failure

  • Maximum increment

  • Match clause

  • Experiment traffic percentage

iter8 experiment type




  • performance test

  • Assess version against criteria

  • Typically done in a test/dev environment

  • Can be done in production



Canary release

  • 2 version : bassine and candidate

  • Assess canary

  • make sure so SLOs are violated

  • relative criteria make sens

  • Apply traffic control strategy

  • If canary passes → roll forward

  • If canary fail → roll back


  • A/B and A/B/n testing

  • “n” versions :

  • baseline

  • 1 or more candidates



  • Compare versions to declare a winner

  • maximize a reward metric

  • make sure no SLOs are violated


  • Apply traffic control strategy

  • Traffic will go towards winner


This conference was really informative, showing how traffic management analytics could be driving a pipeline CI/CD. For this, it is necessay to introduce the iter8 tool to increase the power of continuous experimentation based on Machine Learning . It is particularly interesting to see how today, different domains can work together to drastically increase business goals. In this case we see a mix of DevOps and Machine Learning.

Fore more information about iter8 :

Stateless Fluentd with Kafka


Steven McDonald (Site Reliability & Infra Engineer, Usabilla)


Steven McDonald presents several iterations of his Fluentd logging stack :

First iteration :

  • Two fluentd aggregators, with fluent-bit on every host configured to forward local logs to fluentd.

  • Both fluentd and fluent-bit were configured for disk-backed buffering for reliability.

  • Fluentd then forwarded logs on to CloudWatch and Elasticsearch.



He explains that there have been cascading failures, initially because of the Elastic cluster. (duplication of data, exploding volume, performance problem ect ..)

It highlights the lessons learned on each of the iterations tested.


Second iteration :


A new iteration and not the least, he's bringing in Kafka as a “logging buffer” and to have a stateless architecture.





Despite many contributions on the ruby-kafka plugin, the team still encounters a lot of problems (bad management of large batches, performance problem ..)


Third iteration :




It uses a KStream to filter one or more Kafka topics and then adds two Kafka Connector (sink), one for S3 storage (https://www.confluent.io/hub/confluentinc/kafka-connect-s3) and the other for Elasticsearch (https://www.confluent.io/hub/confluentinc/kafka-connect-elasticsearch).


Given the "Stateless" approach, it is very likely that only data on S3 and Elastic will be persisted.


In this type of configuration, it would be interesting to add a Schema-Registry (https://docs.confluent.io/current/schema-registry/index.html) and a ksqlDB (https://ksqldb.io/).



Very interesting conference, especially on the third iteration which we have implemented at SoKube. This validates our choice of architecture. It remains to be seen if the Fluentbit connector (planned for v1.6) will avoid the use of a dedicated KStream for processing.


Deep Dive into Helm


Paul Czarkowski (Developer Advocate, VMware) and Scott Rigby (Human, Home)


The session was almost about the new version of Helm (v3) :

  • Tiller bye bye

  • Easy upgrade to Helm v3 (helm2to3)

  • CNCF Graduated Project

  • Helm2 Depreciation (30 nov 2020)

  • Process submission charts for community

  • Schema Validation (Json file)

  • Tests (with Registre OCI)

  • CLI enhancement (SDK Go)

  • Post-render (--post--render | https://helm.sh/docs/topics/advanced/)

  • Helm tools (helm-diff, helm-file, helm-controller, helm-conftest)

  • Security (Helm inherits your RBAC (bye bye Tiller, bye bye Rbac Tiller) & Chart provenance

The end of the conference was interesting, but Helm v3 has been released since November 13, 2019. Focusing too much on the new features introduced since v2 was a bit disappointing as we were expecting some focus on more advanced concepts.


©2020 - SOKUBE SA - GENEVA - SWITZERLAND

linkedin_big.png