Access K8s Sparkfabrik Cluster
Since late 2016 Sparkfabrik's internal services (Gitlab, CI/CD pipelines, SparkBoard, etc) are running into a Kubernetes cluster hosted on GKE/GCP.
This means that all intermediate environments other than local and production (so integrations, branch builds, epic builds, etc) run in pods into a Google Cloud Engine elastic cluster. The following guide will help you configure your local environment so that you will be able to access services inside pods, open shells into them, read relevant logs and - ultimately - devops all the things! :)
Step 1: Authentication to Google Cloud
As said, the K8s cluster is running over Google Cloud infrastructure. To access it we first need to authenticate on GCP.
sparkfabrik.com account is enough to perform authentication, but you'll need to open a terminal and install
gcloud CLI tool. Follow the link to get
gcloud running on your OS.
Once done, you can authenticate running
$ gcloud auth login
Now configure the gcloud docker integration running:
$ gcloud auth configure-docker
Step 2: Accessing the K8s cluster
Access to the cluster and pods therein will happen using K8s CLI tool
gcloud command has all that we need to make it work:
$ gcloud components install kubectl
While Ubuntu users can enjoy
$ sudo apt install kubectl
kubectl is installed
gcloud command will allow us to access the GKE cluster.
gcloud CLI manages so many GCP services and areas that there are commands specific to each one. To tame the complexity, all commands are grouped and subgrouped.
Right now, the
container group is what we need: it contains groups of commands by which we can manage GKE aspects, like clusters, node-pools, Container Registry images, and so on.
We are going to use a command in the
clusters subgroup of the
container group to gain access to the cluster. That command is
get-credentials which fetches credentials for already running clusters.
get-credentials command takes a single parameter which is the cluster name. In our case it is
spark-op-services. In additions there is a mandatory flag that specifies the region and the datacenter zone inside the region (namely, where is the cluster phisically running?):
Last but not least, there is a global flag (not specific to the
get-credentials command), which is
--project. Projects in GCP are similare to realms. Not to be confused with K8s namespaces, (quoting GCP docs)
[...] projects form the basis for creating, enabling, and using all GCP services including managing APIs, enabling billing, adding and removing collaborators, and managing permissions for GCP resources.
So let's specify the correct
spark-int-cloud-services, that is the project that holds all the production services in Sparkfabrik.
Beware: environment for customers' projects CI are not customers' assets, they are Sparkfabrik assets, payed and managed by us. That's why accessing these environments involves our production project!
After this long explanation, the following command should be clear:
$ gcloud container clusters get-credentials spark-op-services --zone europe-west1-b --project spark-int-cloud-services
A laconic message should inform you that kubeconfig generated an entry for spark-op-services. No frills but you can pat yourself a shoulder. You're done.
Step 3: Fetching info from clusters
OK, we gained access to the cluster. Mind that's the access is read only, but you have execution permissions (namely you can run
kubectl exec) so you can enter running pods.
Let's test if our access is working after all. Run
$ kubectl cluster-info
and you should get a response in the lines of
Kubernetes master is running at https://<IP address> GLBCDefaultBackend is running at https://<IP address>/api/v1/namespaces/kube-system/services/default-http-backend:http/proxy Heapster is running at https://<IP address>/api/v1/namespaces/kube-system/services/heapster/proxy KubeDNS is running at https://<IP address>/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy Metrics-server is running at https://<IP address>/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
If not check you followed all previous steps correctly.
Mind though that depeding on permissions on your account the output of this command may differ and you can see only a subset of the information and/or a specific error message. Keep this in mind before banging your head to the wall.
Step 4: Namespaces
We mentioned projects, which is GCP realms to address accountability, ACLs and other "administrative" aspects related to the GCP services.
Projects as never to be confused with namespaces. The concept of namespace here is intended as typical of Kubernetes: K8s namespaces allow to segment the same "physical" cluster in reserved spaces, like they are separate clusters.
This makes us sure critical ops won't concur for resources or won't hinder each other in case of malfunctioning, at a cluster level.
We use this feature to make sure each Gitlab project (again, not to be confused with GCP projects: we mean each customer or internal product) that needs build environments in Gitlab, lives in its own namespace.
Let's take a look at all namespaces available in the cluster:
$ kubectl get ns
Here are a dummy response (since this is a public playbook):
NAME STATUS AGE bunnies Active 293d bunnies-demo Active 49d default Active 1y gizmo-website-d6 Active 99d gizmo-website-d8 Active 4d gitlab Active 345d gitlab-test-envs-342 Active 23d ingress-nginx Active 5d kube-lego Active 345d kube-public Active 1y kube-system Active 1y ... spark Active 345d sparkfabrik-website-292 Active 245d ... acme-website-304 Active 126d acme-website-master-stage Active 36d acme-website-subsid-stage Active 37d acme-website-master-dev Active 121d
Some of the preceding namespaces are real. As you can see names are pretty self-explaining (at least the ones related to projects). But if you are in doubt you can check Gitlab to see which namespace is in use by a specific Gitlab projects.
Settings -> Integrations -> Kubernetes -> Namespace in the project page to make sure (proper permissions may be necessary, ask your team leader if you can't access that section).
Step 5: Pods
OK, so far we have this hierarchy:
GCP Project foo └── Cluster bar ├── Namespace foo-bar-alpha ├── Namespace foo-bar-bravo └── Namespace foo-bar-charlie
Now, each namespace can contain pods. For simplicity think of pods like docker containers with superpowers.
Let's list all pods in a specific namespace, say
$ kubectl -n spark get pod
here is the result
NAME READY STATUS RESTARTS AGE artifacts-ssh-server-7d9b9db67b-wg4hh 1/1 Running 0 5d cron-3028794900-znhs8 1/1 Running 0 5d dashboard-develop-499waf-849b7c95f9-4qxmr 1/1 Running 0 5d playbook-locke-2261095262-8x8p2 1/1 Running 0 5d
This command components are:
kubectl: the client - duh
-n spark: use
get pod: list all pods
If we want to view the logs of a specific pod (like issuing
docker logs -f on a normal container), try
$ kubectl -n spark logs -f <pod-name>
$ kubectl -n spark logs -f playbook-locke-2261095262-8x8p2 npm info it worked if it ends with ok npm info using email@example.com npm info using firstname.lastname@example.org npm info lifecycle email@example.com~prestart: firstname.lastname@example.org npm info lifecycle email@example.com~start: firstname.lastname@example.org > email@example.com start /srv/locke > node server.js Express HTTP server listening on port 80 GET /robots.txt 404 58.932 ms - 2387 GET /FAQ/who-to-talk-to-for 200 68.338 ms - 11073 GET /guides/an-introduction-to-docker 200 24.580 ms
Again, let's see what the command does:
kubectl: ok, ok...
-n spark: use
logs -f: spit logs and follow the output (like
fstands for forever)
playbook-locke-2261095262-8x8p2: the pod name
So, to sum things up. Since each pod can be seen as a container and each container usually runs a single service (as per best practice), with this swiss-army knife command template:
kubectl -n <namespace name> logs [-f] <pod name>
you can see the logs of a specific service, for a specific project.
As a (almost) real life example see apache logs for the ACME Drupal 8 website, develop environment can translate to
kubectl -n acme-dev logs [-f] drupal
Step 6: Accessing pods command line
Now that we have logs we can debug 99% of the problems like a boss. Right?
Not really... accessing the shell may be a real boon, even to make live tests and assess the problem (or a solution) quickly.
To gain access to the shell we'll make use of the mentioned
exec command of
kubectl client. Let's try:
$ kubectl -n spark exec -it playbook-locke-2261095262-8x8p2 -- /bin/bash
Ta-daaan. You should be logged to the terminal as root, as simple as that.
Dissecting the command we found:
kubectl: enough of this, right?
-n spark: again, use
exec: this works much like in Docker
-it: the same Docker flags, meaning
--: enforces what follows as positional parameter (shell stuff actually, not partaining to kubectl)
/bin/bash: the shell to be executed (see below)
Gotcha: Please remind that not all containers have bash. Some (many actually) of them are based on Alpine Linux or other distros so the available shell may vary.
Alpine for example sports
ash so you may have to issue
$ kubectl -n acme exec -it acme-ash-test -- /bin/ash
This is a small recipe to get you started with our production K8s environments. From here up it's a matter of experience, docs reading and a bit of work by you to increase your devops skills.
Roll your sleeves and enjoy!