# Lab Architecture :::{note} The lab hosts most of the services deployed in the Tatsu landscape: [home automation](../smart-home/automation.md) services, the [observability](./observability.md) stack, even these docs. ::: ## Summary The lab architecture is based on Kubernetes, provided by RKE2, with storage within Kubernetes provided by Rook Ceph. The lab machines run a minimal install of Debian (see [OS install notes](../misc/os-install-notes.md)), within which RKE2 runs as a service. ## RKE2 [RKE2][rke2] (Rancher Kubernetes Engine 2) is an open-source distribution of Kubernetes that runs from a single binary. This makes is easy to manage and deploy, which we do via [Ansible][ansible]. RKE2 installations are either "servers" or "agents" - both of them will run normal workloads, but the server nodes will also run the control plane that manages Kubernetes itself. We run three server nodes for redundancy. ## Rook Ceph [Ceph][ceph] is a widely-used and extremely reliable storage provider that is designed to avoid any single points of failure. It is famously complicated to setup and configure, so we use the [Rook][rook] operator to abstract away that complexity. Rook is a Kubernetes-native orchestrator for Ceph that allows us to declare _what_ the Ceph cluster should look like and leave it to handle _how_ that should be done. Ceph can provide three different types of storage: - Block devices (RADOS Block Devices, or RBD) - these are presented as raw devices, comparable to a brand new hard drive with no file system applied. - A file system (CephFS) - this is a fully-featured file system that implements the POSIX standard (i.e. the same as most Unix systems). - Object storage (Ceph Object Gateway, RADOS Gateway, or RGW) - this is an S3-compatible API for "simple" object storage. :::{note} RADOS stands for Reliable Autonomic Distributed Object Store. It is one of the core technologies behind Ceph, which is why it appears in some of the names above. The other acronym you'll see is CRUSH, or Controlled Replication Under Scalable Hashing, which determines how data is replicated and assigned to physical nodes. ::: Only CephFS is used at the moment, which backs every persistent volume in the Kubernetes cluster. If we use either of the other two storage types in the future they will all share the same underlying storage. Ceph includes a dashboard which is avaiable [here](https://ceph-dashboard.tatsu.casa) - the credentials are in the password manager. ## Workload Deployment Workloads are not deployed manually to the cluster; they are defined in a Git repo ([this one](https://gitea.tatsu.casa/tatsu-deploy/k8s)) and continuously applied to the cluster by [ArgoCD][argo] (this practise is known as [GitOps][gitops]). This helps to prevent config drift within the cluster, and means that service upgrades can be rolled out automatically just by merging PRs in the repo. ArgoCD itself is deployed in Kubernetes, so what if you need to start from scratch? The Git repo linked above contains a readme that explains how to bootstrap the cluster manually up to the point that ArgoCD can run, after which it will take over. [ansible]: https://gitea.tatsu.casa/tatsu-deploy/ansible [argo]: https://argo-cd.readthedocs.io/en/stable [ceph]: https://ceph.io/en [gitops]: https://about.gitlab.com/topics/gitops [rke2]: https://docs.rke2.io [rook]: https://rook.io