Introducing My Self-Hosted Setup

Introduction

After years of self-hosting practice, I’ve gradually formed a stable and reliable deployment solution. This article mainly introduces the cloud deployment part, while the on-premises environment is mainly used for backup and learning/testing. The entire architecture follows these core principles:

Simplicity First: Avoid over-engineering, use simple solutions when possible
Reliability Priority: Stable operation is more important than fancy features
Regular Maintenance: Update all services to the latest version every Friday
Single-node Deployment: Focus on reliability and fast recovery rather than high availability
Daily Backups: Automated backups with multi-location redundancy
Fast Recovery: Ensure services can be restored from backups in a short time

Infrastructure

System Choice

The underlying system uses Ubuntu, with all services running on k3s. The reasons for choosing k3s are simple:

Lightweight: Compared to full Kubernetes, k3s is more suitable for single-node deployment
Simple Configuration: A single config file can quickly replicate the same environment on a new server
Mature and Stable: Fully validated and suitable for production environments

K3s Configuration

cluster-init: true
docker: false
data-dir: /data/k3s
disable:
  - traefik
  - servicelb
  - metrics-server
token: xxxxxxxxxxxxxxxxxx
service-node-port-range: 79-30124
kubelet-arg:
  - cgroup-driver=systemd
kube-proxy-arg:
  - proxy-mode=ipvs
  - ipvs-strict-arp=true
disable-cloud-controller: true
tls-san: xxxxxxxxxxxxxxx
default-local-storage-path: /data/storage
etcd-snapshot-schedule-cron: 0 */5 * * *
etcd-snapshot-retention: 20
etcd-snapshot-dir: /data/storage/etcd

This configuration has been running stably for years and rarely needs modification. Several key points:

Unified Data Directory Management
All important data is stored in the /data directory, which has several benefits:

Easy to see what data the system has at a glance
Only need to focus on this one directory when backing up
More convenient when migrating servers

Storage Solution
Uses k3s built-in local-path StorageClass, specifying PV data path via default-local-storage-path: /data/storage. This is the simplest storage solution, perfect for single-node deployments.

Final Directory Structure
After setup, there are only two core directories under /data to focus on:

/data/k3s - k3s cluster data
/data/storage - All PV persistent data

Deployment and Operations

GitOps Automated Deployment

All services are deployed and managed through ArgoCD, including ArgoCD itself (except for initial installation). All Kubernetes YAML configurations are stored in a self-hosted Gitea repository, fully following the GitOps philosophy.

Advantages:

Configuration as code, all changes are recorded
Auto-sync, push to Git and it automatically deploys
Easy rollback, revert Git commits when issues occur

Traffic Ingress and Reverse Proxy

Technology Stack Evolution
Initially used Ingress-Nginx, later switched to Traefik. Traefik has noticeably better performance and more concise configuration.

Network Architecture

External Traffic → Cloudflare CDN → Traefik (hostport) → Backend Services

Traefik: Uses hostport mode, single-node deployment is simple and efficient enough
Cloudflare: Acts as the first layer of protection, automatically filters malicious traffic and attacks, can block malicious IPs directly at the Cloudflare level

Database Selection

Migrating from MySQL to PostgreSQL
The database is the core of self-hosted services. Initially used MySQL, but performance wasn’t ideal. Later migrated all services to PostgreSQL, with almost no code changes needed, and saw visible performance improvements.

Version Evolution
Upgraded from PostgreSQL 14 to 18. For my use case, the performance improvement isn’t obvious, but keeping up-to-date helps with security and stability.

Data Storage Solution

PostgreSQL: All services requiring a relational database
Redis: Caching and queues

Deployment Principle: If a project doesn’t support PostgreSQL, I’d rather not deploy it. Maintaining multiple database systems greatly increases operational complexity.

Service List

Namespace Management Strategy: All services are deployed in the app namespace, because most are single-pod services, so there’s no need to create multiple namespaces and increase management complexity.

Development and Code Management

Gitea - Self-hosted Git service, managing all code and configurations, including deployment project pipelines
Wakapi - Coding time statistics, understanding work habits
Registry - Private Docker image registry

Blog and Content

Umami - Blog analytics
Umami API - Self-developed project, periodically fetches page views and caches to Redis, improving blog loading speed and reducing Umami pressure
Waline - Blog comment system

Productivity and Tools

Atuin - Shell command history sync, cross-device usage
Memos - Lightweight notes, recording temporary thoughts, though I don’t take many notes usually
Linkwarden - Read-it-later management, bookmarking valuable links
FreshRSS - RSS feed aggregation

AI and Automation

Open WebUI - ChatGPT client, unified AI interaction interface
One-Hub - API aggregation gateway, providing multi-source API support for Open WebUI
MCPO - Enables Open WebUI to support MCP protocol

Monitoring and Operations

Telemonitor https://github.com/bboysoulcn/telemonitor - Self-written Telegram bot for real-time server monitoring, more lightweight than Prometheus + Grafana. For self-hosting scenarios, CPU, memory, and disk info is sufficient
AdGuard Home - DNS server, providing DoH encrypted queries
TGPush - Automatically pushes news to Telegram channel, by the way you can follow my channel https://t.me/bboyapp

Security and Authentication

Bitwarden - Password manager
PocketID - Unified identity authentication center, implementing single sign-on. Most things are already integrated, eliminating a lot of password typing hassle, highly recommended

Infrastructure

PostgreSQL - Primary database
Redis - Caching and queues
PGBackWeb - PostgreSQL backup management, doesn’t do full database backup, but sufficient for use

Other Services

HubProxy - GitHub and Docker Hub proxy acceleration
GeoIP - aka realip.cc, IP geolocation query
Wallos - Subscription service management, tracking various paid subscriptions
Paperless-ngx - Document management system, though not frequently used, very convenient when needed

Backup Strategy

Backup is the most important aspect of self-hosting. My backup solution pursues simplicity and reliability, ensuring data safety and fast recovery.

Scheduled Tasks

Automatically executes backup every day at 1 AM:

0 1 * * * cd /backup && bash main.sh

Backup Script

#!/bin/bash
set -e
time=$(date "+%Y-%m-%d")-$RANDOM
pg_filename=$time-pg.sql

echo "Starting Gitea backup"
k3s kubectl exec -it gitea-0 -n app -- su - git -c '/app/gitea/gitea dump -c /data/gitea/conf/app.ini --skip-log --skip-package-data'

echo "Starting PostgreSQL backup"
cd /data/storage/pgbackup && export PGPASSWORD="xxxxxxxx" && pg_dumpall -h 127.0.0.1 -p 5432 -U postgres -w > $pg_filename

path=/data/storage
echo "Starting backup" $path
dirname=`dirname $path`
basename=`basename $path`
start=$(date +%s)

# Create backup directory
backupdir=/backup/$basename/$time
mkdir -p $backupdir
cd $dirname

# Use zstd high compression ratio for packaging
tar -cvf - $basename | zstd -15 -T16 > $backupdir/$basename-$time.tar.zst

# Split into 1GB file chunks for easy transfer and storage
split -d -b 1024m $backupdir/$basename-$time.tar.zst $backupdir/$basename-$time.tar.zst.
rm -rf $backupdir/$basename-$time.tar.zst

end=$(date +%s)
take=$(( end - start ))
echo "$path backup completed, took $take seconds"

# Clean up expired backups
cd /backup/ && bash delete.sh
cd /data/storage/pvc-c9d030dd-5657-45f9-89f3-c1be2e6bdae4_app_data-gitea-0/git && bash delete.sh
cd /data/storage/pgbackup && bash delete.sh

# Sync to remote servers
rsync -pgoav --progress --delete /backup [email protected]:/backup
rsync -pgoav --progress --delete /backup [email protected]:/backup

Backup Process Explanation

1. Individual Backups for Critical Services

Gitea: Export complete data using built-in dump command
PostgreSQL: Full database backup using pg_dumpall (PGBackWeb also maintains a separate backup)

2. Overall Data Packaging

Use zstd -15 -T16 for high compression ratio packaging (multi-threaded acceleration)
Split large files into 1GB chunks for easier network transfer and storage management

3. Clean Up Expired Backups
Keep the last 3 days of backups, automatically clean up old files

4. Multi-location Remote Backups
Sync to two remote servers via rsync, achieving geographic redundancy

Cleanup Script

Automatically deletes backup files older than 3 days:

#!/bin/bash

directory="/backup/storage"
current_time=$(date +%s)
THREE_DAYS=$(($current_time - 60*60*24*3))

for file in "$directory"/*; do
    file_time=$(stat -c %Y "$file")
    if [ "$file_time" -lt "$THREE_DAYS" ]; then
        rm -rf "$file"
        echo "Deleted: $file"
    fi
done

Backup Storage Strategy

Adopts a variant of the 3-2-1 backup principle:

Production Server: Real-time data + local backup
Remote Server A: Backup file sync (independent server)
Remote Server B: Backup file sync (independent server)
Home NAS: Periodically pulls backups from the cloud

4 copies total, distributed across different geographic locations. Even with a single point of failure, fast recovery is possible. The two remote servers are dedicated to storing backups and don’t run other services, maximizing the security of backup data.

Self-Hosting Insights

After years of practice, I’ve summarized some core concepts of self-hosting:

Reliability > High Availability

For personal or small team self-hosted services, you don’t need to pursue high availability, but rather high reliability.

Brief service downtime is acceptable (users are mainly yourself and family)
But data must be complete, and the system must be able to recover quickly
Rather than spending effort building complex HA architectures, better to spend time perfecting backup and recovery procedures

Project Selection Principles

Evaluate before deployment:

Community Activity: GitHub star count, issue handling speed
Long-term Maintainability: Whether the author continuously updates, whether the project is mature
Tech Stack Compatibility: Whether it supports PostgreSQL (personal principle)
Avoid Toy Projects: Those flash-in-the-pan projects aren’t worth the effort

The Importance of Regular Updates

Updating all services uniformly every Friday, mainly for:

Security: Timely fixes for known vulnerabilities
Stability: Getting bug fixes
New features are actually secondary

Tech Stack Recommendations

If you’re not familiar with Kubernetes:

Directly use Docker + Docker Compose
Don’t force yourself to use K8s just for learning
Docker Compose is completely sufficient and simpler
Must use docker-compose.yml to manage configuration, don’t manually run docker run

If you’re familiar with Kubernetes:

K3s is a great choice, lightweight yet feature-complete
Combined with GitOps, you can achieve automated operations

Summary

The essence of self-hosting is building a controllable and reliable digital infrastructure for yourself. Keep it simple, focus on reliability, make good backups, and you can run stably for the long term. Don’t be dazzled by flashy tech stacks; what suits you is the best.

Feel free to follow my blog at www.bboy.app

Have Fun

Bboysoul's Blog

Introducing My Self-Hosted Setup

Introduction

Infrastructure

System Choice

K3s Configuration

Deployment and Operations

GitOps Automated Deployment

Traffic Ingress and Reverse Proxy

Database Selection

Service List

Development and Code Management

Blog and Content

Productivity and Tools

AI and Automation

Monitoring and Operations

Security and Authentication

Infrastructure

Other Services

Backup Strategy

Scheduled Tasks

Backup Script

Backup Process Explanation

Cleanup Script

Backup Storage Strategy

Self-Hosting Insights

Reliability > High Availability

Project Selection Principles

The Importance of Regular Updates

Tech Stack Recommendations

Summary