OpenWrtNetworkingMobile WorkHomelabAntigravity CLI
Beryl AX OpenWrt Mobile Office Router with Antigravity CLI
Pixel 8 USB tethering, Mobile_Net Wi-Fi, SQM, encrypted DNS and adblock on a GL.iNet Beryl AX, set up with Antigravity CLI.
· 7 min read
AISecurityAgentsHomelab
Sandboxing AI Agents: Kernel-Level Isolation with nono and Landlock
AI agents have terminal access, network calls, and file operations. Here's how I lock them down with OS user isolation, Landlock kernel sandboxing, and nono. and where the gaps still are.
· 11 min read Beyond the Chat Window · Part 5 AIHomelabAgentsLocal-FirstMCP
From Vibe Coding to AI Agent: My Local Qwen 3.6 Now Runs 24/7
Built a local AI agent with Hermes on Qwen 3.6 MTP at 125 t/s. From benchmarking to vibe coding to a 24/7 autonomous agent. no API costs.
· 9 min read Beyond the Chat Window · Part 4 AIHomelabllama.cppBenchmarking
Gemma 4 MTP vs Qwen 3.6: Same GPU, Different Speedups
Gemma 4 MTP hits 133 t/s (1.32x) vs Qwen's 144 t/s (1.47x) on an RTX 5060 Ti. The 441 MB drafter looks light but compute buffers eat the savings.
· 4 min read Fast AI, Real Risks · Part 11 AIHomelabllama.cppBenchmarking
MTP Speculative Decoding Actually Works on MoE: 144 t/s on a 16GB GPU
MTP landed for Qwen 3.6 in llama.cpp. MoE jumps from 98 to 144 t/s, dense gets 42% slower. Benchmark data, server configs, and why MTP needs bandwidth headroom.
· 5 min read Fast AI, Real Risks · Part 10 AIHomelabllama.cppBenchmarking
Gemma 4 on a 5060 Ti: 256K Context on 16GB — but Only if You Know the Architecture Trick
Gemma 4 26B MoE hits 99 t/s with 256K context on an RTX 5060 Ti 16 GB. The 31B dense tops out at 65K. One flag and one architecture trick make the difference.
· 9 min read Fast AI, Real Risks · Part 8 AIHomelabllama.cppBenchmarking
Code Generation Showdown: Gemma 4 vs Qwen 3.6 on a Consumer GPU — Same Prompt, Four Models, One Shot
Same creative coding prompts, four models (Gemma 4 + Qwen 3.6, MoE + dense) on a 16 GB GPU. One shot, no feedback. Here's what rendered and why MoE wins.
· 12 min read Fast AI, Real Risks · Part 9 AIGoOpen SourceProductivity
I Use 6 AI Coding Tools. Finding Last Week's Session Was Harder Than the Bug It Fixed.
VibeCockpit scans Claude Code, Copilot, Codex, Gemini, and OpenCode sessions into one searchable dashboard. Go + Svelte 5, single binary, local or SSH.
· 7 min read Beyond the Chat Window · Part 3 AIHomelabllama.cppBenchmarking
Qwen 3.6 27B Dense on a 5060 Ti: Speculative Decoding, Ngram-Mod, and Why the MoE Still Wins
31 t/s dense vs 98 t/s MoE on an RTX 5060 Ti 16 GB. Draft models, ngram-mod, two forks tested. What works, server configs, and why MoE wins.
· 13 min read Fast AI, Real Risks · Part 7 AIHomelabllama.cppBenchmarking
Qwen 3.6 35B on RTX 5060 Ti: Full 262K Context, TurboQuant to 400K, and What Actually Matters
Running Qwen 3.6-35B-A3B on the same RTX 5060 Ti. The 35B now reaches its full 262K native context. TurboQuant pushes it to 400K but there's a catch.
· 9 min read Fast AI, Real Risks · Part 6 AIHomelabClaude CodeIoT
I Vibecoded a Blog Server on a $4 ESP32 with MicroPython and Microdot
ESP32 serving a styled blog with live metrics at 5 req/s. 37 lines of Python, load tested to 10 connections. One Claude Code session.
· 5 min read ESP32 Vibing · Part 1 AIHomelabClaude CodeIoTSecurity
From Blog Server to WiFi Radar: Vibecoding a Network Scanner on ESP32
A $4 ESP32 turned WiFi/BLE scanner with probe sniffing, spectrum analysis, and deauth detection. Claude Code, custom C module, MicroPython.
· 5 min read ESP32 Vibing · Part 2 AIProductivityClaude CodeAgentsHomelab
From Chat to Agent: The Real AI Adoption Gap
Most people use AI like a search box. The jump to autonomous agents isn't a skill gap — it's a mental model shift.
· 6 min read Beyond the Chat Window · Part 2 AIClaudeProductivityTutorial
I Let Claude Desktop Control My Browser, Schedule Daily Tasks, and Build a Website
Claude Desktop controlling Chrome for hotel deals, scheduling daily price checks, and building a website in Code mode. Screenshots and security notes.
· 6 min read Beyond the Chat Window · Part 1 3D PrintingKlipperMainsailClaude CodeEnder 3
Reviving a 4-Year-Old Ender 3 V2 with Klipper, Mainsail and Claude Code
Ender 3 V2 collecting dust since 2021 — revived with MainsailOS, Klipper firmware flash, and tuned slicer profiles, all via Claude Code over SSH.
· 5 min read
SecurityDevSecOpseBPFCI/CDSupply Chain
Monitoring CI Secrets With eBPF (DNS + getenv)
CI/CD code can read your secrets. I built pipeline-monitor, an eBPF auditor that captures DNS queries and getenv calls to show what your pipeline accesses.
· 4 min read Pipeline Security · Part 1 AIHomelabllama.cppBenchmarkingSecurity
I Tried to Run Qwen at 160K Context on a 16GB GPU. The 35B Worked — but the 9B Won.
85 t/s was fine — then I pushed it. Qwen 35B-A3B at 160K context on a 16 GB RTX 5060 Ti, benchmarked against the 9B that quietly became my daily driver.
· 13 min read Fast AI, Real Risks · Part 5 AISecurityArchitecture
Picking the Right Brain for Your AI Firewall
I benchmarked 5+ models as prompt injection inspectors. Bigger is not better — the failure modes matter more than the scores.
· 9 min read Fast AI, Real Risks · Part 4 AISecurityArchitecture
AI Security at Machine Speed
Prompt injection scales with inference speed. How do you defend against attacks in 100 languages at 17,000 tokens per second?
· 5 min read Fast AI, Real Risks · Part 3 AIHomelabLocal-FirstMCP
Own Your Context: The Case for Local-First AI
If inference becomes commodity, the scarce resource is context. Your data, your history, your preferences. What if all of it was already local?
· 4 min read Fast AI, Real Risks · Part 2 AIHardwareArchitecture
What If Inference Was Free?
Taalas hardcoded Llama 3.1 8B into silicon and hit 17,000 tokens per second. That number broke something in my mental model of how we build software.
· 4 min read Fast AI, Real Risks · Part 1 AIHomelabllama.cppBenchmarking
MiniMax-M2.5 on 128GB RAM + OCuLink GPU: Can a Mini PC Run a 230B Model?
Can a mini PC with 128GB RAM and an OCuLink GPU run MiniMax-M2.5, a 230B model? Yes, but the bottleneck isn't where you'd expect.
· 2 min read
AIHomelabProxmoxOllama
Running a Local AI Homelab: Mini PC, OCuLink, and a 5060 Ti
How I turned a Proxmox mini PC with an OCuLink-connected NVIDIA RTX 5060 Ti into a private, local AI inference server running GLM-4.7-Flash at 85 tokens/second.
· 4 min read
AstroSvelteTailwind CSSCloudflare
Rebuilding My Portfolio with Astro, Svelte & Tailwind CSS
Why I migrated my personal site from Gatsby v2 to Astro 5, how the new stack works, and what I learned along the way.
· 4 min read
eBPFSecurityGoAIObservability
Reading HTTPS Traffic with eBPF Uprobes: How I Monitor AI Agents Through the Kernel
eBPF hooks SSL_write/SSL_read to read HTTPS before encryption. I built a monitor that traces AI agent behavior without touching the app. SSL uprobes explained.
· 8 min read
Home AssistantIoTSmart Home
Smartify your Washing Machine
How I used a Tasmota-flashed smart power socket and Home Assistant to get notified when laundry is done — no more forgotten wet clothes.
· 1 min read
PythonGitLabREST APIAutomation
Bulk Removing GitLab Webhooks with Python
We were retiring an old integration and needed to remove its webhooks from hundreds of GitLab projects. Doing it manually wasn't an option.
· 2 min read
PythonREST APICLI
Prototyping Shareable CLIs Fast with Python and Typer
I used to write quick Python scripts that only I could run. Typer fixed that — function signatures become the CLI interface, and --help just works.
· 2 min read
CI/CDGitHub ActionsDevOps
Continuous Integration and Continuous Deployment: Cornerstones of Modern Development
An introduction to CI/CD practices with practical GitHub Actions examples for running tests and building Docker images.
· 2 min read Git & Dev Workflow · Part 3 AgileScrumProject Management
What Scrum Actually Looks Like at Enterprise Scale
Scrum works well for small teams. At Bosch, we ran SAFe across hundreds of teams. Here's what that coordination machinery actually looks like from the inside.
· 3 min read
GitSoftware DevelopmentBranching
Mastering Git Branching Models for a Smooth Development Workflow
Exploring Git branching strategies, branch protection rules, and how tags fit into a streamlined development, staging, and release workflow.
· 2 min read Git & Dev Workflow · Part 1 GitSemantic VersioningSoftware Development
Understanding Semantic Versioning and Its Importance in Git Branching Models
An introduction to Semantic Versioning (SemVer) and how the MAJOR.MINOR.PATCH scheme integrates with Git branching models to create informative release tags.
· 2 min read Git & Dev Workflow · Part 2