docs: add "Route to a Kubernetes service with HA" how-to by SunsetDrifter · Pull Request #810 · netbirdio/docs

SunsetDrifter · 2026-06-23T13:41:38Z

Summary

Adds a full how-to guide, Route to a Kubernetes service with high availability, under Kubernetes > Use Cases. It walks the whole journey of reaching a private in-cluster ClusterIP Service from a NetBird client through a redundant pool of routing-peer pods, so access survives a pod or node failure.

The guide covers the NetBird-side pieces the operator does not create for you, then the operator CRDs, then verification:

Create a custom DNS zone — created empty; the operator fills in the A record (<service>.<namespace>.<zone>) automatically when you expose a Service.
Create groups and an access policy — NetBird is deny-by-default; the operator writes no groups or policies.
Deploy the routing peers (HA) — NetworkRouter with workloadOverride.replicas, the group-backed router at one metric for equal-metric failover, and the auto PodDisruptionBudget. Node spread is woven in: Kubernetes spreads replicas across nodes by default, with a topologySpreadConstraints (ScheduleAnyway) example and a note on the DoNotSchedule rolling-update deadlock.
Expose the Service — NetworkResource (ClusterIP-only) placed in the destination group.
Verify and test failover — pods spread across nodes, PDB, resolve + curl, drop a pod and watch traffic continue.

Also includes an appendix on friendly DNS names (CNAME alias to the operator-managed record) and a custom dark-mode topology diagram.

Changes

New page: manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx
Nav: new Use Cases group under Kubernetes
Assets: custom dark-mode topology.svg + dashboard/terminal screenshots

Summary by CodeRabbit

Documentation
- Added a new Kubernetes integration guide for routing NetBird clients to in-cluster ClusterIP Services via an operator-managed DNS zone, including configuration steps and reachability/failover verification.
- Updated the navigation to include the new Kubernetes “Use Cases” section and the “Route to a Kubernetes Service” documentation page.

…erator) Add a standalone use-case page under a new Use Cases group in the Kubernetes nav, covering how to run the operator's routing peers in HA: NetworkRouter workloadOverride.replicas (default 3), the auto-created PodDisruptionBudget (maxUnavailable: 1), equal-metric automatic failover, and spreading replicas across failure domains via workloadOverride.podTemplate. Models least-privilege (named destination group + access policy) rather than the All group.

Two SVG topology diagrams: replicas on a single node (single point of failure) and replicas spread one-per-node via topologySpreadConstraints. Embedded in Step 1 and the failure-domains section.

kube-scheduler spreads a Deployment's replicas across nodes by default (best-effort, via built-in PodTopologySpread defaults). The earlier text/ diagram wrongly implied replicas co-locate by default. Reframe: multi-node spread is the default; topologySpreadConstraints turns it into a guarantee (or spans zones). Remove the single-node diagram (non-HA case, out of scope).

Document exposing a service under a cleaner name via a CNAME in a custom zone pointing at the operator's <service>.<namespace>.<zone> record (verified end-to-end). Placed as an appendix for now; can move to a shared location later.

…t deadlock Multi-node verification: default scheduling already spreads replicas one-per-node; the operator merges workloadOverride.podTemplate.topologySpreadConstraints into the Deployment. DoNotSchedule with replicas == schedulable nodes deadlocks rolling updates (surge pod can't place). Switch the example to ScheduleAnyway (verified clean rollout) and document DoNotSchedule + the node-count/maxSurge caveat for a hard guarantee.

…wing) Verified on the lab: a NetBird custom zone serves only the records you add; other names under the domain fall through to upstream DNS. Reusing a real internal domain for friendly names is safe except for exact-name collisions.

Restructure the HA use-case page into an end-to-end guide covering the whole journey: create the custom DNS zone, groups, and access policy (dashboard) -> deploy HA routing peers (NetworkRouter, replicas:3) -> expose a Service (NetworkResource) -> verify + failover. Generic, human-readable example names (k8s.company.internal, kubernetes-clients/-services, network 'kubernetes', nginx). Keeps the failure-domains diagram + ScheduleAnyway/DoNotSchedule note and the friendly-DNS appendix. Adds <img> slots for 5 dashboard/terminal screenshots (to be supplied). Renames the page + nav entry to route-to-a-kubernetes-service; old slug removed.

Four screenshots (DNS zone, access policy, the kubernetes network with HA + 3 routing peers, kubectl pods-across-nodes). Drop the groups screenshot and renumber the <img> refs to match.

Node-spread is the point of an HA guide, not a tail-end section. Move the topology diagram up to 'What you'll achieve', fold the node-spread story into Step 3 (deploy HA routing peers) - leading with the verified fact that the scheduler spreads replicas across nodes by default (HA out of the box), with topologySpreadConstraints as optional hardening - and drop the orphaned 'Spread across failure domains' section.

…cord) Step 1 showed the auto-created A record without saying you don't enter it. Note that you create only the zone (no hostname/IP/TTL by hand) and the operator adds <service>.<namespace>.<zone> -> ClusterIP (5-min TTL) in Step 4.

Hand-authored dark-background topology diagram (NetBird overlay -> routing peers one-per-node -> Service) that matches the dark docs theme, replacing the light Excalidraw-derived SVG. Removes the orphaned ha-routing-peers-spread-nodes.svg.

Show the Add DNS Record dialog (CNAME 'app' -> nginx.default.k8s.company.internal) and align the example hostname to 'app' to match.

coderabbitai · 2026-06-23T13:42:18Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 74a51218-3df2-4489-8a11-3102db1ba662

📥 Commits

Reviewing files that changed from the base of the PR and between cce9e07 and 1b28b06.

📒 Files selected for processing (1)

src/pages/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx

📝 Walkthrough

Walkthrough

A new “Use Cases” navigation group is added under the Kubernetes integration, and a new MDX guide documents routing NetBird clients to a Kubernetes ClusterIP Service through DNS, access control, a high-availability NetworkRouter, NetworkResource exposure, verification, and aliasing.

Changes

Kubernetes: Route to a ClusterIP Service

Layer / File(s)	Summary
Navigation entry for Kubernetes Use Cases `src/components/NavigationDocs.jsx`	Adds a new nested “Use Cases” group with a single child link under the Kubernetes integration in `docsNavigation`.
Guide introduction, prerequisites, and DNS zone setup `src/pages/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx`	Introduces the guide’s goal and flow, lists prerequisites and example object names, and documents creating an empty custom DNS zone.
Access control policy and HA NetworkRouter deployment `src/pages/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx`	Creates source and destination groups with an allow policy, then deploys a `NetworkRouter` with `workloadOverride.replicas` and describes failover and `PodDisruptionBudget` behavior.
NetworkResource exposure, verification, failover, and DNS appendix `src/pages/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx`	Exposes a `ClusterIP` Service via `NetworkResource`, verifies reachability and failover, and adds DNS naming plus CNAME alias guidance.

Estimated code review effort: 2 (Simple) | ~10 minutes

Possibly related PRs

netbirdio/docs#796: Also adds a new “Use Cases” navigation group entry to docsNavigation in NavigationDocs.jsx.
netbirdio/docs#786: Modifies docsNavigation group structure and open-state behavior in NavigationDocs.jsx.

Suggested reviewers: mlsmaycon

Poem

🐇 I hopped through zones and routes today,
A ClusterIP now knows the way.
With peers in pairs and DNS in tune,
The docs now sing a Kubernetes rune.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly matches the main change: adding a new Kubernetes high-availability how-to doc.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch cc/k8s-ha-routing-peers

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

src/pages/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx

Oops! Something went wrong! :(

ESLint: 9.39.4

TypeError: Converting circular structure to JSON
--> starting at object with constructor 'Object'
| property 'configs' -> object with constructor 'Object'
| property 'flat' -> object with constructor 'Object'
| ...
| property 'plugins' -> object with constructor 'Object'
--- property 'react' closes the circle
Referenced from:
at JSON.stringify ()
at file:///node_modules/@eslint/eslintrc/lib/shared/config-validator.js:308:45
at Array.map ()
at ConfigValidator.formatErrors (file:///node_modules/@eslint/eslintrc/lib/shared/config-validator.js:299:23)
at ConfigValidator.validateConfigSchema (file:///node_modules/@eslint/eslintrc/lib/shared/config-validator.js:330:84)
at ConfigArrayFactory._normalizeConfigData (file:///node_modules/@eslint/eslintrc/lib/config-array-factory.js:676:19)
at ConfigArrayFactory._loadConfigData (file:///node_modules/@eslint/eslintrc/lib/config-array-factory.js:641:21)
at ConfigArrayFactory._loadExtendedShareableConfig (file:///node_modules/@eslint/eslintrc/lib/config-array-factory.js:946:21)
at ConfigArrayFactory._loadExtends (file:///node_modules/@eslint/eslintrc/lib/config-array-factory.js:814:25)
at ConfigArrayFactory._normalizeObjectConfigDataBody (file:///node_modules/@eslint/eslintrc/lib/config-array-factory.js:752:25)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@src/pages/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx`:
- Around line 115-117: The Note mentions setting `maxSurge: 0` as a solution for
rolling-update constraints with `whenUnsatisfiable: DoNotSchedule`, but the
documented example only shows pod-spec configuration through
`spec.workloadOverride.podTemplate`. Since `maxSurge` is a Deployment-level
`strategy.rollingUpdate` field rather than a pod-spec field, either add
clarification with an example showing how to configure `maxSurge: 0` through the
operator if it is supported, or remove the mention of `maxSurge: 0` from the
Note and keep only the option about maintaining more schedulable nodes than
replicas.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ed468dec-b3a9-4cf8-8844-a40a66e36645

📥 Commits

Reviewing files that changed from the base of the PR and between 6b4aa80 and cce9e07.

⛔ Files ignored due to path filters (6)

public/docs-static/img/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service/01-dns-zone.png is excluded by !**/*.png
public/docs-static/img/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service/02-access-policy.png is excluded by !**/*.png
public/docs-static/img/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service/03-network.png is excluded by !**/*.png
public/docs-static/img/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service/04-pods-across-nodes.png is excluded by !**/*.png
public/docs-static/img/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service/friendly-dns-cname.png is excluded by !**/*.png
public/docs-static/img/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service/topology.svg is excluded by !**/*.svg

📒 Files selected for processing (2)

src/components/NavigationDocs.jsx
src/pages/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx

The operator's workloadOverride only exposes annotations, labels, podTemplate, and replicas — there is no hook for the Deployment's strategy.rollingUpdate.maxSurge. Keep the achievable workaround (more schedulable nodes than replicas).

…fault)

SunsetDrifter added 13 commits June 23, 2026 09:39

docs: add topology diagrams to HA routing peers page

29283d8

Two SVG topology diagrams: replicas on a single node (single point of failure) and replicas spread one-per-node via topologySpreadConstraints. Embedded in Step 1 and the failure-domains section.

docs: add dashboard/terminal screenshots to the K8s how-to

ba2dd4d

Four screenshots (DNS zone, access policy, the kubernetes network with HA + 3 routing peers, kubectl pods-across-nodes). Drop the groups screenshot and renumber the <img> refs to match.

docs: swap in cleaner pods-across-nodes screenshot for Step 5

18e1958

docs: add CNAME dialog screenshot to the friendly-DNS appendix

cce9e07

Show the Add DNS Record dialog (CNAME 'app' -> nginx.default.k8s.company.internal) and align the example hostname to 'app' to match.

coderabbitai Bot reviewed Jun 23, 2026

View reviewed changes

Comment thread src/pages/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx Outdated

phillebaba reviewed Jun 29, 2026

View reviewed changes

Comment thread src/pages/manage/integrations/kubernetes/use-cases/route-to-a-kubernetes-service.mdx Outdated

docs: drop manual topology spread guidance (operator handles it by de…

1b28b06

…fault)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: add "Route to a Kubernetes service with HA" how-to#810

docs: add "Route to a Kubernetes service with HA" how-to#810
SunsetDrifter wants to merge 15 commits into
mainfrom
cc/k8s-ha-routing-peers

SunsetDrifter commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

SunsetDrifter commented Jun 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SunsetDrifter commented Jun 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 23, 2026 •

edited

Loading