proxmox 2 node cluster

Proxmox 2-Node Cluster: A Step-by-Step Guide

We once helped a small IT team in Manila who lost half their infrastructure overnight — and learned a simple truth: design choices matter more than luck. That team wanted high availability yet feared complex setups. We promised a clear path and delivered a plan that matched their budget and skills.

In this guide we explain why a strict majority—51%+ of votes—keeps a cluster healthy and what happens when a two-node design drops to 50% of the vote. We show how adding a third vote (qdevice) or choosing the right storage approach prevents downtime.

We compare practical options — ZFS with 15-minute replication, Ceph as shared-nothing storage, and an external SAN or fast 100G NAS — and discuss enterprise servers like the Dell PowerEdge R750 for long-term support.

Our goal: a future-ready design for predictable availability and clean operations in the Philippines. Book a free demo or architecture review on WhatsApp +639171043993 to shorten your deployment time.

Key Takeaways

  • Two-node designs need a third vote (qdevice) to avoid quorum loss.
  • ZFS, Ceph, and SAN/NAS each trade cost, performance, and supportability.
  • Plan network segmentation, qdevice placement, and HA policies before build.
  • Expect RPO/RTO implications — 15-minute replication defines recovery scope.
  • Philippine deployments must consider ISP stability and site latency.

Overview and Goals for a Future-Ready Setup in the Philippines

A practical infrastructure must prioritize uptime while fitting procurement and support realities in the Philippines. We design compact systems that meet strict availability targets and stay within budget.

We often pick a two-server arrangement when space or cost limits expansion. To avoid quorum loss if one host fails, we add a qdevice for a clear majority vote. This pattern keeps services running without complex hardware.

Why choose this design and what to expect

Expect two continuity modes: synchronous high-availability on shared storage, or near-HA with periodic replication and small RPO windows. We evaluate ZFS with 15-minute replication, Ceph as a shared-nothing option, and external SAN/100G NAS as alternate shared storage.

Key outcomes: availability, replication, and operational continuity

Our goals are simple — preserve critical services, limit downtime, and bound acceptable data loss. We map performance targets to hardware and network choices, focusing on database and file workloads.

  • Define measurable success: VM restart time, allowable data lag, and maintenance windows.
  • Prioritize low-latency fiber links with LTE/5G as backup for site resilience.
  • Plan hardware lifecycle with enterprise servers and known NICs to ease RMAs.

Ready to align targets and budget? Chat with us on WhatsApp +639171043993 before procurement to click expand your deployment options and confirm the best option for your data and operations.

Understanding Quorum, Votes, and the Role of a QDevice

Maintaining a true majority is the single most important rule that keeps distributed services safe and predictable. In practice that means a system must hold at least 51% of votes to keep running. When the majority is lost, services stop to protect data integrity.

Quorum basics: why 51%+ votes are mandatory

Quorum is the safety mechanism that prevents split-brain—only the majority partition can serve workloads. A healthy quorum avoids conflicting writes and unintended data loss.

Two nodes, three votes: how a qdevice prevents split-brain

With only two hosts, losing one leaves you at 50% and the system halts. Adding an external qdevice gives a third, independent vote so the healthy side keeps serving. In this setup, a failed host leaves 2/3 votes—about 66%—and operations continue.

“Designing quorum is not optional — it is the single control that prevents split-brain and data corruption.”

Where to host the qdevice matters: pick a neutral, stable site with low jitter and reliable reachability. We recommend a third network location—preferably outside the two production sites—so link failures don’t take the tie-breaker offline.

  • Monitor quorum status, corosync link health, and fencing events for early alerts.
  • Document runbooks: how to verify majority, steps when one host is isolated, and when to intervene manually.
  • Secure the qdevice with access controls and certificates so votes cannot be spoofed.

Need help choosing a neutral site for the qdevice? Message WhatsApp +639171043993. If you want to click expand your deployment options, contact us and we will guide placement, tests, and failure drills.

Requirements and Planning: Hardware, Network, and Time

Good planning starts with clear hardware choices that match workload priorities and site realities. We size CPU and RAM to fit database, application, and analytics mixes, then map those needs to two comparable nodes.

Server specs, CPU/RAM sizing, and NIC planning

We recommend enterprise servers such as the Dell PowerEdge R750 for stronger vendor support. Design disk tiers with RAID10 for SSD performance, HDD for capacity, and mirrored boot devices for resilience.

Use redundant NICs—dual-port 25/100G for storage and separate 1/10G ports for management and VM uplinks. Validate PSU redundancy and out-of-band management. Get a sizing consult on WhatsApp +639171043993 before ordering servers and NICs.

Network latency and link redundancy considerations

Keep corosync round-trip times low to avoid false quorum events. Plan fiber as the primary link and an LTE/5G path as a control-plane failover. Document IP addressing, VLANs, and MTU from day one.

Philippines context: ISP stability and backup links

“Clusters depend on stable networking—low-latency links and a reliable third-vote location.”

  • Schedule implementation windows, soak tests, and rollback plans to reduce business impact.
  • Stock spare parts—fans, NICs, and SSDs—to cut mean time to repair.

Choosing Storage: ZFS Replication, Ceph, or SAN/NAS

Picking the right storage path sets expectations for downtime, cost, and daily operations. We evaluate three practical routes and match them to business needs in the Philippines.

When ZFS with periodic replication makes sense

ZFS on each server with 15-minute replication is cost-effective for small sites. It gives clear RPOs and keeps hardware simple.

Trade-off: VMs restart on the surviving host after a failure. Expect a short outage and possible data loss within the last replication window. This option requires a separate qdevice for safe failover.

Ceph for shared-nothing storage: trade-offs in two-node designs

Ceph delivers real shared-nothing semantics and continuous replication across disks.

However, two-server deployments are tricky and raise operational complexity. Operators often need extra monitoring and a tie-breaker monitor to keep health checks reliable.

External SAN or fast NAS for shared storage and simpler HA

A fast SAN or 25/100G NAS provides true shared storage. That simplifies HA and enables live migration and minimal downtime for business-critical VMs.

Performance vs cost: SSD, HDD tiers, and RAID levels

Pick RAID10 for database write predictability and RAIDZ for capacity efficiency. Use SSD tiers for hot workloads and HDD for cold data to balance cost and performance.

  • Align storage choice with recovery expectations—critical VMs usually justify shared storage.
  • Assess team skills: Ceph needs more operational expertise than a managed SAN/NAS.
  • Request a storage workshop on WhatsApp +639171043993 to review options and tests.
OptionStrengthWeaknessBest for
ZFS + replicationLow cost, clear RPOsShort outage on failover; needs qdeviceBudget sites with planned RTO
Ceph (shared-nothing)Continuous replication, scalableOperational complexity for small teamsScale-out environments with skilled ops
External SAN / 25–100G NASSeamless HA, live migrationHigher CAPEX and network demandsBusiness-critical VMs needing minimal downtime

Backup Strategy Before You Start

Before any build begins, treat backups as the non-negotiable safety net for every migration or upgrade.

We require full VM backups before changes—this protects against migration mistakes and config errors. In practice, teams often start with an offsite NAS when local servers run risky filesystems like ext4 on RAID0 without redundancy.

Full VM backups, offsite NAS, and restore testing

We mandate full backups for all vms and system images prior to work. Follow the 3-2-1 rule: three copies, two media types, one offsite. An offsite NAS is a practical baseline in the Philippines—affordable and easy to manage.

  • Schedule restore testing to verify RTO by actually restoring representative VMs and validating application health.
  • Define retention: daily incrementals and weekly fulls, balanced to budget and compliance.
  • Encrypt backup data at rest and in transit to protect sensitive information.
  • Map backup windows to avoid production impact—throttle bandwidth and IOPS as needed to meet time constraints.
  • Verify application-consistent snapshots for databases with guest agents or scripts.
  • Document runbooks: who initiates restores, where to restore, and how to reroute dependencies.

Secure your environment—ask for our backup checklist on WhatsApp +639171043993.

We also recommend periodic validation of replication paths and a short drill plan for restores. If you want to click expand on policies or need a fast validation run, message us and we will guide the process.

Network Topology and Segmentation for Cluster Traffic

Network design defines whether your high-availability plan survives routine outages or becomes a crisis. We separate traffic planes so a single fault cannot take down management, storage, or VM services. Low latency and reliable links are essential—especially in a two-node plus qdevice setup where votes must travel fast.

Management, storage, and VM networks separation

Keep planes distinct: management, corosync/cluster, storage, and VM traffic each use dedicated VLANs or physical NICs. This containment reduces blast radius and simplifies troubleshooting.

Plan IPAM and deterministic routing from day one. Use ACLs to limit access and speed root-cause analysis when problems appear.

Jumbo frames, bonding, and VLAN best practices

Use bonding (LACP) for resilience and throughput on storage and VM uplinks. Bonding smooths failover and increases aggregated bandwidth.

Enable jumbo frames end-to-end on storage paths and validate MTU consistency to avoid hidden fragmentation. Document switch trunks, MTU, VLAN IDs, and storm control so hardware swaps are predictable.

  • Isolate corosync on a low-jitter path and apply QoS so vote traffic stays timely during bursts.
  • Validate cross-site bandwidth and latency budgets — keep replication and HA traffic within defined performance envelopes.
  • Instrument telemetry for link errors, drops, and latency percentiles to catch degradations before they impact nodes.

Get a network design review—message WhatsApp +639171043993.

Step-by-Step: Build the Proxmox Cluster

Begin with a short checklist: updates applied, system time synced, and hostnames set to their canonical FQDNs. Small, consistent steps reduce surprises during join and later operations.

Prepare both hosts: patch packages, enable NTP or chrony for precise time, and use canonical hostnames. Configure key-based SSH and consistent user accounts so joins are fast and repeatable.

Create the cluster on the first host

Define a clear cluster name and bind corosync to the low-jitter management network. Verify ring addresses and confirm no port or firewall conflicts exist before proceeding.

Join the second host and verify health

Initiate the join from the second system and validate the key exchange. Run pvecm status or equivalent to check quorum and member lists. Watch logs for DNS or time drift warnings.

  1. Define initial storage backends — local ZFS pools or a shared target — so VMs have placement options and replication policies are clear.
  2. Test a maintenance event: migrate a non-critical VM and observe behavior under a controlled host drain to validate HA and performance.
  3. Prepare for qdevice integration: confirm routes, firewall rules, and reachability to the third-vote host to preserve quorum—without it, if one host fails the remaining system will stop at 50% votes.

Confirm health regularly: check membership, ring stability, and replication status after each change. We can guide your build live — book a free session via WhatsApp +639171043993 and click expand on any step.

“Preserve quorum from day one — it is the single control that keeps data safe during failures.”

Integrating a QDevice for Stable Quorum

A third, independent vote prevents half-split shutdowns in small deployments. With two hosts and a qdevice the system holds three votes. If one node fails, 2/3 votes remain and services keep running. Without the qdevice, losing one host drops you to 50% and the system halts to protect data.

Where to place the qdevice and latency targets

Place the qdevice at a reliable third site with separate power and network. Choose a location outside the two production sites to avoid a shared outage.

Target low, stable latency from each host to the qdevice. Avoid asymmetric routing and packet loss—these cause false failures and slow recovery.

Configure and test vote behavior during failures

  • Secure credentials: configure auth and membership so the qdevice participates only when validated.
  • Validate failures: disconnect one host at a time and confirm the remaining hosts plus the qdevice keep quorum and the cluster stays healthy.
  • Simulate flaps: test intermittent link loss, watch corosync, fencing, and service continuity.
  • Monitor and alert: track vote tallies and set thresholds so NOC teams act early.
  • Change controls: require testing for any routing or firewall change that affects the qdevice path.

We’ll help you deploy and test the qdevice—WhatsApp +639171043993. Want a guided setup? Click expand for configuration steps and runbooks, or click expand to book a validation call.

Storage Setup and Data Replication Options

Storage choices determine how quickly services recover and how much data you may lose. We design topologies for predictable RPOs and clear operational steps. Need a storage topology walkthrough? Ping WhatsApp +639171043993.

ZFS pools, datasets, and 15-minute replication cadence

Build ZFS pools with mirrored vdevs or RAID10-like layouts for write-heavy workloads. Create datasets per VM for granular replication and fast restores.

Schedule a 15-minute replication cadence for critical datasets. This balances RPO against network use and snapshot churn.

Setting up Ceph or GlusterFS in two-node contexts

For shared-nothing systems, plan an arbiter or monitor to avoid split decisions. Without a third witness, consistency and performance suffer.

Connecting to a SAN or high-speed NAS over 25/100G

Use dual-controller SANs or a fast NAS with multipath. Present shared LUNs or NFS/iSCSI so vms can live-migrate with minimal disruption.

“Replication is not shared storage — after a host loss, VMs restart on the peer using the latest replicated data.”

  • Size WAL/journals and networks to match expected IO and performance.
  • Benchmark before go-live with fio or VDI-style tests.
  • Validate migration paths; use shared storage for seamless live moves.
OptionWhen to useKey requirement
ZFS + 15-min replicationCost-sensitive sitesBandwidth for regular snapshots
Ceph / GlusterFSScale-out, team skilled in opsThird witness and higher monitoring
External SAN / 25–100G NASBusiness-critical VMsRedundant controllers and multipath

High Availability, Fencing, and Failover Policies

Failover rules are your safety net: clear policies cut guesswork when systems lose contact. We define HA groups, set priorities, and map migration behavior so critical services restart predictably.

On shared storage, VMs can live-migrate with minimal disruption. With periodic replication, VMs restart on the surviving server using the last replicated state. Both options require explicit failover paths documented in runbooks.

HA groups, priorities, and migration behavior

Group mission-critical VMs and pin them to preferred hosts while allowing controlled failover. Use priorities to order restarts and avoid resource contention during recovery.

Test planned migrations and emergency evacuations. Validate that high-priority vms come up first and less-critical workloads wait in the HA queue.

Fencing configuration to avoid data corruption

Fencing is non-negotiable. Configure power fencing to hard-off an isolated server and prevent split writes. Where a qdevice provides quorum, automated fencing plus safe timeouts avoid risky manual interventions.

“Without quorum, services stop even if hardware is up; with a qdevice, the majority partition keeps running.”

  • Define fencing agents and verify power control paths for each option.
  • Simulate one host isolation and confirm the fenced system cannot remount shared disks.
  • Set maintenance modes and graceful draining for planned upgrades to avoid forced restarts.
  • Align SLAs with recovery expectations — document max failover time and acceptable interruptions.
  • Monitor HA queues and tasks to catch stuck migrations or failed fences quickly.
  • Keep runbooks current — name approvers for manual actions when automation fails.
ScenarioBehaviorRecommended Action
Shared storage failoverLive-migrate or restart with minimal downtimePrioritize vms and test live migration
Replicated storage failoverRestart on surviving server using last replicaValidate replication RPO and restart order
Network isolationQuorum checks trigger fencing if neededConfirm fencing works and document manual override

We can define HA policies that match your SLAs—WhatsApp +639171043993. If you want to click expand on operational runbooks or test plans, we will guide the exercises and validate results.

Performance Tuning and Reliability Checks

Performance tuning is as much about reliability as it is about raw speed. Small, deliberate adjustments in ZFS and system settings yield steady, predictable results for business workloads.

ZFS ARC, SLOG/L2ARC, and RAID10 considerations

Reserve RAM for the ZFS ARC, but leave headroom for VMs and hypervisor processes on each node. Too-large ARC sizes starve guests; too-small ones waste fast memory.

Add a power-protected SLOG for sync-heavy writes—databases and NFS mounts benefit from a low-latency commit path. Use L2ARC only after measuring read patterns; it adds metadata overhead and consumes RAM.

Favor mirrored vdevs / RAID10 for transactional workloads—consistent latency beats peak throughput for many business apps. This choice also helps predictable behavior during failures.

Benchmarking storage and validating VM performance

Benchmark with fio using random read/write, mixed workloads, and realistic queue depths that mirror your VMs. Synthetic numbers help, but application KPIs matter most.

  • Measure application-level KPIs—transaction latency, query time, and end-user response—not just IOPS.
  • Monitor thermal and power headroom; sustained performance needs stable cooling and redundant PSUs.
  • Automate checks: SMART, ZFS scrubs, and link error counters to catch reliability issues on both nodes early.

“Tune for predictable behavior under load—then verify with real workloads.”

Ask for our tuning checklist—contact WhatsApp +639171043993 or view the performance tuning checklist to click expand configuration examples and measured profiles.

Troubleshooting, Maintenance, and “Last Edited” Operational Notes

Operational notes and clear runbooks cut mean time to repair. Keep concise steps for common faults so teams work fast and consistently.

Common errors and quick checks

No quorum symptoms include blocked management actions and paused HA. Inspect corosync links, vote counts, and qdevice reachability. If a two-node setup loses one host without a third vote, quorum drops to 50% and services halt—treat this as urgent.

Split-brain risks disappear when fencing is validated. Fix time drift by pointing hosts to a reliable NTP authority and set clock skew alarms. Time issues break authentication and can prevent services from joining.

Maintenance, backups, and recovery drills

Apply rolling updates by draining one node, patching, then verifying before the next. Test backup and restore procedures quarterly to prove RPO/RTO and catch config drift. Teams often begin with offsite NAS backups when local arrays are risky.

  • Test DR drills: simulate site loss and measure recovery of critical vms and dependencies.
  • Review replication logs: confirm schedules, retention, and bandwidth use.
  • Track last edited notes—record config changes, firmware updates, and owners with timestamps.

“Keep your runbooks current—schedule a quarterly review via WhatsApp +639171043993.”

IssueSymptomImmediate Action
No quorumManagement blocked, HA pausedCheck votes, corosync, and qdevice reachability
Split-brainConflicting writes possibleEnsure fencing; isolate minority partition
Time driftAuth failures, join errorsSync to NTP authority; alert on skew

Conclusion

We recommend a clear, repeatable plan: reliable votes, tested storage, and practiced restores.

Experience shows three practical paths for small two‑server cases. Use ZFS with 15‑minute replication plus a qdevice for a defined RPO. Choose Ceph/GlusterFS only with an extra witness or monitor. Or pick external SAN / fast NAS (25/100G) for true shared storage and seamless failover.

Always include a qdevice — without it, one host down drops votes and services stop. Validate fencing, network segmentation, and backups. Test restores and record “last edited” runbooks so machines recover fast.

Ready to move forward? Book a free demo and deployment plan on WhatsApp +639171043993 to click expand your execution blueprint.

FAQ

What is the recommended topology for a two‑server high‑availability setup?

Use two primary servers with a third voting entity (a qdevice) located on a lightweight, reliable host — for example, a small VPS or a dedicated management appliance. This avoids split‑brain by giving you three votes so the environment can reach quorum even if one server fails. Ensure low latency between the qdevice and the servers.

Why is quorum important and how does the voting system work?

Quorum prevents split‑brain and data corruption by requiring a majority of votes to make cluster decisions. With three total votes, at least two are needed — this lets the remaining server keep services running if its peer goes down. Plan vote distribution and test failure scenarios to confirm expected behavior.

Can we achieve true shared storage with only two machines?

True shared‑nothing storage like Ceph is suboptimal with just two physical hosts — it needs odd numbers for redundancy. Better options are ZFS replication between servers, or using an external SAN/NAS appliance on a fast network. That gives simpler HA and predictable failover.

When should we choose ZFS replication versus a SAN?

Choose ZFS replication if you want per‑VM snapshots, easy rollback, and cost‑effective durability across the two machines. Pick a SAN/NAS when you need block‑level shared storage, simpler live migration, and centralized capacity — provided you have a reliable, high‑speed connection.

How often should replication run for production VMs?

A 15‑minute replication cadence is a practical balance for many businesses — it reduces potential data loss while keeping transfer costs manageable. Critical workloads may need more frequent snapshots or synchronous storage solutions; noncritical VMs can use hourly or daily schedules.

What hardware specs and networking should we plan for?

Size CPU and RAM to match peak VM loads and plan NICs for separation of management, storage, and VM traffic. Use bonding for redundancy and at least 10GbE for storage links — 25/100GbE where heavy I/O or many VMs demand it. Choose enterprise SSDs for performance tiers and HDDs for capacity.

How do we handle latency and link redundancy across multiple sites in the Philippines?

Aim for low latency between sites — under 10–20 ms is ideal for storage replication. Use diverse fiber paths where possible and add LTE or secondary ISP links as failover. Monitor ISP stability and design replication schedules to tolerate occasional higher latency.

Where should the qdevice be located and what latency is acceptable?

Place the qdevice in a third location with stable connectivity — a public cloud region or remote office. Keep round‑trip latency under 100 ms to avoid prolonged fencing or split‑brain risks. Test failover to ensure the qdevice remains reachable during real outages.

What backup strategy should we implement before making changes?

Run full VM backups to an offsite NAS or cloud bucket and verify restores with test restores. Keep at least one consistent, recent backup before major changes. Automate backup validation and document recovery steps to reduce error during emergency restore.

What network segmentation is recommended for cluster traffic?

Separate management, storage replication, and tenant VM traffic on distinct VLANs or physical NICs. Use jumbo frames on storage VLANs if supported, and apply bonding and multipath to eliminate single points of failure. This improves performance and limits blast radius.

How do we set up and verify the two‑server environment step by step?

Prepare both hosts with updates, correct hostnames, and time sync. Create the cluster on the first server, add the second, then configure the qdevice and storage replication. Validate cluster health, run failover tests, and check backups. Document results and revise configurations as needed.

How should fencing and failover policies be configured?

Implement fencing to power off unresponsive hosts and prevent data corruption. Define HA groups and priorities so critical VMs restart first. Test fencing paths and ensure administrative access to remote power controllers or out‑of‑band management during recovery.

What performance tuning is essential for ZFS and VM workloads?

Tune ZFS ARC sizes, consider SLOG for sync workloads, and use L2ARC only when you have spare fast storage. Prefer RAID10 for mixed workloads when performance matters. Benchmark with realistic VM loads and iterate based on observed bottlenecks.

How do we avoid split‑brain and resolve it if it happens?

Prevent split‑brain by using a qdevice, reliable networking, and fencing. If split‑brain occurs, stop services, evaluate which replica is authoritative, and follow documented recovery steps to reconcile data. Regular DR drills reduce response time and risk.

What maintenance practices reduce downtime and risk?

Perform rolling updates, validate backups before upgrades, and schedule maintenance windows. Keep time synchronization strict, monitor disks and network health, and run periodic disaster recovery exercises. Track “last edited” notes in change logs for traceability.

How do we measure success after deployment?

Verify that HA failover meets RTO goals, replication meets RPO targets, and performance stays within service levels. Monitor alerts, run scheduled DR tests, and review capacity trends. Use those metrics to refine hardware, replication cadence, and backup policies.

Comments are closed.