You’ve decided to build an ARM cluster. Maybe you want to run k3s. Maybe you’re consolidating a handful of Raspberry Pis under your desk into something more intentional. You’ve done the reading, watched the YouTube videos, compared benchmark charts, and somehow ended up more confused than when you started.
The Raspberry Pi CM5 is the familiar choice. It has broad software support, a mature ecosystem, and enough performance for a wide range of homelab workloads. The RK1 brings more memory capacity, PCIe Gen 3 ×4 connectivity, and the compute density needed for serious cluster deployments. The Jetson Orin Nano delivers impressive AI performance and access to NVIDIA’s CUDA ecosystem, while the Orange Pi 5 offers strong ARM performance at a budget-friendly price point.
On paper, all four look compelling. In practice, they solve very different problems.
This guide cuts through the marketing, spec sheets, and benchmark cherry-picking. We’ll look at what each module does well, where it falls short, and which one makes the most sense for different homelab workloads. If you’re trying to decide between the CM5, RK1, Jetson Orin Nano, and Orange Pi 5 in 2026, this is the comparison that matters.
What This Guide Covers
This guide compares four of the most discussed ARM platforms available to homelab builders in 2026: the Raspberry Pi CM5, the Turing Pi RK1, the NVIDIA Jetson Orin Nano, and the Orange Pi 5.
You’ll find a side-by-side specification breakdown, honest assessments of each platform’s real-world strengths and limitations, benchmark context drawn from actual cluster deployments, and direct use-case recommendations mapped to specific workloads from k3s clusters and Longhorn persistent storage to local AI inference and budget ARM experimentation.
If you’re trying to figure out which platform to buy before you spend money, this is the place to start. This guide is written for homelab builders, k3s users, and hardware experimenters who want a clear answer before committing to a purchase.
Quick Comparison
| CM5 | RK1 | Jetson Orin Nano | Orange Pi 5 (RK3588S) | |
| CPU | 4× Cortex-A76 @ 2.4 GHz | 4× Cortex-A76 @ 2.4 GHz + 4× Cortex-A55 @ 1.8 GHz | 6× Cortex-A78AE | 4× Cortex-A76 @ 2.4 GHz + 4× Cortex-A55 @ 1.8 GHz |
| MAX RAM | 16 GB | 32 GB | 8 GB | 16 GB |
| PCIe | Gen 2 ×1 | Gen 3 ×4 | Gen 3 ×4 (via carrier board) | Gen 2 ×1 |
| AI Accelerator | None | 6 TOPS NPU | 1024 CUDA cores + 32 Tensor Cores; up to 40 TOPS | 6 TOPS NPU |
| Entry Price | From $45 | From $249 | $249 (Developer Kit) | From ~$65 |
| Core Strength | Ecosystem, documentation, community support | Memory capacity, PCIe bandwidth, cluster-native form factor | GPU acceleration, CUDA ecosystem, AI inference performance | Low cost, strong CPU performance |
| Core Weakness | Limited PCIe bandwidth, 16 GB RAM ceiling | Higher entry cost than CM5 | Cost premium, higher power draw, 8 GB RAM ceiling | Limited PCIe, smaller ecosystem, not cluster-oriented |
Specifications from official vendor documentation. Pricing reflects entry-level configurations available during mid-2026 and may vary by region and distributor.
Raspberry Pi CM5: The Safe Choice
The CM5 is Raspberry Pi’s latest compute module. At its core is Broadcom’s BCM2712 SoC: four Cortex-A76 cores running at up to 2.4 GHz. Compared to the CM4, it’s a substantial jump in CPU performance while maintaining compatibility with the broader Raspberry Pi ecosystem.
Memory options range from 2 GB to 16 GB of LPDDR4X-4267 RAM, while storage configurations include Lite variants with no onboard storage alongside 16 GB, 32 GB, and 64 GB eMMC models. Wireless versions add dual-band 802.11ac Wi-Fi and Bluetooth 5.0/BLE connectivity.
For most homelab workloads, the CM5 is comfortably powerful enough. Home Assistant, Pi-hole, Gitea, Nextcloud, media services, development environments, and lightweight Kubernetes clusters all run well on the platform. The jump from the CM4’s Cortex-A72 cores to Cortex-A76 cores makes the system feel noticeably more responsive under load.
The biggest limitation for cluster builders is storage expansion. The CM5 exposes a single PCIe Gen 2 ×1 lane, providing 5 Gbps of bandwidth to external devices. NVMe storage is absolutely viable, but sustained storage-heavy workloads will eventually run into bandwidth limits that simply don’t exist on platforms offering PCIe Gen 3 ×4 connectivity. Persistent Kubernetes storage, database-heavy applications, and CI/CD workloads tend to expose this difference fastest.
The reason many builders continue to choose Raspberry Pi hardware isn’t raw benchmark performance. It’s ecosystem maturity. Operating system support is excellent, container compatibility is rarely an issue, carrier board options are abundant, and troubleshooting resources are everywhere. When something breaks, chances are someone else has already encountered and solved the same problem.
Buy the CM5 if you want the easiest ARM platform to deploy, maintain, and troubleshoot. It’s an excellent choice for general-purpose homelab workloads, smaller Kubernetes clusters, and builders who value software compatibility and community support above all else.
Look elsewhere if your workloads are storage-intensive, memory-hungry, or designed to scale across multiple heavily utilized nodes. In those scenarios, PCIe bandwidth and memory capacity become more important than ecosystem maturity, and other platforms begin to pull ahead.
Turing Pi RK1: The Cluster-Native Option
The RK1 runs Rockchip’s RK3588 on an 8 nm process. Four Cortex-A76 cores at up to 2.4 GHz, four Cortex-A55 cores at up to 1.8 GHz, a shared 3 MB L3 cache, and a 260-pin SO-DIMM connector that is physically compatible with the Jetson pin layout. RAM options are 8 GB, 16 GB, or 32 GB LPDDR4X, making the RK1 one of the few ARM compute modules available with 32 GB of memory.
Where the RK1 stands apart is the combination of memory capacity, storage bandwidth, and cluster density. The RK3588’s PCIe Gen 3 ×4 interface unlocks NVMe performance that simply isn’t possible on modules limited to a single PCIe Gen 2 lane. For Kubernetes clusters, Longhorn persistent storage, CI/CD runners, databases, and container-heavy workloads, that additional bandwidth translates directly into better real-world performance.
The Turing Pi benchmark series provides the details that matter for cluster planning. Under sysbench at 8 threads, the RK1 sustains 13,600-13,900 events per second over a 10-minute continuous run without thermal throttling, provided adequate cooling is in place. A full Linux 6.1 kernel compile via make -j8 on the onboard eMMC completes in 28-32 minutes. Adding NVMe via the PCIe Gen 3 ×4 bus reduces that, because storage I/O stops being the bottleneck. That bus delivers sequential throughput well beyond what a single PCIe Gen 2 lane can sustain, which translates into a meaningful real-world advantage for storage-heavy workloads.
Memory bandwidth measured via STREAM tops out around 21-22 GB/s on sequential access. General-purpose MEMCPY operations land around 8-9 GB/s. Running multiple memory-intensive workloads simultaneously does reduce available bandwidth per workload, which is worth considering when planning pod placement and resource allocation in dense clusters.
Thermal behavior is predictable: idle at 38-42°C, steady-state under full CPU load at 66-74°C with a passive heatsink, and throttle onset above approximately 80°C. Power draw runs 4-5 W at idle per node and 10-12 W under sustained CPU load. A fully populated four-node Turing Pi 2.5 cluster remains remarkably power efficient compared to an equivalent x86 setup.
The RK3588 also includes a 6 TOPS neural processing unit (NPU) accessible through RKNN Toolkit v2. It accelerates supported computer vision and inference workloads efficiently while consuming very little power. For local LLM deployments, however, the NPU is not a drop-in accelerator for llama.cpp. Most LLM workloads continue to rely primarily on CPU execution, and model selection remains important.
Software support has matured significantly over the last few years. Ubuntu, Debian, Docker, k3s, Longhorn, and most modern ARM64 workloads run well on the platform, making the RK1 a practical choice for self-hosting, Kubernetes, and homelab infrastructure.
Buy the RK1 if you’re building a multi-node cluster where memory capacity, PCIe storage throughput, and workload density matter. This is the natural fit for k3s, Longhorn persistent storage, CI/CD pipelines, databases, self-hosted platforms, and any cluster expected to grow beyond a handful of lightweight services.
Look elsewhere if your highest priority is maximum ecosystem familiarity and community resources, or if your workloads are lightweight enough that you won’t benefit from the additional memory and storage bandwidth the RK1 provides.
NVIDIA Jetson Orin Nano: When GPU Inference Is the Actual Requirement
The Jetson Orin Nano 8GB combines six Cortex-A78AE CPU cores with an NVIDIA Ampere GPU featuring 1024 CUDA cores and 32 Tensor Cores. The module includes 8 GB of LPDDR5 memory delivering up to 68 GB/s of memory bandwidth and provides up to 40 TOPS of AI performance. NVIDIA also offers a higher-performance Super Mode through newer JetPack releases, which raises the performance ceiling and AI throughput at the cost of a higher power envelope. For the purposes of this comparison, we focus on the standard Jetson Orin Nano 8GB configuration, which represents the most common entry point for homelab and edge deployments.
The Jetson Orin Nano Super Developer Kit, which includes both the module and reference carrier board, is priced at $249. Standalone modules are also available through distributors for integration into custom carrier boards and embedded systems.
Where Jetson separates itself from every other module in this comparison is software. JetPack 6.x provides a mature AI stack that includes Ubuntu 22.04, CUDA, cuDNN, TensorRT, DeepStream, Isaac ROS, and optimized builds of popular machine learning frameworks. For developers building computer vision pipelines, robotics platforms, AI appliances, and edge inference systems, the software ecosystem is often just as valuable as the hardware itself.
The hardware is particularly well suited to TensorRT-optimized vision workloads, object detection pipelines, image classification, speech processing, and GPU-accelerated inference. This is one of the few ARM platforms where deploying AI models feels comparable to working on a small workstation rather than a single-board computer.
The challenge is that many homelab workloads simply don’t need CUDA. Running Gitea, Nextcloud, Home Assistant, Pi-hole, Kubernetes control planes, databases, or containerized services rarely benefits from the GPU acceleration you’re paying for. In those scenarios, the platform’s higher cost, higher power draw, and 8 GB memory ceiling become harder to justify compared to alternatives focused on general-purpose compute and storage performance.
Buy the Jetson Orin Nano if GPU-accelerated inference is the primary goal. Computer vision pipelines, TensorRT-optimized deployments, robotics projects, edge AI appliances, and multi-model inference workloads are exactly what this platform was built for.
Look elsewhere if your primary goal is self-hosting, Kubernetes, storage-heavy workloads, or general homelab infrastructure. Without a meaningful CUDA workload, much of the hardware’s value goes unused.
Orange Pi 5 and Orange Pi 5 Plus: The Budget Path
The Orange Pi 5 uses Rockchip’s RK3588S, and the “S” suffix matters. Compared to the full RK3588 found in the RK1 and Orange Pi 5 Plus, the RK3588S removes the PCIe Gen 3 lanes that make high-performance NVMe storage possible. As a result, the Orange Pi 5’s M.2 slot operates over PCIe Gen 2 ×1, limiting storage throughput to roughly the same range as a CM5 with NVMe attached. RAM configurations are available up to 16 GB.
The Orange Pi 5 Plus uses the full RK3588 and restores PCIe Gen 3 ×4 connectivity, making it the significantly more capable board for storage-intensive workloads. It also adds dual 2.5G Ethernet, additional I/O, and greater expansion flexibility. If you’re choosing between the two solely on technical merits, the Orange Pi 5 Plus is generally the better board.
Software support is available through official Ubuntu, Debian, and Android images, while community distributions such as Armbian expand the list further. The platform is mature enough for self-hosting, containers, development environments, and general homelab use, though the surrounding ecosystem remains smaller than Raspberry Pi’s. Documentation quality and community resources vary more between releases and distributions than most CM5 users are accustomed to.
The biggest limitation in the context of this guide is form factor. Neither Orange Pi 5 variant is a compute module, and neither is compatible with the Jetson-style 260-pin SO-DIMM connector used by the Turing Pi 2.5. These are standalone single-board computers rather than interchangeable cluster nodes.
Buy the Orange Pi 5 if you’re experimenting with ARM hardware on a budget and want strong performance for the money. For most buyers, the Orange Pi 5 Plus is worth the additional cost because it restores the PCIe Gen 3 ×4 connectivity that makes the RK3588 platform so compelling.
Look elsewhere if you’re building a modular cluster, need interchangeable compute modules, or want the deepest ecosystem and community support available.
Use-Case Recommendations
Building a multi-node k3s cluster with mixed workloads: RK1. The combination of PCIe Gen 3 ×4, up to 32 GB RAM per node, and a cluster-native form factor makes it the strongest option for running Kubernetes at scale on ARM hardware. Start with the complete Turing Pi 2.5 setup guide.
Running Kubernetes with Longhorn persistent storage: RK1. Longhorn benefits significantly from fast NVMe storage, and the RK1’s PCIe Gen 3 ×4 connectivity provides substantially more storage bandwidth than modules limited to a single PCIe Gen 2 lane. The k3s and Longhorn deployment guide covers the complete setup.
Largest ecosystem and community support: CM5. Raspberry Pi’s ecosystem remains one of the largest in the ARM world, with extensive documentation, tutorials, accessories, and community resources. While all of the platforms in this comparison have mature software support, the CM5 still benefits from the deepest pool of community knowledge and third-party hardware.
Local AI inference with GPU acceleration as the primary use case: Jetson Orin Nano 8GB. CUDA, TensorRT, and NVIDIA’s software ecosystem place it in a different category from the other modules in this comparison. If GPU-accelerated AI workloads are the goal, this is the platform to buy.
Local AI inference without a dedicated GPU: RK1. The integrated 6 TOPS NPU won’t replace CUDA, but it can accelerate supported inference workloads while maintaining the power efficiency expected from an ARM homelab. The local LLM setup guide covers practical model selection and deployment on the RK3588.
Budget ARM experimentation outside a cluster: Orange Pi 5 Plus. It delivers excellent performance for the price, restores the full RK3588 feature set, and offers PCIe Gen 3 ×4 connectivity without requiring a cluster-oriented platform.
The Honest Bottom Line
All four platforms are good. The real question is whether you’re optimizing for cluster infrastructure, AI acceleration, ecosystem depth, or cost.
If your goal is GPU-accelerated inference, the Jetson Orin Nano stands alone. CUDA, TensorRT, and NVIDIA’s software ecosystem make it the obvious choice for computer vision, robotics, and AI workloads that genuinely benefit from GPU acceleration.
If you’re looking for the largest ecosystem, the broadest selection of accessories, and the deepest pool of community knowledge, the CM5 remains an excellent option. It’s a capable compute module backed by one of the strongest communities in the ARM space.
If you’re building on a budget outside a cluster environment, the Orange Pi 5 Plus offers impressive hardware for the money. The full RK3588, PCIe Gen 3 ×4 connectivity, and dual 2.5G networking make it one of the most capable ARM single-board computers available in its price range.
But if you’re building a serious ARM cluster in 2026, the RK1 is the strongest overall choice.
The combination of up to 32 GB RAM per node, PCIe Gen 3 ×4 storage, low power consumption, and a form factor designed specifically for dense cluster deployments gives it advantages that become increasingly important as workloads grow. Kubernetes, Longhorn, databases, CI/CD pipelines, self-hosted platforms, and AI services all benefit from the additional memory and storage bandwidth available on the platform.
No platform wins every category. The RK1 simply wins the categories that matter most for cluster builders.
Conclusion
Choosing the right ARM platform isn’t about finding the board with the longest spec sheet. It’s about matching the hardware to the workloads you actually plan to run.
The CM5 remains an excellent choice for builders who value ecosystem depth and community support. The Jetson Orin Nano is the clear winner when GPU-accelerated AI workloads are the primary goal. The Orange Pi 5 Plus continues to offer impressive hardware value for budget-conscious builders looking for a powerful standalone ARM system.
But for homelab clusters, the RK1 stands out as the most complete package. The combination of up to 32 GB RAM, PCIe Gen 3 ×4 connectivity, low power consumption, and a compute-module form factor designed specifically for dense ARM deployments gives it room to grow alongside your workloads. Whether you’re running Kubernetes, persistent storage, databases, CI/CD pipelines, self-hosted applications, or local AI services, it’s a platform that scales well beyond the first few containers.
The best module is ultimately the one that fits your workload. For most ARM cluster builders in 2026, that module is the RK1.