Skip to content

Backend Mode Selection Guide

🎯 Overview

This guide helps you choose the right backend mode for your use case. We offer three modes optimized for different deployment scenarios:

  1. Portable Mode (PyArrow) - Client-side compute, serverless-friendly
  2. Standalone Mode (PyO3) - Native Rust performance, zero network
  3. Distributed Mode (gRPC) - Multi-client memory sharing, streaming

📊 Quick Comparison Matrix

Feature Portable Standalone Distributed
Performance ⭐⭐ (30-50% of Polars) ⭐⭐⭐⭐⭐ (100% Polars) ⭐⭐⭐⭐ (95-98% Polars)
Memory Usage ⭐⭐⭐ (per-client duplication) ⭐⭐⭐ (per-client duplication) ⭐⭐⭐⭐⭐ (shared across clients)
Setup Complexity ⭐⭐⭐⭐⭐ (pip install) ⭐⭐⭐⭐ (pip install + PyO3) ⭐⭐ (requires server)
Network Latency ⭐⭐⭐⭐⭐ (zero - local) ⭐⭐⭐⭐⭐ (zero - embedded) ⭐⭐⭐ (1-10ms gRPC)
Serverless Support ✅ Yes (WASM, Lambda) ⚠️ Limited (cold start) ❌ No (needs persistent server)
Multi-Client ❌ No (each loads data) ❌ No (each loads data) ✅ Yes (shared memory)
Best For Demos, WASM, cost savings Production, single-client Multi-tenant, streaming

🚀 Mode 1: Portable (PyArrow)

What Is It?

Pure Python implementation using PyArrow for columnar operations. No Rust compilation required.

When to Use

  • Demos and Trials: Share with users without Rust toolchain
  • WASM Deployment: Run in browser via Pyodide/PyScript
  • Serverless Functions: AWS Lambda, Azure Functions (no cold start penalty)
  • Cost Optimization: Move compute to client-side (80-95% cost reduction)
  • Cross-Platform: Works everywhere Python runs

When NOT to Use

  • ❌ Performance-critical applications (30-50% slower than Standalone)
  • ❌ Large-scale production workloads
  • ❌ Real-time / low-latency requirements

Example

import os
os.environ["BACKEND_MODE"] = "portable"

from python.backends import get_router
router = get_router()

# Uses PyArrow engine (pure Python)
df = router.load_parquet("data.parquet")
result = df.select(["price", "volume"]).sum()
print(f"✅ Portable mode (PyArrow): {result}")

Performance Characteristics

Dataset Size: 1GB Parquet
Operation: select + filter + groupby

Portable (PyArrow):   3.2s
Standalone (PyO3):    1.1s  ← 2.9x faster
Distributed (gRPC):   1.3s  (1.1s + 0.2s network)

Verdict: Portable is slower BUT enables serverless deployment

Cost Analysis

Scenario: 1000 requests/day, 1GB dataset

Standalone (PyO3) in Container:
  - Azure Container Instance: $50/month
  - 24/7 running: $50/month
  - Total: $50/month

Portable (PyArrow) in Azure Functions:
  - Execution time: 3.2s × 1000 = 3200s/day
  - Compute cost: 3200s × $0.000016/s = $1.54/month
  - Total: $1.54/month ← 97% cost savings ✅

Verdict: 30% slower BUT 97% cheaper

🔥 Mode 2: Standalone (PyO3)

What Is It?

Native Rust PyO3 bindings to Polars. Zero network overhead, same performance as Polars.

When to Use

  • Production Applications: Best performance for single-process workloads
  • HFT / Low-Latency: Sub-millisecond query times required
  • Single-User Tools: Personal research, backtesting, analysis
  • Embedded Applications: Desktop tools, CLI utilities
  • < 10 Concurrent Users: Memory duplication is acceptable

When NOT to Use

  • ❌ 10+ concurrent users (memory duplication becomes expensive)
  • ❌ Serverless deployments (requires compiled PyO3 module)
  • ❌ Cross-platform distribution without Rust toolchain

Example

import os
os.environ["BACKEND_MODE"] = "standalone"

from python.backends import get_router
router = get_router()

# Uses PyO3 Polars bindings (native Rust)
df = router.load_parquet("data.parquet")
result = df.select(["price", "volume"]).sum()
print(f"✅ Standalone mode (PyO3): {result}")

Performance Characteristics

Dataset Size: 1GB Parquet
Operation: select + filter + groupby

Standalone (PyO3):   1.1s  ← FASTEST ✅
Distributed (gRPC):  1.3s  (1.1s compute + 0.2s network)
Portable (PyArrow):  3.2s

Verdict: Standalone is fastest for single-client workloads

Memory Analysis

Scenario: 10GB dataset, 5 concurrent users

Standalone (PyO3):
  - Each user loads data: 10GB × 5 = 50GB RAM
  - Cost: $200/month (large instance)

Distributed (gRPC):
  - Server loads once: 10GB RAM
  - Clients hold handles: 5MB × 5 = 25MB
  - Cost: $50/month (small instance) ← 75% savings ✅

Verdict: Standalone is cheaper for < 10 users

🌐 Mode 3: Distributed (gRPC)

What Is It?

Client-server architecture with gRPC streaming. DataFrames live on server, clients hold handles.

When to Use

  • Multi-Tenant Platforms: 10+ concurrent users querying same data
  • Memory Efficiency: Datasets too large to duplicate per-client
  • Streaming Pipelines: Real-time data processing (WebSocket, Kafka)
  • Time-Series Operations: Rolling windows, OHLCV resampling
  • Language-Agnostic: Clients in Python, Rust, Go, TypeScript

When NOT to Use

  • ❌ Single-user applications (unnecessary complexity)
  • ❌ < 10 concurrent users (operational overhead not justified)
  • ❌ Serverless deployments (requires persistent server)
  • ❌ Ultra-low-latency (<1ms) requirements

Example

import os
os.environ["BACKEND_MODE"] = "distributed"
os.environ["GRPC_SERVER"] = "localhost:50051"

from python.backends import get_router
router = get_router()

# Uses gRPC client (handle-based)
df = router.load_parquet("data.parquet")  # Returns handle, not data
result = df.select(["price", "volume"]).sum().collect()  # Collects over gRPC
print(f"✅ Distributed mode (gRPC): {result}")

Performance Characteristics

Dataset Size: 1GB Parquet
Operation: select + filter + groupby

Single Client:
  Standalone: 1.1s  ← Faster (no network)
  Distributed: 1.3s (1.1s + 0.2s network)

10 Concurrent Clients:
  Standalone: 11s (sequential due to memory contention)
  Distributed: 1.5s (parallel, shared memory) ← 7.3x faster ✅

Verdict: Distributed wins at scale (10+ users)

Cost at Scale

Scenario: 10GB dataset, 50 concurrent users

Standalone (PyO3):
  - Each user: 10GB RAM
  - Total: 500GB RAM required
  - Cost: $2000/month (massive instance) ❌

Distributed (gRPC):
  - Server: 10GB RAM (loaded once)
  - Clients: 50 × 5MB = 250MB handles
  - Total: 10.25GB RAM required
  - Cost: $100/month ← 95% savings ✅

Verdict: Distributed is only viable option at scale

🎯 Decision Tree

graph TD
    A[Start] --> B{Serverless deployment?}
    B -->|Yes| C[Use Portable Mode ✅]
    B -->|No| D{How many users?}

    D -->|1 user| E[Use Standalone Mode ✅]
    D -->|2-9 users| F{Dataset fits in RAM?}
    D -->|10+ users| G[Use Distributed Mode ✅]

    F -->|Yes| E
    F -->|No| G

📋 Step-by-Step Selection

Step 1: Deployment Environment

Environment Recommended Mode
AWS Lambda, Azure Functions Portable (PyArrow)
Docker Container (single instance) Standalone (PyO3)
Kubernetes (multi-pod) Distributed (gRPC)
Desktop Application Standalone (PyO3)
Browser (WASM) Portable (PyArrow)

Step 2: Number of Concurrent Users

Users Recommended Mode
1 Standalone (PyO3)
2-9 Standalone (PyO3)
10-99 Distributed (gRPC)
100+ Distributed (gRPC)

Step 3: Dataset Size vs Available RAM

Dataset RAM Recommended Mode
< 1GB Any Standalone (PyO3)
1-10GB < Dataset Distributed (streaming)
1-10GB > Dataset Standalone (PyO3)
> 10GB Any Distributed (streaming)

Step 4: Performance Requirements

Latency Target Recommended Mode
< 1ms Standalone (PyO3)
1-10ms Standalone or Distributed
> 10ms Portable or Distributed

🔧 Configuration Examples

from python.backends import get_router

# Router auto-detects best mode based on:
# 1. Environment variables (BACKEND_MODE)
# 2. Available imports (polarway, polars, grpc)
# 3. Data size (small → portable, large → distributed)

router = get_router()
df = router.load_parquet("data.parquet")

Explicit Mode Selection

import os

# Force Portable mode
os.environ["BACKEND_MODE"] = "portable"

# Force Standalone mode
os.environ["BACKEND_MODE"] = "standalone"

# Force Distributed mode
os.environ["BACKEND_MODE"] = "distributed"
os.environ["GRPC_SERVER"] = "polarway-server:50051"

from python.backends import get_router
router = get_router()

Fallback Strategy

from python.backends import BackendRouter

router = BackendRouter(
    prefer_mode="standalone",  # Try Standalone first
    fallback_chain=["distributed", "portable"],  # Fallback order
    data_size_threshold_mb=1000  # Switch to distributed if > 1GB
)

# Example flow:
# 1. Try Standalone (PyO3)
# 2. If not available, try Distributed (gRPC)
# 3. If not available, fall back to Portable (PyArrow)

📊 Real-World Use Cases

Use Case 1: Personal Research (Standalone)

Scenario: PhD student analyzing 2GB of historical stock data on laptop.

# Standalone Mode
os.environ["BACKEND_MODE"] = "standalone"

router = get_router()
df = router.load_parquet("stocks_2020_2024.parquet")

# Fast local analysis
signals = df.filter(pl.col("volume") > 1e6).group_by("symbol").agg(...)

Why: Single user, data fits in RAM, needs fast iteration.

Use Case 2: Company Analytics Dashboard (Distributed)

Scenario: 50 employees querying 50GB customer database simultaneously.

# Distributed Mode
os.environ["BACKEND_MODE"] = "distributed"
os.environ["GRPC_SERVER"] = "analytics-server:50051"

router = get_router()
df = router.load_parquet("customers/*.parquet")

# Shared memory across all users
revenue_by_region = df.group_by("region").agg(pl.col("revenue").sum())

Why: Multi-user, large dataset, memory sharing critical.

Use Case 3: Serverless Data API (Portable)

Scenario: Azure Function that processes user-uploaded CSVs.

# Portable Mode (Azure Functions)
@azure_function
def process_csv(request):
    os.environ["BACKEND_MODE"] = "portable"

    router = get_router()
    df = router.load_csv(request.file)

    # Client-side compute (no server needed)
    stats = df.describe()
    return stats.to_json()

Why: Serverless, cost-sensitive, no persistent server.

🎓 Migration Guide

From Standalone to Distributed

# Before (Standalone)
import polars as pl
df = pl.read_parquet("data.parquet")
result = df.select(["price"]).sum()

# After (Distributed)
import os
os.environ["BACKEND_MODE"] = "distributed"
os.environ["GRPC_SERVER"] = "localhost:50051"

from python.backends import get_router
router = get_router()

df = router.load_parquet("data.parquet")  # Returns handle
result = df.select(["price"]).sum().collect()  # Add .collect()

Key Changes: 1. Set environment variables 2. Use get_router() instead of direct polars import 3. Add .collect() to materialize results

From Distributed to Portable

# Before (Distributed)
os.environ["BACKEND_MODE"] = "distributed"
router = get_router()
df = router.load_parquet("data.parquet").collect()

# After (Portable)
os.environ["BACKEND_MODE"] = "portable"
router = get_router()
df = router.load_parquet("data.parquet")  # No .collect() needed

Key Changes: 1. Change BACKEND_MODE to "portable" 2. Remove .collect() calls (portable returns data directly) 3. Expect 2-3x slower performance

📈 Performance Benchmarks

Benchmark Setup

  • Dataset: 1GB Parquet (10M rows)
  • Operation: select + filter + groupby + agg
  • Hardware: 8-core CPU, 16GB RAM

Results

Mode Single User 10 Users 100 Users
Portable 3.2s N/A (each loads) N/A
Standalone 1.1s 11s (sequential) OOM
Distributed 1.3s 1.5s (parallel) 2.1s (parallel)

Key Takeaways: - Standalone is fastest for single user - Distributed is only viable option for 10+ users - Portable trades 2-3x speed for 97% cost savings

🎯 Summary

Choose Portable If...

  • 💰 Cost savings are priority (serverless)
  • 🌐 Need to run in browser (WASM)
  • 📦 Simple distribution without Rust toolchain

Choose Standalone If...

  • ⚡ Performance is critical (sub-millisecond queries)
  • 👤 Single user or < 10 users
  • 🖥️ Data fits in memory

Choose Distributed If...

  • 👥 10+ concurrent users
  • 💾 Memory sharing saves costs
  • 🌊 Streaming / time-series workloads

Rule of Thumb: Start with Standalone, upgrade to Distributed when you hit 10+ users or run out of memory. 🚀