What FrameX is, where it fits, and who it is for.

FrameX Overview

FrameX is a Python dataframe and array library designed for single-machine parallel analytics with familiar Pandas and NumPy ergonomics.

It combines:

Arrow-backed columnar storage (pyarrow.Table, pyarrow.RecordBatch)
Pandas-like tabular APIs (DataFrame, Series, GroupBy)
NumPy dispatch support for array workflows (NDArray, __array_ufunc__, __array_function__)
Optional lazy execution (.lazy().collect()) for multi-step query planning
Runtime backend selection (threads, processes, optional ray, optional dask)
Micro-batch streaming pipeline support via StreamProcessor

Why FrameX

FrameX is built for workloads that are too large for comfortable single-threaded Pandas, but do not require a distributed cluster yet.

Typical range:

FrameX is pre-1.0 and evolving quickly. The core interfaces exist today and are already useful for local experimentation and pipeline prototyping.

If you need strict 1:1 Pandas behavior everywhere, use to_pandas() at boundaries and validate behavior for critical paths.

FrameX supports explicit interchange paths:

Continue with Getting Started for first-run setup and a complete walkthrough.