Skip to content

ORC RowSelection API Design #58

Description

@suxiaogang223

Overview

This document describes the proposed RowSelection API for orc-rust, modeled after arrow-rs Parquet's implementation. This API will enable fine-grained row filtering at the decoder level, significantly reducing I/O and improving query performance.

Motivation

Currently, ORC readers must decode all rows in a stripe, even if only a subset is needed. RowSelection allows:

  • Skip rows before decoding: Avoid reading and decompressing data for filtered rows
  • Reduce I/O: Skip data pages that contain only filtered rows
  • Improve performance: Lower CPU and memory usage by decoding only selected rows
  • Page-level pruning: Leverage ORC row indexes to skip entire row groups

API Design

Core Types

/// A single selector specifying rows to select or skip
#[derive(Debug, Clone, Copy, Eq, PartialEq)]
pub struct RowSelector {
    /// The number of rows
    pub row_count: usize,
    
    /// If true, skip `row_count` rows
    pub skip: bool,
}

impl RowSelector {
    /// Select `row_count` rows
    pub fn select(row_count: usize) -> Self {
        Self {
            row_count,
            skip: false,
        }
    }
    
    /// Skip `row_count` rows
    pub fn skip(row_count: usize) -> Self {
        Self {
            row_count,
            skip: true,
        }
    }
}

/// A collection of [`RowSelector`] used to skip rows when scanning an ORC file
///
/// Invariants:
/// * Contains no [`RowSelector`] of 0 rows
/// * Consecutive [`RowSelector`]s alternate between skipping and selecting
#[derive(Debug, Clone, Default, Eq, PartialEq)]
pub struct RowSelection {
    selectors: Vec<RowSelector>,
}

Usage Examples

Example 1: Skip First N Rows

use orc_rust::{ArrowReaderBuilder, RowSelection, RowSelector};

let selection = RowSelection::from(vec![
    RowSelector::skip(10000),      // Skip first 10k rows
    RowSelector::select(1000000),  // Read next 1M rows
]);

let reader = ArrowReaderBuilder::try_new(file_data)?
    .with_row_selection(selection)
    .build();

for batch in reader {
    // Process only rows 10000-1010000
}

Example 2: From Filter Results

use arrow::array::BooleanArray;
use orc_rust::RowSelection;

// After evaluating predicates against row indexes
let filter1 = BooleanArray::from(vec![true, false, true, false, true]);
let filter2 = BooleanArray::from(vec![false, true, true, false, true]);

let selection = RowSelection::from_filters(&[filter1, filter2]);

let reader = ArrowReaderBuilder::try_new(file_data)?
    .with_row_selection(selection)
    .build();

Example 3: Combining with Stripe Filtering

use datafusion_orc::stripe_filter::StripeAccessPlanFilter;
use orc_rust::RowSelection;

// 1. First, filter stripes by statistics
let mut stripe_filter = StripeAccessPlanFilter::new(access_plan);
stripe_filter.prune_by_statistics(&schema, metadata, &predicate, &metrics);
let stripe_plan = stripe_filter.build();

// 2. For selected stripes, create row-level selection
let mut row_selections = Vec::new();
for stripe_idx in stripe_plan.stripe_indexes() {
    // Use ORC row index to create fine-grained selection
    let row_index = metadata.stripe_metadatas()[stripe_idx].row_index()?;
    let selection = evaluate_predicate_on_row_index(&row_index, &predicate)?;
    row_selections.push(selection);
}

// 3. Combine all selections
let combined = row_selections.into_iter()
    .fold(RowSelection::default(), |acc, sel| acc.union(&sel));

// 4. Read with both stripe and row filtering
let reader = ArrowReaderBuilder::try_new(file_data)?
    .with_row_groups(stripe_plan.stripe_indexes())  // Stripe filtering
    .with_row_selection(combined)                    // Row filtering
    .build();

Example 4: Two-Stage Filtering

// Stage 1: Filter based on column A
let filter_a = evaluate_predicate_a(&row_index)?;  // BooleanArray
let selection_a = RowSelection::from_filters(&[filter_a]);

// Stage 2: Among selected rows, filter based on column B
let filter_b = evaluate_predicate_b(&row_index)?;  // BooleanArray
let selection_b = RowSelection::from_filters(&[filter_b]);

// Combine: only rows where both A and B match
let final_selection = selection_a.and_then(&selection_b);

let reader = ArrowReaderBuilder::try_new(file_data)?
    .with_row_selection(final_selection)
    .build();

Implementation Phases

Phase 1: Basic API (MVP)

  • Define RowSelector and RowSelection types
  • Implement FromIterator and basic conversions
  • Add with_row_selection() to ArrowReaderBuilder
  • Pass selection through to stripe decoder
  • Implement basic skip in decoders (decode and discard)

Phase 2: Efficient Skipping

  • Implement efficient skip_rows() for each decoder type
  • Skip without decoding for:
    • Integer types (RLE skip)
    • Boolean (skip bit runs)
    • Strings (skip dictionary entries)
    • Timestamps, decimals, etc.

Phase 3: Row Index Integration

  • Parse ORC row indexes
  • Implement row_selection_from_row_index()
  • Integrate with stripe filtering
  • Add benchmarks

Phase 4: Advanced Features

  • scan_ranges() for page-level I/O planning
  • and_then() for multi-stage filtering
  • intersection() and union() operations
  • Expand to batch boundaries for caching

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions