Performance

Performance characteristics and optimization tips.

Complexity

Indexing Operations

Operation	SpatialIndex	TemporalIndex	SpatiotemporalIndex
Insert	O(log n)	O(log n)	O(log n)
Build (bulk)	O(n log n)	O(n log n)	O(n log n)
Point query	O(log n)	O(log n)	O(log n)
Range query	O(log n + k)	O(log n + k)	O(log n + k)
K-nearest	O(k log n)	N/A	N/A

Where n = number of items, k = number of results.

Graph Operations

Operation	Complexity
Add node	O(1)
Add edge	O(1)
Get neighbors	O(degree)
Has path (BFS)	O(V + E)
Shortest path	O((V + E) log V)
connect_temporal()	O(n log n)
connect_spatial(d)	O(n²)
connect_thematic()	O(n² × tags)

Where V = vertices, E = edges.

Memory Usage

Per-Event Overhead

Component	Approximate Size
Event base	~200 bytes
Location	~40 bytes
Timestamp	~32 bytes
EventId (UUID)	16 bytes
Text (per char)	1 byte
Tag (each)	~24 bytes + string

Index Overhead

Index	Overhead per item
SpatialIndex	~64 bytes
TemporalIndex	~48 bytes
SpatiotemporalIndex	~112 bytes

Approximate Memory Formula

Memory (MB) ≈ events × (200 + avg_text_len + tags × 30) / 1_000_000
           + index_overhead

Optimization Tips

Bulk Loading

Use from_iter instead of repeated insert:

// ✅ Fast: O(n log n) bulk load
let index = SpatialIndex::from_iter(events, |e| &e.location);

// ❌ Slow: O(n log n) per insert
let mut index = SpatialIndex::new();
for event in events {
    index.insert(event, &event.location);
}

Query Then Filter

Use indexes for initial filtering, then apply additional criteria:

// Get spatial candidates (fast)
let candidates = spatial_index.query_bounds(&bounds);

// Apply additional filters (linear)
let filtered: Vec<_> = candidates.into_iter()
    .filter(|e| e.has_tag("important"))
    .filter(|e| time_range.contains(&e.timestamp))
    .collect();

Avoid O(n²) Operations

The connect_spatial and connect_thematic methods are O(n²):

// For large graphs, consider:

// 1. Sample events
let sample: Vec<_> = events.choose_multiple(&mut rng, 1000).collect();
let mut graph = NarrativeGraph::from_events(sample);
graph.connect_spatial(10.0);  // Now O(sample²)

// 2. Use spatial index for proximity
let index = SpatialIndex::from_iter(events.clone(), |e| &e.location);
for event in &events {
    let nearby = index.query_radius_meters(event.location.lat, event.location.lon, 1000.0);
    // Only connect to nearby events
}

// 3. Skip if graph is too large
if events.len() > 10_000 {
    println!("Skipping spatial connections (too large)");
} else {
    graph.connect_spatial(10.0);
}

Lazy Evaluation

Build indexes only when needed:

struct LazyNarrative {
    events: Vec<Event>,
    spatial_index: Option<SpatialIndex<Event>>,
}

impl LazyNarrative {
    fn get_spatial_index(&mut self) -> &SpatialIndex<Event> {
        if self.spatial_index.is_none() {
            self.spatial_index = Some(SpatialIndex::from_iter(
                self.events.iter().cloned(),
                |e| &e.location
            ));
        }
        self.spatial_index.as_ref().unwrap()
    }
}

Parallel Processing

Use rayon for parallel operations:

use rayon::prelude::*;

// Parallel filtering
let filtered: Vec<_> = events.par_iter()
    .filter(|e| expensive_check(e))
    .collect();

// Parallel analysis
let metrics: Vec<_> = narratives.par_iter()
    .map(|n| SpatialMetrics::from_events(&n.events))
    .collect();

Benchmarks

Index Query Performance

Tested on 100,000 events:

Query Type	Time
Spatial bbox (1% area)	~50 μs
Spatial radius (1km)	~80 μs
Temporal range (1 month)	~40 μs
Combined (1% × 1 month)	~100 μs
K-nearest (k=10)	~120 μs

Graph Operations

Operation	1K events	10K events	100K events
from_events	1 ms	10 ms	100 ms
connect_temporal	2 ms	25 ms	300 ms
connect_spatial(10km)	50 ms	5 s	too slow
shortest_path	< 1 ms	5 ms	50 ms
to_dot	5 ms	50 ms	500 ms

I/O Performance

Format	Export 10K events	Import 10K events
JSON	15 ms	20 ms
GeoJSON	20 ms	25 ms
CSV	10 ms	15 ms

Memory Optimization

String Interning

For repeated tags, consider interning:

use std::collections::HashSet;
use std::sync::Arc;

// Share common strings
let tag_pool: HashSet<Arc<str>> = HashSet::new();

fn intern_tag(pool: &mut HashSet<Arc<str>>, tag: &str) -> Arc<str> {
    if let Some(existing) = pool.get(tag) {
        existing.clone()
    } else {
        let new = Arc::from(tag);
        pool.insert(new.clone());
        new
    }
}

Streaming Processing

For very large datasets, process in chunks:

use std::io::{BufRead, BufReader};

let file = File::open("huge_file.csv")?;
let reader = BufReader::new(file);

for chunk in reader.lines().chunks(10_000) {
    let events: Vec<Event> = chunk
        .filter_map(|line| parse_event(&line.ok()?).ok())
        .collect();
    
    // Process chunk
    process_events(&events);
}

Spatial Narrative Documentation