Benchmarking

Feluda includes a benchmarking system that measures performance characteristics of all operators, helping developers optimize performance and maintain consistent standards.

Quick Start

Run All Benchmarks

cd benchmark
python main.py

Generates timestamped reports in benchmark/results/.

Run Specific Operator

from benchmark.operators import cluster_embeddings
from benchmark.report import BenchmarkReport

results = cluster_embeddings.benchmark()
report = BenchmarkReport()
report.extend(results)
report.save_json("results.json")

System Components

1. Profiler Engine

Measures execution time, memory usage, and CPU time:

from benchmark.profiler import Profiler

# Profile a function
@Profiler.profile
def my_function():
    # Your code here
    pass

# Benchmark an operator
result = Profiler.benchmark_operator(
    operator_class=MyOperator,
    operator_name="my_operator",
    runtime_kwargs={"input": "test_data"}
)

Features:

  • System information capture (CPU, memory, platform)

  • Memory usage tracking with peak detection

  • Wall clock and CPU time measurement

2. Data Generator

Creates synthetic datasets for testing:

from benchmark.data_generator import DataGenerator

# Generate embeddings
embeddings = DataGenerator.generate_embeddings(1000, 512)

# Generate clustered data
embeddings, labels = DataGenerator.generate_embeddings_with_clusters(
    num_clusters=5, samples_per_cluster=200, dim=512
)

# Generate test images
images = DataGenerator.generate_test_images("test_images/")

Dataset Types:

  • Embeddings: 100 to 100,000 samples, 256 to 2048 dimensions

  • Images: 128x128 to 4K resolutions, RGB/grayscale modes

  • Clustered Data: Known cluster structures for validation

3. Operator Benchmarks

Each operator has a dedicated benchmark:

def benchmark() -> list[dict]:
    """Benchmark the ClusterEmbeddings operator."""
    results = []

    # Test with different dataset sizes
    for n_clusters in [3, 5, 10]:
        embeddings, labels = DataGenerator.generate_embeddings_with_clusters(
            num_clusters=n_clusters, samples_per_cluster=200
        )

        result = Profiler.benchmark_operator(
            operator_class=ClusterEmbeddings,
            operator_name=f"cluster_embeddings_{n_clusters}clusters",
            runtime_kwargs={"input_data": embeddings}
        )

        result["data_description"] = f"{n_clusters} clusters, 200 samples each"
        results.append(result)

    return results

4. Report Generation

Creates comprehensive reports:

from benchmark.report import BenchmarkReport

report = BenchmarkReport()
report.extend(benchmark_results)
report.save_json("results.json")
report.save_markdown("results.md")

Example: Creating a New Benchmark

1. Create Benchmark File

# benchmark/operators/my_operator.py
from benchmark.data_generator import DataGenerator
from benchmark.profiler import Profiler
from feluda.operators import MyOperator

def benchmark() -> list[dict]:
    """Benchmark the MyOperator."""
    results = []

    # Test configurations
    configs = [
        {"param1": "value1", "param2": 10},
        {"param1": "value2", "param2": 20}
    ]

    for config in configs:
        result = Profiler.benchmark_operator(
            operator_class=MyOperator,
            operator_name="my_operator",
            runtime_kwargs={"input": "test_data", **config}
        )

        result["data_description"] = f"config: {config}"
        results.append(result)

    return results

2. Register the Operator

Add to benchmark/operators/__init__.py:

from . import my_operator

all_operators = [
    # ... existing operators
    my_operator,
]

Performance Metrics

Execution Metrics

  • Execution Time: Total wall clock time

  • CPU Time: Actual CPU processing time

  • Memory Change: Net memory allocation/deallocation

  • Peak Memory: Maximum memory usage

Quality Metrics

  • Success Rate: Percentage of successful runs

  • Error Details: Specific failure reasons

  • Consistency: Standard deviation of performance

Understanding Results

JSON Report Structure

{
  "system_info": {
    "platform": "macOS-14.6.0-x86_64",
    "cpu_count": 8,
    "total_memory_gb": 16.0
  },
  "statistics": {
    "cluster_embeddings_kmeans": {
      "avg_execution_time": 0.045,
      "avg_memory_change": 2.45,
      "success_rate": 1.0
    }
  }
}

Markdown Report

Provides human-readable summaries with:

  • System configuration details

  • Per-operator performance statistics

  • Success rates and error information

Dependencies

pip install psutil memory_profiler numpy pillow