# Benchmarking Feluda includes a benchmarking system that measures performance characteristics of all operators, helping developers optimize performance and maintain consistent standards. ## Quick Start ### Run All Benchmarks ```bash cd benchmark python main.py ``` Generates timestamped reports in `benchmark/results/`. ### Run Specific Operator ```python from benchmark.operators import cluster_embeddings from benchmark.report import BenchmarkReport results = cluster_embeddings.benchmark() report = BenchmarkReport() report.extend(results) report.save_json("results.json") ``` ## System Components ### 1. Profiler Engine Measures execution time, memory usage, and CPU time: ```python from benchmark.profiler import Profiler # Profile a function @Profiler.profile def my_function(): # Your code here pass # Benchmark an operator result = Profiler.benchmark_operator( operator_class=MyOperator, operator_name="my_operator", runtime_kwargs={"input": "test_data"} ) ``` **Features:** - System information capture (CPU, memory, platform) - Memory usage tracking with peak detection - Wall clock and CPU time measurement ### 2. Data Generator Creates synthetic datasets for testing: ```python from benchmark.data_generator import DataGenerator # Generate embeddings embeddings = DataGenerator.generate_embeddings(1000, 512) # Generate clustered data embeddings, labels = DataGenerator.generate_embeddings_with_clusters( num_clusters=5, samples_per_cluster=200, dim=512 ) # Generate test images images = DataGenerator.generate_test_images("test_images/") ``` **Dataset Types:** - **Embeddings**: 100 to 100,000 samples, 256 to 2048 dimensions - **Images**: 128x128 to 4K resolutions, RGB/grayscale modes - **Clustered Data**: Known cluster structures for validation ### 3. Operator Benchmarks Each operator has a dedicated benchmark: ```python def benchmark() -> list[dict]: """Benchmark the ClusterEmbeddings operator.""" results = [] # Test with different dataset sizes for n_clusters in [3, 5, 10]: embeddings, labels = DataGenerator.generate_embeddings_with_clusters( num_clusters=n_clusters, samples_per_cluster=200 ) result = Profiler.benchmark_operator( operator_class=ClusterEmbeddings, operator_name=f"cluster_embeddings_{n_clusters}clusters", runtime_kwargs={"input_data": embeddings} ) result["data_description"] = f"{n_clusters} clusters, 200 samples each" results.append(result) return results ``` ### 4. Report Generation Creates comprehensive reports: ```python from benchmark.report import BenchmarkReport report = BenchmarkReport() report.extend(benchmark_results) report.save_json("results.json") report.save_markdown("results.md") ``` ## Example: Creating a New Benchmark ### 1. Create Benchmark File ```python # benchmark/operators/my_operator.py from benchmark.data_generator import DataGenerator from benchmark.profiler import Profiler from feluda.operators import MyOperator def benchmark() -> list[dict]: """Benchmark the MyOperator.""" results = [] # Test configurations configs = [ {"param1": "value1", "param2": 10}, {"param1": "value2", "param2": 20} ] for config in configs: result = Profiler.benchmark_operator( operator_class=MyOperator, operator_name="my_operator", runtime_kwargs={"input": "test_data", **config} ) result["data_description"] = f"config: {config}" results.append(result) return results ``` ### 2. Register the Operator Add to `benchmark/operators/__init__.py`: ```python from . import my_operator all_operators = [ # ... existing operators my_operator, ] ``` ## Performance Metrics ### Execution Metrics - **Execution Time**: Total wall clock time - **CPU Time**: Actual CPU processing time - **Memory Change**: Net memory allocation/deallocation - **Peak Memory**: Maximum memory usage ### Quality Metrics - **Success Rate**: Percentage of successful runs - **Error Details**: Specific failure reasons - **Consistency**: Standard deviation of performance ## Understanding Results ### JSON Report Structure ```json { "system_info": { "platform": "macOS-14.6.0-x86_64", "cpu_count": 8, "total_memory_gb": 16.0 }, "statistics": { "cluster_embeddings_kmeans": { "avg_execution_time": 0.045, "avg_memory_change": 2.45, "success_rate": 1.0 } } } ``` ### Markdown Report Provides human-readable summaries with: - System configuration details - Per-operator performance statistics - Success rates and error information ## Dependencies ```bash pip install psutil memory_profiler numpy pillow ```