OpenETL Logo

Benchmark Report – June 2025

Long Job – Local Spark Cluster
Short Job – No Spark Cluster

Benchmark Purpose: Evaluate long-running pipeline stability and measure resource utilization with Spark Master.

Pipeline Duration: 2025-06-11 22:08:46 to 2025-06-13 15:23:17

Total Records Processed: 2,020,000

Batch Size: 10,000 records

API Page Size: 30 records

ETL Host Instance: EC2 t3.large (7.56 GB usable RAM)

RAM Peak Usage: ~6 GB (Consistent)

Dockerized Infrastructure: Yes

Total Docker Containers: 10

Total Duration: 1 day, 17 hours, 14 minutes, 31 seconds

Data Source: Public EC2-hosted API (Ubuntu) - Free Tier

Data Target: PostgreSQL 15 (Aiven)

Benchmark Purpose: Test short job ETL performance without Spark Master.

Pipeline Duration: 2025-06-16 19:50:34.000 to 2025-06-17 00:46:27.000

Total Records Processed: 900,000

Batch Size: 100,000 records

API Page Size: 30 records

ETL Host Instance: EC2 t3.large (7.56 GB usable RAM)

RAM Peak Usage: ~5 GB (Consistent)

Dockerized Infrastructure: Yes

Total Docker Containers: 6

Total Duration: 4 hours, 55 minutes, 53 seconds

Data Source: Public EC2-hosted API (Ubuntu) - Free Tier

Data Target: PostgreSQL 15 (Aiven)