Performance Comparison of Python-Based Complex Event Processing Engines for IoT Intrusion Detection: Faust Versus Streamz

Abbasi, Maryam; Cardoso, Filipe; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Sá, Filipe; Martins, Pedro

Publicação

Performance Comparison of Python-Based Complex Event Processing Engines for IoT Intrusion Detection: Faust Versus Streamz

2026-03-23Texto

dc.contributor.author	Abbasi, Maryam
dc.contributor.author	Cardoso, Filipe
dc.contributor.author	ANTUNES VAZ, PAULO JOAQUIM
dc.contributor.author	Silva, José
dc.contributor.author	Sá, Filipe
dc.contributor.author	Martins, Pedro
dc.date.accessioned	2026-04-08T13:33:44Z
dc.date.available	2026-04-08T13:33:44Z
dc.date.issued	2026-03-23
dc.description.abstract	The proliferation of Internet of Things (IoT) devices has intensified the need for efficient real-time anomaly and intrusion detection, making the selection of an appropriate Complex Event Processing (CEP) engine a critical architectural decision for security-aware data pipelines. Python-based CEP frameworks offer compelling advantages through the seamless integration with data science and machine learning ecosystems; however, rigorous comparative evaluations of such frameworks under realistic IoT security workloads remain absent from the literature. This study presents the first systematic comparative evaluation of Faust and Streamz—two Python-native CEP engines representing fundamentally different architectural philosophies—specifically in the context of IoT network intrusion detection. Faust was selected for its actor-based stateful processing model with native Kafka integration and distributed table support, while Streamz was selected for its reactive, lightweight pipeline design targeting high-throughput stateless processing, making them representative of the two dominant paradigms in Python stream processing. Although both engines target different application niches, their performance characteristics under realistic CEP workloads have never been rigorously compared, leaving practitioners without empirical guidance. The primary evaluation employs an IoT network intrusion dataset comprising 583,485 events from 83 heterogeneous devices. To assess whether the observed performance characteristics are specific to this single dataset or generalize across different workload profiles, a secondary IoT-adjacent benchmark is included: the PaySim financial transaction dataset (6.4 million records), selected because its event schema, fraud-pattern temporal structure, and volume differ substantially from the intrusion dataset, providing a stress test for cross-workload robustness rather than a claim of domain equivalence. We acknowledge the reviewer’s valid point that a second IoT-specific intrusion dataset (such as TON_IoT or Bot-IoT) would constitute a more directly comparable validation; this is identified as a priority for future work. The load levels used in scalability experiments (up to 5000 events per second) intentionally exceed the dataset’s natural rate to stress-test each engine’s architectural ceiling and identify saturation thresholds relevant to large-scale or multi-sensor IoT deployments. We conducted controlled experiments with comprehensive statistical analysis. Our results demonstrate that Streamz achieves superior throughput at 4450 events per second with 89% efficiency and minimal resource consumption (40 MB memory, 12 ms median latency), while Faust provides robust intrusion pattern detection with 93–98% accuracy and stable, predictable resource utilization (1.4% CPU standard deviation). A multi-framework comparison including Apache Kafka Streams and offline scikit-learn baselines confirms that Faust achieves detection quality competitive with JVM-based alternatives (Faust: 96.2%; Kafka Streams: 96.8%; absolute difference of 0.6 percentage points, not statistically significant at p = 0.318) while retaining the Python ecosystem advantages. Statistical analysis confirms significant performance differences across all metrics (p < 0.001, Cohen’s d > 0.8). Critical scalability thresholds are identified: Streamz maintains efficiency above 95% up to 3500 events per second, while Faust degrades beyond 2500 events per second. These findings provide IoT security engineers and system architects with actionable, empirically grounded guidance for CEP engine selection, establish reproducible benchmarking methodology applicable to futurePython-based stream processing evaluations, and advance theoretical understanding of the accuracy–throughput trade-off in stateful versus stateless Python CEP architectures.	eng
dc.identifier.citation	Abbasi, M., Cardoso, F., Váz, P., Silva, J., Sá, F., & Martins, P. (2026). Performance Comparison of Python-Based Complex Event Processing Engines for IoT Intrusion Detection: Faust Versus Streamz. Computers, 15(3), 200. https://doi.org/10.3390/computers15030200
dc.identifier.doi	https://doi.org/10.3390/computers15030200
dc.identifier.eissn	2073-431X
dc.identifier.uri	http://hdl.handle.net/10400.19/9760
dc.language.iso	eng
dc.peerreviewed	yes
dc.publisher	MDPI
dc.relation.hasversion	https://www.mdpi.com/2073-431X/15/3/200
dc.rights.uri	http://creativecommons.org/licenses/by/4.0/
dc.subject	complex event processing
dc.subject	IoT intrusion detection
dc.subject	stream processing
dc.subject	performance evaluation
dc.subject	Python
dc.subject	Faust
dc.subject	Streamz
dc.subject	benchmarking
dc.subject	real-time systems
dc.subject	anomaly detection
dc.subject	scalability
dc.subject	Kafka
dc.title	Performance Comparison of Python-Based Complex Event Processing Engines for IoT Intrusion Detection: Faust Versus Streamz	por
dc.type	text
dspace.entity.type	Publication
oaire.citation.issue	3
oaire.citation.startPage	200
oaire.citation.title	Computers
oaire.citation.volume	15
oaire.version	http://purl.org/coar/version/c_970fb48d4fbd8a85
person.familyName	ANTUNES VAZ
person.familyName	Silva
person.familyName	Sá
person.givenName	PAULO JOAQUIM
person.givenName	José
person.givenName	Filipe
person.identifier	R-00H-E4X
person.identifier.ciencia-id	351C-9899-0EE7
person.identifier.ciencia-id	4A14-D3E7-5B32
person.identifier.ciencia-id	791E-0243-634F
person.identifier.orcid	0000-0002-1745-8937
person.identifier.orcid	0000-0001-7285-8282
person.identifier.orcid	0000-0002-7846-8397
person.identifier.scopus-author-id	55447844100
person.identifier.scopus-author-id	8447524700
relation.isAuthorOfPublication	702e79ee-5b0b-47ff-989d-12e6d8ea1e89
relation.isAuthorOfPublication	e9d8719e-af47-4008-b854-817801bb3964
relation.isAuthorOfPublication	9fb8350d-65a7-4170-b28f-cc60c70c0bb2
relation.isAuthorOfPublication.latestForDiscovery	702e79ee-5b0b-47ff-989d-12e6d8ea1e89

Ficheiros

Principais

A mostrar 1 - 1 de 1

Nome:: Performance Comparison of Python-Based Complex Event.pdf
Tamanho:: 322.34 KB
Formato:: Adobe Portable Document Format

Ver/Abrir

Licença

A mostrar 1 - 1 de 1

Nome:: license.txt
Tamanho:: 1.79 KB
Formato:: Item-specific license agreed upon to submission
Descrição:

Ver/Abrir

Coleções

ESTGV - DEMGI - Artigo em revista científica, indexada ao WoS/Scopus
CISeD - Artigo em revista científica, indexada ao WoS/Scopus
ESTGV - DI - Artigo em revista científica, indexada ao WoS/Scopus