ESTGV - DEMGI - Artigo em revista científica, indexada ao WoS/Scopus

Permanent URI for this collection

http://hdl.handle.net/10400.19/2475

Browse

Now showing 1 - 10 of 15

A Practical Performance Benchmark of Post-Quantum Cryptography Across Heterogeneous Computing Environments
Publication . Abbasi, Maryam; Cardoso, Filipe; Vaz, Paulo; Silva, José; Martins, Pedro
The emergence of large-scale quantum computing presents an imminent threat to contemporary public-key cryptosystems, with quantum algorithms such as Shor’s algorithm capable of efficiently breaking RSA and elliptic curve cryptography (ECC). This vulnerability has catalyzed accelerated standardization efforts for post-quantum cryptography (PQC) by the U.S. National Institute of Standards and Technology (NIST) and global security stakeholders. While theoretical security analysis of these quantum-resistant algorithms has advanced considerably, comprehensive real-world performance benchmarks spanning diverse computing environments—from high-performance cloud infrastructure to severely resource-constrained IoT devices—remain insufficient for informed deployment planning. This paper presents the most extensive cross-platform empirical evaluation to date of NIST selected PQC algorithms, including CRYSTALS-Kyber and NTRU for key encapsulation mechanisms (KEMs), alongside BIKE as a code-based alternative, and CRYSTALS-Di lithium and Falcon for digital signatures. Our systematic benchmarking framework measures computational latency, memory utilization, key sizes, and protocol overhead across multiple security levels (NIST Levels 1, 3, and 5) in three distinct hardware environments and various network conditions. Results demonstrate that contemporary server architectures can implement these algorithms with negligible performance impact (<5% additional latency), making immediate adoption feasible for cloud services. In contrast, resource-constrained devices experience more significant overhead, with computational demands varying by up to 12× between algorithms at equivalent security levels, highlighting the importance of algorithm selection for edge deployments. Beyond standalone algorithm performance, we analyze integration challenges within existing security protocols, revealing that naive implementation of PQC in TLS 1.3 can increase handshake size by up to 7× compared to classical approaches. To address this, we propose and evaluate three optimization strategies that reduce bandwidth requirements by 40–60% without compromising security guarantees. Our investigation further encompasses memory-constrained implementation techniques, side-channel resistance measures, and hybrid classical-quantum approaches for transitional deployments. Based on these comprehensive findings, we present a risk based migration framework and algorithm selection guidelines tailored to specific use cases, including financial transactions, secure firmware updates, vehicle-to-infrastructure communications, and IoT fleet management. This practical roadmap enables organizations to strategically prioritize systems for quantum-resistant upgrades based on data sensitivity, resource constraints, and technical feasibility. Our results conclusively demonstrate that PQC is deployment-ready for most applications, provided that implementations are carefully optimized for the specific performance characteristics and security requirements of target environments. We also identify several remaining research challenges for the community, including further optimization for ultra-constrained devices, standardization of hybrid schemes, and hardware acceleration opportunities.
2025-05-21Text Open access Show more
Performance and Scalability of Data Cleaning and Preprocessing Tools: A Benchmark on Large Real-World Datasets
Publication . Martins, Pedro; Cardoso, Filipe; Vaz, Paulo; Silva, José; Abbasi, Maryam
Data cleaning remains one of the most time-consuming and critical steps in modern data science, directly influencing the reliability and accuracy of downstream analytics. In this paper, we present a comprehensive evaluation of five widely used data cleaning tools—OpenRefine, Dedupe, Great Expectations, TidyData (PyJanitor), and a baseline Pandas pipeline—applied to large-scale, messy datasets spanning three domains (healthcare, finance, and industrial telemetry). We benchmark each tool on dataset sizes ranging from 1 million to 100 million records, measuring execution time, memory usage, error detection accuracy, and scalability under increasing data volumes. Additionally, we assess qualitative aspects such as usability and ease of integration, reflecting realworld adoption concerns. We incorporate recent findings on parallelized data cleaning and highlight how domain-specific anomalies (e.g., negative amounts in finance, sensor corruption in industrial telemetry) can significantly impact tool choice. Our findings reveal that no single solution excels across all metrics; while Dedupe provides robust duplicate detection and Great Expectations offers in-depth rule-based validation, tools like TidyData and baseline Pandas pipelines demonstrate strong scalability and flexibility under chunkbased ingestion. The choice of tool ultimately depends on domain-specific requirements (e.g., approximate matching in finance and strict auditing in healthcare) and the magnitude of available computational resources. By highlighting each framework’s strengths and limitations, this study offers data practitioners clear, evidence-driven guidance for selecting and combining tools to tackle large-scale data cleaning challenges
2025-05-18Text Open access Show more
Adaptive and Scalable Database Management with Machine Learning Integration: A PostgreSQL Case Study
Publication . Abbasi, Maryam; Bernardo, Marco V.; Vaz, Paulo; Silva, José; Martins, Pedro; ANTUNES VAZ, PAULO JOAQUIM; Silva, José
The increasing complexity of managing modern database systems, particularly in terms of optimizing query performance for large datasets, presents significant challenges that traditional methods often fail to address. This paper proposes a comprehensive framework for integrating advanced machine learning (ML) models within the architecture of a database management system (DBMS), with a specific focus on PostgreSQL. Our approach leverages a combination of supervised and unsupervised learning techniques to predict query execution times, optimize performance, and dynamically manage workloads. Unlike existing solutions that address specific optimization tasks in isolation, our framework provides a unified platform that supports real-time model inference and automatic database configuration adjustments based on workload patterns. A key contribution of our work is the integration of ML capabilities directly into the DBMS engine, enabling seamless interaction between the ML models and the query optimization process. This integration allows for the automatic retraining of models and dynamic workload management, resulting in substantial improvements in both query response times and overall system throughput. Our evaluations using the Transaction Processing Performance Council Decision Support (TPC-DS) benchmark dataset at scale factors of 100 GB, 1 TB, and 10 TB demonstrate a reduction of up to 42% in query execution times and a 74% improvement in throughput compared with traditional approaches. Additionally, we address challenges such as potential conflicts in tuning recommendations and the performance overhead associated with ML integration, providing insights for future research directions. This study is motivated by the need for autonomous tuning mechanisms to manage large-scale, hetero geneous workloads while answering key research questions, such as the following: (1) How can machine learning models be integrated into a DBMS to improve query optimization and workload management? (2) What performance improvements can be achieved through dynamic configuration tuning based on real-time workload patterns? Our results suggest that the proposed framework significantly reduces the need for manual database administration while effectively adapting to evolving workloads, offering a robust solution for modern large-scale data environments.
2024-09-18Text Open access Show more
Machine Learning Approaches for Predicting Maize Biomass Yield: Leveraging Feature Engineering and Comprehensive Data Integration
Publication . Abbasi, Maryam; Vaz, Paulo; Silva, José; Martins, Pedro; Silva, José; ANTUNES VAZ, PAULO JOAQUIM
The efficient prediction of corn biomass yield is critical for optimizing crop production and addressing global challenges in sustainable agriculture and renewable energy. This study employs advanced machine learning techniques, including Gradient Boosting Machines (GBMs), Random Forests (RFs), Support Vector Machines (SVMs), and Artificial Neural Networks (ANNs), integrated with comprehensive environmental, soil, and crop management data from key agricultural regions in the United States. A novel framework combines feature engineering, such as the creation of a Soil Fertility Index (SFI) and Growing Degree Days (GDDs), and the incorporation of interaction terms to address complex non-linear relationships between input variables and biomass yield. We conduct extensive sensitivity analysis and employ SHAP (SHapley Additive exPlanations) values to enhance model interpretability, identifying SFI, GDDs, and cumulative rainfall as the most influential features driving yield outcomes. Our findings highlight significant synergies among these variables, emphasizing their critical role in rural environmental governance and precision agriculture. Furthermore, an ensemble approach combining GBMs, RFs, and ANNs outperformed individual models, achieving an RMSE of 0.80 t/ha and R2 of 0.89. These results underscore the potential of hybrid modeling for real-world applications in sustainable farming practices. Addressing the concerns of passive farmer participation, we propose targeted incentives, education, and institutional support mechanisms to enhance stakeholder collaboration in rural environmental governance. While the models assume rational decision-making, the inclusion of cultural and political factors warrants further investigation to improve the robustness of the framework. Additionally, a map of the study region and improved visualizations of feature importance enhance the clarity and relevance of our findings. This research contributes to the growing body of knowledge on predictive modeling in agriculture, combining theoretical rigor with practical insights to support policymakers and stakeholders in optimizing resource use and addressing environ mental challenges. By improving the interpretability and applicability of machine learning models, this study provides actionable strategies for enhancing crop yield predictions and advancing rural environmental governance.
2025-01-02Text Open access Show more
Data Privacy and Ethical Considerations in Database Management
Publication . Pina, Eduardo; Ramos, José; Jorge, Henrique; ANTUNES VAZ, PAULO JOAQUIM; Vaz, Paulo; Silva, José; Wanzeller, Cristina; Abbasi, Maryam; Martins, Pedro; Silva, José; Wanzeller Guedes de Lacerda, Ana Cristina
Data privacy and ethical considerations ensure the security of databases by respecting individual rights while upholding ethical considerations when collecting, managing, and using information. Nowadays, despite having regulations that help to protect citizens and organizations, we have been presented with thousands of instances of data breaches, unauthorized access, and misuse of data related to such individuals and organizations. In this paper, we propose ethical considerations and best practices associated with critical data and the role of the database administrator who helps protect data. First, we suggest best practices for database administrators regarding data minimization, anonymization, pseudonymization and encryption, access controls, data retention guidelines, and stakeholder communication. Then, we present a case study that illustrates the application of these ethical implementations and best practices in a real-world scenario, showing the approach in action and the benefits of privacy. Finally, the study highlights the importance of a comprehensive approach to deal with data protection challenges and provides valuable insights for future research and developments in this field
2024-07-26Text Open access Show more
Enhancing Visual Perception in Immersive VR and AR Environments: AI-Driven Color and Clarity Adjustments Under Dynamic Lighting Conditions
Publication . Abbasi, Maryam; Silva, José; Martins, Pedro; ANTUNES VAZ, PAULO JOAQUIM; Silva, José
The visual fidelity of virtual reality (VR) and augmented reality (AR) environments is essential for user immersion and comfort. Dynamic lighting often leads to chromatic distortions and reduced clarity, causing discomfort and disrupting user experience. This paper introduces an AI-driven chromatic adjustment system based on a modified U-Net architecture, optimized for real-time applications in VR/AR. This system adapts to dynamic lighting conditions, addressing the shortcomings of traditional methods like histogram equalization and gamma correction, which struggle with rapid lighting changes and real-time user interactions. We compared our approach with state-of-the-art color constancy algorithms, including Barron’s Convolutional Color Constancy and STAR, demonstrating superior performance. Experimental results from 60 participants show significant improvements, with up to 41% better color accuracy and 39% enhanced clarity under dynamic lighting conditions. The study also included eye-tracking data, which confirmed increased user engagement with AI-enhanced images. Our system provides a practical solution for developers aiming to improve image quality, reduce visual discomfort, and enhance overall user satisfaction in immersive environments. Future work will focus on extending the model’s capability to handle more complex lighting scenarios.
2024-11-03Text Open access Show more
Real-Time Gesture-Based Hand Landmark Detection for Optimized Mobile Photo Capture and Synchronization
Publication . Marques, Pedro; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Martins, Pedro; Abbasi, Maryam
Gesture recognition technology has emerged as a transformative solution for natural and intuitive human–computer interaction (HCI), offering touch-free operation across diverse fields such as healthcare, gaming, and smart home systems. In mobile contexts, where hygiene, convenience, and the ability to operate under resource constraints are critical, hand gesture recognition provides a compelling alternative to traditional touch based interfaces. However, implementing effective gesture recognition in real-world mobile settings involves challenges such as limited computational power, varying environmen tal conditions, and the requirement for robust offline–online data management. In this study, we introduce ThumbsUp, which is a gesture-driven system, and employ a partially systematic literature review approach (inspired by core PRISMA guidelines) to identify the key research gaps in mobile gesture recognition. By incorporating insights from deep learning–based methods (e.g., CNNs and Transformers) while focusing on low resource consumption, we leverage Google’s MediaPipe in our framework for real-time detection of 21 hand landmarks and adaptive lighting pre-processing, enabling accurate recogni tion of a “thumbs-up” gesture. The system features a secure queue-based offline–cloud synchronization model, which ensures that the captured images and metadata (encrypted with AES-GCM) remain consistent and accessible even with intermittent connectivity. Ex perimental results under dynamic lighting, distance variations, and partially cluttered environments confirm the system’s superior low-light performance and decreased resource consumption compared to baseline camera applications. Additionally, we highlight the feasibility of extending ThumbsUp to incorporate AI-driven enhancements for abrupt lighting changes and, in the future, electromyographic (EMG) signals for users with mo tor impairments. Our comprehensive evaluation demonstrates that ThumbsUp maintains robust performance on typical mobile hardware, showing resilience to unstable network conditions and minimal reliance on high-end GPUs. These findings offer new perspectives for deploying gesture-based interfaces in the broader IoT ecosystem, thus paving the way toward secure, efficient, and inclusive mobile HCI solutions.
2025-02-12Text Open access Show more
Head-to-Head Evaluation of FDM and SLA in Additive Manufacturing: Performance, Cost, and Environmental Perspectives
Publication . Abbasi, Maryam; ANTUNES VAZ, PAULO JOAQUIM; Martins, Pedro; Silva, José
This paper conducts a comprehensive experimental comparison of two widely used additive manufacturing (AM) processes, Fused Deposition Modeling (FDM) and Stereolithography (SLA), under standardized conditions using the same test geometries and protocols. FDM parts were printed with both Polylactic Acid (PLA) and Acryloni trile Butadiene Styrene (ABS) filaments, while SLA used a general-purpose photopolymer resin. Quantitative evaluations included surface roughness, dimensional accuracy, ten sile properties, production cost, and energy consumption. Additionally, environmental considerations and process reliability were assessed by examining waste streams, recy clability, and failure rates. The results indicate that SLA achieves superior surface quality (Ra ≈ 2 µm vs. 12–13 µm) and dimensional tolerances (±0.05 mm vs. ±0.15–0.20 mm), along with higher tensile strength (up to 70 MPa). However, FDM provides notable ad vantages in cost (approximately 60% lower on a per-part basis), production speed, and energy efficiency. Moreover, from an environmental perspective, FDM is more favorable when using biodegradable PLA or recyclable ABS, whereas SLA resin waste is hazardous. Overall, the study highlights that no single process is universally superior. FDM offers a rapid, cost-effective solution for prototyping, while SLA excels in precision and surface finish. By presenting a detailed, data-driven comparison, this work guides engineers, product designers, and researchers in choosing the most suitable AM technology for their specific needs.
2025-02-19Text Open access Show more
Comprehensive Evaluation of Deepfake Detection Models: Accuracy, Generalization, and Resilience to Adversarial Attacks
Publication . Abbasi, Maryam; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Martins, Pedro
The rise of deepfakes—synthetic media generated using artificial intelli gence—threatens digital content authenticity, facilitating misinformation and manipu lation. However, deepfakes can also depict real or entirely fictitious individuals, leveraging state-of-the-art techniques such as generative adversarial networks (GANs) and emerging diffusion-based models. Existing detection methods face challenges with generalization across datasets and vulnerability to adversarial attacks. This study focuses on subsets of frames extracted from the DeepFake Detection Challenge (DFDC) and FaceForensics++ videos to evaluate three convolutional neural network architectures—XCeption, ResNet, and VGG16—for deepfake detection. Performance metrics include accuracy, precision, F1-score, AUC-ROC, and Matthews Correlation Coefficient (MCC), combined with an assessment of resilience to adversarial perturbations via the Fast Gradient Sign Method (FGSM). Among the tested models, XCeption achieves the highest accuracy (89.2% on DFDC), strong generalization, and real-time suitability, while VGG16 excels in precision and ResNet provides faster inference. However, all models exhibit reduced performance under adversarial conditions, underscoring the need for enhanced resilience. These find ings indicate that robust detection systems must consider advanced generative approaches, adversarial defenses, and cross-dataset adaptation to effectively counter evolving deep fake threats
2025-01-25Text Open access Show more
A Simulation of Data Censored Rigth Type I with Weibull Distribution
Publication . Gaspar, Daniel; Andrande Ferreira, Luis
In the maintenance and reliability field, there are frequent analyses with data being censored. In reliability research, many articles do simulation, but few explain how they do it. the loss of information resulting from the unavailable exact failure times will impact negatively the efficiency of reliability analysis. This paper presents four different algorithms to generate random data with a different number of censored values. The four algorithms are compared, and tree parameters are used to select the best one. The Weibull distribution is used to generate the random numbers because it is one of the most used in reliability studies. The results of the algorithm chosen are very relevant; with a sample of n = 50 and a number of cycles of simulations m = 1000, the standard deviation is higher when the shape factor of Weibull distribution is beta = 0.5 and slowly decreases until the shape factor equals 5. The percentage error (PE), one of the indicators selected, is much higher when the percentage of censored data is c = 5%, then goes down when the shape factor increases. There is a different behaviour when censored data is C = 20% and the percentage error (PE) is higher when shape factor is beta = 1.5. This article presents an algorithm that it considers the best for simulating right-censored type-I data. The algorithm has excellent accuracy, random data i.i.d and excellent computational performance.
2022-11Journal article Open access Show more

Browse

Recent Submissions