ESTGV - DI - Artigo em revista científica, indexada ao WoS/Scopus
Permanent URI for this collection
Browse
Browsing ESTGV - DI - Artigo em revista científica, indexada ao WoS/Scopus by Title
Now showing 1 - 10 of 31
Results Per Page
Sort Options
- Adaptive and Scalable Database Management with Machine Learning Integration: A PostgreSQL Case StudyPublication . Abbasi, Maryam; Bernardo, Marco V.; Vaz, Paulo; Silva, José; Martins, Pedro; ANTUNES VAZ, PAULO JOAQUIM; Silva, JoséThe increasing complexity of managing modern database systems, particularly in terms of optimizing query performance for large datasets, presents significant challenges that traditional methods often fail to address. This paper proposes a comprehensive framework for integrating advanced machine learning (ML) models within the architecture of a database management system (DBMS), with a specific focus on PostgreSQL. Our approach leverages a combination of supervised and unsupervised learning techniques to predict query execution times, optimize performance, and dynamically manage workloads. Unlike existing solutions that address specific optimization tasks in isolation, our framework provides a unified platform that supports real-time model inference and automatic database configuration adjustments based on workload patterns. A key contribution of our work is the integration of ML capabilities directly into the DBMS engine, enabling seamless interaction between the ML models and the query optimization process. This integration allows for the automatic retraining of models and dynamic workload management, resulting in substantial improvements in both query response times and overall system throughput. Our evaluations using the Transaction Processing Performance Council Decision Support (TPC-DS) benchmark dataset at scale factors of 100 GB, 1 TB, and 10 TB demonstrate a reduction of up to 42% in query execution times and a 74% improvement in throughput compared with traditional approaches. Additionally, we address challenges such as potential conflicts in tuning recommendations and the performance overhead associated with ML integration, providing insights for future research directions. This study is motivated by the need for autonomous tuning mechanisms to manage large-scale, hetero geneous workloads while answering key research questions, such as the following: (1) How can machine learning models be integrated into a DBMS to improve query optimization and workload management? (2) What performance improvements can be achieved through dynamic configuration tuning based on real-time workload patterns? Our results suggest that the proposed framework significantly reduces the need for manual database administration while effectively adapting to evolving workloads, offering a robust solution for modern large-scale data environments.
- Agile-based Requirements Engineering for Machine Learning: A Case Study on Personalized NutritionPublication . Cunha, Carlos; Oliveira, Rafael; Duarte, RuiRequirements engineering is crucial in developing machine learning systems, as it establishes the foundation for successful project execution. Nevertheless, incorporating requirements engineering approaches from traditional software engineering into machine learning projects presents new challenges. These challenges arise from replacing the software logic derived from static software specifications with dynamic software logic derived from data. This paper presents a case study exploring an agile requirement engineering approach popular in traditional software projects to specify requirements in machine learning software. These requirements allow reasoning about the correctness of software and design tests for validation. The absence of software specification in machine learning software is offset by employing data quality metrics, which are assessed using cutting-edge methods for model interpretability. A case study on personalized nutrition and physical activity demonstrated the adequacy of user stories and acceptance criteria format, popular in agile projects, for specifying requirements in the machine learning domain.
- An automated closed-loop framework to enforce security policies from anomaly detectionPublication . Henriques, João; Caldeira, Filipe; Cruz, Tiago; Simões, PauloDue to the growing complexity and scale of IT systems, there is an increasing need to automate and streamline routine maintenance and security management procedures, to reduce costs and improve productivity. In the case of security incidents, the implementation and application of response actions require significant efforts from operators and developers in translating policies to code. Even if Machine Learning (ML) models are used to find anomalies, they need to be regularly trained/updated to avoid becoming outdated. In an evolving environment, a ML model with outdated training might put at risk the organization it was supposed to defend. To overcome those issues, in this paper we propose an automated closed-loop process with three stages. The first stage focuses on obtaining the Decision Trees (DT) that classify anomalies. In the second stage, DTs are translated into security Policies as Code based on languages recognized by the Policy Engine (PE). In the last stage, the translated security policies feed the Policy Engines that enforce them by converting them into specific instruction sets. We also demonstrate the feasibility of the proposed framework, by presenting an example that encompasses the three stages of the closed-loop process. The proposed framework may integrate a broad spectrum of domains and use cases, being able for instance to support the decide and the act stages of the ETSI Zero-touch Network & Service Management (ZSM) framework.
- Automated Reusable Tests for Mitigating Secure Pattern Interpretation ErrorsPublication . Cunha, Carlos; Pombo, NunoThe importance of software security has increased along with the number and severity of incidents in recent years. Security is a multidisciplinary aspect of the software development lifecycle, operation, and user utilization. Being a complex and specialized area of software engineering, it is often sidestepped in software development methodologies and processes. We address software security at the design level by adopting design patterns that encapsulate reusable solutions for recurring security problems. Design patterns can help development teams implement the best-proven solutions for a specialized problem domain. However, from the analysis of three secure pattern implementations by 70 junior programmers, we detected several structural errors resulting from their interpretation. We propose reusable unit testing test cases based on annotations to avoid secure pattern interpretation errors and provide an example for one popular secure pattern. Providing these test cases to the same group of programmers, they implemented the pattern without errors. The reason is annotations build a framework that disciplines programmers to incorporate secure patterns in their applications and ensure automatic testing.
- Automatic Camera Calibration Using a Single Image to extract Intrinsic and Extrinsic ParametersPublication . P. Duarte, Rui; Cunha, Carlos; Pereira Cardoso, José CarlosThis article presents a methodology for accurately locating vanishing points in undistorted images, enabling the determination of a camera's intrinsic and extrinsic parameters as well as facilitating measurements within the image. Additionally, the development of a vanishing point filtering algorithm is introduced. The algorithm's effectiveness is validated by extracting real-world coordinates using only three points and their corresponding distances. Finally, the obtained vanishing points are compared with extrinsic parameters derived from multiple objects and with intrinsic parameters obtained from various shapes and images sourced from different test sites. Results show that through a single image, the intrinsic parameters are extracted accurately. Moreover, Using 3 points to determine the extrinsic parameters is an excellent alternative to the checkerboard, making the method more practical since it does not imply the manual positioning of the checkerboard to perform the camera calibration.
- Combining K-Means and XGBoost Models for Anomaly Detection Using Log DatasetsPublication . Henriques, João; Caldeira, Filipe; Cruz, Tiago; Simões, PauloAbstract: Computing and networking systems traditionally record their activity in log files, which have been used for multiple purposes, such as troubleshooting, accounting, post-incident analysis of security breaches, capacity planning and anomaly detection. In earlier systems those log files were processed manually by system administrators, or with the support of basic applications for filtering, compiling and pre-processing the logs for specific purposes. However, as the volume of these log files continues to grow (more logs per system, more systems per domain), it is becoming increasingly difficult to process those logs using traditional tools, especially for less straightforward purposes such as anomaly detection. On the other hand, as systems continue to become more complex, the potential of using large datasets built of logs from heterogeneous sources for detecting anomalies without prior domain knowledge becomes higher. Anomaly detection tools for such scenarios face two challenges. First, devising appropriate data analysis solutions for effectively detecting anomalies from large data sources, possibly without prior domain knowledge. Second, adopting data processing platforms able to cope with the large datasets and complex data analysis algorithms required for such purposes. In this paper we address those challenges by proposing an integrated scalable framework that aims at efficiently detecting anomalous events on large amounts of unlabeled data logs. Detection is supported by clustering and classification methods that take advantage of parallel computing environments. We validate our approach using the the well known NASA Hypertext Transfer Protocol (HTTP) logs datasets. Fourteen features were extracted in order to train a k-means model for separating anomalous and normal events in highly coherent clusters. A second model, making use of the XGBoost system implementing a gradient tree boosting algorithm, uses the previous binary clustered data for producing a set of simple interpretable rules. These rules represent the rationale for generalizing its application over a massive number of unseen events in a distributed computing environment. The classified anomaly events produced by our framework can be used, for instance, as candidates for further forensic and compliance auditing analysis in security management.
- Comprehensive Evaluation of Deepfake Detection Models: Accuracy, Generalization, and Resilience to Adversarial AttacksPublication . Abbasi, Maryam; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Martins, PedroThe rise of deepfakes—synthetic media generated using artificial intelli gence—threatens digital content authenticity, facilitating misinformation and manipu lation. However, deepfakes can also depict real or entirely fictitious individuals, leveraging state-of-the-art techniques such as generative adversarial networks (GANs) and emerging diffusion-based models. Existing detection methods face challenges with generalization across datasets and vulnerability to adversarial attacks. This study focuses on subsets of frames extracted from the DeepFake Detection Challenge (DFDC) and FaceForensics++ videos to evaluate three convolutional neural network architectures—XCeption, ResNet, and VGG16—for deepfake detection. Performance metrics include accuracy, precision, F1-score, AUC-ROC, and Matthews Correlation Coefficient (MCC), combined with an assessment of resilience to adversarial perturbations via the Fast Gradient Sign Method (FGSM). Among the tested models, XCeption achieves the highest accuracy (89.2% on DFDC), strong generalization, and real-time suitability, while VGG16 excels in precision and ResNet provides faster inference. However, all models exhibit reduced performance under adversarial conditions, underscoring the need for enhanced resilience. These find ings indicate that robust detection systems must consider advanced generative approaches, adversarial defenses, and cross-dataset adaptation to effectively counter evolving deep fake threats
- Data Privacy and Ethical Considerations in Database ManagementPublication . Pina, Eduardo; Ramos, José; Jorge, Henrique; ANTUNES VAZ, PAULO JOAQUIM; Vaz, Paulo; Silva, José; Wanzeller, Cristina; Abbasi, Maryam; Martins, Pedro; Silva, José; Wanzeller Guedes de Lacerda, Ana CristinaData privacy and ethical considerations ensure the security of databases by respecting individual rights while upholding ethical considerations when collecting, managing, and using information. Nowadays, despite having regulations that help to protect citizens and organizations, we have been presented with thousands of instances of data breaches, unauthorized access, and misuse of data related to such individuals and organizations. In this paper, we propose ethical considerations and best practices associated with critical data and the role of the database administrator who helps protect data. First, we suggest best practices for database administrators regarding data minimization, anonymization, pseudonymization and encryption, access controls, data retention guidelines, and stakeholder communication. Then, we present a case study that illustrates the application of these ethical implementations and best practices in a real-world scenario, showing the approach in action and the benefits of privacy. Finally, the study highlights the importance of a comprehensive approach to deal with data protection challenges and provides valuable insights for future research and developments in this field
- Enhancing quality of life: Human-centered design of mobile and smartwatch applications for assisted ambient livingPublication . Augusto, Gonçalo F.; P. Duarte, Rui; Cunha, CarlosBackground: Assisted ambient living interfaces are technologies designed to improve the quality of life for people who require assistance with daily activities. They are crucial for individuals to maintain their independence for as long as possible. To this end, these interfaces have to be user-friendly, intuitive, and accessible, even for those who are not techsavvy. Research in recent years indicates that people find it uncomfortable to wear invasive or large intrusive devices to monitor health status, and poor user interface design implies a lack of user engagement. Methods: This paper presents the design and implementation of non-intrusive mobile and smartwatch applications for detecting older adults when executing their routines. The solution uses an intuitive mobile application to set up beacons and incorporates biometric data acquired from the smartwatch to measure bio-signals correlated to the user’s location. User testing and interface evaluation are carried out using the User Experience Questionnaire (UEQ). Results: Six older adults participated in the evaluation of the interfaces. Results show that users found the interaction to be excellent in all the parameters of the UEQ in the evaluation of the mobile interface. For the smartwatch application, results vary from above average to excellent. Conclusions: The applications are intuitive and easy to use, and data obtained from integrating systems is essential to link information and provide feedback to the user.
- Enhancing Visual Perception in Immersive VR and AR Environments: AI-Driven Color and Clarity Adjustments Under Dynamic Lighting ConditionsPublication . Abbasi, Maryam; Silva, José; Martins, Pedro; ANTUNES VAZ, PAULO JOAQUIM; Silva, JoséThe visual fidelity of virtual reality (VR) and augmented reality (AR) environments is essential for user immersion and comfort. Dynamic lighting often leads to chromatic distortions and reduced clarity, causing discomfort and disrupting user experience. This paper introduces an AI-driven chromatic adjustment system based on a modified U-Net architecture, optimized for real-time applications in VR/AR. This system adapts to dynamic lighting conditions, addressing the shortcomings of traditional methods like histogram equalization and gamma correction, which struggle with rapid lighting changes and real-time user interactions. We compared our approach with state-of-the-art color constancy algorithms, including Barron’s Convolutional Color Constancy and STAR, demonstrating superior performance. Experimental results from 60 participants show significant improvements, with up to 41% better color accuracy and 39% enhanced clarity under dynamic lighting conditions. The study also included eye-tracking data, which confirmed increased user engagement with AI-enhanced images. Our system provides a practical solution for developers aiming to improve image quality, reduce visual discomfort, and enhance overall user satisfaction in immersive environments. Future work will focus on extending the model’s capability to handle more complex lighting scenarios.