Logo do repositório
 

ESTGV - DI - Artigo em revista científica, indexada ao WoS/Scopus

URI permanente para esta coleção:

Navegar

Entradas recentes

A mostrar 1 - 10 de 43
  • The use of Machine Learning in diabetes prevention
    Publication . Lopes, M.; Fialho, Joana; Wanzeller, Cristina; Autor correspondente: Fialho, Joana.; Fialho, Joana; Wanzeller Guedes de Lacerda, Ana Cristina
    Introdução: A Diabetes Mellitus é uma das doenças crónicas que mais crescem no mundo. Diante disso, técnicas de Aprendizagem de Máquina (Machine Learning - ML) oferecem potencial para a identificação de padrões relevantes ao controle da doença. Objetivo: Analisar o impacto de técnicas de ML e a utilização de técnicas de seleção de características na predição da diabetes, utilizando o conjunto de dados “Diabetes Health Indicators”. Métodos: Aplicou-se a metodologia CRISP-DM. Os dados foram equilibrados com a técnica de subamostragem NearMiss. Utilizaram-se a Eliminação Recursiva de Características (RFE) e a Análise de Componentes Principais (PCA) para a seleção de atributos. Foram testados seis modelos: Random Forest, Gradient Boosting, KNN, Regressão Logística, Perceptron Multicamadas (MLP) e Redes Neurais Recorrentes (RNN). Resultados: A RNN destacou-se com acurácia de 86,8% e F1-score de 0,868 em dados balanceados. A combinação de RFE com MLP também apresentou desempenho robusto. O equilíbrio de classes melhorou significativamente os resultados. Conclusão: As técnicas de ML e DL são promissoras para a triagem clínica e políticas públicas. É necessário aumentar a representatividade dos dados, incorporar IA explicável e calibrar limiares para reduzir os falsos negativos, que são essenciais para aplicações práticas.
  • Olive Tree (Olea europaea) Pruning Autohydrolysis: FTIR Analysis, and Energy Potential
    Publication . Domingos, idalina; Domingos Ferreira, Miguel; Ferreira, José; Esteves, Bruno
    Olive trees cultivated in the Viseu region (Portugal) were used in the present work. This study investigates the compositional characteristics and hydrothermal behavior of olive branches (OB) and olive leaves (OL) under autohydrolysis, aiming to assess their potential for biorefinery applications. Chemical analysis revealed that during autohydrolysis (140–180 ◦C, 15–30 min), OL exhibited greater solubilization than OB, consistent with their higher extractive content. Increasing the temperature promoted selective hemicellulose removal and partial cellulose degradation, leading to a relative enrichment of lignin in the solid residues. Nevertheless, the cellulose content of olive branches for 180 ◦C and 30 min hydrolysis increased. Fourier transform infrared spectroscopy confirmed progressive structural rearrangements, including enhanced hydroxyl exposure, carbonyl formation, and lignin condensation, indicating the transformation of the solid phase toward more aromatic and thermally stable structures. Autohydrolysis slightly increased the higher heating value of the solid residues while acid-catalyzed liquefaction markedly increased, exceeding those of both native and technical lignins. These results suggest extensive carbon enrichment and oxygen removal during liquefaction. Overall, autohydrolysis proved effective for hemicellulose solubilization and sugar recovery, while liquefaction favored energy densification and lignin condensation. The distinct behaviors of OB and OL highlight the importance of tailoring processing conditions to each feedstock type. Both materials show strong potential as renewable resources for bioenergy and value-added carbon-based products within an integrated olive biomass biorefinery framework
  • SSENPV's Integrated Management Platform
    Publication . Pinto, Bruno; Matos, Cristina Peixoto; Abrantes, Steven; Lourenço, Carolina; Fialho, Joana; Cravo, Ivone; Antunes, Maria José; Nascimento, Márcio
    At the Polytechnic of Viseu (PV) [1], the number of students has been increasing [2], and it is hoped that this trend continues, for the sake of the literacy of the population that serves in its area of coverage and for the reduction of the desertification of the interior region in which is inserted. In the case of Students with Specific Educational Needs (SSEN) who attend the PV (SSENPV), it is extremely important to develop procedures that minimize the anxiety brought about by change for these students, as well as to facilitate their adaptation, to make the period of permanence in Higher Education (HE) an inclusive period, generating well-being, promoting academic success, and facilitating the transition to active life. To combat this phenomenon and since the PV is a Higher Education Institution (HEI) that is guided by equity in its community, in particular the student community, the SSENPV census is a crucial measure insofar as it is necessary to implement procedures, which must respect and obey individual specificities. It is also intended that, regarding access to information on the platform, it will allow reducing asymmetries between students as well as access to services. To respond to this reality, within the scope of the Inova & Includes project. IPV I2 [3], a group of researchers, in partnership with the degree course in Computer Engineering at the Superior School of Technology and Management of Viseu (ESTGV) – curricular unit of “Project”, developed an integrated management platform for SSENPV. This platform, which is intended to be a contribution to true equity in education in the PV, is based on the support of social impact in different dimensions, which translates into the implementation of the following profiles: • Informative Profile: dissemination of legislation and other relevant information on Specific Educational Needs (SEN), ensuring centralized and accessible information management and streamlining procedures and support measures. • Academic Profile: registration and updating of data on the SSENPV. • Technical Evaluation and Follow-up Profile: registration of the SSENPV procedural evaluation, with automatic sending of technical evaluations to authorized users
  • Performance Comparison of Python-Based Complex Event Processing Engines for IoT Intrusion Detection: Faust Versus Streamz
    Publication . Abbasi, Maryam; Cardoso, Filipe; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Sá, Filipe; Martins, Pedro
    The proliferation of Internet of Things (IoT) devices has intensified the need for efficient real-time anomaly and intrusion detection, making the selection of an appropriate Complex Event Processing (CEP) engine a critical architectural decision for security-aware data pipelines. Python-based CEP frameworks offer compelling advantages through the seamless integration with data science and machine learning ecosystems; however, rigorous comparative evaluations of such frameworks under realistic IoT security workloads remain absent from the literature. This study presents the first systematic comparative evaluation of Faust and Streamz—two Python-native CEP engines representing fundamentally different architectural philosophies—specifically in the context of IoT network intrusion detection. Faust was selected for its actor-based stateful processing model with native Kafka integration and distributed table support, while Streamz was selected for its reactive, lightweight pipeline design targeting high-throughput stateless processing, making them representative of the two dominant paradigms in Python stream processing. Although both engines target different application niches, their performance characteristics under realistic CEP workloads have never been rigorously compared, leaving practitioners without empirical guidance. The primary evaluation employs an IoT network intrusion dataset comprising 583,485 events from 83 heterogeneous devices. To assess whether the observed performance characteristics are specific to this single dataset or generalize across different workload profiles, a secondary IoT-adjacent benchmark is included: the PaySim financial transaction dataset (6.4 million records), selected because its event schema, fraud-pattern temporal structure, and volume differ substantially from the intrusion dataset, providing a stress test for cross-workload robustness rather than a claim of domain equivalence. We acknowledge the reviewer’s valid point that a second IoT-specific intrusion dataset (such as TON_IoT or Bot-IoT) would constitute a more directly comparable validation; this is identified as a priority for future work. The load levels used in scalability experiments (up to 5000 events per second) intentionally exceed the dataset’s natural rate to stress-test each engine’s architectural ceiling and identify saturation thresholds relevant to large-scale or multi-sensor IoT deployments. We conducted controlled experiments with comprehensive statistical analysis. Our results demonstrate that Streamz achieves superior throughput at 4450 events per second with 89% efficiency and minimal resource consumption (40 MB memory, 12 ms median latency), while Faust provides robust intrusion pattern detection with 93–98% accuracy and stable, predictable resource utilization (1.4% CPU standard deviation). A multi-framework comparison including Apache Kafka Streams and offline scikit-learn baselines confirms that Faust achieves detection quality competitive with JVM-based alternatives (Faust: 96.2%; Kafka Streams: 96.8%; absolute difference of 0.6 percentage points, not statistically significant at p = 0.318) while retaining the Python ecosystem advantages. Statistical analysis confirms significant performance differences across all metrics (p < 0.001, Cohen’s d > 0.8). Critical scalability thresholds are identified: Streamz maintains efficiency above 95% up to 3500 events per second, while Faust degrades beyond 2500 events per second. These findings provide IoT security engineers and system architects with actionable, empirically grounded guidance for CEP engine selection, establish reproducible benchmarking methodology applicable to futurePython-based stream processing evaluations, and advance theoretical understanding of the accuracy–throughput trade-off in stateful versus stateless Python CEP architectures.
  • Unified Data Governance in Heterogeneous Database Environments: An API-Driven Architecture for Multi-Platform Policy Enforcement
    Publication . Abbasi, Maryam; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Cardoso, Filipe; Sá, Filipe; Martins, Pedro; Cardoso, Filipe; Sá, Filipe; Martins, Pedro
    Modern organizations increasingly rely on heterogeneous database environments that combine relational, document-oriented, and key-value storage systems to optimize performance for diverse application requirements. However, this technological diversity creates significant challenges for implementing consistent data governance policies, regulatory compliance, and access control across disparate systems. Traditional governance approaches that operate within individual database silos fail to provide unified policy enforcement and create compliance gaps that expose organizations to regulatory and operational risks. This paper presents a novel API-driven architecture that enables unified data governance across heterogeneous database environments without requiring database-specific modifications or vendor lock-in. The proposed framework implements a centralized governance layer that coordinates policy enforcement across PostgreSQL, MongoDB, and Amazon DynamoDB systems through RESTful API interfaces. Key architectural components include differentiated access control through hierarchical API key management, automated compliance workflows for regulatory requirements such as GDPR, real-time audit trail generation, and comprehensive data quality monitoring with automated improvement mechanisms. Comprehensive experimental evaluation demonstrates the framework’s effectiveness across multiple operational dimensions. The system achieved 95.2% accuracy in access control enforcement across different data classification levels, while automated GDPR compliance workflows demonstrated 98.6% success rates with average processing times of 2.9 h. Performance evaluation reveals acceptable overhead characteristics with linear scaling patterns for PostgreSQL operations (R2 = 0.89), consistent sub-20ms response times for MongoDB logging operations, and sustained throughput rates ranging from 38.9 to 142.7 requests per second across the integrated system. Data quality improvements ranged from 16.1% to 34.3% across accuracy, completeness, consistency, and timeliness dimensions over a 12-week monitoring period, with accuracy improving by 17.8 percentage points, completeness by 13.2 percentage points, consistency by 19.7 percentage points, and timeliness by 24.5 percentage points. The duplicate detection system achieved 94.6% precision and 95.6% recall across various duplicate types, including cross-database redundancy identification. The results demonstrate that API-driven governance architectures can effectively address the persistent challenges of policy fragmentation in multi-database environments while maintaining operational performance and enabling measurable improvements in data quality and regulatory compliance. The framework provides a practical migration path for organizations seeking to implement comprehensive governance capabilities without replacing existing database infrastructure investments.
  • Revisiting Database Indexing for Parallel and Accelerated Computing: A Comprehensive Study and Novel Approaches
    Publication . Abbasi, Maryam; Bernardo, Marco V.; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Martins, Pedro
    While the importance of indexing strategies for optimizing query performance in database systems is widely acknowledged, the impact of rapidly evolving hardware architectures on indexing techniques has been an underexplored area. As modern computing systems increasingly leverage parallel processing capabilities, multi-core CPUs, and specialized hardware accelerators, traditional indexing approaches may not fully capitalize on these advancements. This comprehensive experimental study investigates the effects of hardware-conscious indexing strategies tailored for contemporary and emerging hardware platforms. Through rigorous experimentation on a real-world database environment using the industry-standard TPC-H benchmark, this research evaluates the performance implications of indexing techniques specifically designed to exploit parallelism, vectorization, and hardware-accelerated operations. By examining approaches such as cache-conscious B-Tree variants, SIMD-optimized hash indexes, and GPU-accelerated spatial indexing, the study provides valuable insights into the potential performance gains and trade-offs associated with these hardware-aware indexing methods. The findings reveal that hardware-conscious indexing strategies can significantly outperform their traditional counterparts, particularly in data-intensive workloads and large-scale database deployments. Our experiments show improvements ranging from 32.4% to 48.6% in query execution time, depending on the specific technique and hardware configuration. However, the study also highlights the complexity of implementing and tuning these techniques, as they often require intricate code optimizations and a deep understanding of the underlying hardware architecture. Additionally, this research explores the potential of machine learning-based indexing approaches, including reinforcement learning for index selection and neural network-based index advisors. While these techniques show promise, with performance improvements of up to 48.6% in certain scenarios, their effectiveness varies across different query types and data distributions. By offering a comprehensive analysis and practical recommendations, this research contributes to the ongoing pursuit of database performance optimization in the era of heterogeneous computing. The findings inform database administrators, developers, and system architects on effective indexing practices tailored for modern hardware, while also paving the way for future research into adaptive indexing techniques that can dynamically leverage hardware capabilities based on workload characteristics and resource availability.
  • Optimizing Database Performance in Complex Event Processing through Indexing Strategies
    Publication . Abbasi, Maryam; Bernardo, Marco V.; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Martins, Pedro
    Complex event processing (CEP) systems have gained significant importance in various domains, such as finance, logistics, and security, where the real-time analysis of event streams is crucial. However, as the volume and complexity of event data continue to grow, optimizing the performance of CEP systems becomes a critical challenge. This paper investigates the impact of indexing strategies on the performance of databases handling complex event processing. We propose a novel indexing technique, called Hierarchical Temporal Indexing (HTI), specifically designed for the efficient processing of complex event queries. HTI leverages the temporal nature of event data and employs a multi-level indexing approach to optimize query execution. By combining temporal indexing with spatial- and attribute-based indexing, HTI aims to accelerate the retrieval and processing of relevant events, thereby improving overall query performance. In this study, we evaluate the effectiveness of HTI by implementing complex event queries on various CEP systems with different indexing strategies. We conduct a comprehensive performance analysis, measuring the query execution times and resource utilization (CPU, memory, etc.), and analyzing the execution plans and query optimization techniques employed by each system. Our experimental results demonstrate that the proposed HTI indexing strategy outperforms traditional indexing approaches, particularly for complex event queries involving temporal constraints and multi-dimensional event attributes. We provide insights into the strengths and weaknesses of each indexing strategy, identifying the factors that influence performance, such as data volume, query complexity, and event characteristics. Furthermore, we discuss the implications of our findings for the design and optimization of CEP systems, offering recommendations for indexing strategy selection based on the specific requirements and workload characteristics. Finally, we outline the potential limitations of our study and suggest future research directions in this domain.
  • Environmental and Economic Assessment of Desktop vs. Laptop Computers: A Life Cycle Approach
    Publication . Domingos Ferreira, Miguel; Domingos, idalina; Leite dos Santos, Lenise Maria; Barreto Ana; Ferreira, José
    This study evaluates and compares the environmental and economic implications of desktop and laptop computer systems throughout their life cycles using screening life cycle assessment (LCA) and life cycle costing (LCC) methodologies. The functional unit was defined as the use of one computer system for fundamental home and small-business productivity tasks for over four years. The analysis considered the production, use, and end-of-life phases. The results showed the desktop system had a higher overall carbon footprint (679.1 kg CO2eq) compared to the laptop (286.1 kg CO2eq). For both systems, manufacturing contributed the largest share of the emissions, followed by use. Desktops exhibited significantly higher use phase emissions, due to greater energy consumption. Life cycle cost analysis revealed that laptops had slightly lower total costs (EUR 593.88) than desktops (EUR 608.40) over the 4-year period, despite higher initial investment costs. Sensitivity analysis examining different geographical scenarios highlighted the importance of considering regional factors in the LCA. Manufacturer-provided data generally showed lower carbon footprint values than the modeled scenarios. This study emphasizes the need for updated life cycle inventory data and energy efficiency improvements to reduce the environmental impacts of computer systems. Overall, laptops demonstrated environmental and economic advantages over desktops in the defined usage cases.
  • Olive Tree (Olea europaea) Pruning: Chemical Composition and Valorization of Wastes Through Liquefaction
    Publication . Domingos, idalina; Domingos Ferreira, Miguel; Ferreira, José; Esteves, Bruno; MDPI
    Olive tree branches (OB) and leaves (OL) from the Viseu region (Portugal) were studied for their chemical composition and liquefaction behavior using polyalcohols. Chemical analysis revealed that OL contained higher ash content (4.08%) and extractives, indicating more bioactive compounds, while OB had greater α-cellulose (30.47%) and hemicellulose (27.88%). Lignin content was higher in OL (21.64%) than OB (16.40%). Liquefaction experiments showed that increasing the temperature from 140 ◦C to 180 ◦C improved conversion, with OB showing a larger increase (52.5% to 80.9%) compared to OL (66% to 72%). OB reached peak conversion faster, and the optimal particle size for OB was 40–60 mesh, while OL performed better at finer sizes. OL benefited more from higher solvent ratios, whereas OB achieved high conversion with less solvent. FTIR analysis confirmed that acid-catalyzed liquefaction breaks down lignocellulosic structures, depolymerizes cellulose and hemicellulose, and modifies lignin, forming hydroxyl, aliphatic, and carbonyl groups. These changes reflect progressive biomass degradation and the incorporation of polyalcohol components, converting solid biomass into a reactive, polyol-rich liquid. The study highlights the distinct chemical and processing characteristics of olive branches and leaves, informing their potential industrial applications.
  • Life Cycle Assessment of Pig Production in Central Portugal: Environmental Impacts and Sustainability Challenges
    Publication . Leite dos Santos, Lenise Maria; Domingos Ferreira, Miguel; Domingos, idalina; Oliveira Verónica; Rodrigues Carla; Ferreira António; Ferreira, José; MDPI
    Pig farming plays a crucial socioeconomic role in the European Union, which is one of the largest pork exporters in the world. In Portugal, pig farming plays a key role in regional development and the national economy. To ensure future sustainability and minimize environmental impacts, it is essential to identify the most deleterious pig production activities. This study carried out a life cycle assessment (LCA) of pig production using a conventional system in central Portugal to identify the unitary processes with the greatest environmental impact problems. LCA followed the ISO 14040/14044 standards, covering the entire production cycle, from feed manufacturing to waste management, using 1 kg of live pig weight as the functional unit. The slurry produced is used as fertilizer in agriculture, replacing synthetic chemical fertilizers. Results show that feed production, raising piglets, and fattening pigs are the most impactful phases of the pig production cycle. Fodder production is the stage with the greatest impact, accounting for approximately 60% to 70% of the impact in the categories analyzed in most cases. The environmental categories with the highest impacts were freshwater ecotoxicity, human carcinogenic toxicity, and marine ecotoxicity; the most significant impacts were observed for human health, with an estimated effect of around 0.00045 habitants equivalent (Hab.eq) after normalization. The use of more sustainable ingredients and the optimization of feed efficiency are effective strategies for promoting sustainability in the pig farming sector.