Percorrer por autor "Sá, Filipe"
A mostrar 1 - 10 de 10
Resultados por página
Opções de ordenação
- An Evaluation of How Big-Data and Data Warehouses Improve Business Intelligence Decision MakingPublication . Martins, Anthony; Martins, Pedro; Caldeira, Filipe; Sá, Filipe; Rocha, {\'AAnalyze and understand how to combine data warehouse with business intelligence tools, and other useful information or tools to visualize KPIs are critical factors in achieving the goal of raising competencies and business results of an organization. This article reviews data warehouse concepts and their appropriate use in business intelligence projects with a focus on large amounts of information. Nowadays, data volume is more significant and critical, and proper data analysis is essential for a successful project. From importing data to displaying results, there are crucial tasks such as extracting information, transforming it analyzing, and storing data for later querying. This work contributes with the proposition of a Big Data Business Intelligence architecture for an efficiently BI platform and the explanation of each step in creating a Data Warehouse and how data transformation is designed to provide useful and valuable information. To make valuable information useful, Business Intelligence tools are presented and evaluates, contributing to the continuous improvement of business results.
- An overview on how to develop a low-code application using OutSystemsPublication . Martins, Ricardo; Caldeira, Filipe; Sá, Filipe; Abbasi, Maryam; Martins, PedroThe motivation for developing a self-service platform for employees arises precisely from the idea that in all organizations there are tasks that could be automated in order to redirect work resources to more important tasks. The proposed application consists of the development of a self-service platform, for personal information and scheduling tasks, aimed at the employees instead of all the solutions that are in the market that aim their platform to the Human Resources. We focus on the employers giving them more responsibility to make their own personal management like, change their personal info, book their vacations and other, giving to the Human Resources the tasks of managing all these actions made by the employers. At the end of the work, it is expected that the final solution to be considered as an example of success with regards to the theme of business automation and innovation, using the low-code application Outsystems to perform the full proposed application development.
- Cityaction a smart-city platform architecturePublication . Martins, Pedro; Albuquerque, Daniel; Wanzeller, Cristina; Caldeira, Filipe; Tomé, P.; Sá, FilipeFast population growth in cities and surrounding regions force cities to become smarter to have a sustainable economy, social quality, and environmental well-being. Smart-Cities will be the ones using information and communication technologies to make cities services more efficient (in performance and cost), interactive, and aware of events. For a city to become smarter, it needs to make use of emerging technologies related with Internet-of-Things (IoT), not only to collect information and interact (actuate, command, control) but also to provide services for analytics and other applications. In this paper, is researched the concept of smart-city in the context of the project CityAction, tested on the city of Castelo Branco, Portugal. This project focuses on the relationship between IoT, monitoring, actuating and displaying data. Based on collected data from sensors spread across the city, the proposed project aims to make “smart” decisions to optimize resources, cost, well living, and environmental impact. Results introduce an architecture to integrate multiple heterogeneous sensors, develop a dashboard able of displaying data in a user-friendly way, and making this information available to population and users through a mobile app. This mechanism makes possible to infer better decisions on the city management/behavior and put in place the needed mechanisms to improve response time, safety and well living.
- Distributed data warehouse resource monitoring'Publication . Martins, Pedro; Sá, Filipe; Caldeira, Filipe; Abbasi, MaryamIn this paper, we investigate the problem of providing scalability (out and in) to Extraction, Transformation, Load (ETL) and Querying (Q) (ETL+Q) process of data warehouses. In general, data loading, transformation, and integration are heavy tasks that are performed only periodically, instead of row by row. Parallel architectures and mechanisms can optimize the ETL process by speeding up each part of the pipeline process as more performance is needed. We propose parallelization solutions for each part of the ETL+Q, which we integrate into a framework, that is, an approach that enables the automatic scalability and freshness of any data warehouse and ETL+Q process. Our results show that the proposed system algorithms can handle scalability to provide the desired processing speed in big-data and small-data scenarios.
- Improving bluetooth beacon-based indoor location and fingerprintingPublication . Martins, Pedro; Abbasi, Maryam; Sá, Filipe; Cecílio, José; Morgado, Francisco; Caldeira, FilipeThe complex way radio waves propagate indoors, leads to the derivation of location using fngerprinting techniques. In this cases, location is computed relying on WiFi signals strength mapping. Recent Bluetooth low energy (BLE) provides new opportunities to explore positioning. In this work is studied how BLE beacons radio signals can be used for indoor location scenarios, as well as their precision. Additionally, this paper also introduces a method for beacon-based positioning, based on signal strength measurements at key distances for each beacon. This method allows to use diferent beacon types, brands, and location conditions/constraints. Depending on each situation (i.e., hardware and location) it is possible to adapt the distance measuring curve to minimize errors and support higher distances, while at the same time keeping good precision. Moreover, this paper also presents a comparison with traditional positioning method, using formulas for distance estimation, and the position triangulation. The proposed study is performed inside the campus of Viseu Polytechnic Institute, and tested using a group of students, each with his smart-phone, as proof of concept. Experimental results show that BLE allows having < 1.5 m error approximately 90% of the times, and the experimental results using the proposed location detection method show that the proposed position technique has 13.2% better precision than triangulation, for distances up to 10 m.
- NoSQL Scalability Performance Evaluation over CassandraPublication . Abbasi, Maryam; Sá, Filipe; Albuquerque, Daniel; Wanzeller, Cristina; Caldeira, Filipe; Tomé, P.; Furtado, Pedro; Martins, PedroThe implementation of Smart-Cities is growing all over the world. From big cities to small villages, information able to provide a better and efficient urban management is collected from multiple sources (sensors). Such information has to be stored, queried, analyzed and displayed, aiming to contribute to a better quality of life for citizens and also a more sustainable environment. In this context it is important to choose the right database engine for this scenario. NoSQL databases are now generally accepted by the database community to support application niches. They are known for their scalability, simplicity, and key-indexed data storage, thus, allowing an easy data distribution and balancing over several nodes. In this paper a NoSQL engine is tested, Cassandra, which is one of the most scalable, amongst most NoSQL engines and therefore, a candidate for use in our application scenario. The paper focuses on horizontal scalability, which means that, by adding more nodes, it is possible to respond to more requests with the same or better performance, i.e., more nodes mean reduced execution time. Although, adding more computational resources, does not always result in better performance. This work assesses how each workload (e.g., data volume, simultaneous users) influence scalability performance. An overview of the Cassandra database engine is presented in the paper. Following, it will be tested and evaluated using the benchmark Yahoo Cloud Serving Benchmark (YCSB).
- NoSQL: A Real Use CasePublication . Martins, Pedro; Sá, Filipe; Caldeira, Filipe; Abbasi, MaryamAs the amount of information flow increases, business companies feel the need to improve on storage systems. Henceforth, to tackle this increasing need, paradigms such as NoSQL emerge to solve the unlimited data growing requirement. However, the NoSQL solution has no proofs given in the field to support their solution claims. Benchmarks can test and compare different solutions performance by executing queries over a toy dataset (synthetically generated). The problem with benchmarking results is how to extend the conclusions to a real system operating within a real business scenario. In this paper, an actual corporate case study is used, with real-world data, to evaluate how NoSQL databases perform. First, using big data, write-intensive tests are implemented and evaluated using Cassandra, MongoDB, Couchbase, and compared with the relational database in place, which is within the throughput limit. Results show a throughput comparison for each tested approach.
- Performance Comparison of Python-Based Complex Event Processing Engines for IoT Intrusion Detection: Faust Versus StreamzPublication . Abbasi, Maryam; Cardoso, Filipe; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Sá, Filipe; Martins, PedroThe proliferation of Internet of Things (IoT) devices has intensified the need for efficient real-time anomaly and intrusion detection, making the selection of an appropriate Complex Event Processing (CEP) engine a critical architectural decision for security-aware data pipelines. Python-based CEP frameworks offer compelling advantages through the seamless integration with data science and machine learning ecosystems; however, rigorous comparative evaluations of such frameworks under realistic IoT security workloads remain absent from the literature. This study presents the first systematic comparative evaluation of Faust and Streamz—two Python-native CEP engines representing fundamentally different architectural philosophies—specifically in the context of IoT network intrusion detection. Faust was selected for its actor-based stateful processing model with native Kafka integration and distributed table support, while Streamz was selected for its reactive, lightweight pipeline design targeting high-throughput stateless processing, making them representative of the two dominant paradigms in Python stream processing. Although both engines target different application niches, their performance characteristics under realistic CEP workloads have never been rigorously compared, leaving practitioners without empirical guidance. The primary evaluation employs an IoT network intrusion dataset comprising 583,485 events from 83 heterogeneous devices. To assess whether the observed performance characteristics are specific to this single dataset or generalize across different workload profiles, a secondary IoT-adjacent benchmark is included: the PaySim financial transaction dataset (6.4 million records), selected because its event schema, fraud-pattern temporal structure, and volume differ substantially from the intrusion dataset, providing a stress test for cross-workload robustness rather than a claim of domain equivalence. We acknowledge the reviewer’s valid point that a second IoT-specific intrusion dataset (such as TON_IoT or Bot-IoT) would constitute a more directly comparable validation; this is identified as a priority for future work. The load levels used in scalability experiments (up to 5000 events per second) intentionally exceed the dataset’s natural rate to stress-test each engine’s architectural ceiling and identify saturation thresholds relevant to large-scale or multi-sensor IoT deployments. We conducted controlled experiments with comprehensive statistical analysis. Our results demonstrate that Streamz achieves superior throughput at 4450 events per second with 89% efficiency and minimal resource consumption (40 MB memory, 12 ms median latency), while Faust provides robust intrusion pattern detection with 93–98% accuracy and stable, predictable resource utilization (1.4% CPU standard deviation). A multi-framework comparison including Apache Kafka Streams and offline scikit-learn baselines confirms that Faust achieves detection quality competitive with JVM-based alternatives (Faust: 96.2%; Kafka Streams: 96.8%; absolute difference of 0.6 percentage points, not statistically significant at p = 0.318) while retaining the Python ecosystem advantages. Statistical analysis confirms significant performance differences across all metrics (p < 0.001, Cohen’s d > 0.8). Critical scalability thresholds are identified: Streamz maintains efficiency above 95% up to 3500 events per second, while Faust degrades beyond 2500 events per second. These findings provide IoT security engineers and system architects with actionable, empirically grounded guidance for CEP engine selection, establish reproducible benchmarking methodology applicable to futurePython-based stream processing evaluations, and advance theoretical understanding of the accuracy–throughput trade-off in stateful versus stateless Python CEP architectures.
- Torrent Poisoning Protection with a Reverse Proxy ServerPublication . Godinho, António Augusto Nunes; Rosado, José; Sá, Filipe; Caldeira, Filipe; Cardoso, Filipe GonçalvesA Distributed Denial-of-Service attack uses multiple sources operating in concert to attack a network or site. A typical DDoS flood attack on a website targets a web server with multiple valid requests, exhausting the server’s resources. The participants in this attack are usually compromised/infected computers controlled by the attackers. There are several variations of this kind of attack, and torrent index poisoning is one. A Distributed Denial-of-Service (DDoS) attack using torrent poisoning, more specifically using index poisoning, is one of the most effective and disruptive types of attacks. These web flooding attacks originate from BitTorrent-based file-sharing communities, where the participants using the BitTorrent applications cannot detect their involvement. The antivirus and other tools cannot detect the altered torrent file, making the BitTorrent client target the webserver. The use of reverse proxy servers can block this type of request from reaching the web server, preventing the severity and impact on the service of the DDoS. In this paper, we analyze a torrent index poisoning DDoS to a higher education institution, the impact on the network systems and servers, and the mitigation measures implemented.
- Unified Data Governance in Heterogeneous Database Environments: An API-Driven Architecture for Multi-Platform Policy EnforcementPublication . Abbasi, Maryam; ANTUNES VAZ, PAULO JOAQUIM; Silva, José; Cardoso, Filipe; Sá, Filipe; Martins, Pedro; Cardoso, Filipe; Sá, Filipe; Martins, PedroModern organizations increasingly rely on heterogeneous database environments that combine relational, document-oriented, and key-value storage systems to optimize performance for diverse application requirements. However, this technological diversity creates significant challenges for implementing consistent data governance policies, regulatory compliance, and access control across disparate systems. Traditional governance approaches that operate within individual database silos fail to provide unified policy enforcement and create compliance gaps that expose organizations to regulatory and operational risks. This paper presents a novel API-driven architecture that enables unified data governance across heterogeneous database environments without requiring database-specific modifications or vendor lock-in. The proposed framework implements a centralized governance layer that coordinates policy enforcement across PostgreSQL, MongoDB, and Amazon DynamoDB systems through RESTful API interfaces. Key architectural components include differentiated access control through hierarchical API key management, automated compliance workflows for regulatory requirements such as GDPR, real-time audit trail generation, and comprehensive data quality monitoring with automated improvement mechanisms. Comprehensive experimental evaluation demonstrates the framework’s effectiveness across multiple operational dimensions. The system achieved 95.2% accuracy in access control enforcement across different data classification levels, while automated GDPR compliance workflows demonstrated 98.6% success rates with average processing times of 2.9 h. Performance evaluation reveals acceptable overhead characteristics with linear scaling patterns for PostgreSQL operations (R2 = 0.89), consistent sub-20ms response times for MongoDB logging operations, and sustained throughput rates ranging from 38.9 to 142.7 requests per second across the integrated system. Data quality improvements ranged from 16.1% to 34.3% across accuracy, completeness, consistency, and timeliness dimensions over a 12-week monitoring period, with accuracy improving by 17.8 percentage points, completeness by 13.2 percentage points, consistency by 19.7 percentage points, and timeliness by 24.5 percentage points. The duplicate detection system achieved 94.6% precision and 95.6% recall across various duplicate types, including cross-database redundancy identification. The results demonstrate that API-driven governance architectures can effectively address the persistent challenges of policy fragmentation in multi-database environments while maintaining operational performance and enabling measurable improvements in data quality and regulatory compliance. The framework provides a practical migration path for organizations seeking to implement comprehensive governance capabilities without replacing existing database infrastructure investments.
