As distributed and multi-organization Machine Learning emerges, new challenges must be solved, such as diverse and low-quality data or real-time delivery. In this paper, we use a distributed learning environment to analyze the relationship between block size, parallelism, and predictor quality. Specifically, the goal is to find the optimum block size and the best heuristic to create distributed Ensembles. We ev...
Distributed Machine Learning, in which data and learning tasks are scattered across a cluster of computers, is one of the answers of the field to the challenges posed by Big Data. Still, in an era in which data abounds, decisions must still be made regarding which specific data to use on the training of the model, either because the amount of available data is simply too large, or because the training time or c...
In the last years, the number of machine learning algorithms and their parameters has increased significantly. On the one hand, this increases the chances of finding better models. On the other hand, it increases the complexity of the task of training a model, as the search space expands significantly. As the size of datasets also grows, traditional approaches based on extensive search start to become prohibiti...
Given the new requirements of Machine Learning problems in the last years, especially in what concerns the volume, diversity and speed of data, new approaches are needed to deal with the associated challenges. In this paper we describe CEDEs - a distributed learning system that runs on top of an Hadoop cluster and takes advantage of blocks, replication and balancing. CEDEs trains models in a distributed manner ...
Artificial intelligence and machine learning have been widely applied in several areas with the twofold goal of improving people’s well-being and accelerating computational processes. This may be seen in medical assistance (e.g., automatic verification of MRI images) and in personal assistants that adapt the content to the user based on his/her preferences, to optimize query response times in relational databas...
Machine Learning has been evolving rapidly over the past years, with new algorithms and approaches being devised to solve the challenges that the new properties of data pose. Specifically, algorithms must now learn continuously and in real time, from very large and possibly distributed sets of data. In this paper we describe a learning system that tackles some of these novel challenges. It learns and adapts in ...
Machine Learning has emerged in the last years as the main solution to many of nowadays' data-based decision problems. However, while new and more powerful algorithms and the increasing availability of computational resources contributed to a widespread use of Machine Learning, significant challenges still remain. Two of the most significant nowadays are the need to explain a model's predictions, and the signif...
Traditional explicit authentication mechanisms, in which the device remains unlocked after the introduction of some kind of password, are slowly being complemented with the so-called implicit or continuous authentication mechanisms. In the latter, the user is constantly monitored in one or more ways, in search for signs of unauthorized access, which may happen if a third party has access to the phone after it h...
The European Union has been making efforts to increase energy efficiency within its member states, in line with most of the industrialized countries. In these efforts, the energy consumed by public lighting networks is a key target as it represents approximately 50% of the electricity consumption of European cities. In this paper we propose an approach for the autonomous management of public lighting networks i...
Machine Learning is a field in which significant steps forward have been taken in the last years, resulting in a wide variety of available algorithms, for many different problems. Nonetheless, most of these algorithms focus on the training of static models, in the sense that the model stops evolving after the training phase. This is increasingly becoming a limitation, especially in an era in which datasets are ...