Log in

Relevant bibliographies by topics / Machine Learning as a Service / Dissertations / Theses

To see the other types of publications on this topic, follow the link: Machine Learning as a Service.

Dissertations / Theses on the topic 'Machine Learning as a Service'

Author: Grafiati

Published: 6 September 2023

Create a spot-on reference in APA, MLA, Chicago, Harvard, and other styles

Select a source type:

Consult the top 50 dissertations / theses for your research on the topic 'Machine Learning as a Service.'

Next to every source in the list of references, there is an 'Add to bibliography' button. Press on it, and we will generate automatically the bibliographic reference to the chosen work in the citation style you need: APA, MLA, Harvard, Chicago, Vancouver, etc.

You can also download the full text of the academic publication as pdf and read online its abstract whenever available in the metadata.

Browse dissertations / theses on a wide variety of disciplines and organise your bibliography correctly.

1

Hesamifard, Ehsan. "Privacy Preserving Machine Learning as a Service." Thesis, University of North Texas, 2020. https://digital.library.unt.edu/ark:/67531/metadc1703277/.

Full text

Abstract:

Machine learning algorithms based on neural networks have achieved remarkable results and are being extensively used in different domains. However, the machine learning algorithms requires access to raw data which is often privacy sensitive. To address this issue, we develop new techniques to provide solutions for running deep neural networks over encrypted data. In this paper, we develop new techniques to adopt deep neural networks within the practical limitation of current homomorphic encryption schemes. We focus on training and classification of the well-known neural networks and convolutional neural networks. First, we design methods for approximation of the activation functions commonly used in CNNs (i.e. ReLU, Sigmoid, and Tanh) with low degree polynomials which is essential for efficient homomorphic encryption schemes. Then, we train neural networks with the approximation polynomials instead of original activation functions and analyze the performance of the models. Finally, we implement neural networks and convolutional neural networks over encrypted data and measure performance of the models.

APA, Harvard, Vancouver, ISO, and other styles

2

Altalabani, Osama. "An automatic machine-learning framework for testing service-oriented architecure." Thesis, Kingston University, 2014. http://eprints.kingston.ac.uk/32198/.

Full text

Abstract:

Today, Service Oriented Architecture (SOA) systems such as web services have the advantage of offering defined protocol and standard requirement specifications by means of a formal contract between the service requestor and the service provider, for example, the WSDL (Web Services Description Language) , PBEL (Business Process Execution Language), and BPMN (Business Process Model and Notation). This gives a high degree of flexibility to the design, development, Information Technology (IT) infrastructure implementation, and promise a world where computing resources work transparently and efficiently. Furthermore, the rich interface standards and specifications of SOA web services (collectively referred to as the WS-* Architecture) enable service providers and consumers to solve important problems, as these interfaces enable the development of interoperable computing environments that incorporate end-to-end security, reliability and transaction support, thus, promoting existing IT infrastructure investments. However, many of the benefits of SOA become challenges for testing approaches and frameworks due to their specific design and implementation characteristics, which cause many testability problems. Thus, a number of testing approaches and frameworks have been proposed in the literature to address various aspects of SOA testability. However, most of these approaches and frameworks are based on intuition and not carried out in a systematic manner that is based on the standards and specifications of SOA. Generally, they lack sophisticated and automated testing, which provide data mining and knowledge discovery in accordance with the system based on SOA requirements, which consequently would provide better testability, deeper intelligence and prudence. Thus, this thesis proposes an automated and systematic testing framework based on user requirements, both functional and non-functional, with support of machine-learning techniques for intelligent reliability, real-time monitoring, SOA protocols and standard requirements coverage analysis to improve the testability of SOA-based systems. This thesis addresses the development, implementation, and evaluation of the proposed framework, by means of a proof-of-concept prototype for testing SOA systems based on the web services protocol stack specifications. The framework extends to intelligent analysis of SOA web service specifications and the generation of test cases based on static test analysis using machine-learning support.

APA, Harvard, Vancouver, ISO, and other styles

3

MUSCI, MARIA ANGELA. "Service robotics and machine learning for close-range remote sensing." Doctoral thesis, Politecnico di Torino, 2021. http://hdl.handle.net/11583/2903488.

Full text

APA, Harvard, Vancouver, ISO, and other styles

4

MAZZIA, VITTORIO. "Machine Learning Algorithms and their Embedded Implementation for Service Robotics Applications." Doctoral thesis, Politecnico di Torino, 2022. http://hdl.handle.net/11583/2968456.

Full text

APA, Harvard, Vancouver, ISO, and other styles

5

Ashfaq, Awais. "Predicting clinical outcomes via machine learning on electronic health records." Licentiate thesis, Högskolan i Halmstad, CAISR Centrum för tillämpade intelligenta system (IS-lab), 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:hh:diva-39309.

Full text

Abstract:

The rising complexity in healthcare, exacerbated by an ageing population, results in ineffective decision-making leading to detrimental effects on care quality and escalates care costs. Consequently, there is a need for smart decision support systems that can empower clinician's to make better informed care decisions. Decisions, which are not only based on general clinical knowledge and personal experience, but also rest on personalised and precise insights about future patient outcomes. A promising approach is to leverage the ongoing digitization of healthcare that generates unprecedented amounts of clinical data stored in Electronic Health Records (EHRs) and couple it with modern Machine Learning (ML) toolset for clinical decision support, and simultaneously, expand the evidence base of medicine. As promising as it sounds, assimilating complete clinical data that provides a rich perspective of the patient's health state comes with a multitude of data-science challenges that impede efficient learning of ML models. This thesis primarily focuses on learning comprehensive patient representations from EHRs. The key challenges of heterogeneity and temporality in EHR data are addressed using human-derived features appended to contextual embeddings of clinical concepts and Long-Short-Term-Memory networks, respectively. The developed models are empirically evaluated in the context of predicting adverse clinical outcomes such as mortality or hospital readmissions. We also present evidence that, surprisingly, different ML models primarily designed for non-EHR analysis (like language processing and time-series prediction) can be combined and adapted into a single framework to efficiently represent EHR data and predict patient outcomes.

APA, Harvard, Vancouver, ISO, and other styles

6

Dhekne, Rucha P. "Machine Learning Techniques to Provide Quality of Service in Cognitive Radio Technology." University of Cincinnati / OhioLINK, 2009. http://rave.ohiolink.edu/etdc/view?acc_num=ucin1258579803.

Full text

APA, Harvard, Vancouver, ISO, and other styles

7

Blank, Clas, and Tomas Hermansson. "A Machine Learning approach to churn prediction in a subscription-based service." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-240397.

Full text

Abstract:

Prenumerationstjänster blir alltmer populära i dagens samhälle. En av nycklarna för att lyckas med en prenumerationsbaserad affärsmodell är att minimera kundbortfall (eng. churn), dvs. kunder som avslutar sin prenumeration inom en viss tidsperiod. I och med den ökande digitaliseringen, är det nu enklare att samla in data än någonsin tidigare. Samtidigt växer maskininlärning snabbt och blir alltmer lättillgängligt, vilket möjliggör nya infallsvinklar på problemlösning. Denna rapport kommer testa och utvärdera ett försök att förutsäga kundbortfall med hjälp av maskininlärning, baserat på kunddata från ett företag med en prenumerationsbaserad affärsmodell där prenumeranten får besöka live-event till en fast månadskostnad. De maskininlärningsmodeller som användes i testerna var Random Forests, Support Vector Machines, Logistic Regression, och Neural Networks som alla tränades med användardata från företaget. Modellerna gav ett slutligt träffsäkerhetsresultat i spannet mellan 73,7 % och 76,7 %. Därutöver tenderade modellerna att ge ett högre resultat för precision och täckning gällande att klassificera kunder som sagt upp sin prenumeration än för de som fortfarande var aktiva. Dessutom kunde det konstateras att de kundegenskaper som hade störst inverkan på klassifikationen var ”Använda Biljetter” och ”Längd på Prenumeration”. Slutligen kommer det i denna rapport diskuteras hur informationen angående vilka kunder som sannolikt kommer avsluta sin prenumeration kan användas ur ett mer affärsmässigt perspektiv.
In today’s world subscription-based online services are becoming increasingly popular. One of the keys to success in a subscription-based business model is to minimize churn, i.e. customer canceling their subscriptions. Due to the digitalization of the world, data is easier to collect than ever before. At the same time machine learning is growing and is made more available. That opens up new possibilities to solve different problems with the use of machine learning. This paper will test and evaluate a machine learning approach to churn prediction, based on the user data from a company with an online subscription service letting the user attend live shows to a fixed price. To perform the tests different machine learning models were used, both individually and combined. The models were Random Forests, Support Vector Machines, Logistic Regression and Neural Networks. In order to train them a data set containing either active or churned users was provided. Eventually the models returned accuracy results ranging from 73.7 % to 76.7 % when classifying churners based on their activity data. Furthermore, the models turned out to have higher scores for precision and recall for classifying the churners than the non-churners. In addition, the features that had the most impact on the model regarding the classification were Tickets Used and Length of Subscription. Moreover, this paper will discuss how churn prediction can be used from a business perspective.

APA, Harvard, Vancouver, ISO, and other styles

8

Hill, Jerry L., and Randall P. Mora. "An Autonomous Machine Learning Approach for Global Terrorist Recognition." International Foundation for Telemetering, 2012. http://hdl.handle.net/10150/581675.

Full text

Abstract:

ITC/USA 2012 Conference Proceedings / The Forty-Eighth Annual International Telemetering Conference and Technical Exhibition / October 22-25, 2012 / Town and Country Resort & Convention Center, San Diego, California
A major intelligence challenge we face in today's national security environment is the threat of terrorist attack against our national assets, especially our citizens. This paper addresses global reconnaissance which incorporates an autonomous Intelligent Agent/Data Fusion solution for recognizing potential risk of terrorist attack through identifying and reporting imminent persona-oriented terrorist threats based on data reduction/compression of a large volume of low latency data possibly from hundreds, or even thousands of data points.

APA, Harvard, Vancouver, ISO, and other styles

9

Darborg, Alex. "Real-time face recognition using one-shot learning : A deep learning and machine learning project." Thesis, Mittuniversitetet, Institutionen för informationssystem och –teknologi, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:miun:diva-40069.

Full text

Abstract:

Face recognition is often described as the process of identifying and verifying people in a photograph by their face. Researchers have recently given this field increased attention, continuously improving the underlying models. The objective of this study is to implement a real-time face recognition system using one-shot learning. “One shot” means learning from one or few training samples. This paper evaluates different methods to solve this problem. Convolutional neural networks are known to require large datasets to reach an acceptable accuracy. This project proposes a method to solve this problem by reducing the number of training instances to one and still achieving an accuracy close to 100%, utilizing the concept of transfer learning.

APA, Harvard, Vancouver, ISO, and other styles

10

Tataru, Augustin. "Metrics for Evaluating Machine Learning Cloud Services." Thesis, Tekniska Högskolan, Högskolan i Jönköping, JTH, Datateknik och informatik, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:hj:diva-37882.

Full text

Abstract:

Machine Learning (ML) is nowadays being offered as a service by several cloud providers. Consumers require metrics to be able to evaluate and compare between multiple ML cloud services. There aren’t many established metrics that can be used specifically for these types of services. In this paper, the Goal-QuestionMetric paradigm is used to define a set of metrics applicable for ML cloud services. The metrics are created based on goals expressed by professionals who use or are interested in using these services. At the end, a questionnaire is used to evaluate the metrics based on two criteria: relevance and ease of use.

APA, Harvard, Vancouver, ISO, and other styles

11

Alsterman, Marcus, and Maximilian Karlström. "Evaluation of Machine Learning Methods for Predicting Client Metrics for a Telecom Service." Thesis, KTH, Skolan för teknikvetenskap (SCI), 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-214733.

Full text

Abstract:

A video streaming service faces several difficultiesoperating. Hardware is expensive and it is crucial to prioritizecustomers in a way that will make them content with the serviceprovided. That is, deliver a sufficient frame rate and neverallocate too much, essentially waste, resources on a client. Thisallocation has to be done several times per second so readingdata from the client is out of the question, because the systemwould be adapting too slow. This raises the question whether it ispossible to predict the frame rate of a client using only variablesmeasured on the server and if it can be done efficiently. Which itcan [1]. To further build on the work of Yanggratoke et al [1], weevaluated several different machine learning methods on a dataset in terms of performance, training time and dependence on thesize of the data set. Neural networks, having the best adaptingcapabilities, resulted in the best performance but training is moretime consuming than for the linear model. Using neural networksis a good idea when the relationship between input and outputis not linear.

APA, Harvard, Vancouver, ISO, and other styles

12

Karg, Philipp. "Evaluation and Implementation of Machine Learning Methods for an Optimized Web Service Selection in a Future Service Market." Thesis, Linnéuniversitetet, Institutionen för datavetenskap (DV), 2014. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-38096.

Full text

Abstract:

In future service markets a selection of functionally equal services is omnipresent. The evolving challenge, finding the best-fit service, requires a distinction between the non-functional service characteristics (e.g., response time, price, availability). Service providers commonly capture those quality characteristics in so-called Service Level Agreements (SLAs). However, a service selection based on SLAs is inadequate, because the static SLAs generally do not consider the dynamic service behaviors and quality changes in a service-oriented environment. Furthermore, the profit-oriented service providers tend to embellish their SLAs by flexibly handling their correctness. Within the SOC (Service Oriented Computing) research project of the Karlsruhe University of Applied Sciences and the Linnaeus University of Sweden, a service broker framework for an optimized web service selection is introduced. Instead of relying on the providers’ quality assertions, a distributed knowledge is developed by automatically monitoring and measuring the service quality during each service consumption. The broker aims at optimizing the service selection based on the past real service performances and the defined quality preferences of a unique consumer.This thesis work concerns the design, implementation and evaluation of appropriate machine learning methods with focus on the broker’s best-fit web service selection. Within the time-critical service optimization the performance and scalability of the broker’s machine learning plays an important role. Therefore, high- performance algorithms for predicting the future non-functional service characteristics within a continuous machine learning process were implemented. The introduced so-called foreground-/background-model enables to separate the real-time request for a best-fit service selection from the time-consuming machine learning. The best-fit services for certain consumer call contexts (e.g., call location and time, quality preferences) are continuously pre-determined within the asynchronous background-model. Through this any performance issues within the critical path from the service request up to the best-fit service recommendation are eliminated. For evaluating the implemented best-fit service selection a sophisticated test data scenario with real-world characteristics was created showing services with different volatile performances, cyclic performance behaviors and performance changes in the course of time. Besides the significantly improved performance, the new implementation achieved an overall high selection accuracy. It was possible to determine in 70% of all service optimizations the actual best-fit service and in 94% of all service optimizations the actual two best-fit services.

APA, Harvard, Vancouver, ISO, and other styles

13

Wallin, Jonatan. "Optimization and personalization of a web service based on temporal information." Thesis, Umeå universitet, Institutionen för fysik, 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-149712.

Full text

Abstract:

Development in information and communication technology has increased the attention of personalization in the 21st century and the benefits to both marketers and customers are claimed to be many. The need to efficiently deliver personalized content in different web applications has increased the interest in the field of machine learning. In this thesis project, the aim is to develop a decision model that autonomously optimizes a commercial web service to increase the click through rate. The model should be based on previously collected data about previous usage of the web service. Different requirements for efficiency and storage must be fulfilled at the same time as the model should produce valuable results. An algorithm for a binary decision tree is presented in this report. The evolution of the binary tree is controlled by an entropy minimizing heuristic approach together with three specified stopping criteria. Tests on both synthetic and real data sets were performed to evaluate the accuracy and efficiency of the algorithm. The results showed that the running time is dominated by different parameters depending on the sizes of the test sets. The model is capable of capturing inherent patterns in the the available data.

APA, Harvard, Vancouver, ISO, and other styles

14

Tang, Chen. "Forecasting Service Metrics for Network Services." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-284505.

Full text

Abstract:

As the size and complexity of the internet increased dramatically in recent years,the burden of network service management also became heavier. The need foran intelligent way for data analysis and forecasting becomes urgent. The wideimplementation of machine learning and data analysis methods provides a newway to analyze large amounts of data.In this project, I study and evaluate data forecasting methods using machinelearning techniques and time series analysis methods on data collected fromthe KTH testbed. Comparing different methods with respect to accuracy andcomputing overhead I propose the best method for data forecasting for differentscenarios.The results show that machine learning techniques using regression can achievebetter performance with higher accuracy and smaller computing overhead. Timeseries data analysis methods have relatively lower accuracy, and the computingoverhead is much higher than machine learning techniques on the datasetsevaluated in this project.
Eftersom storleken och komplexiteten på internet har ökat dramatiskt under de senaste åren så har belastningen av nätverkshantering också blivit tyngre. Behovet av ett intelligent sätt för dataanalys och prognos blir brådskande. Den breda implementeringen av maskininlärningsmetoder och dataanalysmetoder ger ett nytt sätt att analysera stora mängder data.I detta projekt studerar och utvärderar jag dataprognosmetoder med hjälp av maskininlärningstekniker och analyser av tidsserier som samlats in från KTHtestbädden. Baserat på jämförelse av olika metoder med avseende på noggrannhet och beräkningskostnader, så föreslår jag föreslår den bästa metoden för dataprognoser för olika scenarier.Resultaten visar att maskininlärningstekniker som använder regression kan uppnå bättre prestanda med högre noggrannhet och mindre datoromkostnader. Metoderför dataanalys av tidsserier har relativt lägre noggrannhet, och beräkningsomkostnaderna är mycket högre än maskininlärningstekniker på de datauppsättningar som utvärderatsi detta projekt.

APA, Harvard, Vancouver, ISO, and other styles

15

Drolia, Utsav. "Adaptive Distributed Caching for Scalable Machine Learning Services." Research Showcase @ CMU, 2017. http://repository.cmu.edu/dissertations/1004.

Full text

Abstract:

Applications for Internet-enabled devices use machine learning to process captured data to make intelligent decisions or provide information to users. Typically, the computation to process the data is executed in cloud-based backends. The devices are used for sensing data, offloading it to the cloud, receiving responses and acting upon them. However, this approach leads to high end-to-end latency due to communication over the Internet. This dissertation proposes reducing this response time by minimizing offloading, and pushing computation close to the source of the data, i.e. to edge servers and devices themselves. To adapt to the resource constrained environment at the edge, it presents an approach that leverages spatiotemporal locality to push subparts of the model to the edge. This approach is embodied in a distributed caching framework, Cachier. Cachier is built upon a novel caching model for recognition, and is distributed across edge servers and devices. The analytical caching model for recognition provides a formulation for expected latency for recognition requests in Cachier. The formulation incorporates the effects of compute time and accuracy. It also incorporates network conditions, thus providing a method to compute expected response times under various conditions. This is utilized as a cost function by Cachier, at edge servers and devices. By analyzing requests at the edge server, Cachier caches relevant parts of the trained model at edge servers, which is used to respond to requests, minimizing the number of requests that go to the cloud. Then, Cachier uses context-aware prediction to prefetch parts of the trained model onto devices. The requests can then be processed on the devices, thus minimizing the number of offloaded requests. Finally, Cachier enables cooperation between nearby devices to allow exchanging prefetched data, reducing the dependence on remote servers even further. The efficacy of Cachier is evaluated by using it with an art recognition application. The application is driven using real world traces gathered at museums. By conducting a large-scale study with different control variables, we show that Cachier can lower latency, increase scalability and decrease infrastructure resource usage, while maintaining high accuracy.

APA, Harvard, Vancouver, ISO, and other styles

16

Essaidi, Moez. "Model-Driven Data Warehouse and its Automation Using Machine Learning Techniques." Paris 13, 2013. http://scbd-sto.univ-paris13.fr/secure/edgalilee_th_2013_essaidi.pdf.

Full text

Abstract:

L'objectif de ce travail de thèse est de proposer une approche permettant l'automatisation complète du processus de transformation de modèles pour le développement d'entrepôts de données. L'idée principale est de réduire au mieux l'intervention des experts humains en utilisant les traces de transformations réalisées sur des projets similaires. L'objectif est d'utiliser des techniques d'apprentissage supervisées pour traiter les définitions de concepts avec le même niveau d'expression que les données manipulées. La nature des données manipulées nous a conduits à choisir les langages relationnels pour la description des exemples et des hypothèses. Ces langages ont l'avantage d'être expressifs en donnant la possibilité d'exprimer les relations entres les objets manipulés mais présente l'inconvénient majeur de ne pas disposer d'algorithmes permettant le passage à l'échelle pour des applications industrielles. Pour résoudre ce problème, nous avons proposé une architecture permettant d'exploiter au mieux les connaissances issues des invariants de transformations entre modèles et métamodèles. Cette manière de procéder a mis en lumière des dépendances entre les concepts à apprendre et nous a conduits à proposer un paradigme d'apprentissage dit de concepts-dépendants. Enfin, cette thèse présente plusieurs aspects qui peuvent influencer la prochaine génération de plates-formes décisionnelles. Elle propose, en particulier, une architecture de déploiement pour la business intelligence en tant que service basée sur les normes industrielles et les technologies les plus récentes et les plus prometteuses
This thesis aims at proposing an end-to-end approach which allows the automation of the process of model transformations for the development of data warehousing components. The main idea is to reduce as much as possible the intervention of human experts by using once again the traces of transformations produced on similar projects. The goal is to use supervised learning techniques to handle concept definitions with the same expressive level as manipulated data. The nature of the manipulated data leads us to choose relational languages for the description of examples and hypothesises. These languages have the advantage of being expressive by giving the possibility to express relationships between the manipulated objects, but they have the major disadvantage of not having algorithms allowing the application on large scales of industrial applications. To solve this problem, we have proposed an architecture that allows the perfect exploitation of the knowledge obtained from transformations' invariants between models and metamodels. This way of proceeding has highlighted the dependencies between the concepts to learn and has led us to propose a learning paradigm, called dependent-concept learning. Finally, this thesis presents various aspects that may inuence the next generation of data warehousing platforms. The latter suggests, in particular, an architecture for business intelligence-as-a-service based on the most recent and promising industrial standards and technologies

APA, Harvard, Vancouver, ISO, and other styles

17

Giommi, Luca. "Prototype of machine learning “as a service” for CMS physics in signal vs background discrimination." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2018. http://amslaurea.unibo.it/15803/.

Full text

Abstract:

Big volumes of data are collected and analysed by LHC experiments at CERN. The success of this scientific challenges is ensured by a great amount of computing power and storage capacity, operated over high performance networks, in very complex LHC computing models on the LHC Computing Grid infrastructure. Now in Run-2 data taking, LHC has an ambitious and broad experimental programme for the coming decades: it includes large investments in detector hardware, and similarly it requires commensurate investment in the R&D in software and com- puting to acquire, manage, process, and analyse the shear amounts of data to be recorded in the High-Luminosity LHC (HL-LHC) era. The new rise of Artificial Intelligence - related to the current Big Data era, to the technological progress and to a bump in resources democratization and efficient allocation at affordable costs through cloud solutions - is posing new challenges but also offering extremely promising techniques, not only for the commercial world but also for scientific enterprises such as HEP experiments. Machine Learning and Deep Learning are rapidly evolving approaches to characterising and describing data with the potential to radically change how data is reduced and analysed, also at LHC. This thesis aims at contributing to the construction of a Machine Learning “as a service” solution for CMS Physics needs, namely an end-to-end data-service to serve Machine Learning trained model to the CMS software framework. To this ambitious goal, this thesis work contributes firstly with a proof of concept of a first prototype of such infrastructure, and secondly with a specific physics use-case: the Signal versus Background discrimination in the study of CMS all-hadronic top quark decays, done with scalable Machine Learning techniques.

APA, Harvard, Vancouver, ISO, and other styles

18

Osman, Yasin, and Benjamin Ghaffari. "Customer churn prediction using machine learning : A study in the B2B subscription based service context." Thesis, Blekinge Tekniska Högskola, 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:bth-21872.

Full text

Abstract:

The rapid growth of technological infrastructure has changed the way companies do business. Subscription based services are one of the outcomes of the ongoing digitalization, and with more and more products and services to choose from, customer churning has become a major problem and a threat to all firms. We propose a machine learning based churn prediction model for a subscription based service provider, within the domain of financial administration in the business-to-business (B2B) context. The aim of our study is to contribute knowledge within the field of churn prediction. For the proposed model, we compare two ensemble learners, XGBoost and Random Forest, with a single base learner, Naïve Bayes. The study follows the guidelines of the design science methodology, where we used the machine learning process to iteratively build and evaluate the generated model, using the metrics, accuracy, precision, recall, and F1- score. The data has been collected from a subscription-based service provider, within the financial administration sector. Since the used dataset is imbalanced with a majority of non- churners, we evaluated three different sampling methods, that is, SMOTE, SMOTEENN and RandomUnderSampler, in order to balance the dataset. From the results of our study, we conclude that machine learning is a useful approach for prediction of customer churning. In addition, our results show that ensemble learners perform better than single base learners and that a balanced training dataset is expected to improve the performance of the classifiers.

APA, Harvard, Vancouver, ISO, and other styles

19

Jiang, Zuoying. "Predicting Service Metrics from Device Statistics in a Container-Based Environment." Thesis, KTH, Kommunikationsnät, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-175889.

Full text

Abstract:

Service assurance is critical for high-demand services running on telecom clouds. While service performance metrics may not always be available in real time to telecom operators or service providers, service performance prediction becomes an important building block for such a system. However, it is generally hard to achieve. In this master thesis, we propose a machine-learning based method that enables performance prediction for services running in virtualized environments with Docker containers. This method is service agnostic and the prediction models built by this method use only device statistics collected from the server machine and from the containers hosted on it to predict the values of the service-level metrics experienced on the client side. The evaluation results from the testbed, which runs a Video-on-Demand service using containerized servers, show that such a method can accurately predict different service-level metrics under various scenarios and, by applying suitable preprocessing techniques, the performance of the prediction models can be further improved. In this thesis, we also show the design of a proof-of-concept of a Real-Time Analytics Engine that uses online learning methods to predict the service-level metrics in real time in a container-based environment.

APA, Harvard, Vancouver, ISO, and other styles

20

Modi, Navikkumar. "Machine Learning and Statistical Decision Making for Green Radio." Thesis, CentraleSupélec, 2017. http://www.theses.fr/2017SUPL0002/document.

Full text

Abstract:

Cette thèse étudie les techniques de gestion intelligente du spectre et de topologie des réseaux via une approche radio intelligente dans le but d’améliorer leur capacité, leur qualité de service (QoS – Quality of Service) et leur consommation énergétique. Les techniques d’apprentissage par renforcement y sont utilisées dans le but d’améliorer les performances d’un système radio intelligent. Dans ce manuscrit, nous traitons du problème d’accès opportuniste au spectre dans le cas de réseaux intelligents sans infrastructure. Nous nous plaçons dans le cas où aucune information n’est échangée entre les utilisateurs secondaires (pour éviter les surcoûts en transmissions). Ce problème particulier est modélisé par une approche dite de bandits manchots « restless » markoviens multi-utilisateurs (multi-user restless Markov MAB -multi¬armed bandit). La contribution principale de cette thèse propose une stratégie d’apprentissage multi-joueurs qui prend en compte non seulement le critère de disponibilité des canaux (comme déjà étudié dans la littérature et une thèse précédente au laboratoire), mais aussi une métrique de qualité, comme par exemple le niveau d’interférence mesuré (sensing) dans un canal (perturbations issues des canaux adjacents ou de signaux distants). Nous prouvons que notre stratégie, RQoS-UCB distribuée (distributed restless QoS-UCB – Upper Confidence Bound), est quasi optimale car on obtient des performances au moins d’ordre logarithmique sur son regret. En outre, nous montrons par des simulations que les performances du système intelligent proposé sont améliorées significativement par l’utilisation de la solution d’apprentissage proposée permettant à l’utilisateur secondaire d’identifier plus efficacement les ressources fréquentielles les plus disponibles et de meilleure qualité. Cette thèse propose également un nouveau modèle d’apprentissage par renforcement combiné à un transfert de connaissance afin d’améliorer l’efficacité énergétique (EE) des réseaux cellulaires hétérogènes. Nous formulons et résolvons un problème de maximisation de l’EE pour le cas de stations de base (BS – Base Stations) dynamiquement éteintes et allumées (ON-OFF). Ce problème d’optimisation combinatoire peut aussi être modélisé par des bandits manchots « restless » markoviens. Par ailleurs, une gestion dynamique de la topologie des réseaux hétérogènes, utilisant l’algorithme RQoS-UCB, est proposée pour contrôler intelligemment le mode de fonctionnement ON-OFF des BS, dans un contexte de trafic et d’étude de capacité multi-cellulaires. Enfin une méthode incluant le transfert de connaissance « transfer RQoS-UCB » est proposée et validée par des simulations, pour pallier les pertes de récompense initiales et accélérer le processus d’apprentissage, grâce à la connaissance acquise à d’autres périodes temporelles correspondantes à la période courante (même heure de la journée la veille, ou même jour de la semaine par exemple). La solution proposée de gestion dynamique du mode ON-OFF des BS permet de diminuer le nombre de BS actives tout en garantissant une QoS adéquate en atténuant les fluctuations de la QoS lors des variations du trafic et en améliorant les conditions au démarrage de l’apprentissage. Ainsi, l’efficacité énergétique est grandement améliorée. Enfin des démonstrateurs en conditions radio réelles ont été développés pour valider les solutions d’apprentissage étudiées. Les algorithmes ont également été confrontés à des bases de données de mesures effectuées par un partenaire dans la gamme de fréquence HF, pour des liaisons transhorizon. Les résultats confirment la pertinence des solutions d’apprentissage proposées, aussi bien en termes d’optimisation de l’utilisation du spectre fréquentiel, qu’en termes d’efficacité énergétique
Future cellular network technologies are targeted at delivering self-organizable and ultra-high capacity networks, while reducing their energy consumption. This thesis studies intelligent spectrum and topology management through cognitive radio techniques to improve the capacity density and Quality of Service (QoS) as well as to reduce the cooperation overhead and energy consumption. This thesis investigates how reinforcement learning can be used to improve the performance of a cognitive radio system. In this dissertation, we deal with the problem of opportunistic spectrum access in infrastructureless cognitive networks. We assume that there is no information exchange between users, and they have no knowledge of channel statistics and other user's actions. This particular problem is designed as multi-user restless Markov multi-armed bandit framework, in which multiple users collect a priori unknown reward by selecting a channel. The main contribution of the dissertation is to propose a learning policy for distributed users, that takes into account not only the availability criterion of a band but also a quality metric linked to the interference power from the neighboring cells experienced on the sensed band. We also prove that the policy, named distributed restless QoS-UCB (RQoS-UCB), achieves at most logarithmic order regret. Moreover, numerical studies show that the performance of the cognitive radio system can be significantly enhanced by utilizing proposed learning policies since the cognitive devices are able to identify the appropriate resources more efficiently. This dissertation also introduces a reinforcement learning and transfer learning frameworks to improve the energy efficiency (EE) of the heterogeneous cellular network. Specifically, we formulate and solve an energy efficiency maximization problem pertaining to dynamic base stations (BS) switching operation, which is identified as a combinatorial learning problem, with restless Markov multi-armed bandit framework. Furthermore, a dynamic topology management using the previously defined algorithm, RQoS-UCB, is introduced to intelligently control the working modes of BSs, based on traffic load and capacity in multiple cells. Moreover, to cope with initial reward loss and to speed up the learning process, a transfer RQoS-UCB policy, which benefits from the transferred knowledge observed in historical periods, is proposed and provably converges. Then, proposed dynamic BS switching operation is demonstrated to reduce the number of activated BSs while maintaining an adequate QoS. Extensive numerical simulations demonstrate that the transfer learning significantly reduces the QoS fluctuation during traffic variation, and it also contributes to a performance jump-start and presents significant EE improvement under various practical traffic load profiles. Finally, a proof-of-concept is developed to verify the performance of proposed learning policies on a real radio environment and real measurement database of HF band. Results show that proposed multi-armed bandit learning policies using dual criterion (e.g. availability and quality) optimization for opportunistic spectrum access is not only superior in terms of spectrum utilization but also energy efficient

APA, Harvard, Vancouver, ISO, and other styles

21

Choudrey, Sajaval. "Video Recommendation through Machine Learning in Amazon Web Services." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-303010.

Full text

Abstract:

Machine learning is a field within Computer Science that is still growing. Finding innovative ways to utilise the potential of Machine learning is important for the future of many companies and service providers. Amazon Web Services (AWS) provides machine learning capabilities that don’t require expert knowledge or the implementation of advanced machine learning algorithms. This thesis looks at how one of AWS machine learning services, ”AWS Rekognition” can be used to visually analyse movie-trailers and make relevant movie-trailer recommendations based on the analysis. This is realised by an experiment were a prototype of a recommender system, based on ”AWS Rekognition”, is created. The recommendations of movie-trailers and evaluations of the recommender, are in this thesis both based on calculations of vector similarity using various formulas. The experiments conducted in the thesis showed that the quality of recommendations were varied due to limitations in the movie-trailer corpus and also due to the fact that categorizing movies solely based on visual data in many cases can be misleading.
Maskininlärning är ett område inom datalogi som fortfarande växer. Att hitta innovativa sätt att uttnytja maskininlärningens potential är viktigt för många företags och tjänsteleveratörers framtid. AmazonWeb Services (AWS) möjliggör maskininlärningsförmågor utan krav på expertis inom området, eller implementationer av avancerade maskininlärningsalgoritmer. Denna uppsats undersöker hur en av AWS maskininlärningstjänster ”AWSRekognition” kan användas för att visuellt analysera film-trailers och göra relevanta trailer rekommendationer baserat på den visuella analysen. Detta är realiserat av ett experiment där en prototyp av ett rekommendationssystem, baserat på ”AWS Rekognition”, utvecklats. Rekommendationerna och evalueringen av rekommendationssystemet är båda baserade på beräkningar av vektorlikheten med diverse formler. Experimenten som utfördes i denna uppsats visar att rekommendationernas kvalitet var varierande på grund av begränsningar i filmtrailerbiblioteket och även på grund av att kategoriseringen av filmer i många fall är väldigt missledande om endast visuell data används för analys.

APA, Harvard, Vancouver, ISO, and other styles

22

Kirchner, Jens. "Context-Aware Optimized Service Selection with Focus on Consumer Preferences." Doctoral thesis, Linnéuniversitetet, Institutionen för datavetenskap (DV), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-54320.

Full text

Abstract:

Cloud computing, mobile computing, Service-Oriented Computing (SOC), and Software as a Service (SaaS) indicate that the Internet emerges to an anonymous service market where service functionality can be dynamically and ubiquitously consumed. Among functionally similar services, service consumers are interested in the consumption of the services which perform best towards their optimization preferences. The experienced performance of a service at consumer side is expressed in its non-functional properties (NFPs). Selecting the best-fit service is an individual challenge as the preferences of consumers vary. Furthermore, service markets such as the Internet are characterized by perpetual change and complexity. The complex collaboration of system environments and networks as well as expected and unexpected incidents may result in various performance experiences of a specific service at consumer side. The consideration of certain call side aspects that may distinguish such differences in the experience of NFPs is reflected in various call contexts. Service optimization based on a collaborative knowledge base of previous experiences of other, similar consumers with similar preferences is a desirable foundation. The research work described in this dissertation aims at an individually optimized selection of services considering the individual call contexts that have an impact on the performance, or NFPs in general, of a service as well as the various consumer preferences. The presented approach exploits shared measurement information about the NFP behavior of a service gained from former service calls of previous consumptions. Gaining selection/recommendation knowledge from shared experience benefits existing as well as new consumers of a service before its (initial) consumption. Our approach solely focuses on the optimization and collaborative information exchange among service consumers. It does not require the contribution of service providers or other non-consuming entities. As a result, the contribution among the participating entities also contributes to their own overall optimization benefit. With the initial focus on a single-tier optimization, we additionally provide a conceptual solution to a multi-tier optimization approach for which our recommendation framework is prepared in general. For a consumer-sided optimization, we conducted a literature study of conference papers of the last decade in order to find out what NFPs are relevant for the selection and consumption of services. The ranked results of this study represent what a broad scientific community determined to be relevant NFPs for service selection. We analyzed two general approaches for the employment of machine learning methods within our recommendation framework as part of the preparation of the actual recommendation knowledge. Addressing a future service market that has not fully developed yet and due to the fact that it seems to be impossible to be aware of the actual NFP data of different Web services at identical call contexts, a real-world validation is a challenge. In order to conduct an evaluation and also validation that can be considered to be close approximations to reality with the flexibility to challenge the machine learning approaches and methods as well as the overall recommendation approach, we used generated NFP data whose characteristics are influenced by measurement data gained from real-world Web services. For the general approach with the better evaluation results and benefits ratio, we furthermore analyzed, implemented, and validated machine learning methods that can be employed for service recommendation. Within the validation, we could achieve up to 95% of the overall achievable performance (utility) gain with a machine learning method that is focused on drift detection, which in turn, tackles the change characteristic of the Internet being an anonymous service market.

APA, Harvard, Vancouver, ISO, and other styles

23

Geisler, Markus. "A Machine Learning Component for an Optimized Context-Aware Web Service Selection based on Decision Trees for a Future Service Market." Thesis, Linnéuniversitetet, Institutionen för datavetenskap (DV), 2013. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-31339.

Full text

APA, Harvard, Vancouver, ISO, and other styles

24

Forte, Paolo. "Predicting Service Metrics from Device and Network Statistics." Thesis, KTH, Kommunikationsnät, 2015. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-175892.

Full text

Abstract:

For an IT company that provides a service over the Internet like Facebook or Spotify, it is very important to provide a high quality of service; however, predicting the quality of service is generally a hard task. The goal of this thesis is to investigate whether an approach that makes use of statistical learning to predict the quality of service can obtain accurate predictions for a Voldemort key-value store [1] in presence of dynamic load patterns and network statistics. The approach follows the idea that the service-level metrics associated with the quality of service can be estimated from serverside statistical observations, like device and network statistics. The advantage of the approach analysed in this thesis is that it can virtually work with any kind of service, since it is based only on device and network statistics, which are unaware of the type of service provided. The approach is structured as follows. During the service operations, a large amount of device statistics from the Linux kernel of the operating system (e.g. cpu usage level, disk activity, interrupts rate) and some basic end-to-end network statistics (e.g. average round-trip-time, packet loss rate) are periodically collected on the service platform. At the same time, some service-level metrics (e.g. average reading time, average writing time, etc.) are collected on the client machine as indicators of the store’s quality of service. To emulate network statistics, such as dynamic delay and packet loss, all the traffic is redirected to flow through a network emulator. Then, different types of statistical learning methods, based on linear and tree-based regression algorithms, are applied to the data collections to obtain a learning model able to accurately predict the service-level metrics from the device and network statistics. The results, obtained for different traffic scenarios and configurations, show that the thesis’ approach can find learning models that can accurately predict the service-level metrics for a single-node store with error rates lower than 20% (NMAE), even in presence of network impairments.

APA, Harvard, Vancouver, ISO, and other styles

25

Wang, Yu. "Toward Better Health Care Service: Statistical and Machine Learning Based Analysis of Swedish Patient Satisfaction Survey." Thesis, KTH, Teknisk informationsvetenskap, 2017. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-212984.

Full text

Abstract:

Patients as a customer of health care service has rights to evaluate the servicethey received, and health care providers and professionals may take advantageof these evaluations to improve the health care service. To investigate the relationshipbetween patients overall satisfaction and satisfaction of specic aspects,this study uses classical statistical and machine learning based method to analyzeSwedish national patient satisfaction survey data.Statistical method including cross tabulation, chi-square test, correlationmatrix and linear regression identies the relationship between features. It isfound that patients' demographics have a signicant association between overallsatisfaction. And patients responses in each dimension show similar trend whichwill contribute to patients overall satisfaction.Machine learning classication approaches including Nave Bayes classier,logistic regression, tree-based model (decision tree, random forest, adaptiveboosting decision tree), support vector machines and articial neural networksare used to built models to classify patients overall satisfaction (positive ornegative) based on survey responses in dimensions and patients' demographicsinformation. These models all have relatively high accuracy (87.41%{89.85%)and could help to nd the important features of health care service and henceimprove the quality of health care service in Sweden.
Patienter som kund av hlsovrdstjnsten har rtt att utvrdera den tjnst de ftt, ochvrdgivare och yrkesverksamma kan utnyttja dessa utvrderingar fr att frbttravrden. Fr att underska frhllandet mellan patientens vergripande tillfredsstllelseoch tillfredsstllelse av specika aspekter anvnder den hr studien klassiskstatistisk och maskinbaserad metod fr att analysera svenska nationella patientunderskningsdata.Statistisk metod, inklusive tvr tabulering, chi-square test, korrelationsmatrisoch linjr regression identierar frhllandet mellan funktioner. Det r konstateratatt patienternas demogra har en betydande koppling mellan vergripande tillfredsstllelse.Och patientens svar i varje dimension visar en liknande trend somkommer att bidra till patientens vergripande tillfredsstllelse.Klassiceringsmetoder fr maskininlrning, inklusive Nave Bayes-klassiceraren,logistisk regression, trdbaserad modell (beslutstrd, slumpmssigt skog, adaptivtkar beslutstratt), stdvektormaskiner och konstgjorda neurala ntverk anvnds fratt bygga modeller fr att klassicera Patientens vergripande tillfredsstllelse (positiveller negativ) baserat p underskningsresponser i dimensioner och patientersdemograinformation. Dessa modeller har alla relativt hg noggrannhet (87.41%- 89.85%) och kan hjlpa till att hitta de viktigaste egenskaperna hos vrden ochdrmed frbttra kvaliteten p vrden i Sverige.

APA, Harvard, Vancouver, ISO, and other styles

26

Prokopp, Christian Werner. "Semantic service discovery in the service ecosystem." Thesis, Queensland University of Technology, 2011. https://eprints.qut.edu.au/50872/1/Christian_Prokopp_Thesis.pdf.

Full text

Abstract:

Electronic services are a leitmotif in ‘hot’ topics like Software as a Service, Service Oriented Architecture (SOA), Service oriented Computing, Cloud Computing, application markets and smart devices. We propose to consider these in what has been termed the Service Ecosystem (SES). The SES encompasses all levels of electronic services and their interaction, with human consumption and initiation on its periphery in much the same way the ‘Web’ describes a plethora of technologies that eventuate to connect information and expose it to humans. Presently, the SES is heterogeneous, fragmented and confined to semi-closed systems. A key issue hampering the emergence of an integrated SES is Service Discovery (SD). A SES will be dynamic with areas of structured and unstructured information within which service providers and ‘lay’ human consumers interact; until now the two are disjointed, e.g., SOA-enabled organisations, industries and domains are choreographed by domain experts or ‘hard-wired’ to smart device application markets and web applications. In a SES, services are accessible, comparable and exchangeable to human consumers closing the gap to the providers. This requires a new SD with which humans can discover services transparently and effectively without special knowledge or training. We propose two modes of discovery, directed search following an agenda and explorative search, which speculatively expands knowledge of an area of interest by means of categories. Inspired by conceptual space theory from cognitive science, we propose to implement the modes of discovery using concepts to map a lay consumer’s service need to terminologically sophisticated descriptions of services. To this end, we reframe SD as an information retrieval task on the information attached to services, such as, descriptions, reviews, documentation and web sites - the Service Information Shadow. The Semantic Space model transforms the shadow's unstructured semantic information into a geometric, concept-like representation. We introduce an improved and extended Semantic Space including categorization calling it the Semantic Service Discovery model. We evaluate our model with a highly relevant, service related corpus simulating a Service Information Shadow including manually constructed complex service agendas, as well as manual groupings of services. We compare our model against state-of-the-art information retrieval systems and clustering algorithms. By means of an extensive series of empirical evaluations, we establish optimal parameter settings for the semantic space model. The evaluations demonstrate the model’s effectiveness for SD in terms of retrieval precision over state-of-the-art information retrieval models (directed search) and the meaningful, automatic categorization of service related information, which shows potential to form the basis of a useful, cognitively motivated map of the SES for exploratory search.

APA, Harvard, Vancouver, ISO, and other styles

27

Hugo, Linsey Sledge. "A Comparison of Machine Learning Models Predicting Student Employment." Ohio University / OhioLINK, 2018. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1544127100472053.

Full text

APA, Harvard, Vancouver, ISO, and other styles

28

Flöjs, Amanda, and Alexandra Hägg. "Churn Prediction : Predicting User Churn for a Subscription-based Service using Statistical Analysis and Machine Learning Models." Thesis, Umeå universitet, Institutionen för matematik och matematisk statistik, 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-171678.

Full text

Abstract:

Subscription-based services are becoming more popular in today’s society. Therefore, any company that engages in the subscription-based business needs to understand the user behavior and minimize the number of users canceling their subscription, i.e. minimize churn. According to marketing metrics, the probability of selling to an existing user is markedly higher than selling to a brand new user. Nonetheless, it is of great importance that more focus is directed towards preventing users from leaving the service, in other words preventing user churn. To be able to prevent user churn the company needs to identify the users in the risk zone of churning. Therefore, this thesis project will treat this as a classification problem. The objective of the thesis project was to develop a statistical model to predict churn for a subscription-based service. Various statistical methods were used in order to identify patterns in user behavior using activity and engagement data including variables describing recency, frequency, and volume. The best performing statistical model for predicting churn was achieved by the Random Forest algorithm. The selected model is able to separate the two classes of churning users and the non-churning users with 73% probability and has a fairly low missclassification rate of 35%. The results show that it is possible to predict user churn using statistical models. Although, there are indications that it is difficult for the model to generalize a specific behavioral pattern for user churn. This is understandable since human behavior is hard to predict. The results show that variables describing how frequent the user is interacting with the service are explaining the most whether a user is likely to churn or not.
Prenumerationstjänster blir alltmer populära i dagens samhälle. Därför är det viktigt för ett företag med en prenumerationsbaserad verksamhet att ha en god förståelse för sina användares beteendemönster på tjänsten, samt att de minskar antalet användare som avslutar sin prenumeration. Enligt marknads-föringsstatistik är sannolikheten att sälja till en redan existerande användare betydligt högre än att sälja till en helt ny. Av den anledningen, är det viktigt att ett stort fokus riktas mot att förebygga att användare lämnar tjänsten. För att förebygga att användare lämnar tjänsten måste företaget identifiera vilka användare som är i riskzonen att lämna. Därför har detta examensarbete behandlats som ett klassifikations problem. Syftet med arbetet var att utveckla en statistisk modell för att förutspå vilka användare som sannolikt kommer att lämna prenumerationstjänsten inom nästa månad. Olika statistiska metoder har prövats för att identifiera användares beteendemönster i aktivitet- och engagemangsdata, data som inkluderar variabler som beskriver senaste interaktion, frekvens och volym. Bäst prestanda för att förutspå om en användare kommer att lämna tjänsten gavs av Random Forest algoritmen. Den valda modellen kan separera de två klasserna av användare som lämnar tjänsten och de användare som stannar med 73% sannolikhet och har en relativt låg missfrekvens på 35%. Resultatet av arbetet visar att det går att förutspå vilka användare som befinner sig i riskzonen för att lämna tjänsten med hjälp av statistiska modeller, även om det är svårt för modellen att generalisera ett specifikt beteendemönster för de olika grupperna. Detta är dock förståeligt då det är mänskligt beteende som modellen försöker att förutspå. Resultatet av arbetet pekar mot att variabler som beskriver frekvensen av användandet av tjänsten beskriver mer om en användare är påväg att lämna tjänsten än variabler som beskriver användarens aktivitet i volym.

APA, Harvard, Vancouver, ISO, and other styles

29

Malyutin, Oleksandr. "A System of Automated Web Service Selection." Thesis, Linnéuniversitetet, Institutionen för datavetenskap (DV), 2016. http://urn.kb.se/resolve?urn=urn:nbn:se:lnu:diva-52733.

Full text

Abstract:

In the modern world, service oriented applications are becoming more and more popular from year to year. To remain competitive, these Web services must provide the high level of quality. From another perspective, the end user is interested in getting the service, which fits the user's requirements the best: for limited resources get the service with the best available quality. In this work, the model for automated service selection was presented to solve this problem. The main focus of this work was to provide high accuracy of this model during the prediction of Web service’s response time. Therefore, several machine learning algorithms were selected and used in the model as well as several experiments were conducted and their results were evaluated and analysed to select one machine learning algorithm, which coped best with the defined task. This machine learning algorithm was used in final version of the model. As a result, the selection model was implemented, whose accuracy was around 80% while selecting only one Web service as a best from the list of available. Moreover, one strategy for measuring accuracy has also been developed, the main idea of which is the following: not one but several Web services, the difference in the response time of which does not exceed the boundary value, can be considered as optimal ones. According to this strategy, the maximum accuracy of selecting the best Web service was about 89%. In addition, a strategy for selecting the best Web service from the end-user side was developed to evaluate the performance of implemented model. Finally, it should also be mentioned that with the help of specific tool the input data for the experiments was generated, which allowed not only generating different input datasets without huge time consumption but also using the input data with the different type (linear, periodic) for experiments.

APA, Harvard, Vancouver, ISO, and other styles

30

Kiourktsidis, Ilias. "Flexible cross layer design for improved quality of service in MANETs." Thesis, Brunel University, 2011. http://bura.brunel.ac.uk/handle/2438/7464.

Full text

Abstract:

Mobile Ad hoc Networks (MANETs) are becoming increasingly important because of their unique characteristics of connectivity. Several delay sensitive applications are starting to appear in these kinds of networks. Therefore, an issue in concern is to guarantee Quality of Service (QoS) in such constantly changing communication environment. The classical QoS aware solutions that have been used till now in the wired and infrastructure wireless networks are unable to achieve the necessary performance in the MANETs. The specialized protocols designed for multihop ad hoc networks offer basic connectivity with limited delay awareness and the mobility factor in the MANETs makes them even more unsuitable for use. Several protocols and solutions have been emerging in almost every layer in the protocol stack. The majority of the research efforts agree on the fact that in such dynamic environment in order to optimize the performance of the protocols, there is the need for additional information about the status of the network to be available. Hence, many cross layer design approaches appeared in the scene. Cross layer design has major advantages and the necessity to utilize such a design is definite. However, cross layer design conceals risks like architecture instability and design inflexibility. The aggressive use of cross layer design results in excessive increase of the cost of deployment and complicates both maintenance and upgrade of the network. The use of autonomous protocols like bio-inspired mechanisms and algorithms that are resilient on cross layer information unavailability, are able to reduce the dependence on cross layer design. In addition, properties like the prediction of the dynamic conditions and the adaptation to them are quite important characteristics. The design of a routing decision algorithm based on Bayesian Inference for the prediction of the path quality is proposed here. The accurate prediction capabilities and the efficient use of the plethora of cross layer information are presented. Furthermore, an adaptive mechanism based on the Genetic Algorithm (GA) is used to control the flow of the data in the transport layer. The aforementioned flow control mechanism inherits GA’s optimization capabilities without the need of knowing any details about the network conditions, thus, reducing the cross layer information dependence. Finally, is illustrated how Bayesian Inference can be used to suggest configuration parameter values to the other protocols in different layers in order to improve their performance.

APA, Harvard, Vancouver, ISO, and other styles

31

Lawlor, Mary Ann C. "Predictors of Health Service Use in Persons with Heart Failure." Case Western Reserve University School of Graduate Studies / OhioLINK, 2021. http://rave.ohiolink.edu/etdc/view?acc_num=case1619702345236178.

Full text

APA, Harvard, Vancouver, ISO, and other styles

32

Chen, Guoyu. "PAILAC: Power and Inference Latency Adaptive Control for Machine Learning Services." The Ohio State University, 2020. http://rave.ohiolink.edu/etdc/view?acc_num=osu160608666572472.

Full text

APA, Harvard, Vancouver, ISO, and other styles

33

PULIGHEDDU, CORRADO. "Machine Learning-Powered Management Architectures for Edge Services in 5G Networks." Doctoral thesis, Politecnico di Torino, 2022. https://hdl.handle.net/11583/2973797.

Full text

APA, Harvard, Vancouver, ISO, and other styles

34

Adi, Erwin. "Denial-of-service attack modelling and detection for HTTP/2 services." Thesis, Edith Cowan University, Research Online, Perth, Western Australia, 2017. https://ro.ecu.edu.au/theses/1953.

Full text

Abstract:

Businesses and society alike have been heavily dependent on Internet-based services, albeit with experiences of constant and annoying disruptions caused by the adversary class. A malicious attack that can prevent establishment of Internet connections to web servers, initiated from legitimate client machines, is termed as a Denial of Service (DoS) attack; volume and intensity of which is rapidly growing thanks to the readily available attack tools and the ever-increasing network bandwidths. A majority of contemporary web servers are built on the HTTP/1.1 communication protocol. As a consequence, all literature found on DoS attack modelling and appertaining detection techniques, addresses only HTTP/1.x network traffic. This thesis presents a model of DoS attack traffic against servers employing the new communication protocol, namely HTTP/2. The HTTP/2 protocol significantly differs from its predecessor and introduces new messaging formats and data exchange mechanisms. This creates an urgent need to understand how malicious attacks including Denial of Service, can be launched against HTTP/2 services. Moreover, the ability of attackers to vary the network traffic models to stealthy affects web services, thereby requires extensive research and modelling. This research work not only provides a novel model for DoS attacks against HTTP/2 services, but also provides a model of stealthy variants of such attacks, that can disrupt routine web services. Specifically, HTTP/2 traffic patterns that consume computing resources of a server, such as CPU utilisation and memory consumption, were thoroughly explored and examined. The study presents four HTTP/2 attack models. The first being a flooding-based attack model, the second being a distributed model, the third and fourth are variant DoS attack models. The attack traffic analysis conducted in this study employed four machine learning techniques, namely Naïve Bayes, Decision Tree, JRip and Support Vector Machines. The HTTP/2 normal traffic model portrays online activities of human users. The model thus formulated was employed to also generate flash-crowd traffic, i.e. a large volume of normal traffic that incapacitates a web server, similar in fashion to a DoS attack, albeit with non-malicious intent. Flash-crowd traffic generated based on the defined model was used to populate the dataset of legitimate network traffic, to fuzz the machine learning-based attack detection process. The two variants of DoS attack traffic differed in terms of the traffic intensities and the inter-packet arrival delays introduced to better analyse the type and quality of DoS attacks that can be launched against HTTP/2 services. A detailed analysis of HTTP/2 features is also presented to rank relevant network traffic features for all four traffic models presented. These features were ranked based on legitimate as well as attack traffic observations conducted in this study. The study shows that machine learning-based analysis yields better classification performance, i.e. lower percentage of incorrectly classified instances, when the proposed HTTP/2 features are employed compared to when HTTP/1.1 features alone are used. The study shows how HTTP/2 DoS attack can be modelled, and how future work can extend the proposed model to create variant attack traffic models that can bypass intrusion-detection systems. Likewise, as the Internet traffic and the heterogeneity of Internet-connected devices are projected to increase significantly, legitimate traffic can yield varying traffic patterns, demanding further analysis. The significance of having current legitimate traffic datasets, together with the scope to extend the DoS attack models presented herewith, suggest that research in the DoS attack analysis and detection area will benefit from the work presented in this thesis.

APA, Harvard, Vancouver, ISO, and other styles

35

Panchapakesan, Ashwin. "Optimizing Shipping Container Damage Prediction and Maritime Vessel Service Time in Commercial Maritime Ports Through High Level Information Fusion." Thesis, Université d'Ottawa / University of Ottawa, 2019. http://hdl.handle.net/10393/39593.

Full text

Abstract:

The overwhelming majority of global trade is executed over maritime infrastructure, and port-side optimization problems are significant given that commercial maritime ports are hubs at which sea trade routes and land/rail trade routes converge. Therefore, optimizing maritime operations brings the promise of improvements with global impact. Major performance bottlenecks in maritime trade process include the handling of insurance claims on shipping containers and vessel service time at port. The former has high input dimensionality and includes data pertaining to environmental and human attributes, as well as operational attributes such as the weight balance of a shipping container; and therefore lends itself to multiple classification method- ologies, many of which are explored in this work. In order to compare their performance, a first-of-its-kind dataset was developed with carefully curated attributes. The performance of these methodologies was improved by exploring metalearning techniques to improve the collective performance of a subset of these classifiers. The latter problem formulated as a schedule optimization, solved with a fuzzy system to control port-side resource deployment; whose parameters are optimized by a multi-objective evolutionary algorithm which outperforms current industry practice (as mined from real-world data). This methodology has been applied to multiple ports across the globe to demonstrate its generalizability, and improves upon current industry practice even with synthetically increased vessel traffic.

APA, Harvard, Vancouver, ISO, and other styles

36

Barr, Kajsa, and Hampus Pettersson. "Predicting and Explaining Customer Churn for an Audio/e-book Subscription Service using Statistical Analysis and Machine Learning." Thesis, KTH, Matematisk statistik, 2019. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-252723.

Full text

Abstract:

The current technology shift has contributed to increased consumption of media and entertainment through various mobile devices, and especially through subscription based services. Storytel is a company offering a subscription based streaming service for audio and e-books, and has grown rapidly in the last couple of years. However, when operating in a competitive market, it is of great importance to understand the behavior and demands of the customer base. It has been shown that it is more profitable to retain existing customers than to acquire new ones, which is why a large focus should be directed towards preventing customers from leaving the service, that is preventing customer churn. One way to cope with this problem is by applying statistical analysis and machine learning in order to identify patterns and customer behavior in data. In this thesis, the models logistic regression and random forest are used with an aim to both predict and explain churn in early stages of a customer's subscription. The models are tested together with the feature selection methods Elastic Net, RFE and PCA, as well as with the oversampling method SMOTE. One main finding is that the best predictive model is obtained by using random forest together with RFE, producing a prediction score of 0.2427 and a recall score of 0.7699. The other main finding is that the explanatory model is given by logistic regression together with Elastic Net, where significant regression coefficient estimates can be used to explain patterns associated with churn and give useful findings from a business perspective.
Det pågående teknologiskiftet har bidragit till en ökad konsumtion av digital media och underhållning via olika typer av mobila enheter, t.ex. smarttelefoner. Storytel är ett företag som erbjuder en prenumerationstjänst för ljud- och e-böcker och har haft en kraftig tillväxt de senaste åren. När företag befinner sig i en konkurrensutsatt marknad är det av stor vikt att förstå sig på kunders beteende samt vilka krav och önskemål kunder har på tjänsten. Det har nämligen visat sig vara mer lönsamt att behålla existerande kunder i tjänsten än hela tiden värva nya, och det är därför viktigt att se till att en befintlig kund inte avslutar sin prenumeration. Ett sätt att hantera detta är genom att använda statistisk analys och maskininlärningsmetoder för att identifiera mönster och beteenden i data. I denna uppsats används både logistisk regression och random forest med syfte att både prediktera och förklara uppsägning av tjänsten i ett tidigt stadie av en kunds prenumeration. Modellerna testas tillsammans med variabelselektionsmetoderna Elastic Net, RFE och PCA, samt tillsammans med översamplingsmetoden SMOTE. Resultatet blev att random forest tillsammans med RFE bäst predikterade uppsägning av tjänsten med 0.2427 i måttet precision och 0.7699 i måttet recall. Ett annat viktigt resultat är att den förklarande modellen ges av logistisk regression tillsammans med Elastic Net, där signifikanta estimat av regressionskoefficienterna ökar förklaringsgraden för beteenden och mönster relaterade till kunders uppsägning av tjänsten. Därmed ges användbara insikter ur ett företagsperspektiv.

APA, Harvard, Vancouver, ISO, and other styles

37

Hasan, Irfan. "Machine learning techniques for automated knowledge acquisition in intelligent knowledge-based systems." Instructions for remote access. Click here to access this electronic resource. Access available to Kutztown University faculty, staff, and students only, 1991. http://www.kutztown.edu/library/services/remote_access.asp.

Full text

Abstract:

Thesis (M.S.)--Kutztown University of Pennsylvania, 1991.
Source: Masters Abstracts International, Volume: 45-06, page: 3187. Abstract precedes thesis as [2] preliminary leaves. Typescript. Includes bibliographical references (leaves 102-104).

APA, Harvard, Vancouver, ISO, and other styles

38

Shaham, Sina. "Location Privacy in the Era of Big Data and Machine Learning." Thesis, The University of Sydney, 2019. https://hdl.handle.net/2123/21689.

Full text

Abstract:

Location data of individuals is one of the most sensitive sources of information that once revealed to ill-intended individuals or service providers, can cause severe privacy concerns. In this thesis, we aim at preserving the privacy of users in telecommunication networks against untrusted service providers as well as improving their privacy in the publication of location datasets. For improving the location privacy of users in telecommunication networks, we consider the movement of users in trajectories and investigate the threats that the query history may pose on location privacy. We develop an attack model based on the Viterbi algorithm termed as Viterbi attack, which represents a realistic privacy threat in trajectories. Next, we propose a metric called transition entropy that helps to evaluate the performance of dummy generation algorithms, followed by developing a robust dummy generation algorithm that can defend users against the Viterbi attack. We compare and evaluate our proposed algorithm and metric on a publicly available dataset published by Microsoft, i.e., Geolife dataset. For privacy preserving data publishing, an enhanced framework for anonymization of spatio-temporal trajectory datasets termed the machine learning based anonymization (MLA) is proposed. The framework consists of a robust alignment technique and a machine learning approach for clustering datasets. The framework and all the proposed algorithms are applied to the Geolife dataset, which includes GPS logs of over 180 users in Beijing, China.

APA, Harvard, Vancouver, ISO, and other styles

39

Carkacioglu, Levent. "Automated Biological Data Acquisition And Integration Using Machine Learning Techniques." Phd thesis, METU, 2009. http://etd.lib.metu.edu.tr/upload/12610396/index.pdf.

Full text

Abstract:

Since the initial genome sequencing projects along with the recent advances on technology, molecular biology and large scale transcriptome analysis result in data accumulation at a large scale. These data have been provided in different platforms and come from different laboratories therefore, there is a need for compilation and comprehensive analysis. In this thesis, we addressed the automatization of biological data acquisition and integration from these non-uniform data using machine learning techniques. We focused on two different mining studies in the scope of this thesis. In the first study, we worked on characterizing expression patterns of housekeeping genes. We described methodologies to compare measures of housekeeping genes with non-housekeeping genes. In the second study, we proposed a novel framework, bi-k-bi clustering, for finding association rules of gene pairs that can easily operate on large scale and multiple heterogeneous data sets. Results in both studies showed consistency and relatedness with the available literature. Furthermore, our results provided some novel insights waiting to be experimented by the biologists.

APA, Harvard, Vancouver, ISO, and other styles

40

Wikström, Johan. "Employment forecasting using data from the Swedish Public Employment Service." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-239174.

Full text

Abstract:

The objective of this thesis is to forecast the number of people registered at the Swedish Public Employment Service (Arbetsförmedlingen) that will manage to get employment each month and examine how accurate the forecasts are. The Swedish Public Employment Service is a government-funded agency in Sweden working to keep the unemployment rate low. When someone is unemployed or looking for a new job, he or she can register at the Swedish Public Employment Service. Being able to forecast well how many are expected to get employment could be useful when planning and making decisions. It could also be used as an indicator of how well the Swedish Public Employment Service manages to perform and thus how well the tax money is used. The models employed for forecasting were the seasonal autoregressive integrated moving average (SARIMA) and the long short-term memory (LSTM). A persistence model is also used as a baseline. The persistence model is a very simple model and the other models are therefore expected to outperform it. For the LSTM model, the use of both univariate and multivariate approaches will be explored in order to examine if the model can be improved with more data. Results from the experiments performed showed that a multivariate LSTM performed the lowest root mean squared error (RMSE) and is therefore considered the best model. However, the robustness of the model over time needs further research.
Syftet med detta arbete är att göra prognoser på hur många av de registrerade på Arbetsförmedlingen som kommer att få arbete en viss månad och undersöka hur noggranna dessa prognoser blir. Arbetsförmedlingen är en skattefinansierad myndighet i Sverige som arbetar med att hålla arbetslösheten låg. När någon är arbetslös eller letar efter ett arbete kan man registrera sig hos Arbetsförmedlingen. Att kunna göra bra prognoser på hur många som kommer att få arbete skulle kunna vara användbart vid planering och beslutfattande. Det skulle också kunna användas som en indikator på hur väl Arbetsförmedlingen använder skattepengarna. De modeller som har använts är seasonal autoregressive integrated moving average (SARIMA) och long short-term memory (LSTM). En persistensmodell används också som baslinje. Persistensmodellen är en enkel modell och därför förväntas de andra modellerna prestera bättre. För LSTM-modellen kommer användningen av både envariabla och flervariabla tillvägagångssätt att undersökas för att testa om mer data kan förbättra modellen. Resultat från experimenten visar att det var en LSTM-modell med flera variabler som presterade lägst root mean squared error (RMSE) och anses därför vara den bästa modellen. Det behövs dock ytterligare studier för att undersöka modellens stabilitet över tid.

APA, Harvard, Vancouver, ISO, and other styles

41

Hellberg, Johan, and Kasper Johansson. "Building Models for Prediction and Forecasting of Service Quality." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-295617.

Full text

Abstract:

In networked systems engineering, operational datagathered from sensors or logs can be used to build data-drivenfunctions for performance prediction, anomaly detection, andother operational tasks [1]. Future telecom services will share acommon communication and processing infrastructure in orderto achieve cost-efficient and robust operation. A critical issuewill be to ensure service quality, whereby different serviceshave very different requirements. Thanks to recent advances incomputing and networking technologies we are able to collect andprocess measurements from networking and computing devices,in order to predict and forecast certain service qualities, such asvideo streaming or data stores. In this paper we examine thesetechniques, which are based on statistical learning methods. Inparticular we will analyze traces from testbed measurements andbuild predictive models. A detailed description of the testbed,which is localized at KTH, is given in Section II, as well as in[2].
Inom nätverk och systemteknik samlas operativ data från sensorer eller loggar som sedan kan användas för att bygga datadrivna funktioner för förutsägelser om prestanda och andra operationella uppgifter [1]. Framtidens teletjänster kommer att dela en gemensam kommunikation och bearbetnings infrastruktur i syfte att uppnå kostnadseffektiva och robusta nätverk. Ett kritiskt problem med detta är att kunna garantera en hög servicekvalitet. Detta problem uppstår till stor del som ett resultat av att olika tjänster har olika krav. Tack vare nyliga avanceringar inom beräkning och nätverksteknologi har vi kunnat samla in användningsmätningar från nätverk och olika datorenheter för att kunna förutspå servicekvalitet för exempelvis videostreaming och lagring av data. I detta arbete undersöker vi data med hjälp av statistiska inlärningsmetoder och bygger prediktiva modeller. En mer detaljerat beskrivning av vår testbed, som är lokaliserad på KTH, finns i [2].
Kandidatexjobb i elektroteknik 2020, KTH, Stockholm

APA, Harvard, Vancouver, ISO, and other styles

42

Nääs, Starberg Filip, and Axel Rooth. "Predicting a business application's cloud server CPU utilization using the machine learning model LSTM." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-301247.

Full text

Abstract:

Cloud Computing sees increased adoption as companies seek to increase flexibility and reduce cost. Although the large cloud service providers employ a pay-as-you-go pricing model and enable customers to scale up and down quickly, there is still room for improvement. Workload in the form of CPU utilization often fluctuates which leads to unnecessary cost and environmental impact for companies. To help mitigate this issue, the aim of this paper is to predict future CPU utilization using a long short-term memory (LSTM) machine learning model. By predicting utilization up to 30 minutes into the future, companies are able to scale their capacity just in time and avoid unnecessary cost and damage to the environment. The study is divided into two parts. The first part analyses how well the LSTM model performs when predicting one step at a time compared with a state-of-the-art model. The second part analyses the accuracy of the LSTM when making predictions up to 30 minutes into the future. To allow for an objective analysis of results, the LSTM is compared with a standard RNN, which is similar to the LSTM in its inherit algorithmic structure. To conclude, the results suggest that LSTM may be a useful tool for reducing cost and unnecessary environmental impact for business applications hosted on a public cloud.
Användandet av molntjänster ökar bland företag som önskar förbättrad flexibilitet och sänkta kostnader. De stora molntjänstleverantörerna använder en prismodell där kostnaden är direkt kopplad till användningen, och låter kunderna snabbt ställa om sin kapacitet, men det finns ändå förbättringsmöjligheter. CPU-behoven fluktuerar ofta vilket leder till meningslösa kostnader och onödig påverkan på klimatet när kapacitet är outnyttjad. För att lindra detta problem används i denna rapport en LSTM maskininlärningsmodell för att förutspå framtida CPU-utnyttjande. Genom att förutspå utnyttjandet upp till 30 minuter in i framtiden hinner företag ställa om sin kapacitet och undvika onödig kostnad och klimatpåverkan. Arbetet ¨ar uppdelat i två delar. Först en del där LSTM-modellen förutspår ett tidssteg åt gången. Därefter en del som analyserar träffsäkerheten för LSTM flera tidssteg in i framtiden, upp till 30 tidssteg. För att möjliggöra en objektiv utvärdering så jämfördes LSTM-modellen med ett standard recurrent neural network (RNN) vilken liknar LSTM i sin struktur. Resultaten i denna studie visar att LSTM verkar vara ¨överlägsen RNN, både när det gäller att förutspå ett tidssteg in i framtiden och när det gäller flera tidssteg in i framtiden. LSTM-modellen var kapabel att förutspå CPU-utnyttjandet 30 minuter in i framtiden med i hög grad bibehållen träffsäkerhet, vilket också var målet med studien. Sammanfattningsvis tyder resultaten på att denna LSTM-modell, och möjligen liknande LSTM-modeller, har potential att användas i samband med företagsapplikationer då man önskar att reducera onödig kostnad och klimatpåverkan.

APA, Harvard, Vancouver, ISO, and other styles

43

Algarni, Abdullah Fayez H. "A machine learning framework for optimising file distribution across multiple cloud storage services." Thesis, University of York, 2017. http://etheses.whiterose.ac.uk/17981/.

Full text

Abstract:

Storing data using a single cloud storage service may lead to several potential problems for the data owner. Such issues include service continuity, availability, performance, security, and the risk of vendor lock-in. A promising solution is to distribute the data across multiple cloud storage services , similarly to the manner in which data are distributed across multiple physical disk drives to achieve fault tolerance and to improve performance . However, the distinguishing characteristics of different cloud providers, in term of pricing schemes and service performance, make optimising the cost and performance across many cloud storage services at once a challenge. This research proposes a framework for automatically tuning the data distribution policies across multiple cloud storage services from the client side, based on file access patterns. The aim of this work is to explore the optimisation of both the average cost per gigabyte and the average service performance (mainly latency time) on multiple cloud storage services . To achieve these aims, two machine learning algorithms were used: 1. supervised learning to predict file access patterns. 2. reinforcement learning to learn the ideal file distribution parameters. File distribution over several cloud storage services . The framework was tested in a cloud storage services emulator, which emulated a real multiple-cloud storage services setting (such as Google Cloud Storage, Amazon S3, Microsoft Azure Storage, and Rack- Space file cloud) in terms of service performance and cost. In addition, the framework was tested in various settings of several cloud storage services. The results of testing the framework showed that the multiple cloud approach achieved an improvement of about 42% for cost and 76% for performance. These findings indicate that storing data in multiple clouds is a superior approach, compared with the commonly used uniform file distribution and compared with a heuristic distribution method.

APA, Harvard, Vancouver, ISO, and other styles

44

Pérennou, Loïc. "Virtual machine experience design : a predictive resource allocation approach for cloud infrastructures." Thesis, Paris, CNAM, 2019. http://www.theses.fr/2019CNAM1246/document.

Full text

Abstract:

L’un des principaux défis des fournisseurs de services cloud est d’offrir aux utilisateurs une performance acceptable, tout en minimisant les besoins en matériel et énergie. Dans cette thèse CIFRE menée avec Outscale, un fournisseur de cloud, nous visons à optimiser l’allocation des ressources en utilisant de nouvelles sources d’information. Nous caractérisons la charge de travail pour comprendre le stress résultant sur l’orchestrateur, et la compétition pour les ressources disponibles qui dégrade la qualité de service. Nous proposons un modèle pour prédire la durée d’exécution des VMs à partir de caractéristiques prédictives disponibles au démarrage. Enfin, nous évaluons la sensibilité aux erreurs d’un algorithme de placement des VMs de la littérature qui se base sur ces prédictions. Nous ne trouvons pas d’intérêt à coupler note système prédictif avec cet algorithme, mais nous proposons d’autres façons d’utiliser les prédictions pour optimiser le placement des VMs
One of the main challenges for cloud computing providers remains to offer trustable performance for all users, while maintaining an efficient use of hardware and energy resources. In the context of this CIFRE thesis lead with Outscale, apublic cloud provider, we perform an in-depth study aimed at making management algorithms use new sources of information. We characterize Outscale’s workload to understand the resulting stress for the orchestrator, and the contention for hardware resources. We propose models to predict the runtime of VMs based on features which are available when they start. We evaluate the sensitivity with respect to prediction error of a VM placement algorithm from the literature that requires such predictions. We do not find any advantage in coupling our prediction model and the selected algorithm, but we propose alternative ways to use predictions to optimize the placement of VMs

APA, Harvard, Vancouver, ISO, and other styles

45

Mpawenimana, Innocent. "Modélisation et conception d’objets connectés au service des maisons intelligentes : Évaluation et optimisation de leur autonomie et de leur QoS." Thesis, Université Côte d'Azur, 2020. http://www.theses.fr/2020COAZ4107.

Full text

Abstract:

Cette thèse s’inscrit dans le domaine des maisons intelligentes, plus précisément dans l’optimisation énergétique et l’utilisation d’un système de récupération et stockage de l’énergie ambiante. L’objectif est de proposer, après collecte d’un ensemble d’informations pertinentes (courant, puissance active et réactive, température, etc.), des services liés à la gestion de la consommation électrique domestique et favorisant l’autoconsommation. Dans cette thèse, la collecte des données a tout d’abord été basée sur une approche intrusive. A défaut de pouvoir construire notre propre base de données, nous avons utilisé une base de données disponible en ligne. Différents algorithmes d’apprentissage supervisés ont été évalués à partir de ces données afin de reconnaître un appareil électrique. Nos résultats ont montré que les puissances active et réactive seules suffisent à identifier de manière précise un appareil électrique. Afin d’améliorer l’identification des différents appareils, une technique basée sur une moyenne glissante a été utilisée pour le pré-traitement des données. Dans cette thèse, une approche non-intrusive consistant à mesurer la consommation électrique d’une habitation de manière globale, a finalement été privilégiée. A partir de cette mesure globale, des prédictions de l’énergie globale consommée à partir d’algorithmes d’apprentissage automatique (LSTM) a été proposée. L’algorithme LSTM (Long Short-Term Memory) a également été utilisé afin de prédire la puissance récupérée par des cellules photovoltaïques, ceci pour différents profils d’ensoleillement. Ces prédictions de l’énergie consommée et récupérée sont finalement exploitées par un algorithme de gestion de l’énergie favorisant l’autoconsommation
This PhD thesis is in the field of smart homes, and more specifically in the energy consumption optimization process for a home having an ambient energy source harvesting and storage system. The objective is to propose services to handle the household energy consumption and to promote self-consumption. To do so, relevant data must be first collected (current, active and reactive power consumption, temperature and so on). In this PhD, data have been first sensed using an intrusive load approach. Despite our efforts to build our own data base, we decided to use an online available dataset for the rest of this study. Different supervised machine learning algorithms have been evaluated from this dataset to identify home appliances with accuracy. Obtained results showed that only active and reactive power can be used for that purpose. To further optimize the accuracy, we proposed to use a moving average function for reducing the random variations in the observations. A non-intrusive load approach has been finally adopted to rather determine the global household active energy consumption. Using an online existing dataset, a machine learning algorithm based on Long Short-Term Memory (LSTM) has then been proposed to predict, over different time scale, the global household consumed energy. Long Short-Term Memory was also used to predict, for different weather profiles, the power that can be harvested from solar cells. Those predictions of consumed and harvested energy have been finally exploited by a Home Energy Management policy optimizing self-consumption. Simulation results show that the size of the solar cells as well as the battery impacts the self-consumption rate and must be therefore meticulously chosen

APA, Harvard, Vancouver, ISO, and other styles

46

Khokhar, Muhammad Jawad. "Modélisation de la qualité d'expérience de la vidéo streaming dans l'internet par expérimentation contrôlée et apprentissage machine." Thesis, Université Côte d'Azur (ComUE), 2019. http://www.theses.fr/2019AZUR4067.

Full text

Abstract:

Le streaming vidéo est l'élément dominant au trafic Internet actuel. En conséquence, l'estimation de la qualité d'expérience (QoE) pour le streaming vidéo est de plus en plus importante pour les opérateurs réseau. La qualité d'expérience (QoE) de la diffusion vidéo sur Internet est directement liée aux conditions du réseau (par exemple, bande passante, délai) également appelée qualité de service (QoS). Cette relation entre QoS et QoE motive l'utilisation de l'apprentissage automatique supervisé pour établir des modèles reliant QoS à QoE. La QoS du réseau peut être mesurée activement en dehors du plan de données de l'application (outband) ou de manière passive à partir du trafic vidéo (inband). Ces deux types de qualité de service correspondent à deux scénarios d'utilisation différents : la prévision et la surveillance. Dans cette thèse, nous examinons les défis associés à la modélisation de la QoE à partir de la QoS réseau, à savoir 1) le coût élevé de la phase expérimentale, et 2) la considération de la grande diversité du contenu vidéo et du chiffrement déployé. Premièrement, la modélisation de la QoE par expérimentation contrôlée constitue un défi, les dimensions d'espace d'expérimentations ainsi que le temps non négligeable de chaque expérience rend cette modélisation plus complexe. L'approche classique consiste à expérimenter avec des échantillons (de qualité de service), échantillonnés de manière uniforme dans tout l'espace expérimental. Cependant, un échantillonnage uniforme peut entraîner une similarité significative au niveau des labels, ce qui entraine une augmentation du coût sans gain en précision du modèle. Pour résoudre ce problème, nous recommandons d'utiliser apprentissage actif pour réduire le nombre d'expériences sans affecter la précision. Nous examinons le cas de la modélisation QoE sur YouTube et montrons que l'échantillonnage actif fournit un gain significatif par rapport à l'échantillonnage uniforme en termes d'augmentation de la précision de la modélisation en moins d'expériences. Nous évaluons ensuite notre approche avec des ensembles de données synthétiques et montrons que le gain dépend de la complexité de l'espace expérimental. Dans l'ensemble, nous présentons une approche générale d'échantillonnage qui peut être utilisée dans n'importe quel scénario de modélisation QoS-QoE, à condition que la fonctionnalité de QoS en entrée soit entièrement contrôlable. Deuxièmement, prévoir la qualité de l'expérience de la vidéo avec précision s'avère difficile, d'une part les vidéos des fournisseurs de contenu actuels varient énormément, des vidéos sportives rapides aux vidéos éducatives statiques. De plus, le trafic vidéo actuel est crypté, ce qui signifie que les opérateurs de réseau ont une visibilité réduite sur le trafic vidéo, ce qui rend la surveillance de la QoE plus complexe. Face à ces défis, nous développons des modèles afin de prévoir ainsi que surveiller avec précision la qualité de l'expérience vidéo. Pour le scénario de prévision QoE, nous construisons un indicateur QoE appelé YouScore qui prédit le pourcentage de vidéos pouvant être lues sans interruption en fonction de l'état du réseau sous-jacent. En ce qui concerne la surveillance QoE, nous estimons la QoE à l'aide des fonctionnalités de qualité de service inband obtenues à partir du trafic vidéo crypté. En conclusion, pour les deux scénarios (prévision et surveillance), nous soulignons l'importance d'utiliser des fonctionnalités qui caractérisent le contenu vidéo afin de pouvoir améliorer la précision des modèles
Video streaming is the dominant contributor of today's Internet traffic. Consequently, estimating Quality of Experience (QoE) for video streaming is of paramount importance for network operators. The QoE of video streaming is directly dependent on the network conditions (e.g., bandwidth, delay, packet loss rate) referred to as the network Quality of Service (QoS). This inherent relationship between the QoS and the QoE motivates the use of supervised Machine Learning (ML) to build models that map the network QoS to the video QoE. In most ML works on QoE modeling, the training data is usually gathered in the wild by crowdsourcing or generated inside the service provider networks. However, such data is not easily accessible to the general research community. Consequently, the training data if not available beforehand, needs to be built up by controlled experimentation. Here, the target application is run under emulated network environments to build models that predict video QoE from network QoS. The network QoS can be actively measured outside the data plane of the application (outband), or measured passively from the video traffic (inband). These two distinct types of QoS correspond to the use cases of QoE forecasting (from end user devices) and QoE monitoring (from within the networks). In this thesis, we consider the challenges associated with network QoS-QoE modeling, which are 1) the large training cost of QoE modeling by controlled experimentation, and 2) the accurate prediction of QoE considering the large diversity of video contents and the encryption deployed by today's content providers. Firstly, QoE modeling by controlled experimentation is challenging due to the high training cost involved as each experiment usually consumes some non-negligible time to complete and the experimental space to cover is large (power the number of QoS features). The conventional approach is to experiment with QoS samples uniformly sampled in the entire experimental space. However, uniform sampling can result in significant similarity in the output labels, which increases the training cost while not providing much gain in the model accuracy. To tackle this problem, we advocate the use of active learning to reduce the number of experiments while not impacting accuracy. We consider the case of YouTube QoE modeling and show that active sampling provides a significant gain over uniform sampling in terms of achieving higher modeling accuracy with fewer experiments. We further evaluate our approach with synthetic datasets and show that the gain is dependent on the complexity of the experimental space. Overall, we present a sampling approach that is general and can be used in any QoSQoE modeling scenario provided that the input QoS features are fully controllable. Secondly, accurate prediction of QoE of video streaming can be challenging as videos offered by today's content providers vary significantly from fast motion sports videos to static lectures. On top of that, today's video traffic is encrypted, which means that network operators have little visibility into the video traffic making QoE monitoring difficult. Considering these challenges, we devise models that aim at accurate forecasting and monitoring of video QoE. For the scenario of QoE forecasting, we build a QoE indicator called YouScore that quantifies the percentage of videos in the catalog of a content provider that may play out smoothly (without interruptions) for a given outband network QoS. For the QoE monitoring scenario, we estimate the QoE using the inband QoS features obtained from the encrypted video traffic. Overall, for both scenarios (forecasting and monitoring), we highlight the importance of using features that characterize the video content to improve the accuracy of QoE modeling

APA, Harvard, Vancouver, ISO, and other styles

47

De, Castri Andrea. "Sistemi di supporto alle decisioni in ambito clinico: predizione del rischio "as a service"." Master's thesis, Alma Mater Studiorum - Università di Bologna, 2017. http://amslaurea.unibo.it/14733/.

Full text

Abstract:

L'evoluzione tecnologica e l'ampliamento dei canali di comunicazione stanno giocando un ruolo fondamentale nelle aziende che hanno intenzione di evolvere la propria infrastruttura IT in un modello as a service. Il cloud sta svolgendo una funzione principale nel business aziendale portandone il mercato ad una crescita che oscilla tra il 18% e il 21%, nel solo 2017, e spingendo migliaia di aziende a integrare la vecchia IT con la nuova. Si deve inoltre tener conto dell'enorme volume di dati prodotto dai processi aziendali nell'ultimo ventennio, in quanto la richiesta di utilizzo di tali dati, per scopi aziendali, ha reso necessaria la creazione di tecniche avanzate per l'analisi degli stessi; ne sono un esempio la ricerca clinica e alcune pratiche mediche. Queste, infatti, stanno subendo un cambiamento radicale attraverso l'introduzione di algoritmi di apprendimento che facilitano l'analisi di una enorme mole di dati relativa alle informazioni dei pazienti. Una delle tecniche utilizzate per la costruzione di algoritmi in grado di imparare da eventi passati e di predire eventi sconosciuti è il machine learning. La sfida che ci si pone e di cui si tratterà in questo documento è la creazione di un sistema di supporto alle decisioni as a service che utilizzi una base di conoscenza elaborata da tecniche di machine learning. Nel mondo sanitario queste tecnologie trovano applicazioni sia nella formulazione di diagnosi, sia nella predizione del rischio di un paziente affetto da determinate malattie. La predizione del rischio può essere intesa come: probabilità di un paziente di entrare in ricovero nei successivi 30 giorni, probabilità di peggioramento oppure possibilità di morire. Lo scopo che ci si prefigge di raggiungere con questo lavoro è quello di fornire un primo prototipo funzionante di un Clinical Decision Support System, focalizzandosi sulla costruzione di un'architettura di sistema che guidi l'analisi delle migliori tecnologie da utilizzare.

APA, Harvard, Vancouver, ISO, and other styles

48

Järkeborn, Sandra, and Vera Werner. "Automatisering av kundtjänst med maskininlärning : Hur maskininlärning kan användas inom kundtjänst samt hur detta påverkar företagskultur och kundnöjdhet." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2018. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-241110.

Full text

Abstract:

Artificiell intelligens och maskininlärning är två ämnen som varit mycket aktuella på sistone. Vad som startade som ett banbrytande fenomen är nu inkorporerat i människors vardagliga liv. Trots alla fördelar som kommer med personifiering och ”smarta” egenskaper, finns det också en oro för att världen skall bli för opersonlig och beroende av maskiner. Detta är en utredning som berör dessa ämnen, tillämpad på Parks & Resorts – ett starkt värderingsoch kulturdrivet företag. Syftet är att analysera föroch nackdelar av att automatisera kundtjänsten, och om det är möjligt att göra detta utan att tumma på koncernens värderingar. Resultatet visar att det finns fördelar med att implementera en chatbot, speciellt vad gäller effektivitet, men också att det skulle krävas mycket resurser för att utveckla en som lär bra nog att bibehålla kundtjänstens nuvarande standard.‌
Artificial intelligence and machine learning has‌been on the radar for a while now. What started as a groundbreaking phenomenon is now incorporated in people's everyday life. With all its benefits regarding personalization and “smart” features, there is also a concern for the world becoming too inhumane, impersonal and dependent of machines. This is an investigation touching upon these issues, applied on Parks & Resorts a highly valueand culture driven company. The purpose was to analyze the pros and cons of automating the customer support, and if it’s possible to do so without cutting corners, taking the company’s values into account. The results show that there are benefits of implementing a chat bot, especially when it comes to being effective, but also that implementing one that is good enough to attain current standards wouldrequire numerous resources.

APA, Harvard, Vancouver, ISO, and other styles

49

Chevallier, Marc. "L’Apprentissage artificiel au service du profilage des données." Electronic Thesis or Diss., Paris 13, 2022. http://www.theses.fr/2022PA131060.

Full text

Abstract:

La transformation digitale qui s’est effectuée de manière rapide aux cours des dernières décennies au sein des entreprises a donné lieu à une production massive de données. Une fois les problèmes liés au stockage de ces données résolus, leur utilisation au sein de la Business Intelligence (BI) ou du Machine Learning (ML) est devenue un objectif majeur des entreprises afin de rentabiliser leurs données. Mais l’exploitation de ces données s’avère complexe car elles sont très peu documentées et contiennent très souvent de nombreuses erreurs. C’est dans ce contexte que les domaines du profilage des données et de la qualité des données (QD) ont pris de plus en plus d’importance, le profilage ayant pour but d’extraire des métadonnées informatives sur les données et la qualité des données de quantifier les erreurs dans les données. Le profilage étant un prérequis à la qualité des données nous avons concentré nos travaux sur ce sujet au travers de l’utilisation de vecteurs de métadonnées issu d’action de profilage simple. Ces vecteurs d’informations simples nous ont permis de réaliser des tâches de profilage avancées, en particulier la prédiction de type sémantique complexe au moyen d’algorithmes d’apprentissage artificiel. Les vecteurs de métadonnées que nous avons utilisés sont de grande taille et sont donc affectés par la malédiction de la grande dimension. Ce terme regroupe un ensemble de problèmes de performance survenant en apprentissage artificiel quand le nombre de dimensions du problème augmente. Une méthode pour résoudre ces problèmes est d’utiliser des algorithmes génétiques pour sélectionner un sous-ensemble de dimensions ayant de bonnes propriétés. Dans ce cadre nous avons proposé des améliorations : d’une part, une initialisation non aléatoire des individus composant la population initiale de l’algorithme génétique, d’autre part, des modifications pour l’algorithme génétique avec des mutations agressives afin d’améliorer ses performances (GAAM)
The digital transformation that has been rapidly happening within companies over the last few decades has led to a massive production of data. Once the problems related to the storage of those data have been solved, its use within Business Intelligence (BI) or Machine Learning (ML) has become a major objective for companies in order to make their data profitable. But the exploitation of the data is complex because it is not well documented and often contains many errors. It is in this context that the fields of data profiling and data quality (DQ) have become increasingly important. Profiling aims at extracting informative metadata from the data and data quality aims at quantifying the errors in the data.Profiling being a prerequisite to data quality, we have focused our work on this subject through the use of metadata vectors resulting from simple profiling actions. These simple information vectors have allowed us to perform advanced profiling tasks, in particular the prediction of complex semantic types using machine learning. The metadata vectors we used are large and are therefore affected by the curse of dimensionality. This term refers to a set of performance problems that occur in machine learning when the number of dimensions of the problem increases. One method to solve these problems is to use genetic algorithms to select a subset of dimensions with good properties. In this framework we have proposed improvements: on one hand, a non-random initialization of the individuals composing the initial population of the genetic algorithm, on the other hand, a modification to the genetic algorithm with aggressive mutations in order to improve its performance (GAAM)

APA, Harvard, Vancouver, ISO, and other styles

50

Berggren, Oliver, and Zina Matti. "A Framework for Defining, Measuring, and Predicting Service Procurement Savings." Thesis, KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021. http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-299348.

Full text

Abstract:

Recent technical advances have paved the way for transformations such as Industry 4.0, Supply Chain 4.0, and new ways for organizations to utilize services to meet the needs of people. In the midst of this shift, a focus has been put on service procurement to meet the demand of everything from cloud computing and information technology to software solutions that support operations or add value to the end customer. Procurement is an integral part of organizations and typically accounts for a substantial part of their costs. Analyzing savings is one of the primary ways of measuring cost reduction and performance. This paper examines how savings can be defined and measured in a unifying way, and determine if machine learning can be used to predict service purchase costs. Semi-structured interviews were utilized to find definitions and measurements. Three decision-tree ensemble machine learning models, XGBoost, LightGBM, and CatBoost were evaluated to study cost prediction. The result indicates that cost reduction and cost avoidance should be seen as a financial, and a performance measure, respectively. Spend and capital binding can be controlled by a budget reallocation system and could be improved further with machine learning cost prediction. The best performing model was XGBoost with a MAPE of 14.17%, compared to the base model’s MAPE of 40.24%. This suggests that budget setting and negotiation can be aided by more accurately predicting cost through machine learning, and in turn have a positive impact on an organization’s resource allocation and profitability.
Nya teknologiska framsteg har gett upphov till transformationer som Industri 4.0, Supply Chain 4.0 och nya satt för organisationer att använda tjänster för att möta människors behov. Från denna förändring har fokus hamna på tjänsteupphandling för att möta efterfrågan på allt från molntjänster och informationsteknologi till mjukvarulösningar som stödjer operationer eller skapar värde för slutkunder. Upphandling ar en väsentlig del av organisationer och utgör oftast en stor del av deras kostnader. Att mata besparingar är ett av de primära sätten att driva kostnadsreducering och prestanda. Detta arbete utforskar hur besparingar kan definieras och matas på ett förenande sätt och undersöker om maskininlärning kan användas för att predicera tjänsteinköpskostnader. Semistrukturerade intervjuer hölls för att hitta definitioner och mått. Tre maskininlärningsmodeller, XGBoost, LightGMB och CatBoost utvärderades för att studera kostnadsprediktion. XGBoost presterade bäst med MAPE 14,17%, jämfört med basmodellens MAPE på 40,24%. Detta tyder på att budgetsättning och förhandling kan stödjas av maskininlärning genom att mer precist predicera kostnader, som i sin tur kan ha en positiv påverkan på en organisations resursallokering och lönsamhet.

APA, Harvard, Vancouver, ISO, and other styles

We offer discounts on all premium plans for authors whose works are included in thematic literature selections. Contact us to get a unique promo code!