2019-01647 - PhD Position F/M An availability-aware NFVs placement using Deep Reinforcement Learning
Le descriptif de l’offre ci-dessous est en Anglais

Type de contrat : CDD de la fonction publique

Niveau de diplôme exigé : Bac + 5 ou équivalent

Fonction : Doctorant

A propos du centre ou de la direction fonctionnelle

Inria, the French national research institute for the digital sciences, promotes scientific excellence and technology transfer to maximise its impact.
It employs 2,400 people. Its 200 agile project teams, generally with academic partners, involve more than 3,000 scientists in meeting the challenges of computer science and mathematics, often at the interface of other disciplines.
Inria works with many companies and has assisted in the creation of over 160 startups.
It strives to meet the challenges of the digital transformation of science, society and the economy.

Contexte et atouts du poste

This PhD will be in the context of a CIFRE collaboration between EXFO and the Dionysos team (Inria Rennes).

EXFO develops smarter network test, monitoring and analytics solutions for the world’s leading telecommunications service providers, network equipment manufacturers and webscale companies. With nearly 1,900 employees in more than 25 countries, EXFO is no. 1 worldwide in fiber optic test solutions and has the largest active assurance deployment. Their broad portfolio of intelligent hardware and software solutions enable their customer’s network transformations related to fiber, 5G, virtualization and big data analytics.

Inria is a French leading research centre in Computer Sciences, where research activities in Dionysos focus the identification, the design and the selection of the most appropriate network architectures of a communication service, as well as the development of computing and mathematical tools for the fulfillment of these tasks. These objectives lead to two types of complementary research fields: the systems' qualitative aspects (e.g. protocols' test) and the quantitative aspects which are essential to the correct dimensioning of these architectures and the associated services (performance, dependability, Quality of Service, Quality of Experience).

Mission confiée

Summary of the project

Building high performance, high available and highly scalable virtualized infrastructure is the most important challenge facing infrastructure providers (InPs) [1-2]. The dynamic deployment of services, in the form of virtual network functions (VNFs), requires the deployment on-demand of monitoring services, in order to better ensure their proper functioning [3]. This is, indeed, a requirement as established by the architectural framework MANO, which is proposed by the ETSI [4], or one of its various implementation such as ONAP [5].

The deployment of on-demand monitoring services involves new challenges. Determining the number of instances to be deployed, in order to ensure high availability of the monitoring infrastructure, remains a problem. The latter, which could be loaded, should automatically scale up to best absorb incoming traffic. On the other hand, the overload associated with monitoring requires strategic placement within the infrastructure. This problem is particularly hard when failures occur, because it requires the adaptation of the infrastructure by setting up an optimal routing/load balancing so that the instances in operation take over the traffic in question.

Within the framework of this thesis topic, we propose to exploit the potential of Deep Learning, and more particularly, of Deep Reinforcement Learning in order to guarantee the high availability of the placed NFVs.

References

[1] I. Afolabi, T. Taleb, K. Samdanis, A. Ksentini, and H. Flinck, “Network slicing and soft- warization: A survey on principles, enabling technologies, and solutions,” IEEE Communications Surveys Tutorials, vol. 20, pp. 2429–2453, thirdquarter 2018.
[2] B. Yi, X. Wang, K. Li, S. k. Das, and M. Huang, “A comprehensive survey of network function virtualization,” Computer Networks, vol. 133, pp. 212 – 262, 2018.
[3] G.Gardikis,I.Koutras,G.Mavroudis,S.Costicoglou,G.Xilouris,C.Sakkas,andA.Kourtis, “An integrating framework for efficient nfv monitoring,” in 2016 IEEE NetSoft Conference and Workshops (NetSoft), pp. 1–5, June 2016.
[4] ETSI, “Network functions virtualisation (nfv);architectural framework,” in ETSI GS NFV 002 v1.2.1, ETSI, 2014.
[5] ONAP, “Open network automation platform.” https://www.onap.org/, 2019. [Online].
[6] S. Basterrech, G. Rubino, and V. Snel, “Sensitivity analysis of echo state networks for forecasting pseudo-periodic time series,” in 2015 7th International Conference of Soft Computing and Pattern Recognition (SoCPaR), pp. 328–333, Nov 2015.
[7] I. Alawe, Y. Hadjadj-Aoul, A. Ksentini, P. Bertin, C. Viho, and D. Darche, “An efficient and lightweight load forecasting for proactive scaling in 5G mobile networks,” in CSCN 2018 - IEEE Conference on Standards for Communications and Networking, (Paris, France), pp. 1– 6, IEEE, Oct. 2018.

Principales activités

Scientific challenges

The scientific challenges that we would like to address within this thesis are related to the high availability and zero touch deployment of monitoring NFVs.

  • Investigating appropriate metrics to scientifically quantify the availability of an NFV function. This could be based on a metric such as the incoming traffic but could also be multi-dimensional such as CPU, RAM, as well as other metrics.
  • Proposing an availability-aware strategy for VNFs scaling using machine learning techniques. In this phase, time series prediction techniques could be used, as previously studied by the researchers involved in this thesis [6-7].
  • Extending the proposed approaches to deal with failure events. Indeed, in the event of a failure, a rearrangement of the monitoring network is often necessary, involving a re-routing of the traffic concerned by the failure in order to better guarantee the high availability of the monitoring network.
  • Evaluating the proposed solutions by simulations or experimentations in relevant scenarios (resources are available in the IaaS of EXFO).

The validation of the proposed solutions will require:

  • Building, maintaining and evolving the current architectural design of the virtual monitoring solution to meet best practices of DevOps and Automation.
  • Performing tests on Integrated IT/Network test-bed Platforms involving the coordination across multiple open source projects (Openstack, Docker, Cloudify, etc).

Depending on progress, we will propose to extend this last concept to other cases of failures, which necessitate a more complex strategies for recognition/detection. In the thesis, we will take care of properly quantifying the metrics relevant for the objectives, at the head of which we position the availability one in any of its appropriate flavors (in equilibrium, the point availability, the interval availability, etc.).

Compétences

Requirements :

  • Master’s degree in computer science (or in a highly related area) by the starting date of the PhD
  • Self-starter with strong analytical and problem-solving skills
  • Ability to adapt quickly to an existing, complex environment
  • Teamwork and good communication skills, both verbal and written English

Technical skills :

  • Virtualization & Automation:
    • Good Virtualization knowledge ( Linux/KVM, VMware vSphere, etc)
    • Strong background in Cloud computing management platforms; experience with OpenStack, Docker or Kubernetes would be a real plus
    • Experience with cloud automation tools and technologies such as Ansible, Heat, NetConf, YANG would be a  plus
  • Networking:
    • DPDK, SRIOV, Direct-IO, OVS
    • Tunneling and Encapsulation: MPLS, VXLAN, GRE, UDP
    • Knowledge of Software Defined Networking ; proven hands-on experience with  SDN Controllers would be appreciated  (OpenDaylight, ONOS, Juniper Contrail, etc)
  • Software:
    • Proficient with Linux environments (Ubuntu, RedHat/CentOS, etc)
    • Solid scripting development skills (Shell, Python, etc)
    • Strong programming skills, experience with Python or C/C++ applied to machine learning is a real plus
    • Micro-Services and Building REST API

Avantages

  • Subsidized meals
  • Partial reimbursement of public transport costs
  • Professional equipment available (videoconferencing, loan of computer equipment, etc.)
  • Social, cultural and sports events and activities
  • Access to vocational training
  • Social security coverage