PhD Position F/M [Campagne CORDI-S] - Theory and practice of modular implicits: Design, formalization, implementation, and applications

Contract type : Fixed-term contract

Level of qualifications required : Graduate degree or equivalent

Fonction : PhD Position

Assignment

Context

A standard programming pattern in typed languages is to build types de- scribing domain-specific data by composing simpler (parametrized) types de- scribing separate, simpler aspects of the domain. For example, an (student * grade list) sorted list can be used to describe the grades of a group of students. Functions that operate on this complex type can then be written by composing functions on the simpler types: a simple printing function for group grades can be obtained as:

Sorted_list.print
(Pair.print Student.print (List.print Grade.print))

An obvious limitation of this approach, however, is that it involves repeat- edly writing function compositions that follow the structure of the types, in a way that is verbose, mechanical and tedious for programmers, for any sort of type-directed service that we want to be able to extend with user-implemented functions.

Various typed languages (Haskell, Scala, Coq, Agda, etc.) introduce mecha- nisms where such type-directed code is inferred from the static type information, by teaching the language how to implement a given interface for each type and letting an elaboration process (a form of Prolog-like proof search) generate com- plex applications at larger types. Most of those languages use type classes (which combine inferred operations in classes), and Scala uses implicits (which infer a single value of arbitrary type). Most ML-family languages (SML, OCaml) lack this feature, which is perceived as a serious limitation by some users.

The proposed topic is modular implicits, a serious proposal to implement such an implicit mechanism in the OCaml programming language.

Scientific Context. An early proposal to extend ML-family programming lan- guages with a type-class-like mechanism was introduced by Dreyer, Harper, Chakravarty, and Keller (2007), but without any attempt to elaborate module arguments. In parallel, emulation of type classes with implicit arguments was first proposed by Lewis, Launchbury, Meijer, and Shields (2000), but became more popular after their introduction in the Scala language (Oliveira, Moors, and Odersky, 2010). Inference of implicit arguments has also been formalized in an explicit bidirectional type system by Oliveira, Schrijvers, Choi, Lee, and Yi (2012).

Modular implicits are a proof of concept proposal for OCaml inspired by these approaches, but using the module system instead of the core language to emulate type classes and therefore using implicit module arguments to automate the construction of dictionaries. They were described by White, Bour, and Yallop (2014), along with a prototype branch of the OCaml compiler of the time. The proposal is quite promising, as it can emulate overloading as permitted by Haskell type classes, but at the level of modules, hence in a way that is modular by construction and well-integrated in a language such as OCaml.

Unfortunately, this proposal has remained at the prototype level for almost a decade without being integrated, and is currently stalled. The problem is that many aspects need to be further explored, and formalized, in order to obtain a robust and efficient implementation of modular implicits.
Making progress on this question requires strong expertise, new results, and would have a major impact on the OCaml programming language.

An overview of modular implicits

A first mandatory step to modular implicits is to extend the module language with modular explicits by adding new constructs to ease the interaction between the module and core languages; in particular, viewing first-class functors that return a single value as core-language functions, called dependent functions, and their application to module arguments as core-language applications.

Then, modular implicits amount to optionally leaving arguments to depen- dent functions implicit and reconstruct them from context, as simple module paths. This requires two separate mechanisms that however closely interact. On the one hand, both the positions and signatures of omitted arguments in applications of dependent functions must be inferred. On the other hand, mod- ule paths must be reconstructed from their signatures. Simple (module) paths are a subset of module paths recursively composed of initial paths, composed of projections from module variables, which must have previously been declared for implicit use, and applications of simple paths to simple paths. Their recon- struction from module signatures requires instantiation of core-language types, which in turn may change the signatures of other dependent functions, hence the interaction.

Challenges

A robust and efficient design is quite challenging, as it raises many issues.

Coherence is one of the main issue when inferring code (here module paths) from types. It means that the solutions should be unique up to pro- gram equivalence—or otherwise rejected. This is indeed necessary to have a well-defined, deterministic semantics. An easy solution to ensure coherence, followed by (Oliveira, Schrijvers, Choi, Lee, and Yi, 2012), is to have unique solutions by design, typically by giving an algorithmic specification that searches for solutions in order and only returns the first well-typed solution found. This is unsatisfactory as it can be understood as a way of hiding incoherence rather than ensuring coherence.
Checking program equivalence is undecidable in general and could be quite expensive to check exactly in practice even on decidable subsets. In the case of modular explicits, an efficient solution is to rely on typing to track (a static approximation of) module aliasing to ensure program equivalence. This amounts to approximate module identities at the level of types by type identities. To be effective this requires applicative functors (already in OCaml) and the use of transparent ascription (to be added to OCaml), which altogether allow to track type identities across functor applications.

Completeness of type inference with respect to its declarative specifica- tion, which is a desired property that is however sometimes compromised in real- world rich languages, becomes a requirement with modular implicits. Indeed, since coherence relies on uniqueness of well-typed solutions, an implementation that would miss well-typed solutions could also miss sources of incoherence and therefore accept programs whose semantics would be ambiguous.

Predictability and robustness of typechecking becomes a key issue, as the semantics now depends on typechecking. Since typing is a static ap- proximation of values, a type system is by design incomplete with respect to the dynamic semantics. Instead, we often evaluate a type system by its ex- pressiveness. Predictability, which means the ability for the user to guess when typechecking should succeed or fail is also an important criteria. It becomes a key property to understand the semantics of programs when their semantics is driven by types.
One way to ensure predictability is the robustness of typechecking, i.e., the preservation of well-typedness by some semantic preserving program transfor- mations. Subject reduction, which means that typing is preserved by reduction, is just one such property. Stability by eta-expansion (of functions), commu- tation of arguments, etc. are also good criteria that helps the user predict well-typedness and contribute to the robustness of the type system. Robustness is not binary and may still have to be balanced with expressiveness. Hence, several strategies may be explored and evaluated on real-world examples.

Efficiency of typechecking will require specific optimizations to ensure that modular implicits do not hinder the user’s experience.

An environment of modules eligible as implicit arguments is built during type-checking from which modules are retrieved by types up to subtyping. Since this environment may be populated with many standard libraries and grow large, it becomes important to explore different techniques (hashing, caching, discriminator trees, etc.) and other algorithmic optimizations to reduce the cost of searching for implicit module arguments.
Implicit proof search can suffer from bad complexity (a naive implementation can easily be exponential) or even non-termination; this has been found to be a problem in practice in Coq, for example. Even when the complexity is kept in check, the sort of type-checking problem it creates is different from problems found in user programs, and may require specific optimizations. Indeed, most user programs are well-typed almost everywhere, with a few type errors at most, so typical type checkers are designed to be fast when they succeed, but can be slower when they fail to produce good errors, etc. On the contrary, elaboration requires discarding quickly a large number of potential choices that are ill-typed.

Recent explorations of such optimizations in an implementation prototype by Samuel Vivien have shown to be quite effective.

Further exploration

Applications of modular implicits are still to be explored. While origi- nally motivated as an alternative to Haskell type classes in a language already equipped with modules, modular implicits are just implicit arguments to depen- dent functions. That is, a less structured, more lightweight mechanism, which could be used for other purposes. In fact, examples written in the original pro- totype already show different programming styles for overloading. In addition to the many examples taken from type classes in Haskell or implicits arguments in Scala, it would be worth exploring their use in other programming patterns.

While modules have originally been designed to program in the large, mod- ular implicits will be used—quite extensively—to program in the small. This will induce both a heavy use of modules and a stronger interaction between the core and module levels, which in turn may require additional support from the programming language to improve the users’ experience and from the compiler to avoid (or just reduce) the overhead when programming at a more abstract level.

Stratification between core-level and module-level expressions will be re- duced by modular implicits. It could perhaps be reconsidered taking inspira- tion either from 1ML (Rossberg and Dreyer, 2013) or from the Dot calculus by (Amin, Rompf, and Odersky, 2014) introduced to model the core of Scala. Alternatively, the stratification could be preserved and reinforced by keeping a clear separation of expressions into two levels that smoothly interact.

Mechanizing the meta-theory of modular implicits, or at least of the un- derlying modular explicits would be another useful and timely extension of the work, made possible by recent advances in the formalization of real-world lan- guages.

References

Nada Amin, Tiark Rompf, and Martin Odersky. Foundations of path-dependent types. In
Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’14, pages 233–249, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2585-1. doi: 10.1145/2660193.2660216. URL https://doi.org/10.1145/2660193.2660216.
Derek Dreyer, Robert Harper, Manuel M. T. Chakravarty, and Gabriele Keller. Modular type classes. In Martin Hofmann and Matthias Felleisen, editors, Proceedings of the 34th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2007, Nice, France, January 17-19, 2007, pages 63–70. ACM, 2007. ISBN 1-59593-575-4. doi: 10.1145/1190216.1190229. URL https://doi.org/10.1145/1190216.1190229.
Jeffrey R. Lewis, John Launchbury, Erik Meijer, and Mark B. Shields. Implicit Parameters: Dynamic Scoping with Static Types. In Proceedings of the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL ’00, pages 108–118, New York, NY, USA, 2000. ACM. ISBN 1-58113-125-9. doi: 10.1145/325694.325708. URL http://doi.acm.org/10.1145/325694.325708. event-place: Boston, MA, USA.
Bruno C. d S. Oliveira, Adriaan Moors, and Martin Odersky. Type classes as objects and im- plicits. In Proceedings of the 25th Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2010, October 17-21, 2010, Reno/Tahoe, Nevada, USA, pages 341–360, 2010. doi: 10.1145/1869459.1869489. URL https://doi.org/10.1145/1869459.1869489.
Bruno C. d S. Oliveira, Tom Schrijvers, Wontae Choi, Wonchan Lee, and Kwangkeun Yi. The implicit calculus: a new foundation for generic programming. In Jan Vitek, Haibo Lin, and Frank Tip, editors, ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’12, Beijing, China - June 11 - 16, 2012, page 35–44. ACM, 2012. doi: 10.1145/2254064.2254070. URL https://doi.org/10.1145/2254064.2254070.
Andreas Rossberg and Derek Dreyer. Mixin’Up the ML Module System. ACM Trans. Pro- gram. Lang. Syst., 35(1):2:1–2:84, April 2013. ISSN 0164-0925. doi: 10.1145/2450136. 2450137. URL http://doi.acm.org/10.1145/2450136.2450137.
Leo White, Fr ́ed ́eric Bour, and Jeremy Yallop. Modular implicits. In Oleg Kiselyov and Jacques Garrigue, editors, Proceedings ML Family/OCaml Users and Developers work- shops, ML/OCaml 2014, Gothenburg, Sweden, September 4-5, 2014., volume 198 of EPTCS, pages 22–63, 2014. doi: 10.4204/EPTCS.198.2. URL https://doi.org/10.4204/EPTCS.198.2.

Main activities

Working on bibliography, writing papers, programming prototypes, writing machine-checked proofs, giving talks, writing and submitting a thesis manuscript.

Benefits package

Subsidized meals
Partial reimbursement of public transport costs
Leave: 7 weeks of annual leave + 10 extra days off due to RTT (statutory reduction in working hours) + possibility of exceptional leave (sick children, moving home, etc.)
Possibility of teleworking
Flexible organization of working hours (after 12 months of employment)
Professional equipment available (videoconferencing, loan of computer equipment, etc.)
Social, cultural and sports events and activities
Access to vocational training

Remuneration

Monthly gross salary : 2100 € during the first and second years. 2190 € the last year.

Apply for this position

General Information

Theme/Domain : Proofs and Verification
Town/city : Paris
Inria Center : Centre Inria de Paris
Starting date : 2024-10-01
Duration of contract : 3 years
Deadline to apply : 2024-05-19

Warning : you must enter your e-mail address in order to save your application to Inria. Applications must be submitted online on the Inria website. Processing of applications sent from other channels is not guaranteed.

Instruction to apply

The application must contain:

CV
Cover letter
Master's notes
Letters of recommendation

Defence Security :
This position is likely to be situated in a restricted area (ZRR), as defined in Decree No. 2011-1425 relating to the protection of national scientific and technical potential (PPST).Authorisation to enter an area is granted by the director of the unit, following a favourable Ministerial decision, as defined in the decree of 3 July 2012 relating to the PPST. An unfavourable Ministerial decision in respect of a position situated in a ZRR would result in the cancellation of the appointment.

Recruitment Policy :
As part of its diversity policy, all Inria positions are accessible to people with disabilities.

Contacts

Inria Team : CAMBIUM
PhD Supervisor :
Remy Didier / Didier.Remy@inria.fr

About Inria

Inria is the French national research institute dedicated to digital science and technology. It employs 2,600 people. Its 200 agile project teams, generally run jointly with academic partners, include more than 3,500 scientists and engineers working to meet the challenges of digital technology, often at the interface with other disciplines. The Institute also employs numerous talents in over forty different professions. 900 research support staff contribute to the preparation and development of scientific and entrepreneurial projects that have a worldwide impact.