I'm working as a PhD student at the Computer Science and Systems Laboratory on a thesis funded under a CIFRE contract with Parsec. As my subject is the study of coherence in distributed systems, since September 2023 I have been exploring the complex field of "Weak coherence for cloud zero-trust". In collaboration with Parsec, my aim is to address the crucial challenges of security, confidentiality and high availability in real-time collaborative applications. This research is rooted in a contemporary context marked by the widespread use of applications as a service (SaaS), highlighting the importance of rethinking traditional client-server architectures.
Introduction
Real-time collaborative applications have gained in popularity with the spread of remote working. However, most of these applications rely on centralized client-server architectures, raising major security and confidentiality concerns. Trusting a third party to manage data, and vulnerabilities to denial-of-service attacks are all challenges to be overcome.
The issues
Indeed, we're a long way from the days of Leslie Lamport (the father of distributed computing), when the issue of actors working on a common task only concerned multi-core processors. Today, this problem extends to much wider contexts and applications, forcing us to consider multiple parameters such as latency and security.
Today's systems are asynchronous (each member expresses himself at his own pace), prey to crashes, malicious attacks, long and irregular communication delays, and untimely loss of messages. This makes the realization of collaborative tasks in real time very complex, requiring the implementation of important compromises impacting the resilience, coherence or availability of the system (as demonstrated by Brewer's CAP theorem)[Brewer99].
Towards Innovative Solutions
To meet these challenges, we suggest exploring untrusted solutions through zero-trust and/or peer-to-peer approaches. These approaches aim to guarantee a high level of security while ensuring system resilience. However, maintaining robust performance, particularly in terms of high availability, requires the implementation of innovative solutions and reflection on achievable compromises.
These reflections lead us to consider the very definition of the coherence of our system, and to envisage weaknesses in this coherence in order to gain in resilience and availability. Researchers in distributed algorithms refer to these solutions as weak coherences [Dubois86], defining different families of weak coherences depending on the use cases sought.
Weak Coherence at the Heart of Solutions
The landscape of weak coherence properties can be divided into groups, each satisfying a specific property[Perrin17]. These groups are not exclusive and may overlap. Obviously, the more properties the coherence criterion satisfies, the more costly it is to implement, requiring constraints on the system's resilience and availability.
The aim is therefore to strike a balance between these different properties to guarantee a sufficient level of consistency while maintaining robust performance. This requires an in-depth understanding of the trade-offs between the different coherence properties and the algorithmic solutions to guarantee them.
Algorithmic results
Recent work has explored algorithmic solutions to guarantee weak consistency in cloud environments, taking into account possible crashes, system openness or even malicious behavior[Frey23, Kleppman22, Nicolaescu16]. Research shows promising advances, but challenges persist, and the solutions presented strongly constrain the final applications, thus limiting the possible functionalities.
Towards a Zero-Trust Cloud
Weakening the coherence of our system therefore helps to increase resilience and availability, but does not in itself provide an answer to all the challenges brought about by real-time collaboration. Questions relating to data security and confidentiality remain unanswered. Having a highly resilient and available system makes no sense if the data handled is not secure. This is where the notion of cloud zero-trust (which we explain in more detail in this article) can play a key role in reducing the attack surface. Indeed, by imagining a system where trusted intermediaries do not exist, the only malicious actors in the system are reduced to the end-users who exploit the data.
Unfortunately, this security choice is not without consequences for the system architecture, and constrains us in the possible solutions for achieving low coherence. The zero-trust approach places the central server(s) in the role of a simple message relay, making our system very similar to a peer-to-peer system. Gone are the solutions where a central server would take on a significant algorithmic load to resolve consistency issues, further limiting the possible solutions.
Data-centric security
Even though weak consistency helps to increase resilience and availability, and the zero-trust approach reduces the attack surface, the data manipulated by the system remains vulnerable through the actions of end-users. Nevertheless, this is a risk that can be mastered by adopting a data-centric approach to security. This is the subject of this article.
Conclusion
Balancing security, confidentiality and performance in real-time collaborative applications is a major challenge. The study of weak coherences and the exploration of innovative solutions are paving the way for promising developments, but questions remain as to their practical application in a zero-trust context. Future research should focus on formalizing these concepts for effective implementation.
Sources
[Frey23] Process-Commutative Distributed Objects: From Cryptocurrencies to Byzantine-Fault-Tolerant CRDTs
[Kleppmann22] Making CRDTs Byzantine fault tolerant
[Perrin17] Concurrence and coherence in distributed systems
[Nicolaescu16] Near Real-Time Peer-to-Peer Shared Editing on Extensible Data Types
[Dubois86] Memory access buffering in multiprocessors
[Brewer99] Harvest, yield, and scalable tolerant systems
Amaury JOLY - Doctoral student CIFRE LIS - R&D Engineer Parsec