diff --git a/proposals/1763-configurable-retention-periods.md b/proposals/1763-configurable-retention-periods.md new file mode 100644 index 00000000000..804e25f556d --- /dev/null +++ b/proposals/1763-configurable-retention-periods.md @@ -0,0 +1,388 @@ +# Proposal for specifying configurable per-room message retention periods. + +A major shortcoming of Matrix has been the inability to specify how long events +should stored by the servers and clients which participate in a given room. + +This proposal aims to specify a simple yet flexible set of rules which allow +users, room admins and server admins to determine how long data should be stored +for a room, from the perspective of respecting the privacy requirements of that +room (which may range from a "burn after reading" ephemeral conversation, +through to FOIA-style public record keeping requirements). + +As well as enforcing privacy requirements, these rules provide a way for server +administrators to better manage disk space (e.g. to enforce rules such as "don't +store remote events for public rooms for more than a month"). + +This proposal originally tried to also define semantics for per-message +retention as well as per-room; this has been split out into +[MSC2228](https://github.com/matrix-org/matrix-doc/pull/2228) in order to get +the easier per-room semantics landed. + + +## Problem + +Matrix is inherently a protocol for storing and synchronising conversation +history, and various parties may wish to control how long that history is stored +for. + +Room administrators, for instance, may wish to control how long a message can be +stored (e.g. to comply with corporate/legal requirements to store message +history for at least a specific amount of time), or how early a message can be +deleted (e.g. to address privacy concerns of the room's members, to avoid +messages staying in the public record forever, or to comply with corporate/legal +requirements to only store specific kinds of information for a limited amount of +time). + +Additionally, server administrators may also wish to control how long message +history is kept in order to better manage their server's disk space, or to +enforce corporate/legal requirements for the organisation managing the server. + +We would like to provide this behaviour whilst also ensuring that users +generally see a consistent view of message history, without lots of gaps and +one-sided conversations where messages have been automatically removed. + +We would also like to set the expectation that rooms typically have a long +message retention - allowing those who wish to use Matrix to act as an archive +of their conversations to do so. If everyone starts defaulting their rooms to +finite retention periods, then the value of Matrix as a knowledge repository is +broken. + +This proposal does not try to solve the problems of: + * GDPR erasure (as this involves retrospectively changing the lifetime of + messages) + * Bulk redaction (e.g. to remove all messages from an abusive user in a room, + as again this is retrospectively changing message lifetime) + * Specifying history retention based on the number of messages (as opposed to + their age) in a room. This is descoped because it is effectively a disk space + management problem for a given server or client, rather than a policy + problem of the room. It can be solved as an implementation specific manner, or + a new MSC can be proposed to standardise letting clients specify disk quotas + per room. + * Per-message retention (as having a mix of message lifetime within a room + complicates implementation considerably - for instance, you cannot just + purge arbitrary events from the DB without fracturing the DAG of the room, + and so a different approach is required) + + +## Proposal + +### Per-room retention + +We introduce a `m.room.retention` state event, which room admins or moderators +can set to mandate the history retention behaviour for a given room. It follows +the default PL semantics for a state event (requiring PL of 50 by default to be +set). Its state key is an empty string (`""`). + +The following fields are defined in the `m.room.retention` contents: + +* `max_lifetime`: the maximum duration in milliseconds for which a server must + store events in this room. Must be null or an integer in range [0, + 253-1]. If absent or null, should be interpreted as not setting an + upper bound to the room's retention policy. + +* `min_lifetime`: the minimum duration in milliseconds for which a server should + store events in this room. Must be null or an integer in range [0, + 253-1]. If absent or null, should be interpreted as not setting a + lower bound to the room's retention policy. + +In the instance of both `max_lifetime` and `min_lifetime` being provided, +`max_lifetime` must always be higher or equal to `min_lifetime`. + + +For instance: + +```json +{ + "max_lifetime": 86400000 +} +``` + +The above example means that servers receiving messages in this room should +store the event for only 86400000 milliseconds (1 day), as measured from that +event's `origin_server_ts`, after which they MUST purge all references to that +event (e.g. from their db and any in-memory queues). + +We consciously do not redact the event, as we are trying to eliminate metadata +and save disk space at the cost of deliberately discarding older messages from +the DAG. + +```json +{ + "min_lifetime": 2419200000 +} +``` + +The above example means that servers receiving this message SHOULD store the +event forever, but can choose to purge their copy after 28 days (or longer) in +order to reclaim diskspace. + +```json +{ + "min_lifetime": 2419200000, + "max_lifetime": 15778800000 +} +``` + +The above example means that servers SHOULD store their copy of the event for at least 28 +days after it has been sent, and MUST delete it at the latest after 6 months. + + +## Server-defined retention + +Server administrators can benefit from a few capabilities to control how long +history is stored: + +* the ability to set a default retention policy for rooms that don't have a + retention policy defined in their state +* the ability to override the retention policy for a room +* the ability to cap the effective `max_lifetime` and `min_lifetime` of the rooms the + server is in + +The implementation of these capabilities in the server is left as an +implementation detail. + +We introduce the following authenticated endpoint to allow clients to enquire +about how the server implements this policy: + + +``` +GET /_matrix/client/v3/retention/configuration +``` + +200 response properties: + +* `policies` (required): An object mapping room IDs to a retention policy. If + the room ID is `*`, the associated policy is the default policy. Each policy + follows the format for the content of an `m.room.retention` state event. +* `limits` (required): An object defining the limits to apply to policies + defined by `m.room.retention` state events. This object has two optional + properties, `min_lifetime` and `max_lifetime`, which each define a limit to + the equivalent property of the state events' content. Each limit defines an + optional `min` (the minimum value, in milliseconds) and an optional `max` (the + maximum value, in milliseconds). + +If both `policies` and `limits` are included in the response, the policies +specified in `policies` __must__ comply with the limits defined in `limits`. + +Example response: + +```json +{ + "policies": { + "*": { + "max_lifetime": 15778800000 + }, + "!someroom:test": { + "min_lifetime": 2419200000, + "max_lifetime": 15778800000 + } + }, + "limits": { + "min_lifetime": { + "min": 86400000, + "max": 172800000 + }, + "max_lifetime": { + "min": 7889400000, + "max": 15778800000 + } + } +} +``` + +In this example, the server is configured with: + +* a default policy with a `max_lifetime` of 6 months and no `min_lifetime` (i.e. messages + can only be kept up to 6 months after they have been sent) +* an override for the retention policy in room `!someroom:test` +* limits on `min_lifetime` that + +Example response with no policy or limit set: + +```json +{ + "policies": {}, + "limits": {} +} +``` + +Example response with only a default policy and an upper limit on `max_lifetime`: + +```json +{ + "policies": { + "*": { + "min_lifetime": 86400000, + "max_lifetime": 15778800000 + } + }, + "limits": { + "max_lifetime": { + "max": 15778800000 + } + } +} +``` + +### Defining the effective retention policy of a room + +In this section, as well as in the rest of this document, we define the +"effective retention policy" of a room as the retention policy that is used to +determine whether an event should be deleted or not. This may be the policy +determined by the `m.room.retention` event in the state of the room, but it +might not be depending on limits set by the homeserver. + +The algorithm implementation must implement to determine the effective retention +policy of a room is + + +* if the homeserver defines a specific retention policy for this room, then use + this policy as the effective retention policy of the room. +* otherwise, if the state of the room does not include a `m.room.retention` + event with an empty state key: + * if the homeserver defines a default retention policy, then use this policy + as the effective retention policy of the room. + * if the homeserver does not define a default retention policy, then don't + apply a retention policy in this room. +* otherwise, if the state of the room includes a `m.room.retention` event with + an empty state key: + * if no limit is set by the homeserver use the policy in the state of the + room as the effective retention policy of the room. + * for `min_lifetime` and `max_lifetime`: + * if there is no limit for the property, use the value specified in the + room's state for the effective retention policy of the room (if any). + * if there is a limit for the property: + * if the value specified in the room's state complies with the + limit, use this value for the effective retention policy of the + room. + * if the value specified in the room's state is lower than the + limit's `min` value, use the `min` value for the effective + retention policy of the room. + * if the value specified in the room's state is greater than the + limit's `max` value, use the `max` value for the effective + retention policy of the room. + * if there is no value specified in the room's state, use the + limit's `min` value for the effective retention policy of the + room (which can be null or absent). +* otherwise, don't apply a retention policy in this room. + +So, for example, if a homeserver defines a lower limit on `max_lifetime` of +`86400000` (a day) and no limit on `min_lifetime`, and a room's retention policy +is the following: + +```json +{ + "max_lifetime": 43200000, + "min_lifetime": 21600000 +} +``` + +Then the effective retention policy of the room is: + +```json +{ + "max_lifetime": 86400000, + "min_lifetime": 21600000 +} +``` + + +## Enforcing a retention policy + +Retention is only considered for non-state events. Retention is also not +considered for the most recent event in a room, in order to allow a new event +sent to that room to reference it in its `prev_events`. + +When purging events in a room, only the latest retention policy state event in +that room is considered. This means that in a room where the history looks like +the following (oldest event first): + +1. Retention policy A +2. Event 1 +3. Event 2 +4. Retention policy B + +Then the retention policy B is used to determine the effective retention that +defines whether events 1 and 2 should be purged, even though they were sent when +the retention policy A was in effect. This is to avoid creating wholes in the +room's DAG caused by events in the middle of the timeline being subject to a +lower `max_lifetime` than other events being sent before and after them. Such +holes would make it more difficult for homeservers to calculate room timelines +when showing them to clients. They would also force clients to display +potentially incomplete or one-sided conversations without being able to easily +tell which parts of the conversation is missing. + +Servers decide whether an event should or should not be purged by calculating +how much time has passed since the event's `origin_server_ts` property, and +comparing this duration with the room's effective retention policy. + +Note that, for performance reasons, a server might decide to not purge an event +the second it hits the end of its lifetime (e.g. so it can batch several events +together). In this case, the server must make sure to omit the expired events +from reponses to client requests. Similarly, if the server is sent an expired +event over federation, it must omit it from responses to client requests (and +ensure it is eventually purged). + +## Tradeoffs + +This proposal specifies that the lifetime of an event is defined by the latest +retention policy in the room, rather than the one in effect when the event was +sent. This might be controversial as, in Matrix, the state that an event is +subject to is usually the state of the room at the time it was sent. However, +there are a few issues with using the retention that was in effect at the time +the event was sent: + +* it would create holes in the DAG of a room which would complexify the + server-side handling of the room's history +* malicious servers could potentially make an event evade retention policies by + selecting their event's `prev_events` and `auth_events` so that the event is + on a portion of the DAG where the policy does not exist +* it would be difficult to translate the configuration of retention policies + into a clear and easy to use UX (especially considering server-side + configuration applies to the whole history of the room) +* it would not allow room administrators to retroactively update the lifetime of + events that have already been sent (e.g. if the context of a room administered + by an organisation which requirements for data retention change over time) + +This proposal does not cover per-message retention (i.e. the ability to set +different lifetimes to different messages). This has been split out into +[MSC2228](https://github.com/matrix-org/matrix-spec-proposals/pull/2228) to +simplify this proposal. + +This proposal does also not cover the case where a room's administrator wishes +to only restrict the lifetime of a specific section of the room's history. This +is left to be covered by a separate MSC, possibly built on top of MSC2228. + +## Security considerations + +In a context of open federation, it is worth keeping in mind the possibility +that not all servers in a room will enforce its retention policy. Similarly, +different servers will likely enforce different server-side configuration, and +as a result calculate different lifetimes for a given event. This proposal aims +at trying to compromise between finding an absolute consensus on an event's +lifetime and working within the constraints of a server's operator in terms of +data retention. + +In a kind of contradictory way with the previous paragraph, a server may keep an +expired event in its database for some time after its expiration, while not +sharing it with clients and federating servers. This is in order to prevent +abusers from using low lifetime values in a room's retention policy in order to +erase any proof of such abuse and avoid being investigated. + +Basing the expiration time of an event on its `origin_server_ts` is not ideal as +this field can be falsified by the sending server. However, there currently +isn't a more reliable way to certify the send time of an event. + +As mentioned previously in this proposal, servers might store expired events for +longer than their lifetime allows, either for performance reason or to mitigate +abuse. This is considered acceptable as long as: + +* an expired event is not kept permanently +* an expired event is not shared with clients and federated servers + +## Unstable prefixes + +While this proposal is under review, the `m.room.retention` event type should be +replaced by the `org.matrix.msc1763.retention` type. + +Similarly, the `/_matrix/client/v3/retention/configuration` path should be replaced with `/_matrix/client/unstable/org.matrix.msc1763/retention/configuration`.