Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC1763: Proposal for specifying configurable message retention periods #1763
base: old_master
Are you sure you want to change the base?
MSC1763: Proposal for specifying configurable message retention periods #1763
Changes from 35 commits
687b650
f770440
b25367e
2aafa02
64695ed
c493dbd
0afc3af
7597e03
7a8d204
4646fcd
c55158d
6e33c2f
28ea4e1
cca99dd
a4974b6
c27394c
f0553c0
bdce6f1
a30a853
c281420
ef215dd
0b6a209
5c29779
032e63b
1a4101e
90b17d6
32f21ac
a1b8726
ee0a7ee
cabef48
f5c3729
f8ceb97
8b1a0c3
9357ec6
ac2f87e
116c5b9
f809087
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m sensing an innate conflict within this MSCs interests, one where it both wants to reduce server history in rooms, but where it also simultaneously expects to be able to fetch that history from thin air at any convenient time. I have a feeling it’s written with the underlying idea that large servers will carry all the events in the federation, with some servers being able to fetch from those at any time.
…however, this is mentioned nowhere in the MSC, where it skirts around these problems by putting these assumptions between the lines, while not thinking critically about what this means for the larger federation; more dependency on large servers.
With this, it does not bring a lucid solution to the problem of dealing with history retention, one where any server eventually has to face that it cannot fetch events it knows exist(ed), but are now expected to respond with them to a client’s query.
The semantic equivalent of HTTP Error 410 (“gone”) has to exist somewhere here, to be able to tell clients it’s unable to fetch a historical event due to history retention, and all sad and happy paths that spring from that. The current stance against this is “you’re SOL, have a 404 with no context”.
I don’t see this MSC deal with the reality that it is deleting events, I don’t see a coherent solution to allow some servers to “archive” history, and make that explicit (also in the rooms, for privacy concerns, for people who wanna know which servers are ignoring retention rules and archiving anyways)
Servers ignoring retention rules does have a basis, namely one of actually archiving historic conversations, in a similar philosophy as The Internet Archive. If this MSC were to go through as-is, then we’d have a similar situation as the general internet, namely one where all history is lost to time due to individual retention strategies.
While reliance on large servers isn’t what a federation would want, an explicit form of mentioning where at least people are aware which servers are backing up, and which ones aren’t, would help this MSC greatly in the long run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I question the feasibility of this - on what I essentially see as a matrix-specced version of Synapse's History Purge functionality. What would qualify exactly as "after read"? Shouldn't this be removed and left alone for MSC2228 to specify or address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is
true
, does that mean that the retention rules apply to both servers and clients?(Reading below, it seems that this is the case, but it seems unclear to me here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stupid question as general audience: what does this imply for the room topic, membership, etc. state data? Is the full history of e.g. who was a member of a room and when retained or purged? If retained, should the summary on the top mention this limitation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are state events and according to the current MSC they must be retained. The issue here is that some state events are used to authorise new events, e.g. so you can't send messages into a room that you haven't joined (i.e. in the state of which there's no join event from you), so purging state events could potentially break the room. We could theoretically avoid that by carefully selecting which state events should not be purged and which ones can (and I'm not even sure about that) but then it becomes a ticking time bomb because one day we're bound to forget about that and make some changes in state events without updating the retention policies spec and break everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this limitation is mentioned at the right place here, but ymmv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The summary contains a statement "... set of rules which allow users, room admins and server admins to determine how long data should be stored for a room, from the perspective of respecting the privacy requirements of that room" which seems to incorrectly imply that the retention rules apply to all data. This was my initial understanding also when reading the configuration file in the current synapse implementation. Just a suggestion from user perspective, but I think it would be important to be clear what it does and doesn't do, so that people can make an informed decision.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, that makes sense, fair point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, it makes more sense to clamp
min_lifetime
to bemax_lifetime
, rather than the other way around, because currently, it makes sense to setmin_lifetime
and leavemax_lifetime
unset (and the result is as expected, as themin_lifetime
takes effect, and themax_lifetime
remains at its default), but if you setmax_lifetime
and leavemin_lifetime
unset, then it will unexpectedly ignore the value formax_lifetime
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this also be added as a fallback when the
max_lifetime >= min_lifetime
invariant is broken?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read somewhere else that the spec doesn't mandate (anymore) how clients will expose UI elements to users, maybe a more abstract description should be used as to when the client is warned, such as;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_lifetime
is in milliseconds.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do need something here to encourage clients to delete/discard the megolm keys for pruned e2e convos?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't that going to create holes in the DAG of rooms and make pagination (and maybe also federation) potentially faffy? Also, in the event of the first retention policy of a room being set in the middle of the history of the room, won't that make it difficult/impossible to reach the messages that were sent before the policy was set?
We could think of solutions like retrieving the most recent expired event and purging everything before (though we'd need to take
min_lifetime
into account and figure out what to do if the retention policy is lacking, which seems to be left as an implementation detail), or redacting events upon expiry and only purging them if there's no event before, or also calculating the expiration date of an event using the current retention policy in the room rather than as it was whenE
was sent (i.e. making it an implementation detail whether the policy used is the current one in the room or the one as ofE
).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw from my own experience since the release of the support for this feature in Synapse people seem to expect retention policies to apply retroactively, so perhaps we should just use the latest
m.room.retention
state event in the room (even though I can see how it creates a different behaviour than most state events).