Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add MediaRetentionPolicy to the EventCacheStore, take 2 #4571

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,314 @@
// Copyright 2025 Kévin Commaille
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The commit message (fd7f88e) is a good basis for this module documentation :-].

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

use ruma::time::{Duration, SystemTime};
use serde::{Deserialize, Serialize};

/// The retention policy for media content used by the `EventCacheStore`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have an intra-link to EventCacheStore please?

/// … by the [`EventCacheStore`].
///
/// [`EventCacheStore`]: crate::event_cache::store::EventCacheStore

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied

#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[non_exhaustive]
pub struct MediaRetentionPolicy {
/// The maximum authorized size of the overall media cache, in bytes.
///
/// The cache size is defined as the sum of the sizes of all the (possibly
/// encrypted) media contents in the cache, excluding any metadata
/// associated with them.
///
/// If this is set and the cache size is bigger than this value, the oldest
/// media contents in the cache will be removed during a cleanup until the
/// cache size is below this threshold.
///
/// Note that it is possible for the cache size to temporarily exceed this
/// value between two cleanups.
///
/// Defaults to 400 MiB.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub max_cache_size: Option<usize>,

/// The maximum authorized size of a single media content, in bytes.
///
/// The size of a media content is the size taken by the content in the
/// database, after it was possibly encrypted, so it might differ from the
/// initial size of the content.
///
/// The maximum authorized size of a single media content is actually the
/// lowest value between `max_cache_size` and `max_file_size`.
///
/// If it is set, media content bigger than the maximum size will not be
/// cached. If the maximum size changed after media content that exceeds the
/// new value was cached, the corresponding content will be removed
/// during a cleanup.
///
/// Defaults to 20 MiB.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub max_file_size: Option<usize>,

/// The duration after which unaccessed media content is considered
/// expired.
///
/// If this is set, media content whose last access is older than this
/// duration will be removed from the media cache during a cleanup.
///
/// Defaults to 60 days.
#[serde(default, skip_serializing_if = "Option::is_none")]
pub last_access_expiry: Option<Duration>,
}

impl MediaRetentionPolicy {
/// Create a `MediaRetentionPolicy` with the default values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intra-link:

- /// Create a `MediaRetentionPolicy` with
+ /// Create a [`MediaRetentionPolicy`] with

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied

pub fn new() -> Self {
Self::default()
}

/// Create an empty `MediaRetentionPolicy`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intra-link for the win 🤘

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

///
/// This means that all media will be cached and cleanups have no effect.
pub fn empty() -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose to rename empty to infinite or unlimited, I guess it conveys the intent better, thoughts?

- pub fn empty() -> Self {
+ pub fn unlimited() -> Self {

Copy link
Collaborator Author

@zecakeh zecakeh Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a future PR, I am planning to allow to launch periodic cleanup jobs. It will add a field that is not a limitation, but a setting of how often the cleanup jobs should run. If this struct does not contain only limitations, infinite and unlimited do not seem to fit.

Self { max_cache_size: None, max_file_size: None, last_access_expiry: None }
}

/// Set the maximum authorized size of the overall media cache, in bytes.
pub fn with_max_cache_size(mut self, size: Option<usize>) -> Self {
self.max_cache_size = size;
self
}

/// Set the maximum authorized size of a single media content, in bytes.
pub fn with_max_file_size(mut self, size: Option<usize>) -> Self {
self.max_file_size = size;
self
}

/// Set the duration before which unaccessed media content is considered
/// expired.
pub fn with_last_access_expiry(mut self, duration: Option<Duration>) -> Self {
self.last_access_expiry = duration;
self
}

/// Whether this policy has limitations.
///
/// If this policy has no limitations, a cleanup job would have no effect.
///
/// Returns `true` if at least one limitation is set.
pub fn has_limitations(&self) -> bool {
self.max_cache_size.is_some()
|| self.max_file_size.is_some()
|| self.last_access_expiry.is_some()
}

/// Whether the given size exceeds the maximum authorized size of the media
/// cache.
///
/// # Arguments
///
/// * `size` - The overall size of the media cache to check, in bytes.
pub fn exceeds_max_cache_size(&self, size: usize) -> bool {
self.max_cache_size.is_some_and(|max_size| size > max_size)
}

/// The computed maximum authorized size of a single media content, in
/// bytes.
///
/// This is the lowest value between `max_cache_size` and `max_file_size`.
pub fn computed_max_file_size(&self) -> Option<usize> {
match (self.max_cache_size, self.max_file_size) {
(None, None) => None,
(None, Some(size)) => Some(size),
(Some(size), None) => Some(size),
(Some(max_cache_size), Some(max_file_size)) => Some(max_cache_size.min(max_file_size)),
}
}

/// Whether the given size, in bytes, exceeds the computed maximum
/// authorized size of a single media content.
///
/// # Arguments
///
/// * `size` - The size of the media content to check, in bytes.
pub fn exceeds_max_file_size(&self, size: usize) -> bool {
self.computed_max_file_size().is_some_and(|max_size| size > max_size)
}

/// Whether a content whose last access was at the given time has expired.
///
/// # Arguments
///
/// * `current_time` - The current time.
///
/// * `last_access_time` - The time when the media content to check was last
/// accessed.
pub fn has_content_expired(
&self,
current_time: SystemTime,
last_access_time: SystemTime,
) -> bool {
self.last_access_expiry.is_some_and(|max_duration| {
current_time
.duration_since(last_access_time)
// If this returns an error, the last access time is newer than the current time.
// This shouldn't happen but in this case the content cannot be expired.
.is_ok_and(|elapsed| elapsed >= max_duration)
})
}
}

impl Default for MediaRetentionPolicy {
fn default() -> Self {
Self {
// 400 MiB.
max_cache_size: Some(400 * 1024 * 1024),
// 20 MiB.
max_file_size: Some(20 * 1024 * 1024),
// 60 days.
last_access_expiry: Some(Duration::from_secs(60 * 24 * 60 * 60)),
zecakeh marked this conversation as resolved.
Show resolved Hide resolved
}
}
}

#[cfg(test)]
mod tests {
use ruma::time::{Duration, SystemTime};

use super::MediaRetentionPolicy;

#[test]
fn test_media_retention_policy_has_limitations() {
let mut policy = MediaRetentionPolicy::empty();
assert!(!policy.has_limitations());

policy = policy.with_last_access_expiry(Some(Duration::from_secs(60)));
assert!(policy.has_limitations());

policy = policy.with_last_access_expiry(None);
assert!(!policy.has_limitations());

policy = policy.with_max_cache_size(Some(1_024));
assert!(policy.has_limitations());

policy = policy.with_max_cache_size(None);
assert!(!policy.has_limitations());

policy = policy.with_max_file_size(Some(1_024));
assert!(policy.has_limitations());

policy = policy.with_max_file_size(None);
assert!(!policy.has_limitations());

// With default values.
assert!(MediaRetentionPolicy::new().has_limitations());
}

#[test]
fn test_media_retention_policy_max_cache_size() {
let file_size = 2_048;

let mut policy = MediaRetentionPolicy::empty();
assert!(!policy.exceeds_max_cache_size(file_size));
assert_eq!(policy.computed_max_file_size(), None);
assert!(!policy.exceeds_max_file_size(file_size));

policy = policy.with_max_cache_size(Some(4_096));
assert!(!policy.exceeds_max_cache_size(file_size));
assert_eq!(policy.computed_max_file_size(), Some(4_096));
assert!(!policy.exceeds_max_file_size(file_size));

policy = policy.with_max_cache_size(Some(2_048));
assert!(!policy.exceeds_max_cache_size(file_size));
assert_eq!(policy.computed_max_file_size(), Some(2_048));
assert!(!policy.exceeds_max_file_size(file_size));

policy = policy.with_max_cache_size(Some(1_024));
assert!(policy.exceeds_max_cache_size(file_size));
assert_eq!(policy.computed_max_file_size(), Some(1_024));
assert!(policy.exceeds_max_file_size(file_size));
}

#[test]
fn test_media_retention_policy_max_file_size() {
let file_size = 2_048;

let mut policy = MediaRetentionPolicy::empty();
assert_eq!(policy.computed_max_file_size(), None);
assert!(!policy.exceeds_max_file_size(file_size));

// With max_file_size only.
policy = policy.with_max_file_size(Some(4_096));
assert_eq!(policy.computed_max_file_size(), Some(4_096));
assert!(!policy.exceeds_max_file_size(file_size));

policy = policy.with_max_file_size(Some(2_048));
assert_eq!(policy.computed_max_file_size(), Some(2_048));
assert!(!policy.exceeds_max_file_size(file_size));

policy = policy.with_max_file_size(Some(1_024));
assert_eq!(policy.computed_max_file_size(), Some(1_024));
assert!(policy.exceeds_max_file_size(file_size));

// With max_cache_size as well.
policy = policy.with_max_cache_size(Some(2_048));
assert_eq!(policy.computed_max_file_size(), Some(1_024));
assert!(policy.exceeds_max_file_size(file_size));

policy = policy.with_max_file_size(Some(2_048));
assert_eq!(policy.computed_max_file_size(), Some(2_048));
assert!(!policy.exceeds_max_file_size(file_size));

policy = policy.with_max_file_size(Some(4_096));
assert_eq!(policy.computed_max_file_size(), Some(2_048));
assert!(!policy.exceeds_max_file_size(file_size));

policy = policy.with_max_cache_size(Some(1_024));
assert_eq!(policy.computed_max_file_size(), Some(1_024));
assert!(policy.exceeds_max_file_size(file_size));
}

#[test]
fn test_media_retention_policy_has_content_expired() {
let epoch = SystemTime::UNIX_EPOCH;
let last_access_time = epoch + Duration::from_secs(30);
let epoch_plus_60 = epoch + Duration::from_secs(60);
let epoch_plus_120 = epoch + Duration::from_secs(120);

let mut policy = MediaRetentionPolicy::empty();
assert!(!policy.has_content_expired(epoch, last_access_time));
assert!(!policy.has_content_expired(last_access_time, last_access_time));
assert!(!policy.has_content_expired(epoch_plus_60, last_access_time));
assert!(!policy.has_content_expired(epoch_plus_120, last_access_time));

policy = policy.with_last_access_expiry(Some(Duration::from_secs(120)));
assert!(!policy.has_content_expired(epoch, last_access_time));
assert!(!policy.has_content_expired(last_access_time, last_access_time));
assert!(!policy.has_content_expired(epoch_plus_60, last_access_time));
assert!(!policy.has_content_expired(epoch_plus_120, last_access_time));

policy = policy.with_last_access_expiry(Some(Duration::from_secs(60)));
assert!(!policy.has_content_expired(epoch, last_access_time));
assert!(!policy.has_content_expired(last_access_time, last_access_time));
assert!(!policy.has_content_expired(epoch_plus_60, last_access_time));
assert!(policy.has_content_expired(epoch_plus_120, last_access_time));

policy = policy.with_last_access_expiry(Some(Duration::from_secs(30)));
assert!(!policy.has_content_expired(epoch, last_access_time));
assert!(!policy.has_content_expired(last_access_time, last_access_time));
assert!(policy.has_content_expired(epoch_plus_60, last_access_time));
assert!(policy.has_content_expired(epoch_plus_120, last_access_time));

policy = policy.with_last_access_expiry(Some(Duration::from_secs(0)));
assert!(!policy.has_content_expired(epoch, last_access_time));
assert!(policy.has_content_expired(last_access_time, last_access_time));
assert!(policy.has_content_expired(epoch_plus_60, last_access_time));
assert!(policy.has_content_expired(epoch_plus_120, last_access_time));
}
}
19 changes: 19 additions & 0 deletions crates/matrix-sdk-base/src/event_cache/store/media/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
// Copyright 2025 Kévin Commaille
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

//! Types and traits regarding media caching of the event cache store.

mod media_retention_policy;

pub use self::media_retention_policy::MediaRetentionPolicy;
1 change: 1 addition & 0 deletions crates/matrix-sdk-base/src/event_cache/store/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ use std::{fmt, ops::Deref, str::Utf8Error, sync::Arc};
#[cfg(any(test, feature = "testing"))]
#[macro_use]
pub mod integration_tests;
pub mod media;
mod memory_store;
mod traits;

Expand Down