-
Notifications
You must be signed in to change notification settings - Fork 37
Sync the huge amount of data which should not sync #240
Comments
@anujjpandey11 if I understand this issue correctly, this is the expected outcome. Today, Realm's sync works off syncing the operation log and we do not compact the log. Thus if you add an operation, including writing a large binary blob, this is appended to the log and will be synced to all devices, even if ultimately later operations deleted it. In the coming months we will be working on log compaction to improve this situation. |
What is the timeline of log compaction to be implemented? |
@eXeDK is your main concern the initial download? |
Yes it is. Having to sync the entire history of the Realm file at startup don't seem optimal. Can you reveal some of the timeline for this specific feature? |
@eXeDK we are starting work on it which will be going into a RMP 2.0 effort. The target for that is in September, though we might be able to offer early previews sooner. |
Okay that sounds awesome @bigfish24. We'll start our internal testing of ROS next week hopefully and then we'll start evaluating in internal builds before we decide whether or not we want to push towards production. |
We are seeing a similar issue. Currently seeing download sizes of > 300mb for a 1.7mb compacted realm. Granted the realm does have quite a bit of history in its logs but am I incorrect in thinking there should be a "delta logs" solution to this? On initial download don't look at the logs at all... but save a timestamp. On subsequent downloads use the timestamp to only fetch a "delta log" of changes since the last time, etc. The RMP isn't really viable to us until something like this is implemented. As our data is managed by clients and changes frequently (thus many history logs). We thought about periodically compacting on the server and always having a compact version for the mobile clients to consume but I'm assuming this wouldn't work as they likely need the log in order to update properly. Without this we're back to using plain old realm mobile database and writing our own syncing engine... |
@zachwhelchel thanks for this info. Quite confident we will be solving this for you. There is a PR internally to compact the history, which will be able to run on both the client and server. We are targeting this PR for the 2.0 release in September, but there will be previews leading up to that. Stay tuned! |
@bigfish24 awesome. Thats really encouraging to hear. Do you mind explaining a bit more about what you mean in "compacting the history"? Will this be a delta like approach or will it still be an ever-growing file size for the logs? Can we expect the download size of a 1.7mb realm file to be closer to 1.7mb? Or roughly 2x, 4x, 10x? |
The current PR can scan the history and compact The current plan is to have this compaction run on the server, meaning clients will upload their operations and then the server will compact the operations periodically so other clients don't have to download as much. We are also exploring performing the compaction on the client when it is offline, so that once the network connection is reestablished it uploads less. There is a balance between uploading immediately vs. spending time compacting. Secondarily, we are also working on functionality to download the Realm file on first connect vs. the history. This will only be available via the We think the combination of both of these will mostly eliminate the issue. |
@bigfish24 best news I've heard all day. Thanks for the insight! |
What if I use a lot of lists? Is there any hope then for a much smaller data size, closer to the compacted file size? :) For example I have a list of areas, and each area has a list of streets, and each street have a list of addresses, and each address have a list of contacts. And streets with all the addresses and contacts will be added or deleted often and status of a contact will also change often. So a lot of changes in the data! I hope that version 2 will also solve the problem with everything memory mapped. Not very possible when the database becomes large due to many users, like 500GB-1TB.. |
There might be ways to optimize lists, but we aren't working on that at the moment. One thing that has been proposed separately that would be easier to compact would be to offer a |
I should qualify that the operations related to |
Thanks for your clarification. What do you mean by "inherit ordering"? If I have object A, that has a list of object B and I remove object A from the realm (a custom delete function that calls delete from realm on the list so also all B objects will be removed), will it then store a delete operation for each object B in the lest or just the deletion of object A, which in turn would mean all members, lists and objects in the list is deleted? |
On a related note, if you don’t care about the order of addresses, streets and so on, you can use inverse relationship to represent the same model, i.e. Street has a property Area. This will not preserve ordering but will make the file size smaller. |
Is there any update on the feature to sync the latest changes or compact the history? We are finding this to be a real problem now. We launched an app in June and the sync of the initial data has slowly taken longer and longer (due to more and more transactions) we are now at the point where the sync on a clean install takes around 1 hour to complete. |
Apologies that this isn’t documented well but we have released log compaction but it needs enabled when you start the server. Follow these instructions here: #127 (comment) We will update our public docs to call this out. |
I've recently update our self hosted Realm server to v3.19.0 after running v3.4.5 since my last comments here. My reason for upgrading and buying a licence was in the hope that the historyTtl setting would resolve the issues described here. So far I am unable to notice any difference. Our users are experiencing issues again because the amount of data syncing from the realm is in excess of 2GB so on their initial sync it will fall over at about 85% complete with the "mmap() failed: Cannot allocate memory size:" error. I have set the historyTtl setting to 30 days. Is my understanding wrong that setting it to 30 should mean new clients will only take the transaction history of the last 30 days and should therefore sync a much much smaller set of data? Currently with historyTtl set to 30 days or disabled I always see the sync attempting to sync over 2GB of data. The size reporting in the new Realm Studio is really handy, but it shows me a realm size of 6.98GB and a data size of 160MB. If this means what I think then the transaction history is massively bloating our realm (which is understandable because we perform a lot of transactions) but our client devices are connected daily so we really don't need all that history. |
@Jonsapps please open a ticket at support.realm.io - you will also need to set |
Goals
We have a MediaFile object which contain byte array and other property associate with the byte array, there are few steps we follow to process this media file.
Till this point of time everything looks fine and the file we deleted from ROS is also reflected to the ROS Browser as you can see in image 1.
Expected Results
What I can see there is no media file at the ROS browser, there is only schema and some meta data of media file which could only produce 2-3 mb of downloadable data.
Actual Results
But what I can see there in SYNC Log is just out of the assumption
07-23 20:31:35.613 3340-3382/com.automotive.tracker D/REALM_SYNC: Using already open Realm file: /storage/sdcard0/KentTracker/Sync/9d8b5a513622a769155609c4a1f44d23/9d8b5a513622a769155609c4a1f44d23/2056695815/private.realm
07-23 20:31:35.613 3340-3382/com.automotive.tracker D/REALM_SYNC: Connection[1]: Session[1]: Progress handler called, downloaded = 795650, downloadable = 68717705, uploaded = 0, uploadable = 0, progress version = 1, snapshot version = 4
07-23 20:31:35.623 3340-3382/com.automotive.tracker D/REALM_SYNC: message_type = download
07-23 20:31:35.643 3340-3382/com.automotive.tracker D/REALM_SYNC: Download message compression: is_body_compressed = 1, compressed_body_size=333927, uncompressed_body_size=390100
07-23 20:31:35.643 3340-3382/com.automotive.tracker D/REALM_SYNC: Connection[1]: Session[1]: Received: DOWNLOAD(scan_server_version=4, scan_client_version=0, latest_server_version=351, latest_server_session_ident=5729703240262368978, latest_client_version=0, downloadable_bytes=68717705, number_of_changesets=1)
07-23 20:31:35.643 3340-3382/com.automotive.tracker D/REALM_SYNC: Using already open Realm file: /storage/sdcard0/KentTracker/Sync/9d8b5a513622a769155609c4a1f44d23/9d8b5a513622a769155609c4a1f44d23/2056695815/private.realm
07-23 20:31:37.153 3340-3382/com.automotive.tracker D/REALM_SYNC: Connection[1]: Session[1]: 1 remote changeset integrated, producing client version 5
07-23 20:31:37.153 3340-3382/com.automotive.tracker D/REALM_SYNC: Using already open Realm file: /storage/sdcard0/KentTracker/Sync/9d8b5a513622a769155609c4a1f44d23/9d8b5a513622a769155609c4a1f44d23/2056695815/private.realm
07-23 20:31:37.153 3340-3382/com.automotive.tracker D/REALM_SYNC: Connection[1]: Session[1]: Progress handler called, downloaded = 1185725, downloadable = 68717705, uploaded = 0, uploadable = 0, progress version = 1, snapshot version = 5
07-23 20:31:37.163 3340-3382/com.automotive.tracker D/REALM_SYNC: message_type = download
07-23 20:31:37.173 3340-3382/com.automotive.tracker D/REALM_SYNC: Download message compression: is_body_compressed = 1, compressed_body_size=326705, uncompressed_body_size=389846
07-23 20:31:37.173 3340-3382/com.automotive.tracker D/REALM_SYNC: Connection[1]: Session[1]: Received: DOWNLOAD(scan_server_version=5, scan_client_version=0, latest_server_version=351, latest_server_session_ident=5729703240262368978, latest_client_version=0, downloadable_bytes=68717705, number_of_changesets=1)
07-23 20:31:37.173 3340-3382/com.automotive.tracker D/REALM_SYNC: Using already open Realm file: /storage/sdcard0/KentTracker/Sync/9d8b5a513622a769155609c4a1f44d23/9d8b5a513622a769155609c4a1f44d23/2056695815/private.realm
07-23 20:31:38.633 3340-3382/com.automotive.tracker D/REALM_SYNC: Connection[1]: Session[1]: 1 remote changeset integrated, producing client version 6
07-23 20:31:38.633 3340-3382/com.automotive.tracker D/REALM_SYNC: Using already open Realm file: /storage/sdcard0/KentTracker/Sync/9d8b5a513622a769155609c4a1f44d23/9d8b5a513622a769155609c4a1f44d23/2056695815/private.realm
07-23 20:31:38.633 3340-3382/com.automotive.tracker D/REALM_SYNC: Connection[1]: Session[1]: Progress handler called, downloaded = 1575546, downloadable = 68717705, uploaded = 0, uploadable = 0, progress version = 1, snapshot version = 6
07-23 20:31:38.643 3340-3382/com.automotive.tracker D/REALM_SYNC: message_type = download
07-23 20:31:38.653 3340-3382/com.automotive.tracker D/REALM_SYNC: Download message compression: is_body_compressed = 1, compressed_body_size=313076, uncompressed_body_size=375174
07-23 20:31:38.653 3340-3382/com.automotive.tracker D/REALM_SYNC: Connection[1]: Session[1]: Received: DOWNLOAD(scan_server_version=6, scan_client_version=0, latest_server_version=351, latest_server_session_ident=5729703240262368978, latest_client_version=0, downloadable_bytes=68717705, number_of_changesets=1)
and this is to be continued to the snapshot version = n untill unless the size of downloadable == downloaded.
And this much data is actually downloaded from the realm object server and written to the sd card.
Steps to Reproduce
Just commit media file to the ROS and delete them
now take a fresh mobile device and enable sync for the same path.
Code Sample
Version of Realm and Tooling
-Realm Object Serverv 1.8.2
Please let me why it is downloading this huge amount of data if it has been already deleted from the ROS.
or let me know if you want anything else from my side.
The text was updated successfully, but these errors were encountered: