-
Notifications
You must be signed in to change notification settings - Fork 172
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wishlist feature: offline-first #101
Comments
+1 |
1 similar comment
+1 |
A PR does infinitely more good than a +1 😉 |
I understand the sentiment but I'd argue a +1 demonstrates interest for On Sun, Mar 6, 2016 at 9:02 AM, Matt Krick [email protected] wrote:
Thank you, James D. Wilson |
yeah... that's what's broken in a bunch of OSS. There's no white knight going around making PRs on issues that have the most "+1"s, awesome as that sounds. As a positive example, look at linux or nodejs. Someone needs a feature, their company sponsors them to write the PR, others validate it, and you've got a healthy feature added. I've got no problems if someone files a feature request as an issue, but saying "+1" tells me the feature is not desirable enough to spend your own time & brain juice on it, which ultimately tells me it's not that important. Even if this project were venture backed & looking to prioritize a next sprint, I'd personally select the next sprint based on the issue with the most thoughtful comments. We're all in this thing together. That's what makes this a community 😄 |
What's missing is a "star" or a more integrated +1 feature. Then tools to On Sun, Mar 6, 2016 at 12:29 PM, Matt Krick [email protected]
Thank you, James D. Wilson |
This probably also needs service worker invalidation mechanics. Getting rid of a borked service worker can be really painful. A well-tested service worker wrapper would go a long way to making sure we can always bypass/remove the service worker if necessary. The |
offline-plugin seems to work okay for react-boilerplate. Would a PR with something like that work? |
@wtgtybhertgeghgtwtg I've seen that, but haven't played with it much. I think it would work, but ultimately it'd be nice to arrive at a solution like what's described here: www.pocketjavascript.com/blog/2015/11/23/introducing-pokedex-org where it sends a toast when you're offline, etc. Not sure if that's possible with the webpack plugin or if more work is required on the sw.js. I've still got a stack of offline-first books I need to read 😄 |
On that note, have you given any consideration to taking a progessive web app approach? |
@mattkrick The Pokedex.org example is awesome, but it's really showcasing the multi-master replication of CouchDB, using PouchDB to synchronize a local database (the better adapter is selected depending on the browser) from a remote CouchDB, handling possible conflicts based on the documents' I agree that this "progressive" webapp is basically the better user experience and would be great to have something similar with RethinkDB. However, how would the offline-plugin or sw.js solve the data querying problem? As I understand it, they wouldn't, right? I guess the offline cache should be something aware of how GraphQL queries work, so would it be something to solve in the cashay project? |
You'd probably have to synch Rethink with a local IndexedDB or something. I was under the impression that the Service Worker would just handle the shell or static content. |
Resumable changefeeds are being tracked in rethinkdb/rethinkdb#3471 A graphql+crdt example could be neat. If there's interest I can contribute one. I haven't actually looked at the cachecontrol headers on the webpack chunks. If they're not being fingerprinted and set to a long term expiry we should do that first. |
I've had some good chats about data expiration recently (gah that sounds nerdy). There is some data that can be long lived, such as a list of countries.Typically, long lived stuff won't come in through a changefeed. Other stuff, say a list of Kanban lanes, will probably be invalidated on every visit. I think it's the job of the cache to provide a timestamp (eg On the cache side, GraphQL actually keeps us from being cache-efficient. That's because if I request @wenzowski i'm really curious about your use with graphql + crdt. What's it look like?? The only time I've used crdt is with swarm.js, and it's not document based & frankly i'm not sure how to make it document based. That said, I'd love to build a client cache that supports infrequent queries, frequent document updates (subscriptions), and frequent collaborative changes (CmRDT). I still dunno what that'd look like... |
If you want to sync arbitrary documents over a high-latency/interruptible network connection, I highly recommend ShareJS. In another (experimental) app, I have a few Riak buckets providing special fields for a few object types, where each field has its own set of mutation methods. That doesn't sound like GraphQL preventing us from being cache-efficient, but rather an application concern that prevents it. If you were to, say, Particularly long-lived data (like a country list) could easily be compiled into webpack chunks. Speaking of which, PR for |
@wenzowski wow, it's like you're in my head...so here's the thing with |
Query-level TTLs came up in facebook/relay#720. Possibly elsewhere? If we were to define reasonable field-level cache expectations with a TTL, then perhaps these could go directly in the schema, allowing the graphql http server to correctly set maxAge to the lowest field value and enabling the websocket server to provide an equivalent mechanism. This would allow a key-value cache as appears in relay docs to be as consistent as specified by the schema. |
ehhh i think that's getting too apollo-ish. the meteor folks are solving this (i think) by setting up an invalidation server. Personally, i think we should keep as much logic on the client as possible & detached from the data & data transport layer. it'd be amazing if firefox had an equivalent to chrome's The same logic holds true for a TTL. After 5 mins, roll through each document, make a queue of queries to invalidate, and then do a refresh. I'd keep it at the document level instead of the field level to be somewhat performant, but the logic is dead simple & pretty performant. |
Skipping over two hard problems, your comment reminds me that there are a few gems mixed in with The State of Meteor Part 1 and Part 2 should you wish to explore the rabbit hole. I firmly believe mapping GraphQL queries to TTLs is the right way to go, and I hope you'll permit me a brief time warp to explain why I suspect this is the case. The ANSI-SPARC three-level architecture has had a lasting impact on both database design and, by extension, data-driven document generation.
If we were to describe GraphQL/Relay in these terms, its role is to both define a set of composable (mostly)immutable conceptual schemae and to handle mappings between the external and internal views, decoupling both. The magic of meteor is in seamlessly synchronizing document state between users, observers, correctly piping changes that occur internally (mongo documents) to external observers (loaded html documents) by way of DDP. With GraphQL, each client only ever sees an external representation (its fetched/subscribed document) yet cache invalidation happens on the client based on remote server-side changes to the internal representation (rethinkdb documents in this case). If we are able to rewind each client's graphql subscription to the point where that client lost connectivity by going offline and replay all remote changes, resolving conflicts or generating siblings for future resolution, then we can cache all subscribed queries indefinitely and throw out the concept of a validity threshold. If we're talking about subscribing to absolutely everything like, say, derby does, and are talking about caching GraphQL fetches, then I think perhaps the TTL route is necessary. Going with a bubble-up TTL approach would allow developers some kind of knowledge about necessary propagation delays that modifications to the internal model will inherently be subjected to: beyond the TTL threshold an offline client will have purged the stale data and will be forced to reconnect before taking any action that relies on it. Without this foreknowledge, I think we open ourselves up to production-only heisenbugs. Given the nested nature of GraphQL schema definitions, I would suspect that operating at the field-level is necessary even if only one TTL is allowed per
I think I have a need for the TTL mechanics in an app I'm working on. If this turns out to be the case, I'll extract. |
I like where this is headed, but one thing bothers me:
We don't know exactly when they go offline. for example, meatier has a heartbeat every 15 seconds. DDP is similar. lost connectivity that is < heartbeat means that we can't be guaranteed that the document made it to the client (unless we use durable messaging, but there goes our scaling). If you put a TTL on every rootObject (stripping away the non-nulls & the Lists), and invalidate a single one, you still don't have a way to refetch that particular doc unless the client provides you with a function to do so. Basically, for every |
I think we should split this issue in two: full offline support is different from applying progressive web application techniques to combat latency. We're never going to know exactly when any event happens. I'd like to estimate clock offset as part of socket initiation, but haven't opened an issue for that yet. If we have a static object order then the server doesn't need to know which messages have reached the client: the client can advertise the last object received though an additional parameter when it reconnects.
The relay way leverages globally unique object IDs, something this app already has, and a Thinking about |
@wenzowski i considered adding in a clock offset into the socket handshake protocol, but we decided against it. At the end of the day, it's still a heuristic & that latency is going to change, especially on a mobile connection. WRT the relay way, that GUID contains an opaque I think for 95% of use cases, last write wins is perfectly fine. For those other 5%, it's best not to rely on a single source of truth (ie use CmRDT). I imagine carving out a separate piece of state for CmRDT and letting something like swarm.js do the heavy lifting. That way, we have document-level changes for regular stuff, and... i dunno what to call it, field location level changes? for collaboration pieces. |
Regarding clock offset and mobile jitter causing significant variance: an offset prevents client timestamps from being off by hours (client timezone set incorrectly or never ntp synced). Cold start over Edge vs established socket puts RTT variance in the range of ~4s. Yes, network instability could make that worse, though I think it could be useful to have at least minute-level consistency in client-generated timestamps. I really strongly dislike the "last write wins" term: in a carefully designed distributed system you can record causality yet still cannot necessarily determine operation order. If you cannot determine operation order, then you cannot determine "last" and your system is now very clearly a "some arbitrary writes lost" consistency mechanic. So, I will agree that "some writes lost" is perfectly fine if your system is designed to accommodate this; if all your objects are immutable and your collections append-only, for instance, consistency can eventually be obtained by detecting and re-requesting the lost writes. If inconsistency is tolerable (streaming logs, for instance) then then data loss is tolerable to your application-defined threshold. Going the CmRDT route can be useful for the particular field types that CmRDT operations can be defined upon, provided you need automatic merging. Another approach is to store conflicting writes as siblings. The idea of sets of field-level mutators is exactly what I was expecting would be needed to support something akin to swarm.js and I agree that some mutations simply need to be emitted while online: they need to wait on acceptance by whatever the single source of truth is, and need to block certain additional operations until they're accepted. The nice thing about building an example app is that we don't have to predict the future and guess the proportion of use cases that require one thing or the other; we can focus instead on the specific use cases in the example app. Yes, in order to provide |
@wenzowski yeah, i think this convo is getting a little too abstract for me to add much value without a concrete example. let's touch base thursday. |
Great! In terms of actually taking a stab at an offline-first architecture, shall we start with toast messages to indicate offline state? |
Guys, I enjoyed your discussion, but I'm afraid I missed quite a stuff. My simple question is - is it possible to bundle a Meatier app in Cordova/Electron and use Service Worker to cache files client-side and do updates when files change on server? Is this implemented and works out of the box, or it is possible just in theory? Personally I'm not looking about caching any real data, just JS/CSS files. |
I don't think service workers are meant for native implementations, you'd On Sat, Jul 9, 2016, 1:21 AM mishoboss [email protected] wrote:
|
I have question on what is the plan on offline feature? will it be built on meatier or ride on relay/cashay . How soon can we start to see alpha version of this? |
Yeah, this absolutely will exist in the application layer (ie meatier). Then, when I rehydrate from localStorage/localForage, I only include certain reducers: Service workers will also live in the application layer, I'll get to them after we get an MVP out the door (and seed $$ from investors 😄) |
See React.js Conf 2016 - Aditya Punjani - Building a Progressive Web App.
This requires some careful scaffolding that can easily be reused between apps, so makes a prime candidate for meatier inclusion IMHO.
Basically requires:
For service workers, see https://changelog.com/essential-reading-list-for-getting-started-with-service-workers/.
For encryption, there is already #80.
The big advantage of offline-first is native-like loading speed:
("SW + App Shells" is Service Worker + pre-rendered-and-cached page views that look the same for all visitors. Basically, don't put user-specific data in pre-rendered views. Perhaps a result cache based on url?)
The text was updated successfully, but these errors were encountered: