-
Notifications
You must be signed in to change notification settings - Fork 842
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch to Web Crypto API #28
Comments
Another thing we should do is minimize meta data. E.g. currently attachments and comments are encrypted in separate parts of the JSON. This allows the web server admin to see whether an attachment is done and how large it is. As for comments we will probably not be able to change this, due to the nature of comments (anyone can add another one at any time and we cannot re-encrypt the whole paste in this case). But for everything else, we should add all to-be-encrypted data first in a plain-text JSON and then encrypt the whole JSON. Additionally all other data, which is needed on the server (expiration time e.g.) could currently be manipulated by the server. Already SCJL, however, has a mechanism which can prevent this, so we should make use of it. So two things:
|
Paste EncryptionJust a suggestion Data passed inThe following data is what we pass in
Process dataStart with generating a one time password.
If
Processing of the If we want compression:
else
Key generation for encryption (PBKDF2)
Encryption
URLurl = "<paste_url>/?<paste_id>#<base64(paste_otp)> |
I think it should go into the wiki, so we can work together on this. I think information about the kdf and the hash used for the kdf should be passed around so it can be changed in future. I've added a compressed flag so you can turn compression of. Maybe we should send around information about compression. However we should move as much information as possible into the cipher_text. So the cipher_text should probably be a json structure and hold the plaintext, information about the compression algorithm and if it is compressed.
|
Indeed I'd also argue for the cipher text to be a JSON. This is also easily parsable in JS. |
FYI: Started work on this one in a branch. First I'll replace as many SJCL (and Base64.js, while I'm at it) calls as possible, to prove that at least the deciphering stays compatible with the existing paste formats (that's why I added unit tests for both the legacy and current paste format). Then we can work on the new format, replacing the flawed deflate library in the process. At that point we will have to be able to write both the legacy and the new format and read all three of these formats. The legacy format will still require loading of Base64.js 1.7 and the rawdeflate/inflate libraries instead of pako, but ideally we would not need SJCL at all at that point (assuming that using webcrypt API we are able to produce identical cipher encoding as SJCL) and can instead rely on browser makers doing their job. |
Sounds great! |
On starting to learn about the webcrypto's "subtle" API, I did find both the required PBKDF2 and AES-GCM interfaces. Already introduced the UTF-16/UTF-8 conversion (was done using Base64.js before). Most newer browsers have TextEncoder for that, but neither IE or Edge support this and it is relatively easy to do ourselves. Maybe will do a polyfill for the M$ browsers instead, but would make the code more complicated to maintain. Currently hitting my head with the following: The unit tests, which I need to ensure the backwards compatibility with the old formats and for faster development cycles run on nodejs. Node has a crypto module (which interfaces with OpenSSL), but none of the Webcrypto interfaces. There is a nodejs module that provides a webcrypto API facade in front of the crypto module, but it still lacks PBKDF2 support. The crypto module seems to offer PBKDF2, so it may just need to get exposed in the facade. I may have to help that project get the PBKDF2 support first, so that I then can use that library to continue the development using node. I did consider ignoring the node unit tests for now and instead develop purely in the browser, which is made easier thanks to the Janitor container. Problem is, I then have to build a lot of testing tools into privatebin.js, to be able to feed it the old paste formats or I need to generate old pastes on other servers and import them into the development instance. It would be much easier, if I could just keep the test code separate from the application and use node, so I may just have to bite the bullet and help building the necessary tool first. |
A polyfill would likely be better. Also better to use the real TextEncoder in newer JS versions. Actually there is already a polyfill for the whole API. |
Another update: I found another nodejs module that implements all required webcrypto API calls, but needed a newer nodejs, fortunately I hadn't spent too much time on the other one and an upgrade to Ubuntu 18.04 on my development environment promised to solve the nodejs dependency. Unfortunately it turned out that the Debian/Ubuntu maintainers messed up and had compiled nodejs 8 against openssl 1.1, while modules assume that nodejs 8 implies openssl 1.0 and fail to compile against ubuntus Franken-node. I found a temporary workaround for that in the ticket and also switched to a newer node version in travis and our unit testing docker container (the latter resulted in a significant speedup: running mocha with the ubuntu node takes around 2 minutes, while the new container using alpines node takes 17 seconds, on the same machine). Other issues were caused by the upgrade to gpg2 which broke my git commits as I sign them by default. Turns out you have to manually import the old gpg1 private keys. After spending a couple of days on the actual API migration, I today finally managed to get it working, both for the old and current pastes. It ended up not being quite as complicated as expected, but I spent a lot of time figuring out the exact formats of all the parameters and how to parse them correctly and convert them to the ArrayBuffer syntax and back. It is not yet finished: There are still some remaining SJCL calls in there that need to be removed. I also used the ugly async/await workaround to emulate synchronous behaviour for the existing functions. It has therefore be regarded as a proof of concept using the existing API. @rugk, if you have time, how would you suggest we should change the logic to be able to use a callback or a promise with the cipher/decipher functions? This would allow to make proper use of the webcrypto APIs asynchronous nature. |
Async and await are no workaround but perfectly fine features. AFAIK you only user them for the crypto API, so it's okay. So as for your question: theoretically I would prefer promises, but look at browse support. Switching to them would be another round of refactoring the whole code, if we want to do it everywhere and get rid of most callbacks. Also await has AFAIK quite less browser support in older browsers. |
Don't you still use sjcl? If you need to support it for backwards compatibility and a browser doesn't have the WebCrypto API you need, then just use sjcl? |
No, as the current work proves, we can get rid of SJCL completly and still be able to decrypt even the old ZeroBin pastes (the ones generated using the incompatible Base64.js encoding). Reg. async: Ok, so async/await and promises are out of the window then, due to poor browser support. So we would have to use traditional callbacks. @rugk, I see that we use these methods in five locations and four methods:
In What I am not sure how to handle is the |
Well… I would love to use it. Actually the only problem is IE 11. Can't we just drop IE support? Or fallback to sjcl for it? |
Looking at your implementation, you use 10 000 rounds of PBKDF2.-SHA256. This seems to be the minimum, see borgbackup/borg#77. Of course, as for the random key, I guess, it does not matter. But as for the password, we should pass that through a larger amount of key stretching rounds. Also I do not feel fine with appending the password just to the key. I think, some more elaborate solution would be needed. (But this is just about decrypting sjcl-encrypted pastes, so as for that, we cannot change the format.) Also this here, contradicts itself (likely just a wrong comment): Line 752 in 0dbbb61
|
FYI here is Mozilla's implementation in Send, which we may take as some kind of reference. |
Quick reminder to get back on track: At the moment the goal is to implement the current logic, so that we can replace SJCL entirely. Of course once we start implementing the new paste format, see above, we can massively increase the standards we apply. Before we get there, though, I want to pin down the API, so that we can then focus on the implementation without having to touch the unit tests, so that we can rely on them bringing to light if we mess up the legacy paste support. So, given that we never cared much about IE (since the days of ZeroBin there was a special warning, just for the IE users) and that Edge, the current MS browser stack supports both Webcrypto and promises, we should probably go with that. You may have noticed that the travis builds since yesterday started failing frequently for the webcrypto branch. The reason was that the promises can take a while to get processed in the background, especially if you launch 100 of them simultaneously to test the crypto and they messed with the DOM states of subsequent tests that ran in parallel to them. I "fixed" this in the morning by severely reducing the amount of launched crypto tests and it is stable again, but it shows that the async/await pattern, while seemingly making your code more linear, simply hides the complexity of the promises behind them. So yes, lets go with promises then. Whats your suggestion on how the Uploader class should get extended/rewritten/refactored to support this? Could the Uploader.run() be triggered from the Promises then method? Should we show a status of "encrypting" or similar until it gets triggered and changes the status to "sending"? |
Reg. nounce: We already use a random initialization vector (128 bits), so no two plaintexts should get the same cipher text anyway. Also there is a random salt (64 bits) used in the key derivation from the randomly generated key (256 bit). Of course IV and PBKDF2 salt are known to anyone trying to brute force the key to an offline stored paste and increasing the iterations and raising the HMAC hash to SHA-512 will slow this down. -> New paste format - focus on adjusting the API first, please. |
No, I am sorry, but you do not seem to understand the async/await feature. So just let me explain two small things:
As for the other stuff, I need to look into it. Now not, however… As such, we of course go with async/await, it often makes using Promises easier. And if we drop IE, we can use it, so all is fine here. As I said, this may just involve a whole round of refactoring…
Yep, a nounce has nothing to do with IV. I guess it is here mainly just to hide the size of the encrypted content. |
Okay, I've adjusted the Uploader object and from what I see, I guess it now encrypts correctly. However the result is a JSON obejct and when it is accessed later it is shown as |
Your implementation was a very helpful example and I managed to implement solutions for the other listed cases. There was a point where it "clicked" in my head and the promise concept started to make sense for separating the scope of the various involved components. I can use the decryption promise and attach UI related actions to it and pass it on, to trigger further actions when multiple related promises get fulfilled in yet another layer. This is much cleaner then having to build callback interfaces into many methods and limits the scope that I have to consider in each of them. Anyway, we now again have a working UI and no dependencies on SJCL at all, even for the current format. Next comes the interesting part of implementing the new format we discussed so long ago. Should document this in the wiki pages as I go, both for the old format (that we still need to read) and the new one (read & write). |
Exactly! It totally avoids this "callback hell". |
Also this may speed up the whole encryption/decryption as we can start Promises one after another and let browsers potentially use multithreading or whatever they do, to encrypt/decrypt multiple stuff at the same time. |
I have reviewed the original proposal for the version 2 paste format and came up with a streamlined version, which I have put into the encryption format wiki page for review. The key changes are:
Let me know what you think. |
What do you mean by comment? I guess these have to be encrypted separately, as these need to be added by different users/anybody than the paste creation. But generally yes, all paste content goes in a big JSON there. Also, I'd suggest to add a JSON entry
Yeah okay I see no big risk with that, but it should then at least be authenticated? I mean, it should be possible to put it in that part, should not it?
Can we then name this section something like "serverMetaData" or "unauthenticatedMeta" or so?
Hmm, don't think this is really needed. If we really implemented that in the future, we maybe rather need "parents" (as the parent doe snot know it gets a child when being encrypted) or we may internally implement it with a "hidden" comment or something like this.
Yeah, generally good I think, but:
What do you mean by that? You can actually decrypt pastes where fewer rounds were used with much more rounds now? This sounds very strange to me… BTW as this is a big change, why not name the next release "PrivateBin v2"? |
|
comments: I take this as a "no, it's not contained in the paste".
So what? Even if we have not, this whole issue is about discussing the new format. I do feel a big attacked by that sentence as if I were not allowed to discuss a new feature. As said and as linked in the crypto.stackexchange.com it may not really be needed, but could be useful. compression: Yeah, exactly. In any case, it won't hurt to do that. 😃 children: yeah, IMHO not add this now. As said, thanks to the versioning of the encryption format now we can easily add a new property (for that) later.
Yes, it was this (highlighting by me):
So that's the absolute minimum. Actually we can aim for 1 second delay. As the delay however totally depends on the device, we might have to test mobile devices or so and check out what number we may use. combining: As said, I'll look it up…
Totally off-topic, but thinking about this, if we continue like this we would always have "legacy" pastes in PrivateBin. Even the old format should be quite okay (i.e. "secure"), but it may not be – especially not in the long term. As such, what we can do is detect the "legacy" format and show a suggestion/info message to the user to "press clone" and continue using the paste in a new format. But hmm… you loose the comments then, of course. 😢 version number: Yeah, okay. 😄 |
comments: The JSON-LD clearly documents that comments are an array in the paste? See below... nounce: Sorry, that was not my intention. Of course we can discuss it, but please in a dedicated ticket assigned to the 1.3 milestone and then we can add it to the format in here. The comment you refer to is about the iv, which always was and needs to remain part of the format, as long as we use AES, which requires an initialization vector. compression: I'll create a JSON-LD below incorporating this, maybe it is more clear what is my proposition. children: Its just in there to document where it would go in case someone actually implement that ticket. Not my prio, but would clearly belong with the encrypted payload and not the clear text part. iterations: Just suggest a number. Can still be changed at any point up to the release. Lets assume the 10000 is a placeholder until then. combinations: Sorry, I thought that is obvious that we already do that and continue on doing. I have seen pastes with comments in three different old formats. Original paste in ccm 128 bit, some comments the same, later comments in ccm 256 bit, latest in gcm 256 bit. All in one page and all could be decrypted properly. It's like with some of my old word documents or bitmaps from 1992 that I still want to read occasionally. Sure, at some point conversion may be an option, but in our model that would require overwriting the message, so the clone would be an alternative for that? Anyhow, here is the updated suggestion, this time expressed as JSON-LD in the hopes to make the intention clearer. This is the response a client would get upon requesting a paste: {
"@context": {
"so": "https://schema.org/",
"pb": "https://privatebin.org/",
"status": {"@id": "so:Integer"},
"id": {"@id": "so:name"},
"deletetoken": {"@id": "so:Text"},
"url": {
"@type": "@id",
"@id": "so:url"
},
"v": {
"@id": "so:Integer",
"@value": 2
},
"ct": {
"@type": "so:Text",
"@id": "pb:CipherText"
},
"adata": {
"@id": "pb:AuthenticatedData",
"@container": "@list",
"@value": [
{
"@type": "so:Text",
"@id": "pb:InitializationVector"
},
{
"@type": "so:Integer",
"@id": "pb:Iterations"
},
{
"@type": "so:Integer",
"@id": "pb:KeySize"
},
{
"@type": "so:Integer",
"@id": "pb:TagSize"
},
{
"@type": "so:Text",
"@id": "pb:Mode",
"@value": "gcm"
},
{
"@type": "so:Text",
"@id": "pb:Algorithm",
"@value": "aes"
},
{
"@type": "so:Text",
"@id": "pb:Compression",
"@value": "zlib"
},
{
"@type": "so:Text",
"@id": "pb:Salt"
}
]
},
"meta": {
"@id": "?jsonld=pastemeta"
},
"comments": {
"@id": "?jsonld=comment",
"@container": "@list"
},
"comment_count": {"@id": "so:Integer"},
"comment_offset": {"@id": "so:Integer"}
}
} Note:
pastemeta would be reduced to: {
"@context": {
"so": "https://schema.org/",
"postdate": {"@id": "so:Integer"},
"remaining_time": {"@id": "so:Integer"}
}
} The comments would be changed similarly: a single ct section, static settings into the adata and minimal meta section. To avoid any confusion: The above is a schema describing the message structure, not how the message actually looks like. The properties starting with "@" are descriptors of the syntax, "so" and "pb" contain vocabularies for the syntax. |
Ah, okay, thanks. IMHO, that LGTM. What just came into my mind was to duplicate some of the non-authenticated data in the adata part, so it can be authenticated after decryption. This could prevent some awkward attacks like version downgrade (if the new hypothetical decryption v3 could be downgraded to v2) or the server serving a different paste for an old paste ID (include paste ID in duplicated format). |
Took forever and a several attempts to get the JSON-LD done. Have to say that JSON-LD as DSL is helpful as it forces you to think about how you will use the described data structure and so I discovered some flaws in the above proposal before I even attempted to implement it in JS or PHP. Anyhow, here are the files describing the format incl. key differences to the above: types.jsonld - introduced to document PB specific data types used in the pastes and comments, including crypto primitives. This is partially there as documentation for humans (defaults in '@value', min & max or options in '@enum'), but we can and should also use it for feature detection. Examples:
paste.jsonld & pastemeta.jsonld - mostly as described in the proposal. Things in 'adata' are created by the client and can't be changed on the server (or the decryption fails, which I have to test in browsers), while the comment.jsonld & commentmeta.jsonld - similar to paste in structure. Note that the Other changes: I tried to make the formatting more consistent for readability and removed the incorrect use of '@id'. I think next I'll work on the JS implementation, then the PHP side (support validation of new format, changes of DB tables, if necessary). |
(1) But taking the "Expire" I wonder whether it is useful to put it into the adata part anyway? I mean, the server can likely use it, but cannot verify the authenticity anyway, as it does not have the key? (2) (3) BTW, generally said, these JSON-LD's lock great and seem to be a great tool for this "spec work". 😄 |
|
I have implemented the changes for point 3 in 1de57c8. Correction on 2., though: That time stamp is actually used in the comments for displaying the comments date (in localized formatting). We could still keep the pastes creation date internal (it currently gets exposed). |
Okay, that's a "metadata leak" that currently is there. So IMHO, we should keep it secret. It has no value on the client and for a paste creator it is not obvious it is exposed. |
BTW the MDN explanations of the WebCrypto API have been mostly rewritten and they already look quite good, IMHO. |
BTW another resource that explains how we should do it very well, e.g.: https://timtaubert.de/talks/keeping-secrets-with-javascript/ |
Thanks for sharing. I implemented that bit a few months back. Since you familiarized yourself with the topic, it would be helpful if you could compare those recommendations with the current state of implementation in the webcrypto branch and let me know if anything sticks out to you. I'm a bit in the middle of things: The |
I know, this is done for two years now, but I am building a very simple client right now and parsing the heterogenely typed list for Most programming languages help with setting up json <-> data type deserialization but those lists really make simple parsing difficult, especially if said languages have static typing. It would have been nicer (for other languages) to just explicitly order the parts for the authentication and use normal objects otherwise. Please consider changing that for a possible v3 API :) |
Hi @mqus and great to hear you are building a PrivateBin client. Don't forget to add it to https://github.com/PrivateBin/PrivateBin/wiki/Third-party-clients. 😉 😃 As for your feedback I'm afraid I don't exactly get what you suggest. Yes we use JSON and store our adata there. |
Switch to Web Crypto API and provide sjcl only as a fallback.
Browser support is good.
The text was updated successfully, but these errors were encountered: