Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1399 - Add a grammar corresponding to the parsing algorithm #1759

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

recvfrom
Copy link
Contributor

@recvfrom recvfrom commented Oct 19, 2021

This PR add a grammar that corresponds to the algorithm parsing section.

A grammar currently exists in section 4.4.1 but it's much stricter than what user agents actually enforce, and there have been several instances where that grammar was assumed to correspond with the cookie parsing algorithm defined in section 5. By having a separate grammar in section 5, hopefully we can reduce confusion while also preserving the high-level view that a grammar provides.

I'm not sure whether it's worth keeping around the existing grammar in section 4.4.1, but in this initial work I have not attempted to remove it.

CC @sbingler and @miketaylr

draft-ietf-httpbis-rfc6265bis.md Outdated Show resolved Hide resolved
draft-ietf-httpbis-rfc6265bis.md Outdated Show resolved Hide resolved
draft-ietf-httpbis-rfc6265bis.md Outdated Show resolved Hide resolved
draft-ietf-httpbis-rfc6265bis.md Outdated Show resolved Hide resolved
draft-ietf-httpbis-rfc6265bis.md Outdated Show resolved Hide resolved
@sbingler
Copy link
Collaborator

I'm not sure whether it's worth keeping around the existing grammar in section 4.4.1, but in this initial work I have not attempted to remove it.

4.4.1 defines the behavior for a well behaved server, while your addition defines what UAs must accept to be compatible with the real world (and misbehaving servers). I think 4.4.1 still serves a useful purpose and I think we should keep it.

@recvfrom
Copy link
Contributor Author

I believe I've address all feedback with the most recent commit, but please let me know if I missed anything.


cookie-av = expires-av / max-age-av / domain-av /
path-av / secure-av / httponly-av /
samesite-av / extension-av
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pattern of ABNF definition we're not using in http-core anymore. Note that by defintion, any valid cookie-av will match extension-av (right?).

It would be better to just define the common syntax of any cookie-av, and then define seperate ABNFs for the individual values.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point - I moved the attribute grammars into separate ABNFs. Thanks for the feedback!

recvfrom and others added 5 commits November 1, 2021 12:54
This commit addresses review feedback and improves the correctness
of the grammar for various edge cases (Ex: `Domain=;` and `Path;`).
It also fixes some issues with how '/' was used in the previous
commit.
@recvfrom recvfrom force-pushed the issue-1399-update-abnf branch from 6ad2421 to 38fc7af Compare November 1, 2021 17:04
@recvfrom recvfrom requested a review from reschke November 1, 2021 17:09
~~~ abnf
expires-av = "Expires" BWS "=" BWS cookie-date BWS
; cookie-date is defined in the "Dates" section.
~~~
Copy link
Contributor

@reschke reschke Nov 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time of parsing when "Expires" is detected, the parsing into name and value has already happened.

Thus, I would just define the syntax for the value, as in:

expires-v = cookie-date

Note this is how multi-level grammars are defined in the core HTTP specs as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see - thank you for the example @reschke.

It seems like we'd lose a bit of completeness if we only include the attribute value in the grammar. For instance, the grammar wouldn't explicitly tie the "Expires" keyword to the Expires attribute, and wouldn't convey whether "ExPiReS" is treated the same way as "Expires". Having the grammar format used here align with the convention in the core HTTP specs seems desirable as well, though.

Does anyone else have thoughts on this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

. For instance, the grammar wouldn't explicitly tie the "Expires" keyword to the Expires attribute

It by definition can't do that anyway, as long as the grammar allows for extensibilty - an ABNF-driven parser would simply switch to the code branch for new attributes, and then the value syntax would remain unconstrained.

You really can't do this in ABNF; it needs to be done in prose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If our goal is to have an ABNF-driven parser function correctly, it seems like we'd need to go back to the using the previous definition of cookie-av. To address the original concern of "any valid cookie-av will match extension-av", maybe we could do something like the following:

cookie-av          = expires-av / max-age-av / domain-av /
                     path-av / secure-av / httponly-av /
                     samesite-av / extension-av

extension-av       = 1*cookie-name-octet BWS extension-eq-value BWS
extension-eq-value = "" / ("=" BWS optional-value)
    ; attributes solely matching extension-av should be treated as unrecognized by
    ; cookie parsers strictly conforming to this RFC

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can solve this by adding prose (comments) to the ABNF.

The ABNF should be something that can translate to how a parser processes the input. Parsers in general parse first generic structures, then look at specific micro syntaxes of values.

That's what we're doing in the base spec, and that's what I'll keep suggesting to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging this pull request may close these issues.

6 participants