Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BSV struct syntax can mistakenly match Classic data constructors #353

Closed
quark17 opened this issue Apr 21, 2021 · 5 comments
Closed

BSV struct syntax can mistakenly match Classic data constructors #353

quark17 opened this issue Apr 21, 2021 · 5 comments
Labels
bug Something isn't working

Comments

@quark17
Copy link
Collaborator

quark17 commented Apr 21, 2021

Code in Bluespec's Flute CPU defines a Meta struct type and this code now fails to compile since a Meta data constructor (and type) was introduced into the Prelude as part of the Generics feature. The issue was reported as bluespec/Flute#34. The following small example illustrates the issues raised there:

import Vector::*;


typedef struct {
  Bool f1;
  Bool f2;
} Meta deriving (Bits);


module mkTest();

  // -----
  // Trigger error T0007

  function Meta fn();
    let res = Meta {f1: True, f2: False};
    return res;
  endfunction

  // -----
  // Trigger error T0080

  Reg #(Vector #(4, Meta)) rg <- mkRegU;

  rule r;
    let vec = rg;
    vec[0] = Meta {f1: True, f2: False};
  endrule

endmodule

The example leads to the following errors:

Error: "Test.bsv", line 17, column 15: (T0007)
  Unbound type constructor `Meta_$Meta'
Error: "Test.bsv", line 28, column 14: (T0080)
  Type error at the use of the following function:
    Meta

  The expected return type of the function:
    Test::Meta

  The return type according to the use:
    Meta#(b__, a__)

In bsc/src/comp/TCheck.hs, I see the following code:

tiExpr as td exp@(CStruct c ies) = do
    --trace ("CStruct " ++ ppReadable exp) $ return ()                                                                        
    mti <- (findCons td c >>= \ (_, ti) -> return (Just ti)) `handle` \ _ -> return Nothing
    case mti of
     Just ti -> tiExpr as td (CApply (CCon0 Nothing c) [CStruct (mkTCId ti c) ies])
     Nothing -> do
      find_res <- findTyCon c
      case find_res of
       --- XXX: assuming kind will be there because this is a struct                                                          
       tyc@(TyCon c' (Just k) ti) ->
        case ti of
         TIstruct ss qfs -> do
           ... [typecheck struct expression] ...
         _ -> err (getPosition c, ENotStructId (pfpString c))
       _ -> internalError ("tiExpr: unexpected findTyCon result")

The first analysis that this code does is to lookup the ID and see if it's a constructor -- if not, then normal typecheck of a struct proceeds; but, if a constructor is found by that name, then BSC assumes that this is actually a call of that constructor and that the fields are its values. Recall that BSC internally represents all data constructors as having a single argument; when the constructor has multiple arguments (which I believe can only occur in BH/Classic?), BSC creates a new struct type to be the single argument (holding the multiple values). The name Meta_$Meta that we see in the error message is the name of that internally-created struct.

In the BSV syntax, it is clear that a BH/Classic constructor is not intended; so possibly the parser can better signal this to the typechecker. However, maybe the problem is just in the typechecker. Can we write a similar example in BH/Classic that also triggers the issue?

Could it be that the findCons check in the typecheck of CStruct should be removed? Removing it would certainly fix the behavior on the BSV example above. Would it break anything in BH/Classic?

I recall that in 2018 (pre-open BSC), @nanavati addressed an issue that these internal structs were being exposed to the user in BH/Classic. Prior to his fix, the following two declarations were interchangeable:

data List a = Nil | Cons a (List a)
data List a = Nil | Cons { head :: a; tail :: (List a) }

I would think that the findCons check is still needed for constructors with named fields? But the code should at least confirm that the constructor takes a struct with named fields (which it is not currently doing).

If we need to keep this check, we should at least make it a little smarter. We should simultaneously check whether the ID is a constructor and a struct, and if it's both then we should look at the fields. If the constructor doesn't have named fields, then it's not a candidate; if the field names match the field names of one, then use that one; if the field names don't exactly match either then ... error message?

This all came up because new names were added to the Prelude, that (in the case of Meta) are common enough to conflict with user names. I don't know that there's a need to rename the Generics names, but I'll tag @krame505 in case he wants to think about it.

@quark17 quark17 added the bug Something isn't working label Apr 21, 2021
@quark17
Copy link
Collaborator Author

quark17 commented Apr 21, 2021

It's also worth noting, in the T0080 message, that the expected type is qualified (Test::Meta) but the use type is not (Meta). This is because the qualifier is Prelude (Prelude::Meta) but a decision was made (ages ago) to strip Prelude qualifiers, to make the error messages easier to read and friendlier for hardware designers. In this case, though, it's too easy to think that something has gone wrong within this one file, and overlook that this is a conflict with a type in the Prelude (which would be salient if the Prelude qualifier were shown). It may be worth reconsidering how much Prelude is hidden in messages.

@krame505
Copy link
Contributor

Hmm, that is rather unfortunate. How hard would be to add Piccolo, Flute, etc. to the github actions build integration on bsc? Would help catch this sort of unexpected breakage before bsc changes are merged.

I would prefer not to rename Meta to something else, a short name here is desirable since this constructor shows up in quite a few places in the deriving code, as well as in uses of generics.

@quark17
Copy link
Collaborator Author

quark17 commented Apr 21, 2021

The Toooba CPU is already being tested. It's worth adding Piccolo and Flute. (Some amount of code is shared among them, although Toooba probably shares the least.) One issue, though, is that each CPU has many options that can be configured, and different code is included depending on the chosen options -- so we'd have to add many different builds, in order to guarantee that we're covering all the code. For example, if I had added Flute in the same way that I'd added Toooba, it would not have failed! That's because Flute's main build is not triggering this issue; it's only a special build in a subdirectory that is failing.

I am looking into whether that first check for CStruct can be removed. I believe that it's only being used by this code, in the typechecking of CCon:

tiExpr as td exp@(CCon c es) = do
    (c' :>: sc, ti) <- findCons td c
    ...
    if numGiven > numConExpected
    ...
      else do ...
              let res =
                    case es ++ map CVar extraArgs of
                      []  -> let unit = setIdPosition (getPosition c) idPrimUnit
                             in CApply (CCon0 Nothing c) [CStruct unit []]
                      [e] -> CApply (CCon0 Nothing c) [e]
                      es  -> CStruct c (zipWith (\ i e -> (setIdPosition (getPosition e) i, e)) tupleIds es)
              tiExpr as td $ foldr CLam res $ map Right extraArgs

There is a case-expression on the arguments that, when there are multiple arguments, replaces the whole constructor with a CStruct, that is then converted back to CCon0 of a CStruct in the code we saw before. Why not just create the CCon0 application here (as the other case arms are doing)? I am investigating that change.

@quark17
Copy link
Collaborator Author

quark17 commented Apr 21, 2021

It appears that the parsing of BH/Classic uses CStruct to represent construction of both structs and types with constructors with named fields:

data Foo =
    C1 { f1::Bool; f2::(Bit 16); } 
  | C2

struct UInstr =
        pc :: IAddress
        instr :: IValue

// These use the same syntax
Foo x = C1 { f1 = True; f2 = 0 }
UInstr y = UInstr { pc = 0; instr = 0; }

Both of the constructions use CStruct, but in one case the name is a type name and in one case the name is a constructor name. This causes a conflict in name spaces (in BH/Classic)! At the moment, BSC resolves it by giving precedence to constructor names. As I said above, when there's both a type and a constructor by that name, BSC will need to do a little more work to decide which is intended.

It does appear that BSV is also taking advantage of this double meaning for CStruct, because it uses the same parser function (pConstructorPrimaryWith) for parsing the named fields of tagged unions as for the fields of a bare constructor name (which is the struct syntax?). If the parser has more information about whether it's a struct vs a constructor, maybe that info could be included as a new field in CStruct.

quark17 added a commit to quark17/bsc that referenced this issue May 2, 2021
In BH/Classic, the same syntax is used for expressing structs and
constructors with named fields.  The BH/Classic parser represents both
with CStruct and leaves it to the typechecker to disambiguate.  In BSV,
the syntax indicates which is intended, but that information was being
lost.  This has been fixed by adding a field to CStruct to allow the
BSV parser to record whether a CStruct is known to be a struct or a
constructor.  There are also CStruct expressions created inside BSC
where the meaning is known, and that can now also be recorded.
This reduces the work that the typechecker needs to do and avoids
mistakenly interpreting a CStruct in the wrong way.

This change uncovered that a number of BSV files in the testsuite were
using the wrong syntax.  These have been fixed by adding or removing
the keyword 'tagged' as needed.  New tests have been added, to test the
current state of the parsers and typechecker, testing this change and
other issues raised in bug B-Lang-org#353.
quark17 added a commit to quark17/bsc that referenced this issue May 2, 2021
The previous commit removed the need to disambiguate CStruct value
expressions created by the BSV parser.  CStruct values created by the
BH/Classic parser still need to be disambiguated.  This commit
improves the heuristic for that.  The BH/Classic test for bug B-Lang-org#353 now
compiles.

This commit also disallows using CStruct for constructors without
named fields.  A new error message is given for this situation.  Tests
for this situation are updated.  The tests reveal that non-named
fields can still be mentioned in pattern matching, which will need to
be fixed.

It turns out that typechecking of fields in patterns (CPstruct) is
done by different code in BSC, so the improvements so far have not
addressed patterns.  New test examples are added, that show that the
original disambiguation issue still exists for CPstruct in pattern
matches.
quark17 added a commit to quark17/bsc that referenced this issue May 2, 2021
In the parsing of patterns, allow a struct syntax (similar to
constructor syntax but without 'tagged').  Fix the tests in the
testsuite where a 'match' statement used 'tagged' syntax -- these
tests broke when the typecheck of 'match' was made stricter, to only
allow struct syntax.

Update the Bug B-Lang-org#353 tests that needed struct patterns.  And have the
testing try both struct and constructor syntax, to confirm that one
compiles and the other does not.  And add a pattern-match version of
the test that the syntaxes aren't interchangeable.

Add tests in the syntax testing directory.  And add tests showing
that the parsing of pattern matching is using 'try' in ways that
lead to poor error messages -- something to be addressed.
quark17 added a commit to quark17/bsc that referenced this issue May 6, 2021
In the parsing of patterns, allow a struct syntax (similar to
constructor syntax but without 'tagged').  Fix the tests in the
testsuite where a 'match' statement used 'tagged' syntax -- these
tests broke when the typecheck of 'match' was made stricter, to only
allow struct syntax.

Update the Bug B-Lang-org#353 tests that needed struct patterns.  And have the
testing try both struct and constructor syntax, to confirm that one
compiles and the other does not.  And add a pattern-match version of
the test that the syntaxes aren't interchangeable.

Add tests in the syntax testing directory.  And add tests showing
that the parsing of pattern matching is using 'try' in ways that
lead to poor error messages -- something to be addressed.
quark17 added a commit that referenced this issue May 6, 2021
In BH/Classic, the same syntax is used for expressing structs and
constructors with named fields.  The BH/Classic parser represents both
with CStruct and leaves it to the typechecker to disambiguate.  In BSV,
the syntax indicates which is intended, but that information was being
lost.  This has been fixed by adding a field to CStruct to allow the
BSV parser to record whether a CStruct is known to be a struct or a
constructor.  There are also CStruct expressions created inside BSC
where the meaning is known, and that can now also be recorded.
This reduces the work that the typechecker needs to do and avoids
mistakenly interpreting a CStruct in the wrong way.

This change uncovered that a number of BSV files in the testsuite were
using the wrong syntax.  These have been fixed by adding or removing
the keyword 'tagged' as needed.  New tests have been added, to test the
current state of the parsers and typechecker, testing this change and
other issues raised in bug #353.
quark17 added a commit that referenced this issue May 6, 2021
The previous commit removed the need to disambiguate CStruct value
expressions created by the BSV parser.  CStruct values created by the
BH/Classic parser still need to be disambiguated.  This commit
improves the heuristic for that.  The BH/Classic test for bug #353 now
compiles.

This commit also disallows using CStruct for constructors without
named fields.  A new error message is given for this situation.  Tests
for this situation are updated.  The tests reveal that non-named
fields can still be mentioned in pattern matching, which will need to
be fixed.

It turns out that typechecking of fields in patterns (CPstruct) is
done by different code in BSC, so the improvements so far have not
addressed patterns.  New test examples are added, that show that the
original disambiguation issue still exists for CPstruct in pattern
matches.
quark17 added a commit that referenced this issue May 6, 2021
In the parsing of patterns, allow a struct syntax (similar to
constructor syntax but without 'tagged').  Fix the tests in the
testsuite where a 'match' statement used 'tagged' syntax -- these
tests broke when the typecheck of 'match' was made stricter, to only
allow struct syntax.

Update the Bug #353 tests that needed struct patterns.  And have the
testing try both struct and constructor syntax, to confirm that one
compiles and the other does not.  And add a pattern-match version of
the test that the syntaxes aren't interchangeable.

Add tests in the syntax testing directory.  And add tests showing
that the parsing of pattern matching is using 'try' in ways that
lead to poor error messages -- something to be addressed.
@quark17 quark17 closed this as completed May 6, 2021
@quark17
Copy link
Collaborator Author

quark17 commented Nov 2, 2021

FYI, this was also bug 1711 in Bluespec Inc's pre-GitHub bug database. There, the example was this code:

import StmtFSM::*;
typedef struct {t x;} Update#(type t) deriving (Bits,Eq);
module mkUpdateBug();
   Update#(int) u = Update{x: 55};
endmodule

which gave the response:

Error: "UpdateBug.bsv", line 6, column 21: (T0080)
  Type error at the use of the following function:
    Update
  The expected return type of the function:
    UpdateBug::Update#(Int#(32))
  The return type according to the use:
    StmtFSM::ActionType

because StmtFSM exports this datatype:

data ActionType
        = Default
        | Update Freedom
        | Jump String
        | Wait
        | NoME

Now that BSC is keeping track of whether the BSV syntax was for a struct or tagged union (and is doing better at disambiguating by type, even for BH which doesn't distinguish between the two), this example compiles with no error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants