-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix how we handle the Names attribute #11
Comments
Handling different node types is something that I really want to avoid, because it is the road to hellish complexity and big hacks. But anyway, let's ignore this problem for now because our time is very little. Let's use what we have in the current DB. It will not be so precise, but 100% enough for the library embeddings task. |
I agree - although doing so only for imports may be useful in the future. Anyway yeah, I just reported the bug because I saw it, as we won't be rerunning this any time soon I'd rather reference them fixing later. |
Found another related issue, the sorting of nodes get's messed up because of this. For instance, if we have the Java import:
Then we get a |
Found another related issue. Basically, all import from Ruby get removed due to this. That is because the |
Found another related issue, basically part of JavaScript imports are getting purged like Python. It is not impeding progress but thought I'd mention it, as there were some strange stuff as well. |
Currently, when a node is being traversed, we check multiple attributes to find its value. To do so, we rely on the assumption that there will be only one value at best, and check, successively:
Name
/name
field.Text
/text
fieldValue
/value
fieldNames
attribute. We handle this one differently: if the field is an array of nodes, which it always is, then we join the value of each of these node'sName
attribute, if they have one.In order to avoid duplication, we then ignore the
Names
attribute in thegoDeeper
function. While this provides utility, it has 2 risks, one of which I am certain exists:Names
attribute, we will not traverse the nodes in it. I have not (yet) found an example for this, but it is a possibility.2 . If
Names
contain proper nodes that are nested, thus not having aName
attribute. This is the case in the following example:Here, the
uast:RuntimeImport
node has the following structure:Path
attribute with a singleuast:Identifier
node, with valuefoo
uast:Alias
node, with aNode
attribute containing theuast:Identifier
node with valuebar
, and aName
attribute containinguast:Identifier
node with valuebaz
Now, unfortunately this is not a Babelfish bug, as this structure for aliases is always the same: we replace a
uast:Identifier
node with auast:Alias
node that has aName
and aNode
attribute. Also, this makes sense, as in the following snippet theNames
attribute would have2 alias nodes instead of one:
Anyway, this is a problem, because currently we loose this information. In the first snippet, only the
foo
identifier is kept. I am going to check out what we can do and will push a PR when a proper solution is found. I think we should start dealing with import nodes in a specific way, and just go deeper on the Names attribute in all other cases, but I've got to look more into this before.The text was updated successfully, but these errors were encountered: