-
-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor the code generator #75
base: master
Are you sure you want to change the base?
Conversation
- Make it prettier - Make it faster - Make the interface to the pretty printer better
astor/node_util.py
Outdated
@@ -132,6 +132,7 @@ class ExplicitNodeVisitor(ast.NodeVisitor): | |||
|
|||
""" | |||
|
|||
@staticmethod | |||
def abort_visit(node): # XXX: self? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I guess the addition of staticmethod
decorator would make the # XXX: self?
comment obsolete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhhh. Yeah. (Mumbles to himself about why he did something based on a comment then didn't remove the comment.)
astor/code_gen.py
Outdated
|
||
def to_source(node, indent_with=' ' * 4, add_line_information=False, | ||
pretty_string=pretty_string, pretty_source=pretty_source): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would potentially break the public API. We need a at least a short deprecation period (maybe remove them in 0.7?) to remove them from the to_source
signature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting case. These weren't there in 0.5 (pretty printer added right after that), and I unthinkingly already changed the call signature to pretty_source a couple of commits ago, and the internals are now way different.
Having said that, I think we can actually do a reasonable job of supporting that interim API with the current code, so I'll go back and look at that. I'm at (real) work today, so that will have to wait until Thursday.
Also, I think that it might be useful to make a ToSource
class to make things more easily configurable, and make to_source a convenience method of that class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, good catch! You're right that these weren't in 0.5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, unless you know of a major user using them, we should probably not support them, since anybody using them was pulling from git rather than pypi anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only major user I know of is Hylang and I doubt they are doing anything with these parameters :) So let's get rid of them!
astor/code_gen.py
Outdated
|
||
def else_body(node, | ||
# Constants | ||
If=ast.If, set_precedence=set_precedence, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, are these optimizations still produce big speedups in a modern Python? We don't use this technique in the Python stdlib anymore because some core developers said that the performance gain is no longer huge, but I don't remember seeing any benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The gains do drop off. I tend to apply them to functions that are called often, and/or reference the attributes in loops. I'd have to go back and look, but I think I'm getting about an 8% speedup in to_unparse
(around 5% overall in rtrip) this way.
There were additional gains to be had by adding a proliferation of dispatchers, e.g. adding a write
method that handled a single item with no for loop, but that was too much stuff to remember, and made the API too big.
I actually went the other way, decomposing conditional_write
into condition
and write
, and coalescing the write
and visit
methods into a single write
, so in one very real sense, these parameter constant micro-optimizations (which I don't mind looking at and thinking about or ignoring, as the case may be) offset the speed losses from making the API more regular and orthogonal and reducing the number of dispatchers.
I also reverted some changes there was a big discussion about last year about whether we should pass a string or a boolean for async
to node handlers that can handle two things. Two things decided it for me: (1) since async is now a keyword, my editor screams at me everywhere it appears, and I think the editor is right -- there's no real reason to use it as a variable name even where you can; and (2) to me this is still more natural (and less code). Having said that, if anybody still really feels strongly the other way, we could change it back, but they have to pick a different variable name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But yeah, overall, I'd say that the gains from doing this are much smaller. Attribute lookups must be much faster than they used to be, because one of the things that really surprised me was that the gains from using closures are much smaller now, and non-existent or negative in a few corner, presumably for the same reason. You have to have a lot of attribute accesses to offset the cost of building the closure.
- Make interface to code_gen better - Support old 0.5 API better - Shorten more lines by splitting strings
OK, @berkerpeksag I took your hint and removed the micro-optimizations. The code looks a lot better, and now I think it performs a lot better, too. Stdlib roundtrips aren't nearly so ugly, because long strings are wrapped. Anyway, this is all I have time for for now. I believe the code is in pretty good shape:
Still needs more doc and whatever new things we want to do to rtrip, but I don't know that we actually have anything gating a 0.6 release, given that the doc is better than it used to be. |
- Add a test for subclassing that shows how to add custom nodes for comments, as requested in issue #50.
In my local copy, I had made Then I remembered the use-case that was in the back of my mind -- issue #50 doesn't really need changes in astor itself, if the code is structured well enough, so I went ahead and checked in changes to support that. Maybe later, after we have some experience and a lot of demand, we could make this a standard library feature, but I think the new test file showing how to add comment nodes should be more than good enough for anybody who wants to do that for now. |
What's the state of this Pull Request? Is anything blocking us from merging it? @pmaupin mentions writing more docs - is this still needed? As this PR sits open, it will become harder and harder to rebase it because we're making changes to Since this PR was opened, we've merged 26 commits $ git log --oneline 28b081a..master | wc -l
26 and changed these files: $ git diff --stat 28b081a..master
.travis.yml | 5 +++
AUTHORS | 2 ++
Makefile | 6 ++--
README.rst | 6 ++--
astor/__init__.py | 11 ++----
astor/code_gen.py | 18 +++++-----
astor/codegen.py | 11 ++++++
astor/rtrip.py | 67 ++----------------------------------
astor/tree_walk.py | 2 +-
docs/changelog.rst | 159 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------
docs/index.rst | 240 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------
setup.cfg | 42 +++++++++++++++++++++++
setup.py | 56 ++++++++----------------------
tests/support.py | 44 ++++++++++++++++++++++++
tests/test_code_gen.py | 18 +++++++---
tests/test_misc.py | 50 +++++++++++++++------------
tox.ini | 4 +--
17 files changed, 453 insertions(+), 288 deletions(-) However, the $ git diff 28b081a..master -- astor/code_gen.py | xsel -b
diff --git a/astor/code_gen.py b/astor/code_gen.py
index 7c27f70..47d6acc 100644
--- a/astor/code_gen.py
+++ b/astor/code_gen.py
@@ -308,8 +308,8 @@ class SourceGenerator(ExplicitNodeVisitor):
self.statement(node)
self.generic_visit(node)
- def visit_FunctionDef(self, node, async=False):
- prefix = 'async ' if async else ''
+ def visit_FunctionDef(self, node, is_async=False):
+ prefix = 'async ' if is_async else ''
self.decorators(node, 1 if self.indentation else 2)
self.statement(node, '%sdef %s' % (prefix, node.name), '(')
self.visit_arguments(node.args)
@@ -322,7 +322,7 @@ class SourceGenerator(ExplicitNodeVisitor):
# introduced in Python 3.5
def visit_AsyncFunctionDef(self, node):
- self.visit_FunctionDef(node, async=True)
+ self.visit_FunctionDef(node, is_async=True)
def visit_ClassDef(self, node):
have_args = []
@@ -364,24 +364,24 @@ class SourceGenerator(ExplicitNodeVisitor):
self.else_body(else_)
break
- def visit_For(self, node, async=False):
+ def visit_For(self, node, is_async=False):
set_precedence(node, node.target)
- prefix = 'async ' if async else ''
+ prefix = 'async ' if is_async else ''
self.statement(node, '%sfor ' % prefix,
node.target, ' in ', node.iter, ':')
self.body_or_else(node)
# introduced in Python 3.5
def visit_AsyncFor(self, node):
- self.visit_For(node, async=True)
+ self.visit_For(node, is_async=True)
def visit_While(self, node):
set_precedence(node, node.test)
self.statement(node, 'while ', node.test, ':')
self.body_or_else(node)
- def visit_With(self, node, async=False):
- prefix = 'async ' if async else ''
+ def visit_With(self, node, is_async=False):
+ prefix = 'async ' if is_async else ''
self.statement(node, '%swith ' % prefix)
if hasattr(node, "context_expr"): # Python < 3.3
self.visit_withitem(node)
@@ -392,7 +392,7 @@ class SourceGenerator(ExplicitNodeVisitor):
# new for Python 3.5
def visit_AsyncWith(self, node):
- self.visit_With(node, async=True)
+ self.visit_With(node, is_async=True)
# new for Python 3.3
def visit_withitem(self, node): |
I think it was mergeable when made. Didn't want to do it because I thought
a new release was imminent, and since it was a big change, wanted to do it
after the big release.
I think it's a lot better than the old one; if it passes tests, should
probably merge it.
…On Fri, May 4, 2018 at 4:35 AM, Radomír Bosák ***@***.***> wrote:
What's the state of this Pull Request? Is anything blocking us from
merging it?
@pmaupin <https://github.com/pmaupin> mentions writing more docs - is
this still needed?
As this PR sits open, it will become harder and harder to rebase it
because we're making changes to astor.code_gen.py.
Since this PR was opened, we've merged 26 commits
$ git log --oneline 28b081a..master | wc -l26
and changed these files:
$ git diff --stat 28b081a..master .travis.yml | 5 +++ AUTHORS | 2 ++ Makefile | 6 ++-- README.rst | 6 ++-- astor/__init__.py | 11 ++---- astor/code_gen.py | 18 +++++----- astor/codegen.py | 11 ++++++ astor/rtrip.py | 67 ++---------------------------------- astor/tree_walk.py | 2 +- docs/changelog.rst | 159 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------- docs/index.rst | 240 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------- setup.cfg | 42 +++++++++++++++++++++++ setup.py | 56 ++++++++---------------------- tests/support.py | 44 ++++++++++++++++++++++++ tests/test_code_gen.py | 18 +++++++--- tests/test_misc.py | 50 +++++++++++++++------------ tox.ini | 4 +-- 17 files changed, 453 insertions(+), 288 deletions(-)
However, the code_gen.py file changed minimally, so there shouldn't be
any problem rebasing this PR.
$ git diff 28b081a..master -- astor/code_gen.py | xsel -bdiff --git a/astor/code_gen.py b/astor/code_gen.py
index 7c27f70..47d6acc 100644--- a/astor/code_gen.py+++ b/astor/code_gen.py@@ -308,8 +308,8 @@ class SourceGenerator(ExplicitNodeVisitor):
self.statement(node)
self.generic_visit(node)
- def visit_FunctionDef(self, node, async=False):- prefix = 'async ' if async else ''+ def visit_FunctionDef(self, node, is_async=False):+ prefix = 'async ' if is_async else ''
self.decorators(node, 1 if self.indentation else 2)
self.statement(node, '%sdef %s' % (prefix, node.name), '(')
self.visit_arguments(node.args)@@ -322,7 +322,7 @@ class SourceGenerator(ExplicitNodeVisitor):
# introduced in Python 3.5
def visit_AsyncFunctionDef(self, node):- self.visit_FunctionDef(node, async=True)+ self.visit_FunctionDef(node, is_async=True)
def visit_ClassDef(self, node):
have_args = []@@ -364,24 +364,24 @@ class SourceGenerator(ExplicitNodeVisitor):
self.else_body(else_)
break
- def visit_For(self, node, async=False):+ def visit_For(self, node, is_async=False):
set_precedence(node, node.target)- prefix = 'async ' if async else ''+ prefix = 'async ' if is_async else ''
self.statement(node, '%sfor ' % prefix,
node.target, ' in ', node.iter, ':')
self.body_or_else(node)
# introduced in Python 3.5
def visit_AsyncFor(self, node):- self.visit_For(node, async=True)+ self.visit_For(node, is_async=True)
def visit_While(self, node):
set_precedence(node, node.test)
self.statement(node, 'while ', node.test, ':')
self.body_or_else(node)
- def visit_With(self, node, async=False):- prefix = 'async ' if async else ''+ def visit_With(self, node, is_async=False):+ prefix = 'async ' if is_async else ''
self.statement(node, '%swith ' % prefix)
if hasattr(node, "context_expr"): # Python < 3.3
self.visit_withitem(node)@@ -392,7 +392,7 @@ class SourceGenerator(ExplicitNodeVisitor):
# new for Python 3.5
def visit_AsyncWith(self, node):- self.visit_With(node, async=True)+ self.visit_With(node, is_async=True)
# new for Python 3.3
def visit_withitem(self, node):
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#75 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ACRNNmLDuRhU1AKEGHbvAAy1Q_sR-TXoks5tvCDGgaJpZM4NIZqT>
.
|
I think I'm the blocker here :) I was going to take a look Patrick's PRs (this one and #78) but unfortunately never had a chance to create some time for a review. |
Currently, I really only care about to_source and rtrip (and tests for those). It's been years since I touched any of the other stuff.
So, I've been looking at refactoring all the kludgey pretty printing stuff I put in.
I think we should do this before the release, so that we don't release a kludgey pretty-printing API and then a better one.
I have code_gen refactored really well (I think) and the API over to the pretty printer is solid.
I would at least like to have it printing as well as it does now, preferably more PEP8 like. Right now, it doesn't do as many string changes as it used to, but it will be much easier to make the pretty printer better (and/or to replace it from external code), because all the string munging happens after the code generator is finished. (Before, the literal string fixups happened during the middle of code generation.)
Also, the code generator creates a list of lists, so the pretty printer doesn't have to separate the lines out -- they are already separated.
I'm tossing this out there to start the conversation. It needs PEP8 cosmetic changes as well, but I need to go to bed.