Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipe ka-clone's logging to stdout and stderr depending on level so other scripts can parse non-fatal errors #21

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 33 additions & 2 deletions bin/ka-clone
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,8 @@ def _default_email():
).decode('utf-8')
return kac_email.rstrip()
except subprocess.CalledProcessError:
# Send to stderr
logging.error('** No git email set')
return os.environ['USER'] + "@" + DEFAULT_EMAIL_DOMAIN


Expand Down Expand Up @@ -295,6 +297,8 @@ def run_make_hooks():
cwd=_top_level_dir())
_cli_log_step_indented_info("Ran make hooks successfully")
except subprocess.CalledProcessError as e:
# Send to stderr
logging.error('** Failed to run make hooks')
_cli_log_step_indented_info("Failed to run make hooks")
print("\nError: {}".format(e.output.decode('utf-8')))

Expand Down Expand Up @@ -328,6 +332,8 @@ def _existing_commit_template(cwd):
).decode('utf-8')
return template.strip()
except subprocess.CalledProcessError:
# Send to stderr
logging.error('** Failed to get git config --local commit.template')
return None


Expand Down Expand Up @@ -381,6 +387,10 @@ def link_commit_template():
"The commit template {}".format(tmpl),
"is not installed, so we can't link it, skipping..."
)
# Send to stderr
logging.error(
'** The commit template {} is not installed'.format(tmpl)
)
return

all_dirs = _get_submodule_paths()
Expand All @@ -404,6 +414,8 @@ def _existing_configs(cwd):
[line.strip() for line in incl.splitlines()] if incl else []
)
except subprocess.CalledProcessError:
# Send to stderr
logging.error('** Failed to get-all git config --local include.path')
return []


Expand Down Expand Up @@ -446,6 +458,10 @@ def link_gitconfig_khan():

tmpl = _expanded_home_path('.gitconfig.khan')
if not os.path.isfile(tmpl):
# Send to stderr
logging.error(
'** The git config file {} is not installed'.format(tmpl)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this PR. Won't it print "The git config file is not installed" twice, once due to this line and once due to line 466?

Copy link
Member Author

@lillialexis lillialexis Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yeah, you're right. 🤦‍♀️

It doesn't print twice if you call ka-clone through the OLC module, as the module uses inherit for stdout/stdin, when calling ka-clone, and pipe for stderr. That means that it is able to dump the stdout-stream to the console as it runs, but capture the stderr separately to be able to display to the user after the command finishes (when stderr not empty). Here is the PR that complements this one and does that: https://github.com/Khan/our-lovely-cli/pull/893

There's actually an even worse issue with this PR and PR 893, that hit me as I was drifting off to sleep last night. As it currently stands, there is a sort-of race condition, where a user could be on post-893-OLC and pre-21-ka-clone. In that rare-but-possible case, because the default behavior of Python's logging is to output to stderr, then it will appear to the user that ka-clone hit issues no matter what. It looks like this:

✔ Let's go! … <press enter>

🐱 Whoops! Looks like something wasn't quite right with ka-clone! The script ran
   but sent something to stderr:
   -> Set user.email to [email protected]
   -> Linking commit message template ~/.git_template/commit_template...
       *** The commit template is already linked in the 
           current repo, skipping...
   
   -> Including git config ~/.gitconfig.khan...
       *** The git config is already included in current repo, skipping...
   
   -> Backing up existing hooks...
       *** Backed up existing hooks to .git/hooks.backup_2025-01-09T09:18:47
       *** More info: https://khanacademy.org/r/gitfaq#id-3c30
   
   -> Created new hooks directory at .git/hooks
   -> Added pre-push linting hook
   -> Added hooks to protect master branch
   -> Added commit-msg branch-name hook
   
   You can also scroll up and read the output to see what went wrong.
   
   Please report this issue in the #github-prs channel so we can fix it for
   everyone!

That's the normal output of ka-clone.

When I originally used a version flag to make sure ka-clone was on the correct version for OLC, I could have just bumped that from 1 to 2 in this PR and PR 893 to close this gap. But then we switched to using presence-of-command-line-args instead.

But, I think that actually ends up working out better, given this feedback! Maybe I can just add a new arg that OLC has and checks for (closing the race-condition-gap) that only outputs to stderr when true (fixing the double-output-bug).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your goal is to have ka-clone emit output that can be parsed by another script, you should just support this directly. Have a --json flag or something, that emits the results of ka-clone as a json blob, to stdout, and have OLC parse that blob and do what it needs to do. And when --json is set, you can change the way that _cli_log_step_indented_info behaves, maybe it saves its data to a buffer somewhere, and you can format and emit that buffer at program exit.

Copy link
Member Author

@lillialexis lillialexis Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that that's the most "right" way of doing things, I think that it isn't worth the time it will take to implement. I'm trying to limit the amount of new code, as I need to wrap this project up. But I also want a quick and easy way to flag to the user that something isn't quite right, as the warnings from the ka-clone output get lost. And the only thing I'm doing with the output is piping it to the user; I'm not parsing it any more than just seeing if something is in stderr versus stdout. (The OLC tools do this with the git output too.)

)
_cli_log_step_indented_info(
"The git config file {} is not installed, ".format(tmpl) +
"so we can't include it, skipping..."
Expand Down Expand Up @@ -561,8 +577,23 @@ if __name__ == '__main__':
parser = _cli_parser()
args = parser.parse_args()

logging.basicConfig(format="%(message)s")
logging.getLogger().setLevel(logging.ERROR if args.quiet else logging.INFO)
logger = logging.getLogger()
logger.setLevel(logging.DEBUG)
format = '%(message)s'
formatter = logging.Formatter(format)

h1 = logging.NullHandler() if args.quiet \
else logging.StreamHandler(sys.stdout)

h1.setLevel(logging.INFO)
h1.addFilter(lambda record: record.levelno <= logging.INFO)
h1.setFormatter(formatter)

h2 = logging.StreamHandler()
h2.setLevel(logging.WARNING)
h2.setFormatter(formatter)
logger.addHandler(h1)
logger.addHandler(h2)

if args.repair:
if args.src or args.dst:
Expand Down