Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Placeholder Docstrings To Reduce PyLint Messages #2750

Closed
medined opened this issue Jul 2, 2024 · 8 comments
Closed

Add Placeholder Docstrings To Reduce PyLint Messages #2750

medined opened this issue Jul 2, 2024 · 8 comments
Labels
enhancement New feature or request

Comments

@medined
Copy link

medined commented Jul 2, 2024

What problem or use case are you trying to solve?

Running PyLint on the OpenDevin codebase produces 713 messages. 234 of these messages are related to missing docstrings. It is possible, even easy, to configure PyLint to ignore these messages. However, doing so won't nudge the code base towards better documentation. Once all placeholder docstrings are in place, future code (pull requests) can be rejected if docstring are not present.

Having a large number of PyLint messages make it hard to find important messages. For example, this message might be important enough to fix since when check is True a CalledProcessError exception will be raised when the sub-process returns a non-zero exit code.

opendevin/runtime/docker/local_box.py:31:32: W1510: Using subprocess.run without explicitly set `check` is not recommended. (subprocess-run-check)

If the community is willing, I'd be happy to work on resolving the mundane PyLint messages.

Does this seem valuable?

Describe the UX of the solution you'd like

Run the attached python script. Spot check the changes. Push to GitHub. This set of changes should be isolated from other changes so that formatting and logic commits are separated.

Please consider running "isort ." to organize the python import into the suggested order.

Please consider running "black --target-version py310 ." afterwards so that the code conforms to a coding style.

After the above steps, there should be 471 PyLint messages, down from 713.

Do you have thoughts on the technical implementation?

I wrote a python script that uses the Abstract Syntax Tree of Python to add placeholder docstrings where they are missing.

"""Placeholder Module Docstring"""
import os
import ast
import astor
error_count = 0
errors = []


def find_classes_without_docstrings(tree):
    """Placeholder Function Docstring"""
    classes_without_docstrings = []
    for node in ast.walk(tree):
        if isinstance(node, ast.ClassDef):
            if not node.body or not isinstance(node.body[0], ast.Expr
                ) or not isinstance(node.body[0].value, ast.Constant):
                classes_without_docstrings.append(node)
    return classes_without_docstrings


def find_functions_without_docstrings(tree):
    """Placeholder Function Docstring"""
    functions_without_docstrings = []
    for node in ast.walk(tree):
        if isinstance(node, (ast.FunctionDef, ast.AsyncFunctionDef)):
            if not node.body or not isinstance(node.body[0], ast.Expr
                ) or not isinstance(node.body[0].value, ast.Constant):
                functions_without_docstrings.append(node)
    return functions_without_docstrings


def docstring(s):
    """Placeholder Class Docstring"""
    return ast.Expr(value=ast.Constant(s))


def add_placeholder_docstrings(file_path):
    """Placeholder Function Docstring"""
    global error_count
    needs_writing = False
    try:
        with open(file_path, 'r', encoding='utf-8') as file:
            content = file.read()
            if not content.strip():
                with open(file_path, 'w', encoding='utf-8') as empty_file:
                    empty_file.write('"""\nPlaceholder Module Docstring\n"""\n'
                        )
                return True
            tree = ast.parse(content, filename=file_path)
        if not tree.body or not isinstance(tree.body[0], ast.Expr
            ) or not isinstance(tree.body[0].value, ast.Constant):
            tree.body.insert(0, docstring('Placeholder Module Docstring'))
            needs_writing = True
        for func in find_classes_without_docstrings(tree):
            func.body.insert(0, docstring('Placeholder Class Docstring'))
            needs_writing = True
        for func in find_functions_without_docstrings(tree):
            func.body.insert(0, docstring('Placeholder Function Docstring'))
            needs_writing = True
        if needs_writing:
            with open(file_path, 'w', encoding='utf-8') as file:
                file.write(astor.to_source(tree))
    except Exception as e:
        errors.append(f'Error processing {file_path}: {e}')
        print(f'Error processing {file_path}: {e}')
        error_count += 1
        pass
    return needs_writing


def check_directory_for_missing_docstrings(root_dir):
    """Placeholder Function Docstring"""
    for subdir, _, files in os.walk(root_dir):
        for file in files:
            if file.endswith('.py'):
                file_path = os.path.join(subdir, file)
                if add_placeholder_docstrings(file_path):
                    print(f'Added placeholder docstrings in {file_path}')


check_directory_for_missing_docstrings('.')
print('---------------------------')
for error in errors:
    print(error)
print(f'Finished with {error_count} errors.')

Describe alternatives you've considered

Additional context

@medined medined added the enhancement New feature or request label Jul 2, 2024
@medined
Copy link
Author

medined commented Jul 2, 2024

I found a bug in the script. Will resolve and add another comment.

@Umpire2018
Copy link
Contributor

I personally do not agree with the practice of adding meaningless docstrings to avoid errors. It might be more effective to focus on adding actually useful docstrings by using agent framework.

I do agree with the following measures to improve code quality:

Please consider running "isort ." to organize the python import into the suggested order.
Please consider running "black --target-version py310 ." afterwards so that the code conforms to a coding style.

@medined
Copy link
Author

medined commented Jul 3, 2024

I'm using ast to read the python files. And astor to write the python code with the placeholder docstrings. I've found a bug in astor. It can't correctly parse the following docstring.

        """
            \x00
        """

Now I'm curious. Is there some C code somewhere reading the \x00 as an end-of-string?

astor prints that comment as

'\n            \x00\n        '

@tobitege
Copy link
Collaborator

tobitege commented Jul 3, 2024

With the help of this bash command I tried to find any .py file with that character not being the end-of-file character:
find . -name '*.py' -type f -print0 | xargs -0 grep -Pzl '\x00(?!.*\x00\z)'
Could you run that in your OpenDevin folder, please, and report if any files were found?

@medined
Copy link
Author

medined commented Jul 3, 2024

That command finds no files. The file (or one of them) causing issues with astor is opendevin/runtime/docker/process.py. It's the existing docstring for parser_docker_exec_output.

P.S. Thanks for indulging my eccentricity on this topic.

@tobitege
Copy link
Collaborator

tobitege commented Jul 3, 2024

No worries 😀

Edit: nvm, just read the comment fully lol

@tobitege
Copy link
Collaborator

tobitege commented Jul 3, 2024

Ohhhh lol...
that docstring has: >> b'\x00\x00\x00\x00\x00\x00\x00\x13Hello OpenDevin!'
But that is plain ASCII, not binary... 🤣

I'm sorry, I should've read your above post more closely, I was reading it as if it found the actual hex value 0 somehow.
So it does seem an issue with astor, I'd agree. @medined

@medined
Copy link
Author

medined commented Jul 3, 2024

Sadly, I have verified that changing the existing docstring to \x00 avoids the issue. And, of course, it makes no sense to alter the original comment to use \x00 - that looks stupid. If anyone wants to follow the excitement over on the astor project, the issue is at berkerpeksag/astor#225.

@medined medined closed this as completed Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants