Skip to content

Commit

Permalink
deploy: d2f41ed
Browse files Browse the repository at this point in the history
  • Loading branch information
enryH committed Sep 20, 2024
1 parent 9ced8fc commit 6562953
Show file tree
Hide file tree
Showing 18 changed files with 649 additions and 7 deletions.
Binary file modified .doctrees/environment.pickle
Binary file not shown.
Binary file modified .doctrees/index.doctree
Binary file not shown.
Binary file added .doctrees/python/best_practices.doctree
Binary file not shown.
Binary file added _images/better_comments_todo_tree.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/lint_pylance_in_vscode.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions _sources/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ azure/running_nextflow_on_azure
:caption: Python
python/percent_notebooks
python/best_practices
```

```{toctree}
Expand Down
86 changes: 86 additions & 0 deletions _sources/python/best_practices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# Coding (best) practices for Data Science

> Author: Henry Webel
> Reviewers: Pasquale Colaianni, Jakob Berg Jespersen
Being asked to show some coding best practices for an internal retreat, I assembled
some low hanging fruits in reach for everyone and some practices I learned to appreciate.

## Use a formatter

When you write code, I encourage using a formatter. `black` is a common choice
as it allows you to format code in a user-defined linelength consistently. It even can
break too long strings into its parts - leaving only long comments and docstrings to you
for adoption.

`black` or `autopep8` are also available next to `isort` for sorting imports in VSCode as
extension, so your files are formatted everytime you save these
([link](https://code.visualstudio.com/docs/python/formatting)).

You can configure black and isort e.g. in your `pyproject.toml` file to process long strings
and enable the next release features:

```toml
[tool.black]
line-length = 90
preview = true
enable-unstable-feature = ["string_processing",]

[tool.isort]
profile = "black"
```

## Use a linter

Using a linter like `flake8` or `ruff` can identify too long lines, unpassed arguments
or mutable objects as default function parameters. Tools like
[`Pylint` in VSCode](https://code.visualstudio.com/docs/python/linting)
allow you to get in editor highlighting of Code issues and links with hints on how to fix them
![screenshot with typehints](assets/lint_pylance_in_vscode.png)

Using the linter you can for example find that you did not pass an argument to a function
as was fixed in this commit [18b675](https://github.com/Multiomics-Analytics-Group/acore/pull/2/commits/18b67516b25de30cf6fd4bb640422aa8e0642b08) in `run_umap` (you will need to unfold the first file to see the full picture).


## Better Comments and ToDo Trees

[Better Comments](https://marketplace.visualstudio.com/items?itemName=aaron-bond.better-comments)
allows you to highlight comments in code using different colors
`# ? warning` or `# ? question` or `# TODO` . If you add `?` and `!` to the list of expression to list
in a
[ToDo Tree](https://marketplace.visualstudio.com/items?itemName=Gruntfuggly.todo-tree)
you can easily keep a list of todos in your code - allowing you to go
through them from time to time and prioritze.

![Highlighted Comments and ToDo Tree example](assets/better_comments_todo_tree.png)

## Text based Notebook (percent format) with jupytext and papermill

[`jupytext`](https://jupytext.readthedocs.io/) is a lightweight tool to keep scripts either as notebooks (`.ipynb`) or simpler text based file formats, such as markdown files (`.md`) which can be easily rendered on GitHub or python files (`.py`) which can be executed in VSCode’s interactive shell and are better for version control. Some tools still need ipynb to work, e.g. `papermill`. Therefore it is handy to keep different version of a script in sync. Otherwise one can also only use python files and render these as notebook in e.g.
[jupyter lab](https://jupytext.readthedocs.io/en/latest/text-notebooks.html). This is useful especially if the code is only kept for version control, but executed versions are kept in a project folder using a workflow environment (as `snakemake` or `nextflow`).

You can see an example of the percent notebook in the [percent_notebooks](project:percent_notebooks.py) section.

I showed how to sync a text based percent notebook and execute it using `papermill`
(without) specifying arguments on the commmand line:

```bash
jupytext --to ipynb -k - -o - example_nb.py | papermill - path/to/executed_example.ipynb
```

If you want to keep some formats in sync, you can specify that and only push one type to git
- specifying e.g. a `.gitignore` the types you want to only have locally.
Each folder can have a `.jupytext.toml` file to specify the formats you want to keep in sync
in that folder e.g.:

```toml
# percent format and ipynb format in sync
formats = "ipynb,py:percent"
```

## Copilot in VSCode

Ghosttext, chats and inline chats are great ways to get suggestions on the code you are
writing. You can apply for a free version as a [(PhD) student](https://github.com/education/students)
or [instructor](https://github.com/education/teachers). Currently alternatives with a free-tier as [codium](https://codeium.com/) are also available.

1 change: 1 addition & 0 deletions about.html
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,7 @@
<p aria-level="2" class="caption" role="heading"><span class="caption-text">Python</span></p>
<ul class="nav bd-sidenav">
<li class="toctree-l1"><a class="reference internal" href="python/percent_notebooks.html">Percent Notebooks in VSCode (and other editior)</a></li>
<li class="toctree-l1"><a class="reference internal" href="python/best_practices.html">Coding (best) practices for Data Science</a></li>
</ul>
<p aria-level="2" class="caption" role="heading"><span class="caption-text">HPC</span></p>
<ul class="nav bd-sidenav">
Expand Down
1 change: 1 addition & 0 deletions azure/creating_ressources.html
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,7 @@
<p aria-level="2" class="caption" role="heading"><span class="caption-text">Python</span></p>
<ul class="nav bd-sidenav">
<li class="toctree-l1"><a class="reference internal" href="../python/percent_notebooks.html">Percent Notebooks in VSCode (and other editior)</a></li>
<li class="toctree-l1"><a class="reference internal" href="../python/best_practices.html">Coding (best) practices for Data Science</a></li>
</ul>
<p aria-level="2" class="caption" role="heading"><span class="caption-text">HPC</span></p>
<ul class="nav bd-sidenav">
Expand Down
1 change: 1 addition & 0 deletions azure/running_nextflow_on_azure.html
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,7 @@
<p aria-level="2" class="caption" role="heading"><span class="caption-text">Python</span></p>
<ul class="nav bd-sidenav">
<li class="toctree-l1"><a class="reference internal" href="../python/percent_notebooks.html">Percent Notebooks in VSCode (and other editior)</a></li>
<li class="toctree-l1"><a class="reference internal" href="../python/best_practices.html">Coding (best) practices for Data Science</a></li>
</ul>
<p aria-level="2" class="caption" role="heading"><span class="caption-text">HPC</span></p>
<ul class="nav bd-sidenav">
Expand Down
1 change: 1 addition & 0 deletions genindex.html
Original file line number Diff line number Diff line change
Expand Up @@ -176,6 +176,7 @@
<p aria-level="2" class="caption" role="heading"><span class="caption-text">Python</span></p>
<ul class="nav bd-sidenav">
<li class="toctree-l1"><a class="reference internal" href="python/percent_notebooks.html">Percent Notebooks in VSCode (and other editior)</a></li>
<li class="toctree-l1"><a class="reference internal" href="python/best_practices.html">Coding (best) practices for Data Science</a></li>
</ul>
<p aria-level="2" class="caption" role="heading"><span class="caption-text">HPC</span></p>
<ul class="nav bd-sidenav">
Expand Down
7 changes: 4 additions & 3 deletions hpc_dtu/setup_user_env.html
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
<link rel="author" title="About these documents" href="../about.html" />
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
<link rel="prev" title="Percent Notebooks in VSCode (and other editior)" href="../python/percent_notebooks.html" />
<link rel="prev" title="Coding (best) practices for Data Science" href="../python/best_practices.html" />
<meta name="viewport" content="width=device-width, initial-scale=1"/>
<meta name="docsearch:language" content="en"/>
</head>
Expand Down Expand Up @@ -176,6 +176,7 @@
<p aria-level="2" class="caption" role="heading"><span class="caption-text">Python</span></p>
<ul class="nav bd-sidenav">
<li class="toctree-l1"><a class="reference internal" href="../python/percent_notebooks.html">Percent Notebooks in VSCode (and other editior)</a></li>
<li class="toctree-l1"><a class="reference internal" href="../python/best_practices.html">Coding (best) practices for Data Science</a></li>
</ul>
<p aria-level="2" class="caption" role="heading"><span class="caption-text">HPC</span></p>
<ul class="current nav bd-sidenav">
Expand Down Expand Up @@ -380,12 +381,12 @@ <h2>Access<a class="headerlink" href="#access" title="Link to this heading">#</a

<div class="prev-next-area">
<a class="left-prev"
href="../python/percent_notebooks.html"
href="../python/best_practices.html"
title="previous page">
<i class="fa-solid fa-angle-left"></i>
<div class="prev-next-info">
<p class="prev-next-subtitle">previous</p>
<p class="prev-next-title">Percent Notebooks in VSCode (and other editior)</p>
<p class="prev-next-title">Coding (best) practices for Data Science</p>
</div>
</a>
</div>
Expand Down
1 change: 1 addition & 0 deletions index.html
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,7 @@
<p aria-level="2" class="caption" role="heading"><span class="caption-text">Python</span></p>
<ul class="nav bd-sidenav">
<li class="toctree-l1"><a class="reference internal" href="python/percent_notebooks.html">Percent Notebooks in VSCode (and other editior)</a></li>
<li class="toctree-l1"><a class="reference internal" href="python/best_practices.html">Coding (best) practices for Data Science</a></li>
</ul>
<p aria-level="2" class="caption" role="heading"><span class="caption-text">HPC</span></p>
<ul class="nav bd-sidenav">
Expand Down
Binary file modified objects.inv
Binary file not shown.
Loading

0 comments on commit 6562953

Please sign in to comment.