Skip to content

Commit

Permalink
docs(ingest): refactor docgen process (#12300)
Browse files Browse the repository at this point in the history
  • Loading branch information
hsheth2 authored Jan 10, 2025
1 parent cf35dcc commit a6cd995
Show file tree
Hide file tree
Showing 10 changed files with 748 additions and 673 deletions.
24 changes: 20 additions & 4 deletions docs-website/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,6 @@ The purpose of this section is to provide developers & technical users with conc

This section aims to provide plain-language feature overviews for both technical and non-technical readers alike.


## Docs Generation Features

**Includes all markdown files**
Expand All @@ -145,16 +144,33 @@ You can suppress this check by adding the path to the file in a comment in `side

Use an "inline" directive to include code snippets from other files. The `show_path_as_comment` option will include the path to the file as a comment at the top of the snippet.

```python
{{ inline /metadata-ingestion/examples/library/data_quality_mcpw_rest.py show_path_as_comment }}
```
```python
{{ inline /metadata-ingestion/examples/library/data_quality_mcpw_rest.py show_path_as_comment }}
```

**Command Output**

Use the `{{ command-output cmd }}` directive to run subprocesses and inject the outputs into the final markdown.

{{ command-output python -c 'print("Hello world")' }}

This also works for multi-line scripts.

{{ command-output
source metadata-ingestion/venv/bin/activate
python -m <something>
}}

Regardless of the location of the markdown file, the subcommands will be executed with working directory set to the repo root.

Only the stdout of the subprocess will be outputted. The stderr, if any, will be included as a comment in the markdown.

## Docs site generation process

This process is orchestrated by a combination of Gradle and Yarn tasks. The main entrypoint is via the `docs-website:yarnGenerate` task, which in turn eventually runs `yarn run generate`.

Steps:

1. Generate the GraphQL combined schema using the gradle's `docs-website:generateGraphQLSchema` task. This generates `./graphql/combined.graphql`.
2. Generate docs for ingestion sources using the `:metadata-ingestion:docGen` gradle task.
3. Generate docs for our metadata model using the `:metadata-ingestion:modelDocGen` gradle task.
Expand Down
37 changes: 37 additions & 0 deletions docs-website/generateDocsDir.ts
Original file line number Diff line number Diff line change
Expand Up @@ -439,6 +439,42 @@ function markdown_process_inline_directives(
contents.content = new_content;
}

function markdown_process_command_output(
contents: matter.GrayMatterFile<string>,
filepath: string
): void {
const new_content = contents.content.replace(
/^{{\s*command-output\s*([\s\S]*?)\s*}}$/gm,
(_, command: string) => {
try {
// Change to repo root directory before executing command
const repoRoot = path.resolve(__dirname, "..");

console.log(`Executing command: ${command}`);

// Execute the command and capture output
const output = execSync(command, {
cwd: repoRoot,
encoding: "utf8",
stdio: ["pipe", "pipe", "pipe"],
});

// Return the command output
return output.trim();
} catch (error: any) {
// If there's an error, include it as a comment
const errorMessage = error.stderr
? error.stderr.toString()
: error.message;
return `${
error.stdout ? error.stdout.toString().trim() : ""
}\n<!-- Error: ${errorMessage.trim()} -->`;
}
}
);
contents.content = new_content;
}

function markdown_sanitize_and_linkify(content: string): string {
// MDX escaping
content = content.replace(/</g, "&lt;");
Expand Down Expand Up @@ -602,6 +638,7 @@ function copy_python_wheels(): void {
markdown_rewrite_urls(contents, filepath);
markdown_enable_specials(contents, filepath);
markdown_process_inline_directives(contents, filepath);
markdown_process_command_output(contents, filepath);
//copy_platform_logos();
// console.log(contents);

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
### Configuration Notes

See the

1. [Microsoft Grant user access to a Report Server doc](https://docs.microsoft.com/en-us/sql/reporting-services/security/grant-user-access-to-a-report-server?view=sql-server-ver16)
2. Use your user credentials from previous step in yaml file

### Concept mapping

| Power BI Report Server | Datahub |
| ---------------------- | ----------- |
| `Paginated Report` | `Dashboard` |
| `Power BI Report` | `Dashboard` |
| `Mobile Report` | `Dashboard` |
| `Linked Report` | `Dashboard` |
| `Dataset, Datasource` | `N/A` |

This file was deleted.

Loading

0 comments on commit a6cd995

Please sign in to comment.