Skip to content

Commit

Permalink
Replace current docs by mkdocs (microsoft#1263)
Browse files Browse the repository at this point in the history
* Replace docs by mkdocs-material

* Fix markdown

* Fix verions in gh-pages workflow

* remove whitespaces

* add semver

* Add build docs check on python-ci

* Fix command in index cli

* Spellcheck

* Spellcheck

* remove docsite paths

* clear outputs from notebook

* remove dependabot npm for docsite

* remove more docsite left overs

* execute notebooks

* Update notebooks

* update poetry lock

* Remove notebook build from ci

* Revert dep update

* Navigation tabs

* Fix stylesheet

* add kwds to dictionary

* Turn on notebook execution

* Update gitignore

* Add MSR Blog posts

* spellcheck

* Accessibility Changes

---------

Co-authored-by: Alonso Guevara <[email protected]>
  • Loading branch information
andresmor-ms and AlonsoGuevara authored Oct 11, 2024
1 parent d9a005c commit fc9895f
Show file tree
Hide file tree
Showing 64 changed files with 620 additions and 4,410 deletions.
4 changes: 0 additions & 4 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,6 @@
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
version: 2
updates:
- package-ecosystem: "npm" # See documentation for possible values
directory: "docsite/" # Location of package manifests
schedule:
interval: "weekly"
- package-ecosystem: "pip" # See documentation for possible values
directory: "/" # Location of package manifests
schedule:
Expand Down
42 changes: 15 additions & 27 deletions .github/workflows/gh-pages.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,21 +2,22 @@ name: gh-pages
on:
push:
branches: [main]

permissions:
contents: write

env:
POETRY_VERSION: 1.8.3
PYTHON_VERSION: "3.11"
NODE_VERSION: 18.x
POETRY_VERSION: '1.8.3'
PYTHON_VERSION: '3.11'

jobs:
build:
runs-on: ubuntu-latest
env:
GH_PAGES: 1
DEBUG: 1
GRAPHRAG_API_KEY: ${{ secrets.OPENAI_NOTEBOOK_KEY }}
GRAPHRAG_LLM_MODEL: ${{ secrets.GRAPHRAG_LLM_MODEL }}
GRAPHRAG_EMBEDDING_MODEL: ${{ secrets.GRAPHRAG_EMBEDDING_MODEL }}

steps:
- uses: actions/checkout@v4
Expand All @@ -33,33 +34,20 @@ jobs:
with:
poetry-version: ${{ env.POETRY_VERSION }}

- name: Use Node ${{ env.NODE_VERSION }}
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}

- name: Install Yarn dependencies
run: yarn install
working-directory: docsite

- name: Install Poetry dependencies
- name: poetry intsall
shell: bash
run: poetry install

- name: mkdocs build
shell: bash
run: poetry run poe build_docs

- name: Build Jupyter Notebooks
run: poetry run poe convert_docsite_notebooks

- name: Build docsite
run: yarn build
working-directory: docsite
env:
DOCSITE_BASE_URL: "graphrag"

- name: List docsite files
run: find docsite/_site
- name: List Docsite Contents
run: find site

- name: Deploy to GitHub Pages
uses: JamesIves/[email protected]
with:
branch: gh-pages
folder: docsite/_site
clean: true
folder: site
clean: true
39 changes: 11 additions & 28 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,32 +1,6 @@
# Node Artifacts
*/node_modules/
docsite/*/src/**/*.js
docsite/*/lib/
docsite/*/storybook-static/
docsite/*/docsTemp/
docsite/*/build/
.swc/
dist/
.idea
# https://yarnpkg.com/advanced/qa#which-files-should-be-gitignored
docsite/.yarn/*
!docsite/.yarn/patches
!docsite/.yarn/releases
!docsite/.yarn/plugins
!docsite/.yarn/sdks
!docsite/.yarn/versions
docsite/.pnp.*

.yarn/*
!.yarn/patches
!.yarn/releases
!.yarn/plugins
!.yarn/sdks
!.yarn/versions
.pnp.*

# Python Artifacts
python/*/lib/
dist/
# Test Output
.coverage
coverage/
Expand Down Expand Up @@ -66,4 +40,13 @@ __blobstorage__/
ragtest/
.ragtest/
.pipelines
.pipeline
.pipeline


# mkdocs
site/

# Docs migration
docsite/
.yarn/
.pnp*
4 changes: 4 additions & 0 deletions .semversioner/next-release/patch-20241009233835780962.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{
"type": "patch",
"description": "Use mkdocs for documentation"
}
2 changes: 1 addition & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
".gitattributes": ".gitignore",
".yarnrc.yml": "yarn.lock, .pnp.*",
"jest.config.js": "jest.setup.mjs",
"pyproject.toml": "poetry.lock, poetry.toml",
"pyproject.toml": "poetry.lock, poetry.toml, mkdocs.yaml",
"cspell.config.yaml": "dictionary.txt"
},
"azureFunctions.postDeployTask": "npm install (functions)",
Expand Down
3 changes: 0 additions & 3 deletions cspell.config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,6 @@ ignorePaths:
- entity_extraction.txt
- package.json
- tests/fixtures/
- docsite/data/
- docsite/nbdocsite_template/
- docsite/posts/query/notebooks/inputs/
- examples_notebooks/inputs/
- "*.csv"
- "*.parquet"
Expand Down
19 changes: 18 additions & 1 deletion dictionary.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
# Team
Alonso
Truitt
Trinh
Fernández

# Pythonisms
PYTHONPATH
Expand Down Expand Up @@ -63,6 +66,7 @@ numpy
pypi
nbformat
semversioner
mkdocs

# Library Methods
iterrows
Expand All @@ -85,6 +89,13 @@ isin
nocache
nbconvert

# HTML
nbsp
onclick
pymdownx
linenums
twemoji

# Verbs
binarize
prechunked
Expand Down Expand Up @@ -159,10 +170,16 @@ Tiruzia's
Verdantis
Verdantis's


# English
skippable
upvote

# Misc
Arxiv
kwds

# Dulce
astrotechnician
epitheg
unspooled
unnavigated
27 changes: 27 additions & 0 deletions docs/blog_posts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@

<div class="grid cards" markdown>

- [:octicons-arrow-right-24: __GraphRAG: Unlocking LLM discovery on narrative private data__](https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/)

---
<h6>Published February 13, 2024

By [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect; [Steven Truitt](https://www.microsoft.com/en-us/research/people/steventruitt/), Principal Program Manager</h6>


- [:octicons-arrow-right-24: __GraphRAG: New tool for complex data discovery now on GitHub__](https://www.microsoft.com/en-us/research/blog/graphrag-new-tool-for-complex-data-discovery-now-on-github/)

---
<h6>Published July 2, 2024

By [Darren Edge](https://www.microsoft.com/en-us/research/people/daedge/), Senior Director; [Ha Trinh](https://www.microsoft.com/en-us/research/people/trinhha/), Senior Data Scientist; [Steven Truitt](https://www.microsoft.com/en-us/research/people/steventruitt/), Principal Program Manager; [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect</h6>


- [:octicons-arrow-right-24: __GraphRAG auto-tuning provides rapid adaptation to new domains__](https://www.microsoft.com/en-us/research/blog/graphrag-auto-tuning-provides-rapid-adaptation-to-new-domains/)

---
<h6>Published September 9, 2024

By [Alonso Guevara Fernández](https://www.microsoft.com/en-us/research/people/alonsog/), Sr. Software Engineer; Katy Smith, Data Scientist II; [Joshua Bradley](https://www.microsoft.com/en-us/research/people/joshbradley/), Senior Data Scientist; [Darren Edge](https://www.microsoft.com/en-us/research/people/daedge/), Senior Director; [Ha Trinh](https://www.microsoft.com/en-us/research/people/trinhha/), Senior Data Scientist; [Sarah Smith](https://www.microsoft.com/en-us/research/people/smithsarah/), Senior Program Manager; [Ben Cutler](https://www.microsoft.com/en-us/research/people/bcutler/), Senior Director; [Steven Truitt](https://www.microsoft.com/en-us/research/people/steventruitt/), Principal Program Manager; [Jonathan Larson](https://www.microsoft.com/en-us/research/people/jolarso/), Senior Principal Data Architect

</div>
10 changes: 2 additions & 8 deletions docsite/posts/config/custom.md → docs/config/custom.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,8 @@
---
title: Custom Configuration Mode
navtitle: Fully Custom Config
layout: page
tags: [post]
date: 2023-01-04
---
# Fully Custom Config

The primary configuration sections for Indexing Engine pipelines are described below. Each configuration section can be expressed in Python (for use in Python API mode) as well as YAML, but YAML is show here for brevity.

Using custom configuration is an advanced use-case. Most users will want to use the [Default Configuration](/posts/config/overview) instead.
Using custom configuration is an advanced use-case. Most users will want to use the [Default Configuration](overview.md) instead.

## Indexing Engine Examples

Expand Down
8 changes: 1 addition & 7 deletions docsite/posts/config/env_vars.md → docs/config/env_vars.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
---
title: Default Configuration Mode (using Env Vars)
navtitle: Using Env Vars
tags: [post]
layout: page
date: 2023-01-03
---
# Default Configuration Mode (using Env Vars)

## Text-Embeddings Customization

Expand Down
12 changes: 3 additions & 9 deletions docsite/posts/config/init.md → docs/config/init.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
---
title: Configuring GraphRAG Indexing
navtitle: Init Command
tags: [post]
layout: page
date: 2023-01-03
---
# Configuring GraphRAG Indexing

To start using GraphRAG, you need to configure the system. The `init` command is the easiest way to get started. It will create a `.env` and `settings.yaml` files in the specified directory with the necessary configuration settings. It will also output the default LLM prompts used by GraphRAG.

Expand All @@ -31,8 +25,8 @@ The `init` command will create the following files in the specified directory:

- `settings.yaml` - The configuration settings file. This file contains the configuration settings for GraphRAG.
- `.env` - The environment variables file. These are referenced in the `settings.yaml` file.
- `prompts/` - The LLM prompts folder. This contains the default prompts used by GraphRAG, you can modify them or run the [Auto Prompt Tuning](/posts/prompt_tuning/auto_prompt_tuning) command to generate new prompts adapted to your data.
- `prompts/` - The LLM prompts folder. This contains the default prompts used by GraphRAG, you can modify them or run the [Auto Prompt Tuning](../prompt_tuning/auto_prompt_tuning.md) command to generate new prompts adapted to your data.

## Next Steps

After initializing your workspace, you can either run the [Prompt Tuning](/posts/prompt_tuning/auto_prompt_tuning) command to adapt the prompts to your data or even start running the [Indexing Pipeline](/posts/index/overview) to index your data. For more information on configuring GraphRAG, see the [Configuration](/posts/config/overview) documentation.
After initializing your workspace, you can either run the [Prompt Tuning](../prompt_tuning/auto_prompt_tuning.md) command to adapt the prompts to your data or even start running the [Indexing Pipeline](../index/overview.md) to index your data. For more information on configuring GraphRAG, see the [Configuration](overview.md) documentation.
Original file line number Diff line number Diff line change
@@ -1,10 +1,4 @@
---
title: Default Configuration Mode (using JSON/YAML)
navtitle: Using JSON or YAML
tags: [post]
layout: page
date: 2023-01-03
---
# Default Configuration Mode (using JSON/YAML)

The default configuration mode may be configured by using a `settings.json` or `settings.yml` file in the data project root. If a `.env` file is present along with this config file, then it will be loaded, and the environment variables defined therein will be available for token replacements in your configuration document using `${ENV_VAR}` syntax.

Expand Down
16 changes: 5 additions & 11 deletions docsite/posts/config/overview.md → docs/config/overview.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,15 @@
---
title: Configuring GraphRAG Indexing
navtitle: Configuration
tags: [post]
layout: page
date: 2023-01-03
---
# Configuring GraphRAG Indexing

The GraphRAG system is highly configurable. This page provides an overview of the configuration options available for the GraphRAG indexing engine.

## Default Configuration Mode

The default configuration mode is the simplest way to get started with the GraphRAG system. It is designed to work out-of-the-box with minimal configuration. The primary configuration sections for the Indexing Engine pipelines are described below. The main ways to set up GraphRAG in Default Configuration mode are via:

- [Init command](/posts/config/init) (recommended)
- [Purely using environment variables](/posts/config/env_vars)
- [Using JSON or YAML for deeper control](/posts/config/json_yaml)
- [Init command](init.md) (recommended)
- [Purely using environment variables](env_vars.md)
- [Using JSON or YAML for deeper control](json_yaml.md)

## Custom Configuration Mode

Custom configuration mode is an advanced use-case. Most users will want to use the Default Configuration instead. The primary configuration sections for Indexing Engine pipelines are described below. Details about how to use custom configuration are available in the [Custom Configuration Mode](/posts/config/custom) documentation.
Custom configuration mode is an advanced use-case. Most users will want to use the Default Configuration instead. The primary configuration sections for Indexing Engine pipelines are described below. Details about how to use custom configuration are available in the [Custom Configuration Mode](custom.md) documentation.
10 changes: 2 additions & 8 deletions docsite/posts/config/template.md → docs/config/template.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,9 @@
---
title: Configuration Template
navtitle: Configuration Template
layout: page
tags: [post]
date: 2024-04-04
---
# Configuration Template

The following template can be used and stored as a `.env` in the the directory where you're are pointing
the `--root` parameter on your Indexing Pipeline execution.

For details about how to run the Indexing Pipeline, refer to the [Index CLI](../../index/2-cli) documentation.
For details about how to run the Indexing Pipeline, refer to the [Index CLI](../index/cli.md) documentation.

## .env File Template

Expand Down
File renamed without changes.
File renamed without changes.
9 changes: 2 additions & 7 deletions docsite/posts/developing.md → docs/developing.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,4 @@
---
title: Developing GraphRAG
navtitle: Developing
layout: page
tags: [post]
---
# Development Guide

# Requirements

Expand Down Expand Up @@ -87,4 +82,4 @@ Make sure you have python3.10-dev installed or more generally `python<version>-d
### LLM call constantly exceeds TPM, RPM or time limits

`GRAPHRAG_LLM_THREAD_COUNT` and `GRAPHRAG_EMBEDDING_THREAD_COUNT` are both set to 50 by default. You can modify this values
to reduce concurrency. Please refer to the [Configuration Documents](../config/overview)
to reduce concurrency. Please refer to the [Configuration Documents](config/overview.md)
Original file line number Diff line number Diff line change
Expand Up @@ -2,20 +2,9 @@
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\nCopyright (c) Microsoft Corporation.\\n'"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"outputs": [],
"source": [
"# Copyright (c) 2024 Microsoft Corporation.\n",
"# Licensed under the MIT License."
Expand Down
File renamed without changes.
Loading

0 comments on commit fc9895f

Please sign in to comment.