Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs updates #293

Merged
merged 10 commits into from
Oct 31, 2023
2 changes: 1 addition & 1 deletion .git-hooks/pre-commit
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#
files_to_check="docs/src/recipes/*.ipynb"
files_to_check+=" docs/src/user_guide.ipynb"
files_to_check+=" docs/src/tutorials/getting_started.ipynb"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we risk breaking existing url that we shared?

Maybe we can add a redirection from tutorials/getting_started.ipynb to getting_started.ipynb?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I'll create a symlink and see if it works.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: symlink didn't work, but I could configure a redirect through the rtd admin panel.

files_to_check+=" docs/src/getting_started.ipynb"


for path in `git diff --name-only --staged $files_to_check`
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test_notebooks.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,11 @@ jobs:
# * can be used and all .ipynb files in that dir will be tested sequentially
path:
- user_guide
- getting_started
- recipes/*
- tutorials/anomaly_detection_supervised
- tutorials/anomaly_detection_unsupervised
- tutorials/bank_fraud_detection_with_tfdf
- tutorials/getting_started
- tutorials/heart_rate_analysis
- tutorials/loan_outcomes_prediction
- tutorials/m5_competition
Expand Down
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,17 +91,17 @@ Check the [Getting Started tutorial](https://temporian.readthedocs.io/en/stable/

## Next steps

New users should refer to the [3 minutes to Temporian](https://temporian.readthedocs.io/en/stable/3_minutes/) page, which provides a
New users should refer to the [Getting Started](https://temporian.readthedocs.io/en/stable/getting_started/) guide, which provides a
quick overview of the key concepts and operations of Temporian.

After reading the 3 minute guide, visit the [User Guide](https://temporian.readthedocs.io/en/stable/user_guide/) for a deep dive into
After that, visit the [User Guide](https://temporian.readthedocs.io/en/stable/user_guide/) for a deep dive into
the major concepts, operators, conventions, and practices of Temporian. For a
hands-on learning experience, work through the [Tutorials](https://temporian.readthedocs.io/en/stable/tutorials/) or refer to the [API
reference](https://temporian.readthedocs.io/en/stable/reference/).

## Documentation

The documentation 📚 is available at [temporian.readthedocs.io](https://temporian.readthedocs.io/en/stable/). The [3 minutes to Temporian ⏰️](https://temporian.readthedocs.io/en/stable/3_minutes/) is the best way to start.
The documentation 📚 is available at [temporian.readthedocs.io](https://temporian.readthedocs.io/en/stable/). The [Getting Started guide](https://temporian.readthedocs.io/en/stable/getting_started/) is the best way to start.

## Contributing

Expand Down
2 changes: 1 addition & 1 deletion docs/.readthedocs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,9 @@ build:
- pip install -r docs/src/tutorials/requirements.txt

pre_build:
- tools/run_notebooks.sh docs/src/getting_started.ipynb
- tools/run_notebooks.sh docs/src/user_guide.ipynb
- tools/run_notebooks.sh $(ls docs/src/recipes/*.ipynb)
- tools/run_notebooks.sh docs/src/tutorials/getting_started.ipynb
# These are too slow
# - tools/run_notebooks.sh docs/src/tutorials/*.ipynb

Expand Down
2 changes: 1 addition & 1 deletion docs/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ extra_css:
# Navigation bar
nav:
- Home: index.md
- 3 minutes to Temporian: 3_minutes.md
- Getting Started: getting_started.ipynb
- User Guide: user_guide.ipynb
- Recipes: recipes/
- Tutorials: tutorials/
Expand Down
69 changes: 0 additions & 69 deletions docs/src/3_minutes.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,17 @@
"source": [
"# Getting Started with Temporian\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
"\n",
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/google/temporian/blob/last-release/docs/src/tutorials/getting_started.ipynb)"
"[![Open in Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/google/temporian/blob/last-release/docs/src/tutorials/getting_started.ipynb)\n",
"\n",
"This guide will introduce you to the basics of Temporian, including:\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
"- What is an **EventSet** and how to create one from scratch.\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
"- Visualizing input/output data using **EventSet.plot()** and interactive plots.\n",
"- Converting back and forth between EventSets and pandas **DataFrames**.\n",
"- Transforming the EventSets by using **operators**.\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
"- How operators work when using **indexes**.\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
"- Commonly used operations like **glue**, **resample**, **lag**, moving windows and arithmetics.\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
"\n",
"If you're interested in a topic that is not included here, we provide links to other parts of the documentation on the final section, to continue learning."
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
]
},
{
Expand Down Expand Up @@ -51,13 +61,66 @@
"import numpy as np"
]
},
{
"cell_type": "markdown",
"id": "5e9bf8e3-07fd-4e4d-bdb8-d32a48a366c7",
"metadata": {},
"source": [
"## Part 1: Events and EventSets\n",
"\n",
"The most basic unit of data in Temporian is an **event**. An event consists of a timestamp and a set of feature values.\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
"\n",
"Events are not handled individually. Instead, events are grouped together into an **`EventSet`**.\n",
"\n",
"`EventSets` are the main data structures in Temporian, and represent **[multivariate and multi-index time sequences](../user_guide/#what-is-temporal-data)**. Let's break that down:\n",
"\n",
"- \"multivariate\" indicates that each event in the time sequence holds several feature values.\n",
"- \"multi-index\" indicates that the events can represent hierarchical data, and be therefore grouped by one or more of their features' values.\n",
"- \"sequence\" indicates that the events are not necessarily sampled at a uniform rate (in which case we would call it a time \"series\").\n",
"\n",
"You can create an `EventSet` from a pandas DataFrame, NumPy arrays, CSV files, and more. Here is an example containing four events and three features (one of which is used as an `index`):"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "46ebad7f-f4b4-4850-bc12-8552e55a3f6b",
"metadata": {},
"outputs": [],
"source": [
"evset = tp.event_set(\n",
" timestamps=[\"2023-02-04\", \"2023-02-06\", \"2023-02-07\", \"2023-02-07\"],\n",
" features={\n",
" \"feature_1\": [0.5, 0.6, np.nan, 0.9],\n",
" \"feature_2\": [\"red\", \"blue\", \"red\", \"blue\"],\n",
" \"feature_3\": [10.0, -1.0, 5.0, 5.0],\n",
" },\n",
" indexes=[\"feature_2\"],\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
")\n",
"evset"
]
},
{
"cell_type": "markdown",
"id": "effc4483-9a1a-4e21-b376-3ed188ced821",
"metadata": {},
"source": [
"An `EventSet` can hold one or several time sequences, depending on what its `index` is.\n",
"\n",
"If it has no index, it will hold a single multivariate time sequence, which means that all events will be considered part of the same group and will interact with each other when operators are applied.\n",
"\n",
"If it has one (or many) indexes, its events will be grouped by their `indexes` values, so it will hold one multivariate time sequence for each unique value (or unique combination of values) of its indexes, and operators will be applied to each time sequence independently.\n",
"\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
"See the last part of this tutorial to see some examples using `indexes` and operators."
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "18cc96f7",
"metadata": {},
"source": [
"## Example Data\n",
"### Example Data\n",
"\n",
"This minimal data consists of just one `signal` with a `timestamp` for each sample.\n",
DonBraulio marked this conversation as resolved.
Show resolved Hide resolved
"\n",
Expand Down Expand Up @@ -98,14 +161,11 @@
"id": "3f156949",
"metadata": {},
"source": [
"## Part 1: Loading Data\n",
"\n",
"Any kind of signal is represented in Temporian as a **collection of events**, using the `EventSet` object.\n",
"### Creating an EventSet from a DataFrame\n",
"\n",
"In this case there's no `indexes` because we only have one sequence.\n",
"As mentioned in the previous section, any kind of signal is represented in Temporian as a **collection of events**, using the `EventSet` object.\n",
"\n",
"Indices could be useful if we had multiple signals in parallel.\n",
"For example, imagine that we needed to work with signals from multiple sensor devices, or represent the sales from many stores or products: we could separate them by setting the correct features as indexes for each one."
"In this case there's no `indexes` because we only have one sequence. In the third part we'll learn how to use them and why they can be useful."
]
},
{
Expand Down Expand Up @@ -347,7 +407,7 @@
"id": "de46f604-8d28-4d56-a83d-a13f1073b6b8",
"metadata": {},
"source": [
"## Part 3: Using an index\n",
"## Part 3: Using indexes\n",
"This is the final important concept to get from this introduction.\n",
"\n",
"Indexes are useful to handle multiple signals in parallel (as mentioned at the top of this notebook).\n",
Expand Down Expand Up @@ -464,25 +524,18 @@
"\n",
"To keep it short and concise, there are interesting concepts that were not mentioned above:\n",
"\n",
"- You might as well use **datetimes** to specify the timestamps. Learn more about it on the [**Time Units** section of the User Guide](https://temporian.readthedocs.io/en/latest/user_guide/#time-units). There are many [**calendar operators**](https://temporian.readthedocs.io/en/stable/reference/temporian/operators/calendar/calendar_day_of_month/) available when working with date timestamps.\n",
"- Temporian can handle **non-uniform samplings** just as easily (non-equal distance between event timestamps). Read more about the data representation on the **[User Guide's introduction](https://temporian.readthedocs.io/en/latest/user_guide/)** or check the [**sampling** section](https://temporian.readthedocs.io/en/latest/user_guide/#sampling).\n",
"- Temporian is **strict on the feature types** when applying operations, to avoid potentially silent errors or memory issues. Check the [User Guide's **casting** section](https://temporian.readthedocs.io/en/latest/user_guide/#casting) section to learn how to tackle those cases.\n",
"- We only used moving average here, but there are a bunch of other [**moving window**](https://temporian.readthedocs.io/en/stable/reference/temporian/operators/window/moving_count/) operators, frequently useful for time sequences manipulation.\n",
"- Check the [**Time Units** section of the User Guide](https://temporian.readthedocs.io/en/latest/user_guide/#time-units). There are many [**calendar operators**](https://temporian.readthedocs.io/en/stable/reference/temporian/operators/calendar/calendar_day_of_month/) available when working with datetimes.\n",
"- To combine or operate with events from different sampling sources (potentially non-uniform samplings) check the [**sampling** section of the User Guide](https://temporian.readthedocs.io/en/stable/user_guide/#sampling).\n",
"- Temporian is **strict on the feature data types** when applying operations, to avoid potentially silent errors or memory issues. Check the [User Guide's **casting** section](https://temporian.readthedocs.io/en/latest/user_guide/#casting) section to learn how to tackle those cases.\n",
"\n",
"### Next Steps\n",
"- The [**Recipes**](https://temporian.readthedocs.io/en/latest/recipes/) are short and self-contained examples showing how to use Temporian in typical use cases.\n",
"- Try the more advanced [**tutorials**](https://temporian.readthedocs.io/en/latest/tutorials/) to continue learning by example about all these topics and more!\n",
"- The [**Recipes**](https://temporian.readthedocs.io/en/stable/recipes/) are short and self-contained examples showing how to use Temporian in typical use cases.\n",
"- Try the more advanced [**tutorials**](https://temporian.readthedocs.io/en/stable/tutorials/) to continue learning by example about all these topics and more!\n",
"- Learn how Temporian is **ready for production**, using [**graph mode**](https://temporian.readthedocs.io/en/stable/user_guide/#eager-mode-vs-graph-mode) or [Apache Beam](https://temporian.readthedocs.io/en/stable/tutorials/temporian_with_beam/).\n",
"\n",
"- We could only cover a small fraction of **[all available operators](https://temporian.readthedocs.io/en/stable/reference/temporian/operators/add_index/)**.\n",
"- We put a lot of ❤️ in the **[User Guide](https://temporian.readthedocs.io/en/latest/user_guide/)**, so make sure to check it out 🙂."
"- We put a lot of ❤️ in the **[User Guide](https://temporian.readthedocs.io/en/stable/user_guide/)**, so make sure to check it out 🙂."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "73b77f3e-dcdf-4be0-9c15-c625cb4e297e",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -501,7 +554,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.10.13"
}
},
"nbformat": 4,
Expand Down
5 changes: 2 additions & 3 deletions docs/src/recipes/aggregate_calendar.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@
"metadata": {},
"source": [
"## Solution\n",
"We want to calculate every month, the accumulated sales from the last 2 months. So this is what we can do:\n",
"We want to calculate for every month, the accumulated sales from the last 2 months. So this is what we can do:\n",
"1. Create a tick on the first day of every month.\n",
"1. Use a `moving_sum` with variable window length, at each tick covering the duration since the last 2 months.\n",
"\n",
Expand Down Expand Up @@ -173,8 +173,7 @@
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
Expand Down
2 changes: 1 addition & 1 deletion docs/src/recipes/aggregate_interval.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
"version": "3.10.13"
}
},
"nbformat": 4,
Expand Down
3 changes: 2 additions & 1 deletion docs/src/tutorials/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@

A collection of tutorials that will help you see how Temporian can be used in practice in a wide range of scenarios.

- [Getting Started](getting_started.ipynb): Basic usage of the library, including the concept of index.
Make sure to check the [Getting Started guide](../getting_started.ipynb) before to learn about the basic usage of the library, including the concepts of sampling and indexes.

- [Heart rate analysis](heart_rate_analysis.ipynb): Load, visualize and preprocess medical data to detect heartbeats and compute heart rate and heart rate variability from raw ECG signals.
- [Loan outcomes prediction](loan_outcomes_prediction.ipynb): Use Temporian to prepare data to predict outcomes for finished loans.
- [M5 Competition](m5_competition.ipynb): Feature engineering and model training on the M5 Makridakis Forecasting Competitions.
Expand Down
2 changes: 1 addition & 1 deletion docs/src/tutorials/temporian_with_beam.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"\n",
"**WARNING:** Temporian with Apache Beam is experimental. The API might change, some optimizations are not implemented, and some operators are not available.\n",
"\n",
"The reader is assumed to be familiar with Temporian in-process execution. Please read the [3 Minutes to Temporian](../3_minutes), or the [User Guide](../user_guide) before.\n"
"The reader is assumed to be familiar with Temporian in-process execution. Please read the [Getting Started](../getting_started) or the [User Guide](../user_guide) before.\n"
]
},
{
Expand Down
Loading