Skip to content

Commit

Permalink
Merge branch 'main' into isaac/backtestingtutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
baskaryan authored Dec 19, 2024
2 parents 041c29d + f9ec6b7 commit 355a02f
Show file tree
Hide file tree
Showing 33 changed files with 2,579 additions and 788 deletions.
3 changes: 2 additions & 1 deletion .prettierignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
node_modules
build
.docusaurus
docs/api
docs/api
docs/evaluation
8 changes: 4 additions & 4 deletions docs/administration/pricing.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -208,13 +208,13 @@ If you’ve consumed the monthly allotment of free traces in your account, you c

Every user will have a unique personal account on the Developer plan. <b>We cannot upgrade a Developer account to the Plus or Enterprise plans.</b> If you’re interested in working as a team, create a separate LangSmith Organization on the Plus plan. This plan can upgraded to the Enterprise plan at a later date.

### How will billing work?
### How does billing work?

<b>Seats</b>
<br />
Seats are billed monthly on the first of the month in the future will be
pro-rated if additional seats are purchased in the middle of the month. Seats
removed mid-month will not be credited.
Seats are billed monthly on the first of the month. Additional seats purchased
mid-month are pro-rated and billed within one day of the purchase. Seats removed
mid-month will not be credited.
<br />
<br />
<b>Traces</b>
Expand Down
409 changes: 160 additions & 249 deletions docs/evaluation/concepts/index.mdx

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Binary file not shown.
Binary file added docs/evaluation/concepts/static/offline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/evaluation/concepts/static/online.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3 changes: 3 additions & 0 deletions docs/evaluation/how_to_guides/annotation_queues.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,9 @@ To assign runs to an annotation queue, either:

3. [Set up an automation rule](../../../observability/how_to_guides/monitoring/rules) that automatically assigns runs which pass a certain filter and sampling condition to an annotation queue.

4. Select one or multiple experiments from the dataset page and click **Annotate**. From the resulting popup, you may either create a new queue or add the runs to an existing one:
![](./static/annotate_experiment.png)

:::tip

It is often a very good idea to assign runs that have a certain user feedback score (eg thumbs up, thumbs down) from the application to an annotation queue. This way, you can identify and address issues that are causing user dissatisfaction.
Expand Down
4 changes: 2 additions & 2 deletions docs/evaluation/how_to_guides/custom_evaluator.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,9 @@ Custom evaluators are expected to return one of the following types:

Python and JS/TS

- `dict`: dicts of the form `{"score" | "value": ..., "name": ...}` allow you to customize the metric type ("score" for numerical and "value" for categorical) and metric name. This if useful if, for example, you want to log an integer as a categorical metric.
- `dict`: dicts of the form `{"score" | "value": ..., "key": ...}` allow you to customize the metric type ("score" for numerical and "value" for categorical) and metric name. This if useful if, for example, you want to log an integer as a categorical metric.

Currently Python only
Python only

- `int | float | bool`: this is interepreted as an continuous metric that can be averaged, sorted, etc. The function name is used as the name of the metric.
- `str`: this is intepreted as a categorical metric. The function name is used as the name of the metric.
Expand Down
2 changes: 1 addition & 1 deletion docs/evaluation/how_to_guides/evaluate_pairwise.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ In the Python example below, we are pulling [this structured prompt](https://smi
return scores
evaluate(
["experiment-1", "experiment-2"], # Replace with the names/IDs of your experiments
("experiment-1", "experiment-2"), # Replace with the names/IDs of your experiments
evaluators=[ranked_preference],
randomize_order=True,
max_concurrency=4,
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/evaluation/how_to_guides/static/view_experiment.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 355a02f

Please sign in to comment.