Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prediction depth #72

Open
levmckinney opened this issue Apr 18, 2023 · 2 comments
Open

Prediction depth #72

levmckinney opened this issue Apr 18, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@levmckinney
Copy link
Collaborator

In the paper there is a nice visualization of prediction depth. Prediction depth is defined in the paper is the first layer where the most likely token is equal to the token output.

prediction depth

These should be included as part of the PredictionTrajectory class so that we can easily produce them in the future. Note that the code for this should be modular like TrajectoryStatistic since we may want to reuse these visualizations for attention in the future.

@levmckinney levmckinney added the enhancement New feature or request label Apr 19, 2023
@levmckinney
Copy link
Collaborator Author

I appear to have lost my prototype code for doing this. But I've dug up the reference implementation I based it off of in cap tum. https://github.com/pytorch/captum/blob/50f7bdd243b0430ef06958bb2dda9b3bdd0c150d/captum/attr/_utils/visualization.py#L755

@levmckinney
Copy link
Collaborator Author

levmckinney commented Apr 21, 2023

Another reference to look at would be the attention visualizations from anthropic: https://github.com/anthropics/PySvelte. Its used here https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant