Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add first and last aggregations to Rolling and Expanding #60579

Merged
merged 7 commits into from
Jan 17, 2025

Conversation

snitish
Copy link
Contributor

@snitish snitish commented Dec 16, 2024

@snitish snitish requested a review from WillAyd as a code owner December 16, 2024 04:57
Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this @snitish.

Curious if we even need a Cython function for this? It seems we can replicate this with a take on the original data accounting for min_periods/window size

@mroeschke mroeschke added Enhancement Window rolling, ewma, expanding labels Dec 16, 2024
@snitish
Copy link
Contributor Author

snitish commented Dec 16, 2024

@mroeschke would take be able to handle NAs efficiently though? i.e. perform the aggregation in O(N) time as opposed to O(NW) (W = window)

Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jan 16, 2025
@snitish
Copy link
Contributor Author

snitish commented Jan 16, 2025

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

bumping this for visibility.

@snitish
Copy link
Contributor Author

snitish commented Jan 16, 2025

Some quick benchmarks I ran:

In [5]: ser = pd.Series(np.random.randn(10000))

In [6]: %timeit ser.rolling(5).apply(lambda x: x[~x.isna()].iloc[0]) # using custom lambda
831 ms ± 4.21 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit ser.rolling(5).first()
150 μs ± 881 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

In [8]: %timeit ser.rolling(5).last()
144 μs ± 1.14 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

@mroeschke mroeschke added this to the 3.0 milestone Jan 17, 2025
@mroeschke mroeschke merged commit 72fd708 into pandas-dev:main Jan 17, 2025
49 of 51 checks passed
@mroeschke mroeschke removed the Stale label Jan 17, 2025
@mroeschke
Copy link
Member

Sorry for the delay @snitish. Thanks for sticking with this

@snitish
Copy link
Contributor Author

snitish commented Jan 17, 2025

Sorry for the delay @snitish. Thanks for sticking with this

No problem at all. Thanks for approving!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Window rolling, ewma, expanding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: first & last for rolling()
2 participants