Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Serve] Support event-based autoscaling #4533

Open
gaocegege opened this issue Jan 5, 2025 · 0 comments
Open

[Serve] Support event-based autoscaling #4533

gaocegege opened this issue Jan 5, 2025 · 0 comments

Comments

@gaocegege
Copy link

Right now, we only have QPS-based autoscaling, which works pretty well for a lot of situations. But honestly, it’d be awesome if we could generalize it to support event-based scaling too. By making the autoscaler more flexible, we could let it play nicely with other systems or triggers. Picture this: scaling your services based on things like the number of files in object storage—or any custom event. This would be super handy, especially for those tricky scaling scenarios like going from 1 to 0 or 0 to 1.

Scenario 1 - Scale down to 0 at night

You could use a cron trigger to implement the logic about scaling up during daytime and scale down to 0 at night.

service:
  replica_policy:
    max_replicas: 10
    target_qps_per_replica: 3
    upscale_delay_seconds: 300
    downscale_delay_seconds: 1200
    autoscaling_rules:
    - type: cron  # Use a cron-based policy to schedule scaling
      metadata:
        timezone: America/New_York  # Set the timezone for the cron schedule
        # Start time for scaling up: 09:00 AM every weekday (Monday to Friday)
        start: 0 9 * * 1-5
        # The desired number of replicas during the day (work hours)
        min_replicas: "2"  # Keeps at least 2 replicas running during the day

    - type: cron  # Another cron policy for scaling down at night
      metadata:
        timezone: America/New_York  # Same timezone as above
        # Scale down to 0 at 17:00 (05:00 PM) every weekday (Monday to Friday)
        start: 0 17 * * 1-5
        # The desired number of replicas at night (scale down to 0)
        min_replicas: "0"  # Scales down to 0 replicas after work hours

Scenario 2 - Autoscale when new nodes added

The cluster could be dynamically updated. Some times new nodes will be added and the services could be autoscaled based on this metric:

service:
  replica_policy:
    max_replicas: 10
    target_qps_per_replica: 3
    upscale_delay_seconds: 300
    downscale_delay_seconds: 1200
    autoscaling_rules:
    - type: node_based  # Trigger scaling based on node changes
      metadata:
        # Scale up when new nodes are added to the cluster
        scale_up_on: node_added  # Trigger scaling logic when nodes are added
        scale_down_on: node_removed
        # Optional: Define the minimum number of replicas per node
        replicas_per_node: 1  # Scale 1 replica per new node added
        # Optional: Set a maximum replica limit to avoid over-scaling

Version & Commit info:

  • sky -v: PLEASE_FILL_IN
  • sky -c: PLEASE_FILL_IN
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant