[Serve] Support event-based autoscaling #4533

gaocegege · 2025-01-05T01:47:32Z

Right now, we only have QPS-based autoscaling, which works pretty well for a lot of situations. But honestly, it’d be awesome if we could generalize it to support event-based scaling too. By making the autoscaler more flexible, we could let it play nicely with other systems or triggers. Picture this: scaling your services based on things like the number of files in object storage—or any custom event. This would be super handy, especially for those tricky scaling scenarios like going from 1 to 0 or 0 to 1.

Scenario 1 - Scale down to 0 at night

You could use a cron trigger to implement the logic about scaling up during daytime and scale down to 0 at night.

service:
  replica_policy:
    max_replicas: 10
    target_qps_per_replica: 3
    upscale_delay_seconds: 300
    downscale_delay_seconds: 1200
    autoscaling_rules:
    - type: cron  # Use a cron-based policy to schedule scaling
      metadata:
        timezone: America/New_York  # Set the timezone for the cron schedule
        # Start time for scaling up: 09:00 AM every weekday (Monday to Friday)
        start: 0 9 * * 1-5
        # The desired number of replicas during the day (work hours)
        min_replicas: "2"  # Keeps at least 2 replicas running during the day

    - type: cron  # Another cron policy for scaling down at night
      metadata:
        timezone: America/New_York  # Same timezone as above
        # Scale down to 0 at 17:00 (05:00 PM) every weekday (Monday to Friday)
        start: 0 17 * * 1-5
        # The desired number of replicas at night (scale down to 0)
        min_replicas: "0"  # Scales down to 0 replicas after work hours

Scenario 2 - Autoscale when new nodes added

The cluster could be dynamically updated. Some times new nodes will be added and the services could be autoscaled based on this metric:

service:
  replica_policy:
    max_replicas: 10
    target_qps_per_replica: 3
    upscale_delay_seconds: 300
    downscale_delay_seconds: 1200
    autoscaling_rules:
    - type: node_based  # Trigger scaling based on node changes
      metadata:
        # Scale up when new nodes are added to the cluster
        scale_up_on: node_added  # Trigger scaling logic when nodes are added
        scale_down_on: node_removed
        # Optional: Define the minimum number of replicas per node
        replicas_per_node: 1  # Scale 1 replica per new node added
        # Optional: Set a maximum replica limit to avoid over-scaling

Version & Commit info:

sky -v: PLEASE_FILL_IN
sky -c: PLEASE_FILL_IN

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Serve] Support event-based autoscaling #4533

[Serve] Support event-based autoscaling #4533

gaocegege commented Jan 5, 2025

[Serve] Support event-based autoscaling #4533

[Serve] Support event-based autoscaling #4533

Comments

gaocegege commented Jan 5, 2025

Scenario 1 - Scale down to 0 at night

Scenario 2 - Autoscale when new nodes added