multi-tenancy + sdk client related changes in agents #3432

dhrubo-os · 2025-01-24T19:34:36Z

Description

[multi-tenancy + sdk client related changes in agents]

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Dhrubo Saha <[email protected]>

jngz-es · 2025-01-25T02:08:57Z

ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/agent/MLAgentExecutor.java

+    @Override
+    public void onMultiTenancyEnabledChanged(boolean isEnabled) {
+        this.isMultiTenancyEnabled = isEnabled;


I remember this setting is static as

ml-commons/plugin/src/main/java/org/opensearch/ml/settings/MLCommonsSettings.java

Line 305 in 570edaf

public static final Setting<Boolean> ML_COMMONS_MULTI_TENANCY_ENABLED = Setting

, are we expecting it can be changed dynamically?

So I developed this piece of code when we are thinking to change tenancy dynamically. But for now, we are keeping it as a static settings. But still piece of code applied, it doesn't harm. If we in the long run want to turn it to dynamic settings we won't need to do anything from our end.

Pretty sure that as a static setting it will never trigger this method at all. And you have to actually register listeners to get these notifications and I don't see that done. So it's half implemented. I agree it's harmless here but I'd rather see none or all.

That registration happened in this PR: https://github.com/opensearch-project/ml-commons/pull/3307/files

Right but you need to add a settings update consumer in the class where you want it to update. See https://github.com/search?q=repo%3Aopensearch-project%2Fml-commons%20addsettingsupdateconsumer&type=code

But I never see us register multitenancy (we did at one time on the feature branch but I think I removed it):

ml-commons/plugin/src/main/java/org/opensearch/ml/settings/MLFeatureEnabledSetting.java

Lines 46 to 73 in 570edaf

public MLFeatureEnabledSetting(ClusterService clusterService, Settings settings) {

isRemoteInferenceEnabled = ML_COMMONS_REMOTE_INFERENCE_ENABLED.get(settings);

isAgentFrameworkEnabled = ML_COMMONS_AGENT_FRAMEWORK_ENABLED.get(settings);

isLocalModelEnabled = ML_COMMONS_LOCAL_MODEL_ENABLED.get(settings);

isConnectorPrivateIpEnabled = new AtomicBoolean(ML_COMMONS_CONNECTOR_PRIVATE_IP_ENABLED.get(settings));

isControllerEnabled = ML_COMMONS_CONTROLLER_ENABLED.get(settings);

isBatchIngestionEnabled = ML_COMMONS_OFFLINE_BATCH_INGESTION_ENABLED.get(settings);

isBatchInferenceEnabled = ML_COMMONS_OFFLINE_BATCH_INFERENCE_ENABLED.get(settings);

isMultiTenancyEnabled = ML_COMMONS_MULTI_TENANCY_ENABLED.get(settings);

clusterService

.getClusterSettings()

.addSettingsUpdateConsumer(ML_COMMONS_REMOTE_INFERENCE_ENABLED, it -> isRemoteInferenceEnabled = it);

clusterService

.getClusterSettings()

.addSettingsUpdateConsumer(ML_COMMONS_AGENT_FRAMEWORK_ENABLED, it -> isAgentFrameworkEnabled = it);

clusterService.getClusterSettings().addSettingsUpdateConsumer(ML_COMMONS_LOCAL_MODEL_ENABLED, it -> isLocalModelEnabled = it);

clusterService

.getClusterSettings()

.addSettingsUpdateConsumer(ML_COMMONS_CONNECTOR_PRIVATE_IP_ENABLED, it -> isConnectorPrivateIpEnabled.set(it));

clusterService.getClusterSettings().addSettingsUpdateConsumer(ML_COMMONS_CONTROLLER_ENABLED, it -> isControllerEnabled = it);

clusterService

.getClusterSettings()

.addSettingsUpdateConsumer(ML_COMMONS_OFFLINE_BATCH_INGESTION_ENABLED, it -> isBatchIngestionEnabled = it);

clusterService

.getClusterSettings()

.addSettingsUpdateConsumer(ML_COMMONS_OFFLINE_BATCH_INFERENCE_ENABLED, it -> isBatchInferenceEnabled = it);

}

Yeah, we just need to add it back here, that's all. So for now I'll keep the code as it is.

jngz-es · 2025-01-25T02:32:16Z

I have a high level question. It turns out we are to add a tenant id in MLToolSpec that means for every tool we will check the permission according to tenant id? From my point of view, we don't have any tools related APIs like create/delete, as we don't think of a tool as a resource. On the contrary, we consider the agent to be the resource for that we have create/delete APIs. So I am wondering if it is enough that we only have resource control on agents per tenant id.

dhrubo-os · 2025-01-25T02:44:26Z

I have a high level question. It turns out we are to add a tenant id in MLToolSpec that means for every tool we will check the permission according to tenant id? From my point of view, we don't have any tools related APIs like create/delete, as we don't think of a tool as a resource. On the contrary, we consider the agent to be the resource for that we have create/delete APIs. So I am wondering if it is enough that we only have resource control on agents per tenant id.

That a very good question. I agree with that. But the problem is, tool can make prediction to the model. But that model needs to be tenant specific. So which is why we are forwarding this tenant id everywhere. This way when tool wants to perform any operation we can check if this tool has enough permission to perform the operation.

dbwiddis

LGTM. A lot more complicated than I thought!

dbwiddis · 2025-01-25T02:47:17Z

ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/agent/MLAgentExecutor.java

+    @Override
+    public void onMultiTenancyEnabledChanged(boolean isEnabled) {
+        this.isMultiTenancyEnabled = isEnabled;


Pretty sure that as a static setting it will never trigger this method at all. And you have to actually register listeners to get these notifications and I don't see that done. So it's half implemented. I agree it's harmless here but I'd rather see none or all.

codecov · 2025-01-25T09:00:42Z

Codecov Report

Attention: Patch coverage is 69.69697% with 110 lines in your changes missing coverage. Please review.

Project coverage is 80.22%. Comparing base (f28bb74) to head (be09e92).
Report is 173 commits behind head on main.

Files with missing lines	Patch %	Lines
...ch/ml/engine/algorithms/agent/MLAgentExecutor.java	78.12%	13 Missing and 8 partials ⚠️
...h/ml/action/agents/DeleteAgentTransportAction.java	72.22%	14 Missing and 6 partials ⚠️
...orithms/agent/MLConversationalFlowAgentRunner.java	0.00%	12 Missing ⚠️
...arch/ml/action/agents/GetAgentTransportAction.java	80.48%	5 Missing and 3 partials ⚠️
...ava/org/opensearch/ml/common/agent/MLToolSpec.java	53.33%	3 Missing and 4 partials ⚠️
.../ml/engine/algorithms/agent/MLFlowAgentRunner.java	60.00%	4 Missing and 2 partials ⚠️
...n/java/org/opensearch/ml/common/agent/MLAgent.java	61.53%	3 Missing and 2 partials ⚠️
...ml/action/agents/TransportRegisterAgentAction.java	82.60%	3 Missing and 1 partial ⚠️
...rg/opensearch/ml/client/MachineLearningClient.java	0.00%	3 Missing ⚠️
...nsearch/ml/common/connector/AbstractConnector.java	0.00%	0 Missing and 3 partials ⚠️
... and 15 more

Additional details and impacted files

@@             Coverage Diff              @@
##               main    #3432      +/-   ##
============================================
- Coverage     81.31%   80.22%   -1.10%     
- Complexity     6094     6699     +605     
============================================
  Files           573      599      +26     
  Lines         25268    29298    +4030     
  Branches       2666     3258     +592     
============================================
+ Hits          20547    23504    +2957     
- Misses         3601     4377     +776     
- Partials       1120     1417     +297

Flag	Coverage Δ
ml-commons	`80.22% <69.69%> (-1.10%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dhrubo-os had a problem deploying to ml-commons-cicd-env January 24, 2025 19:36 — with GitHub Actions Failure

dhrubo-os force-pushed the multi_tenancy_agent branch 2 times, most recently from 6b2ef3d to 68ecd9a Compare January 24, 2025 22:04

dhrubo-os had a problem deploying to ml-commons-cicd-env January 24, 2025 22:06 — with GitHub Actions Failure

dhrubo-os force-pushed the multi_tenancy_agent branch from 68ecd9a to b3bd8b4 Compare January 24, 2025 22:09

dhrubo-os had a problem deploying to ml-commons-cicd-env January 24, 2025 22:11 — with GitHub Actions Failure

dhrubo-os force-pushed the multi_tenancy_agent branch 2 times, most recently from 6eb19a3 to 3d9f58d Compare January 24, 2025 22:55

dhrubo-os had a problem deploying to ml-commons-cicd-env January 24, 2025 22:56 — with GitHub Actions Failure

dhrubo-os force-pushed the multi_tenancy_agent branch from 3d9f58d to eade048 Compare January 24, 2025 23:21

dhrubo-os had a problem deploying to ml-commons-cicd-env January 24, 2025 23:22 — with GitHub Actions Failure

dhrubo-os force-pushed the multi_tenancy_agent branch from eade048 to 0feb3b4 Compare January 25, 2025 01:11

multi-tenancy + sdk client related changes in agents

be09e92

Signed-off-by: Dhrubo Saha <[email protected]>

dhrubo-os force-pushed the multi_tenancy_agent branch from 0feb3b4 to be09e92 Compare January 25, 2025 01:13

dhrubo-os had a problem deploying to ml-commons-cicd-env January 25, 2025 01:15 — with GitHub Actions Failure

dhrubo-os temporarily deployed to ml-commons-cicd-env January 25, 2025 01:15 — with GitHub Actions Inactive

dhrubo-os marked this pull request as ready for review January 25, 2025 01:50

dhrubo-os requested review from b4sjoo, mingshl, jngz-es, model-collapse, rbhavna, ylwu-amzn, zane-neo and Zhangxunmt as code owners January 25, 2025 01:50

dhrubo-os requested review from austintlee, HenryL27 and xinyual as code owners January 25, 2025 01:50

jngz-es reviewed Jan 25, 2025

View reviewed changes

dhrubo-os had a problem deploying to ml-commons-cicd-env January 25, 2025 02:40 — with GitHub Actions Failure

dbwiddis approved these changes Jan 25, 2025

View reviewed changes

dhrubo-os had a problem deploying to ml-commons-cicd-env January 25, 2025 03:46 — with GitHub Actions Failure

dhrubo-os had a problem deploying to ml-commons-cicd-env January 25, 2025 05:30 — with GitHub Actions Failure

dhrubo-os temporarily deployed to ml-commons-cicd-env January 25, 2025 08:02 — with GitHub Actions Inactive

dhrubo-os deployed to ml-commons-cicd-env January 25, 2025 09:00 — with GitHub Actions Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multi-tenancy + sdk client related changes in agents #3432

multi-tenancy + sdk client related changes in agents #3432

dhrubo-os commented Jan 24, 2025

jngz-es Jan 25, 2025

dhrubo-os Jan 25, 2025

dbwiddis Jan 25, 2025

dhrubo-os Jan 25, 2025

dbwiddis Jan 25, 2025

dhrubo-os Jan 25, 2025

jngz-es commented Jan 25, 2025

dhrubo-os commented Jan 25, 2025

dbwiddis left a comment

dbwiddis Jan 25, 2025

codecov bot commented Jan 25, 2025

	public MLFeatureEnabledSetting(ClusterService clusterService, Settings settings) {
	isRemoteInferenceEnabled = ML_COMMONS_REMOTE_INFERENCE_ENABLED.get(settings);
	isAgentFrameworkEnabled = ML_COMMONS_AGENT_FRAMEWORK_ENABLED.get(settings);
	isLocalModelEnabled = ML_COMMONS_LOCAL_MODEL_ENABLED.get(settings);
	isConnectorPrivateIpEnabled = new AtomicBoolean(ML_COMMONS_CONNECTOR_PRIVATE_IP_ENABLED.get(settings));
	isControllerEnabled = ML_COMMONS_CONTROLLER_ENABLED.get(settings);
	isBatchIngestionEnabled = ML_COMMONS_OFFLINE_BATCH_INGESTION_ENABLED.get(settings);
	isBatchInferenceEnabled = ML_COMMONS_OFFLINE_BATCH_INFERENCE_ENABLED.get(settings);
	isMultiTenancyEnabled = ML_COMMONS_MULTI_TENANCY_ENABLED.get(settings);

	clusterService
	.getClusterSettings()
	.addSettingsUpdateConsumer(ML_COMMONS_REMOTE_INFERENCE_ENABLED, it -> isRemoteInferenceEnabled = it);
	clusterService
	.getClusterSettings()
	.addSettingsUpdateConsumer(ML_COMMONS_AGENT_FRAMEWORK_ENABLED, it -> isAgentFrameworkEnabled = it);
	clusterService.getClusterSettings().addSettingsUpdateConsumer(ML_COMMONS_LOCAL_MODEL_ENABLED, it -> isLocalModelEnabled = it);
	clusterService
	.getClusterSettings()
	.addSettingsUpdateConsumer(ML_COMMONS_CONNECTOR_PRIVATE_IP_ENABLED, it -> isConnectorPrivateIpEnabled.set(it));
	clusterService.getClusterSettings().addSettingsUpdateConsumer(ML_COMMONS_CONTROLLER_ENABLED, it -> isControllerEnabled = it);
	clusterService
	.getClusterSettings()
	.addSettingsUpdateConsumer(ML_COMMONS_OFFLINE_BATCH_INGESTION_ENABLED, it -> isBatchIngestionEnabled = it);
	clusterService
	.getClusterSettings()
	.addSettingsUpdateConsumer(ML_COMMONS_OFFLINE_BATCH_INFERENCE_ENABLED, it -> isBatchInferenceEnabled = it);
	}

multi-tenancy + sdk client related changes in agents #3432

Are you sure you want to change the base?

multi-tenancy + sdk client related changes in agents #3432

Conversation

dhrubo-os commented Jan 24, 2025

Description

Related Issues

Check List

jngz-es Jan 25, 2025

Choose a reason for hiding this comment

dhrubo-os Jan 25, 2025

Choose a reason for hiding this comment

dbwiddis Jan 25, 2025

Choose a reason for hiding this comment

dhrubo-os Jan 25, 2025

Choose a reason for hiding this comment

dbwiddis Jan 25, 2025

Choose a reason for hiding this comment

dhrubo-os Jan 25, 2025

Choose a reason for hiding this comment

jngz-es commented Jan 25, 2025

dhrubo-os commented Jan 25, 2025

dbwiddis left a comment

Choose a reason for hiding this comment

dbwiddis Jan 25, 2025

Choose a reason for hiding this comment

codecov bot commented Jan 25, 2025

Codecov Report