Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reserving CPU resource in CPU inference #27321

Merged
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
47 commits
Select commit Hold shift + click to select a range
dd848c2
add property enable_cpu_reservation
sunxiaoxia2022 Oct 29, 2024
23a6e42
update enable_cpu_pinning when enable_cpu_reservation=true && enable_…
sunxiaoxia2022 Oct 29, 2024
ce804ee
update test case
sunxiaoxia2022 Oct 30, 2024
bf63bf9
support cpu_reservation=true,cpu_pinning=false
sunxiaoxia2022 Nov 12, 2024
dd43e4a
change comments
sunxiaoxia2022 Nov 12, 2024
13a146e
add enable_cpu_reservation to CpuExecNetworkSupportedPropertiesAreAva…
sunxiaoxia2022 Nov 13, 2024
79099f0
initial implementation
wangleis Dec 3, 2024
b6ba3a6
update current_socket_id
wangleis Dec 4, 2024
ae3e1f0
Merge branch 'master' into xiaoxia/cpu_reservation
sunxiaoxia2022 Dec 4, 2024
7c53ce3
Revert "update current_socket_id"
wangleis Dec 4, 2024
e27cc45
Merge branch 'master' into xiaoxia/cpu_reservation
sunxiaoxia2022 Dec 6, 2024
967f2fa
Merge branch 'pr27873' into xiaoxia/cpu_reservation
sunxiaoxia2022 Dec 6, 2024
3ccc8c6
Merge branch 'master' into xiaoxia/cpu_reservation
peterchen-intel Dec 16, 2024
9e7ed1c
initial implementation
wangleis Dec 18, 2024
5cdfc10
refactor streams calculation
wangleis Dec 18, 2024
e684864
fix code style issue
wangleis Dec 18, 2024
d1f50d2
set cpu_pinning yes if user only set cpu_reservation yes
sunxiaoxia2022 Dec 23, 2024
64e8114
Merge branch 'xiaoxia/cpu_reservation' of https://github.com/sunxiaox…
sunxiaoxia2022 Dec 23, 2024
bea7d8e
Merge branch 'master' into update_proc_type_table
wangleis Dec 24, 2024
93fcb21
revert pr27873
sunxiaoxia2022 Dec 24, 2024
6412749
Merge branch 'master' into xiaoxia/cpu_reservation
sunxiaoxia2022 Dec 24, 2024
e171818
Merge branch 'pr28117' into xiaoxia/cpu_reservation
sunxiaoxia2022 Dec 24, 2024
a575038
Merge commit 'refs/pull/28117/head' of https://github.com/openvinotoo…
sunxiaoxia2022 Dec 24, 2024
bd4a132
update for cpu reservation
wangleis Dec 24, 2024
885d837
Merge branch 'master' into update_proc_type_table
wangleis Dec 24, 2024
a6d7ea1
Merge branch 'master' into xiaoxia/cpu_reservation
wangleis Dec 30, 2024
8cc37b1
add enable_cpu_reservation condition in creating executor
sunxiaoxia2022 Jan 3, 2025
daf179b
Merge branch 'xiaoxia/cpu_reservation' of https://github.com/sunxiaox…
sunxiaoxia2022 Jan 3, 2025
f6e5867
Merge branch 'master' into xiaoxia/cpu_reservation
wangleis Jan 6, 2025
34b1aff
update pinning in windows
sunxiaoxia2022 Jan 6, 2025
f3bcc0c
add lock to guarantee thread safity
sunxiaoxia2022 Jan 10, 2025
8c86f52
fix conflict
sunxiaoxia2022 Jan 10, 2025
4e9d6a1
fix ci issue
sunxiaoxia2022 Jan 10, 2025
2e0295f
fix ci test issue
sunxiaoxia2022 Jan 11, 2025
76bb972
add test case of reservation
sunxiaoxia2022 Jan 13, 2025
a0c8793
add parallel running multiple compiled model test case
sunxiaoxia2022 Jan 13, 2025
af4cf60
Merge branch 'master' into xiaoxia/cpu_reservation
sunxiaoxia2022 Jan 13, 2025
9a36e8d
rm invalid log
sunxiaoxia2022 Jan 13, 2025
ab7484d
add test case
sunxiaoxia2022 Jan 14, 2025
aab8519
fix conflict
sunxiaoxia2022 Jan 14, 2025
f46103c
fix ci test issue
sunxiaoxia2022 Jan 14, 2025
85b03f6
change test case
sunxiaoxia2022 Jan 15, 2025
b1143c6
fix conflicts
sunxiaoxia2022 Jan 15, 2025
bb622bb
change test case
sunxiaoxia2022 Jan 16, 2025
d0e19e1
change _cpu_ids_all to Impl()
sunxiaoxia2022 Jan 16, 2025
c0f2e69
fix ci test
sunxiaoxia2022 Jan 16, 2025
e06fbdd
remove smoke_CpuExecNetworkCheckCpuReservation
sunxiaoxia2022 Jan 16, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions src/inference/include/openvino/runtime/properties.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -479,6 +479,24 @@ static constexpr Property<std::set<ModelDistributionPolicy>> model_distribution_
*/
static constexpr Property<bool> enable_cpu_pinning{"ENABLE_CPU_PINNING"};

/**
* @brief This property allows CPU reservation during inference.
* @ingroup ov_runtime_cpp_prop_api
*
* Cpu Reservation means reserve cpus which will not be used by other plugin. Developer can use this property to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the property limits CPUs reuse not only between the plugins, but also within single plugin for different compiled models. That should be stated directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done.

* enable or disable CPU reservation during inference on Windows and Linux. MacOS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say property support matrix should be described in User documentation as we do for other properties.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, added the property in User documentation.

* does not support CPU reservation, and this property is always disabled.
* This property defaults to false.
*
* The following code is example to use this property.
*
* @code
* ie.set_property(ov::hint::enable_cpu_reservation(true));
* ie.set_property(ov::hint::enable_cpu_reservation(false));
* @endcode
*/
static constexpr Property<bool> enable_cpu_reservation{"ENABLE_CPU_RESERVATION"};

/**
* @brief This property define if using hyper threading during inference.
* @ingroup ov_runtime_cpp_prop_api
Expand Down
6 changes: 1 addition & 5 deletions src/inference/src/dev/threading/istreams_executor.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -163,10 +163,6 @@ void IStreamsExecutor::Config::update_executor_config() {
return;
}

if (_cpu_reservation && !_cpu_pinning) {
_cpu_pinning = true;
}

if (!_streams_info_table.empty()) {
streams_info_available = true;
std::vector<int> threads_proc_type(HYPER_THREADING_PROC + 1, 0);
Expand Down Expand Up @@ -265,7 +261,7 @@ void IStreamsExecutor::Config::update_executor_config() {
}
}

if (_cpu_pinning) {
if (_cpu_pinning || _cpu_reservation) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If user set cpu_reservation only, change cpu_pinning to yes internally.
If user set both cpu_reservation and cpu_pinning, keep user setting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done.

reserve_available_cpus(_streams_info_table, _stream_processor_ids, _cpu_reservation ? CPU_USED : NOT_USED);
}

Expand Down
281 changes: 279 additions & 2 deletions src/inference/tests/unit/executor_config_test.cpp

Large diffs are not rendered by default.

4 changes: 4 additions & 0 deletions src/plugins/intel_cpu/src/compiled_model.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,7 @@ ov::Any CompiledModel::get_property(const std::string& name) const {
RO_property(ov::hint::execution_mode.name()),
RO_property(ov::hint::num_requests.name()),
RO_property(ov::hint::enable_cpu_pinning.name()),
RO_property(ov::hint::enable_cpu_reservation.name()),
RO_property(ov::hint::scheduling_core_type.name()),
RO_property(ov::hint::model_distribution_policy.name()),
RO_property(ov::hint::enable_hyper_threading.name()),
Expand Down Expand Up @@ -307,6 +308,9 @@ ov::Any CompiledModel::get_property(const std::string& name) const {
} else if (name == ov::hint::enable_cpu_pinning.name()) {
const bool use_pin = config.enableCpuPinning;
return decltype(ov::hint::enable_cpu_pinning)::value_type(use_pin);
} else if (name == ov::hint::enable_cpu_reservation.name()) {
const bool use_reserve = config.enableCpuReservation;
return decltype(ov::hint::enable_cpu_reservation)::value_type(use_reserve);
} else if (name == ov::hint::scheduling_core_type) {
const auto stream_mode = config.schedulingCoreType;
return stream_mode;
Expand Down
10 changes: 10 additions & 0 deletions src/plugins/intel_cpu/src/config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -178,6 +178,16 @@ void Config::readProperties(const ov::AnyMap& prop, const ModelType modelType) {
ov::hint::enable_cpu_pinning.name(),
". Expected only true/false.");
}
} else if (key == ov::hint::enable_cpu_reservation.name()) {
try {
enableCpuReservation = val.as<bool>();
} catch (ov::Exception&) {
OPENVINO_THROW("Wrong value ",
val.as<std::string>(),
"for property key ",
ov::hint::enable_cpu_reservation.name(),
". Expected only true/false.");
}
} else if (key == ov::hint::scheduling_core_type.name()) {
try {
schedulingCoreType = val.as<ov::hint::SchedulingCoreType>();
Expand Down
1 change: 1 addition & 0 deletions src/plugins/intel_cpu/src/config.h
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,7 @@ struct Config {
uint32_t hintNumRequests = 0;
bool enableCpuPinning = true;
bool changedCpuPinning = false;
bool enableCpuReservation = false;
ov::hint::SchedulingCoreType schedulingCoreType = ov::hint::SchedulingCoreType::ANY_CORE;
std::set<ov::hint::ModelDistributionPolicy> modelDistributionPolicy = {};
int streamsRankLevel = 1;
Expand Down
2 changes: 1 addition & 1 deletion src/plugins/intel_cpu/src/cpu_streams_calculation.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -691,7 +691,7 @@ std::vector<std::vector<int>> generate_stream_info(const int streams,
config.streams,
config.threadsPerStream,
ov::hint::SchedulingCoreType::ANY_CORE,
false,
config.enableCpuReservation,
cpu_pinning,
streams_info_table};

Expand Down
4 changes: 4 additions & 0 deletions src/plugins/intel_cpu/src/plugin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -346,6 +346,9 @@ ov::Any Plugin::get_property(const std::string& name, const ov::AnyMap& options)
} else if (name == ov::hint::enable_cpu_pinning) {
const bool pin_value = engConfig.enableCpuPinning;
return decltype(ov::hint::enable_cpu_pinning)::value_type(pin_value);
} else if (name == ov::hint::enable_cpu_reservation) {
const bool reserve_value = engConfig.enableCpuReservation;
return decltype(ov::hint::enable_cpu_reservation)::value_type(reserve_value);
} else if (name == ov::hint::scheduling_core_type) {
const auto core_type = engConfig.schedulingCoreType;
return core_type;
Expand Down Expand Up @@ -422,6 +425,7 @@ ov::Any Plugin::get_ro_property(const std::string& name, const ov::AnyMap& optio
RW_property(ov::hint::execution_mode.name()),
RW_property(ov::hint::num_requests.name()),
RW_property(ov::hint::enable_cpu_pinning.name()),
RW_property(ov::hint::enable_cpu_reservation.name()),
RW_property(ov::hint::scheduling_core_type.name()),
RW_property(ov::hint::model_distribution_policy.name()),
RW_property(ov::hint::enable_hyper_threading.name()),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ TEST_F(OVClassConfigTestCPU, smoke_CpuExecNetworkSupportedPropertiesAreAvailable
RO_property(ov::hint::execution_mode.name()),
RO_property(ov::hint::num_requests.name()),
RO_property(ov::hint::enable_cpu_pinning.name()),
RO_property(ov::hint::enable_cpu_reservation.name()),
RO_property(ov::hint::scheduling_core_type.name()),
RO_property(ov::hint::model_distribution_policy.name()),
RO_property(ov::hint::enable_hyper_threading.name()),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ TEST_F(OVClassConfigTestCPU, smoke_PluginAllSupportedPropertiesAreAvailable) {
RW_property(ov::hint::execution_mode.name()),
RW_property(ov::hint::num_requests.name()),
RW_property(ov::hint::enable_cpu_pinning.name()),
RW_property(ov::hint::enable_cpu_reservation.name()),
RW_property(ov::hint::scheduling_core_type.name()),
RW_property(ov::hint::model_distribution_policy.name()),
RW_property(ov::hint::enable_hyper_threading.name()),
Expand Down
2 changes: 2 additions & 0 deletions src/plugins/intel_gpu/src/graph/program.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -113,11 +113,13 @@ static ov::threading::IStreamsExecutor::Config make_task_executor_config(const E
default: OPENVINO_ASSERT(false, "[GPU] Can't create task executor: invalid host task priority value: ", priority);
}
bool enable_cpu_pinning = config.get_property(ov::hint::enable_cpu_pinning);
bool enable_cpu_reservation = config.get_property(ov::hint::enable_cpu_reservation);

ov::threading::IStreamsExecutor::Config task_executor_config(tags,
streams,
1,
core_type,
enable_cpu_reservation,
enable_cpu_pinning);

return task_executor_config;
Expand Down
3 changes: 3 additions & 0 deletions src/plugins/intel_gpu/src/plugin/compiled_model.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,13 @@ std::shared_ptr<ov::threading::ITaskExecutor> create_task_executor(const std::sh
// the CPU behavior
return plugin->get_executor_manager()->get_executor("GPU");
} else if (config.get_property(ov::hint::enable_cpu_pinning)) {
bool enable_cpu_reservation = config.get_property(ov::hint::enable_cpu_reservation);
return std::make_shared<ov::threading::CPUStreamsExecutor>(
ov::threading::IStreamsExecutor::Config{"Intel GPU plugin executor",
config.get_property(ov::num_streams),
1,
ov::hint::SchedulingCoreType::PCORE_ONLY,
enable_cpu_reservation,
true});
} else {
return std::make_shared<ov::threading::CPUStreamsExecutor>(
Expand Down Expand Up @@ -242,6 +244,7 @@ ov::Any CompiledModel::get_property(const std::string& name) const {
// Configs
ov::PropertyName{ov::enable_profiling.name(), PropertyMutability::RO},
ov::PropertyName{ov::hint::enable_cpu_pinning.name(), PropertyMutability::RO},
ov::PropertyName{ov::hint::enable_cpu_reservation.name(), PropertyMutability::RO},
ov::PropertyName{ov::hint::model_priority.name(), PropertyMutability::RO},
ov::PropertyName{ov::intel_gpu::hint::host_task_priority.name(), PropertyMutability::RO},
ov::PropertyName{ov::intel_gpu::hint::queue_priority.name(), PropertyMutability::RO},
Expand Down
1 change: 1 addition & 0 deletions src/plugins/intel_gpu/src/plugin/plugin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -564,6 +564,7 @@ std::vector<ov::PropertyName> Plugin::get_supported_properties() const {
ov::PropertyName{ov::hint::num_requests.name(), PropertyMutability::RW},
ov::PropertyName{ov::hint::inference_precision.name(), PropertyMutability::RW},
ov::PropertyName{ov::hint::enable_cpu_pinning.name(), PropertyMutability::RW},
ov::PropertyName{ov::hint::enable_cpu_reservation.name(), PropertyMutability::RW},
ov::PropertyName{ov::device::id.name(), PropertyMutability::RW},
ov::PropertyName{ov::hint::dynamic_quantization_group_size.name(), PropertyMutability::RW}
};
Expand Down
6 changes: 6 additions & 0 deletions src/plugins/intel_gpu/src/runtime/execution_config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ void ExecutionConfig::set_default() {
std::make_tuple(ov::hint::execution_mode, ov::hint::ExecutionMode::PERFORMANCE),
std::make_tuple(ov::hint::num_requests, 0),
std::make_tuple(ov::hint::enable_cpu_pinning, false),
std::make_tuple(ov::hint::enable_cpu_reservation, false),

std::make_tuple(ov::intel_gpu::hint::host_task_priority, ov::hint::Priority::MEDIUM),
std::make_tuple(ov::intel_gpu::hint::queue_throttle, ov::intel_gpu::hint::ThrottleLevel::MEDIUM),
Expand Down Expand Up @@ -235,6 +236,11 @@ void ExecutionConfig::apply_user_properties(const cldnn::device_info& info) {
if (info.supports_immad) {
set_property(ov::intel_gpu::queue_type(QueueTypes::in_order));
}
if (!is_set_by_user(ov::hint::enable_cpu_reservation)) {
if (get_property(ov::hint::enable_cpu_pinning)) {
set_property(ov::hint::enable_cpu_reservation(true));
}
}

user_properties.clear();
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ const std::vector<ov::AnyMap> gpu_compileModel_properties = {
{ov::hint::performance_mode(ov::hint::PerformanceMode::LATENCY),
ov::hint::num_requests(10),
ov::hint::enable_cpu_pinning(true),
ov::hint::enable_cpu_reservation(false),
ov::enable_profiling(true)}};

INSTANTIATE_TEST_SUITE_P(smoke_gpuCompileModelBehaviorTests,
Expand Down
Loading