-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add power_envelope
& soc_power
sensors
#112
base: main
Are you sure you want to change the base?
Conversation
trindenau
commented
Dec 23, 2024
•
edited
Loading
edited
lanserv/mellanox-bf/mlx-bf.sdrs
Outdated
endsdr | ||
|
||
|
||
#ddr_temp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, extra space and the comment should be #power_envelope
. The comment can also be omitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Please add more description in the commit message |
404f4f9
to
77c26c4
Compare
77c26c4
to
adef23b
Compare
@trindenau - please update commit message |
# Get SOC power info # | ||
################################### | ||
#If the file doesn’t exists try to load the module | ||
SOC_POWER_PATH="/sys/kernel/debug/mlxbf-ptm/monitors/status/total_power" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trindenau - is this path defined in Yochai arch doc? if not we need to make sure he define the path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the right path according to the arch doc
lanserv/mellanox-bf/set_emu_param.sh
Outdated
#If the file doesn’t exists try to load the module | ||
SOC_POWER_PATH="/sys/kernel/debug/mlxbf-ptm/monitors/status/total_power" | ||
if [ ! -f "$SOC_POWER_PATH" ]; then | ||
modprobe mlxbf-ptm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trindenau - arch state the we need to load the driver? what happens in secure Linux where we can't load it or driver does not exist?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it doesn't I added it.
Do you want to get read of that or find a way to add it to secure linux?
soc_power=$(cat "$SOC_POWER_PATH") | ||
# Remove all the number after the decimal point – it can cause issues in the ipmb | ||
if ! [[ "$soc_power" =~ ^-?[0-9]+(\.[0-9]+)?$ ]]; then | ||
echo "Error: soc_power is not a valid number" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trindenau - indentation
lanserv/mellanox-bf/set_emu_param.sh
Outdated
else | ||
soc_power=$(cat "$SOC_POWER_PATH") | ||
# Remove all the number after the decimal point – it can cause issues in the ipmb | ||
if ! [[ "$soc_power" =~ ^-?[0-9]+(\.[0-9]+)?$ ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- power can be negative? do we have such requirement?
- the power reading is in decimal? not hex?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The power is in decimal. and I got read of the minus
################################### | ||
# Get power envelope info # | ||
################################### | ||
POWER_ENVELOPE_PATH="/sys/kernel/debug/mlxbf-ptm/monitors/status/power_envelope" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trindenau - same as above, this need to be defined in the arch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the right path according to the arch doc
lanserv/mellanox-bf/set_emu_param.sh
Outdated
POWER_ENVELOPE_PATH="/sys/kernel/debug/mlxbf-ptm/monitors/status/power_envelope" | ||
if [ ! -f "$POWER_ENVELOPE_PATH" ]; then | ||
#the module loaded in the soc_power routine | ||
echo "Error: power_envelope file still not found after loading module" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trindenau - indentation
lanserv/mellanox-bf/set_emu_param.sh
Outdated
################################### | ||
POWER_ENVELOPE_PATH="/sys/kernel/debug/mlxbf-ptm/monitors/status/power_envelope" | ||
if [ ! -f "$POWER_ENVELOPE_PATH" ]; then | ||
#the module loaded in the soc_power routine |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trindenau - this comment not clear? what does it means?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got read of that
lanserv/mellanox-bf/set_emu_param.sh
Outdated
remove_sensor "soc_power" | ||
else | ||
soc_power=$(cat "$SOC_POWER_PATH") | ||
# Remove all the number after the decimal point – it can cause issues in the ipmb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trindenau - this is only a check it doesn't remove anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
lanserv/mellanox-bf/set_emu_param.sh
Outdated
remove_sensor "power_envelope" | ||
else | ||
power_envelope=$(cat "$POWER_ENVELOPE_PATH") | ||
# Remove all the number after the decimal point – it can cause issues in the ipmb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@trindenau - same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
2c3ac58
to
b0d700d
Compare
Add `power_envelope` and `soc_power` sensors. Both will derive their values from `dbugfs`. testd by: ipmitool -I ipmb sensors: soc_power | 22.000 | Watts | ok | na | 5.000 | na | na | na | na power_envelope | 65.000 | Watts | ok | na | na | 10.000 | 150.000 | na |na Or by running the TestRedFishSensorSchema test. Fixes jira https://redmine.mellanox.com/issues/4016386.
b0d700d
to
297b205
Compare