-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
crush root X not known #44
Comments
currently just root buckets work as "take" base, support for any bucket is not yet implemented, unfortunately. |
Hi Jonas,
On 6/2/24 17:31, Jonas Jelten wrote:
currently just root buckets work as "take" base, support for any bucket is not yet implemented, unfortunately.
Thank you for your response. However, both mentioned regions are under
the default root. Shouldn't the balancer then take the default root when
I don't want balancing just within one region, but the entire tree? Am I
missing something?
Best regards,
Michal
|
hmm, strange. so there is a bucked "oldservers" of type root? |
Hi Jonas, "oldservers" and "newservers" are of the type region, but both are descendants of a root. I attached the output from "ceph osd tree" to the first post, where it can be seen. Do you need the output from "ceph osd crush dump"? Thx |
yea that's the problem. they need to be of type root, sorry... but you can probably collect the "root" trees anyway by changing "root" to "region" here as a dirty hack:
i have to rewrite the logic for root node collection one day so we can support any top node type... |
Hi Jonas,
I have tried dirty hack and replace "root" with "region" in the code
(line 2254):
if bucket["type_name"] == "region":
bucket_root_ids.append(id)
I got different errror:
python3 ./placementoptimizer.py -v balance --max-pg-moves 128
[2024-06-04 21:25:28,345] gathering cluster state via ceph api...
[2024-06-04 21:25:37,602] running pg balancer
Traceback (most recent call last):
File "./placementoptimizer.py", line 5496, in <module>
exit(main())
File "./placementoptimizer.py", line 5490, in main
run()
File "./placementoptimizer.py", line 5454, in <lambda>
run = lambda: balance(args, state)
File "./placementoptimizer.py", line 4618, in balance
need_simulation=True)
File "./placementoptimizer.py", line 3265, in __init__
self.init_analyzer.analyze(self)
File "./placementoptimizer.py", line 4288, in analyze
self._update_stats()
File "./placementoptimizer.py", line 4385, in _update_stats
avail, limit_osd = self.pg_mappings.get_pool_max_avail_limited(poolid)
File "./placementoptimizer.py", line 3974, in get_pool_max_avail_limited
osdid_candidates = self.cluster.candidates_for_pool(poolid).keys()
File "./placementoptimizer.py", line 2324, in candidates_for_pool
candidates = self.candidates_for_root(root_name)
File "./placementoptimizer.py", line 2286, in candidates_for_root
raise RuntimeError(f"crush root {root_name} not known?")
RuntimeError: crush root default~hdd not known?
Michal
…On 6/4/24 17:38, Jonas Jelten wrote:
yea that's the problem. they need to be of type root, sorry... but you can probably collect the "root" trees anyway by changing "root" to "region" here as a dirty hack:
```
if bucket["type_name"] == "root": # change to region
bucket_root_ids.append(id)
```
i have to rewrite the logic for root node collection one day so we can support any top node type...
|
ah, apparently you also have
|
Hi Jonas,
it seems that it works.
Thx
Michal
…On 6/6/24 08:41, Jonas Jelten wrote:
ah, apparently you also have `default` with type `root`.
no idea what will break, but please try:
```
if bucket["type_name"] in ("region", "root"):
```
|
Hi!
We are using two regions in the CRUSH map, where one is for new servers and the other is for old servers. Both regions are under a single root default. Unfortunately, we are unable to start the balancer. It seems that for some reason, it does not consider the root default and fails at that point.
Attached, you will find the output of "ceph osd tree" and below is the error encountered when attempting to start it. Does anyone have any idea what could be wrong?
We are running 16.2.12 version (code name Pacific).
ceph_osd_tree.txt
Thank you very much for any advice and help.
Michal
The text was updated successfully, but these errors were encountered: