Feature request/question on EDR support for FS. #154

tmefford · 2020-01-30T18:15:22Z

The beegfs example doesn't use the IB interface. The lustre example doesn't seem to either. How could I get that to work for either? I'd have to use different nodes for the beegfssm, at least. Failing that, what is the best way to get I/O performance? Just bump the number of instances of beegfssm? Sorry for being a little open-ended with the question. Suggestions appreciated.

Thank you

xpillons · 2020-01-31T14:41:40Z

the only VM supporting IB are our HPC skus which are not design for high density storage. If you want, you can use these for lustre so that your compute and storage will be on the same IB fabric. What level of I/O performance are you looking for ?

garvct · 2020-01-31T15:16:14Z

Please consult the following blog post for tips on tuning beeGFS/BeeOND on Azure.
https://techcommunity.microsoft.com/t5/azurecat/tuning-beegfs-and-beeond-on-azure-for-specific-i-o-patterns/ba-p/1015446

tmefford · 2020-01-31T19:13:11Z

Thanks for the replies. I'm not an I/O expert. I don't have a quantifiable performance target, I only know that even single node runs are way worse on the BeeGFS filesystem than on local SSD. Using an HPC SKU for lustre seems to make sense, but I don't know what i would need to do to make it use the IB interface. If I don't go the IB route, what would you expect the best performing azurehpc filesystem to be? lustre_full with maybe an increased number of instances? I will look at the BeeGFS docs, but, even though ideally I would, I don't really know the apps' I/O pattern. Thank you for your time, suggestions appreciated.

edwardsp · 2020-02-04T15:56:45Z

You can actually use the local disks in the HPC VMs but it will probably be a fairly expensive way to create a parallel filesystem (since the disks are only 700GiB per VM). You would just need to make sure you use /dev/sdb when calling the lfsmaster and lfsoss scripts in the install section. This would then allow you to use the IB network. But, if you want compute on that same network you will need to provision more VMs in the scaleset and set the others up as compute (only VMs within a scaleset can communicate with one-another on the IB network).

You should only be limited by the network throughput using the Lv2 setup.

Can you provide more details about what you are running?

tmefford · 2020-02-04T17:16:54Z

I’m not independently measuring disk performance, but trying to run a performance benchmark on a cluster. My reason for thinking I need better performance is simply that a single node run done on the BeeGFS filesystem I initially set up performed much worse than one run from a localdisk. (And the multinode runs were correspondingly bad). So, I will try a Lustre filesystem and want to try to make it as good as possible without spending a week on doing so. I did not know the nodes needed to be in the same scaleset to communicate over IB. That is very helpful. Thank you! I’m still not sure I want to do that, as, obviously, the I/O traffic would interfere with internode communication. I don’t know the application’s behavior well enough to judge the relative traffic levels of I/O and message passing. (In an ideal world, yes, I would, but so it goes.) Would just increasing the number of LFS OSS servers be a second best way to improve the PFS I/O performance? My capacity need for this filesystem are actually pretty small, it’s all about the performance, so maybe, if I do try to use the IB interface, using /dev/sdb on some of the IB enabled node types should be sufficient. In the config.json, I would guess that I would only need, after changing the vmtype, to change the nvme entry to sdb here: { "script": "lfsoss.sh", "args": [ "$(<hostlists/tags/lfsmaster)", "/dev/nvme0n1" ], "tag": "lfsoss", "sudo": true }, But I don’t know how to make it use the IB interface. Would I have to go in and do that after the fact, or can AzureHPC scripting do that, too? Of course, I would still have to figure out how to get both OSS and compute in the same scale set. Thank you for any suggestions. Tim From: edwardsp <[email protected]> Sent: Tuesday, February 4, 2020 7:57 AM To: Azure/azurehpc <[email protected]> Cc: Mefford, Tim <[email protected]>; Author <[email protected]> Subject: Re: [Azure/azurehpc] Feature request/question on EDR support for FS. (#154) You can actually use the local disks in the HPC VMs but it will probably be a fairly expensive way to create a parallel filesystem (since the disks are only 700GiB per VM). You would just need to make sure you use /dev/sdb when calling the lfsmaster and lfsoss scripts in the install section. This would then allow you to use the IB network. But, if you want compute on that same network you will need to provision more VMs in the scaleset and set the others up as compute (only VMs within a scaleset can communicate with one-another on the IB network). You should only be limited by the network throughput using the Lv2 setup. Can you provide more details about what you are running? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#154?email_source=notifications&email_token=AEMY33OI56EAHKDF5DYQRKLRBGF35A5CNFSM4KN2TBR2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKYEOEY#issuecomment-581977875>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AEMY33KY4MM26FMARPGOCXDRBGF35ANCNFSM4KN2TBRQ>.

tmefford · 2020-02-04T19:34:08Z

Super simple question--How would I alter this to get two non-OS SSDs per compute node?

    "compute": {
        "type": "vmss",
        "vm_type": "Standard_HB60rs",
        "accelerated_networking": false,
        "instances": 2,
        "image": "variables.hpcimage",
        "subnet": "compute",
        "tags": [
            "disable-selinux",
            "lfsrepo",
            "lfsclient",
            "localuser",
            "pbsclient",
            "nfsclient"
        ]
    },

Thanks,

Tim

edwardsp · 2020-02-05T08:12:52Z

Hi Tim,

You can just add:

    "storage_sku": "Premium_LRS",
      "data_disks": [
        4095, 4095
      ],

The "storage_sku" is the type of storage and the "data_disks" are a list of disks to add (if disk size is less than 4096 it will use caching provided the VM type supports it). For the above it is premium SSD. But, please rememeber, these SSDs are not inside the physical VM. There is still a latency.

Best regards,
Paul

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request/question on EDR support for FS. #154

Feature request/question on EDR support for FS. #154

tmefford commented Jan 30, 2020

xpillons commented Jan 31, 2020

garvct commented Jan 31, 2020

tmefford commented Jan 31, 2020

edwardsp commented Feb 4, 2020

tmefford commented Feb 4, 2020 via email

tmefford commented Feb 4, 2020

edwardsp commented Feb 5, 2020

Feature request/question on EDR support for FS. #154

Feature request/question on EDR support for FS. #154

Comments

tmefford commented Jan 30, 2020

xpillons commented Jan 31, 2020

garvct commented Jan 31, 2020

tmefford commented Jan 31, 2020

edwardsp commented Feb 4, 2020

tmefford commented Feb 4, 2020 via email

tmefford commented Feb 4, 2020

edwardsp commented Feb 5, 2020