There are several ways to access the servers:
- eduroam and LAN (All servers in TU munich are accessible from within the TUM network)
- Via SSH jump host: recommended for ssh, required for students
- We have one Proxy jump host that contains all SSH keys that are added to the nixos configuration i.e. in modules/users.nix
- Reproducible example:
SSH_AUTH_SOCK= ssh -v -F /dev/null -i <path/to/privkey> -oProxyCommand="ssh [email protected] -i <path/to/privkey> -W %h:%p" <yourusername>@graham.dos.cit.tum.de
- Keys are uploaded via the machine astrid whenever nixos configuration is updated.
- You can generate an SSH config file for all TUM hosts with this script, providing your username as an argument
- VPN provided by RBG: recommended for admins
- this option only works for ls1 employes
- this vpn also gives access to the management network (i.e. for IPMI access)
- use the
dos
profile from here
All servers in TUM have public ipv6/ipv4 addresses and dns record following the format:
$hostname.dos.cit.tum.de
for the machine itself.$hostname-mgmt.dos.cit.tum.de
for the IPMI/BMC interface.
i.e. astrid has the addresses astrid.dos.cit.tum.de
and astrid-mgmt.dos.cit.tum.de
.
On servers where we import xrdp.nix
, we have graphical access via xrdp. This is mainly useful for xilinx development.
User need to have xrdpAccess
set to true in their account entry in ../modules/users.
After than run the following commands:
$ inv generate-password --user <USER>
Send the password in <USER>-password
to the student
and store -password-hash in ./modules/users/xrdp-passwords.yml
by doing:
$ sops ./modules/users/xrdp-passwords.yml
You may have to restart xrdp-sesman.service for the changes to apply.
Bios and the boot flow can be accessed/observed via "Remote Console" on the IPMI webinterfaces.
- use il1 VPN (see Accessing the server, only for staff)
- goto
https://$hostname-mgmt.dos.cit.tum.de
- login credentials are encrypted in the doctor cluster repo
sops secrets.yaml
To be able to accept self signed certificates in firefox: Go to the website about:config
and set network.stricttransportsecurity.preloadlist
to false.
- Expansion cards and slots
- Network graph (see also networking notes in "Expansion cards and slots")
Our epyc servers are shared devices on which many users usually work concurrently.
- single NUMA node (EPYC 7713P):
- single NUMA node (EPYC 9654P)
- dual NUMA node (EPYC 7413, for many expansion cards)
- dual NUMA node (AMD EPYC 9334)
Those servers (or individual devices) are sometimes used exclusively by a single user to conduct benchmarks.
- single socket Xeon Gold 5317
- dual socket Xeon Gold 6326, GPU
- dual socket Xeon Gold 6438Y+, CXL support
- dual socket Xeon Platinum 8562Y+, TDX support
Note: these servers are equipped with Persistent Memory (PM). For information on how to setup the PM in App-Direct mode, please see here
Those serve as a github action runner for Systemprogramming + cloud systems lab. Astrid also hosts the buildbot master server with Graham as the buildbot worker.
- yasmin
- We have an M1 Mac Mini in Patric's office with a dual boot macos/linux
- ace (Morello)
Each of these machines is equipped with an Alveo U50 FPGA card. Those servers are manually managed by @atsushikoshiba. They run ubuntu - that means that accounts/ssh keys added to this repos won't appear on those machines. Those machines also are not backed up.
- RBG VMs:
- monitoring.dos.cit.tum.de (VM), doctor.r (container in VM) doctor.nix: borg backup target, monitoring
- login.dos.cit.tum.de README: ssh jumphost
- dosvm1.cit.tum.de: pxeboot
We have a shared nfs-based /home
mounted. The nfs for /home is based on a NVME
disk on mickey and is limited to 1.5TB.
Please do not store large amounts of data such as VM images here. VM images of
running VMs will also interfere with the Backupsoftware.
Instead if you need fast local disk access use /scratch/$YOURUSER
- however unlike
/home
and/share
this directory are not included in the backup. If you want to share larger datasets between machines use/share
, which is based on two hard disk (15TB capacity).
Both nfs export stored on mickey
are also replicated to dan
every 15
minutes using zfs replication based on
syncoid.
In case there are hardware problems with mickey
, dan
can take over serving
the nfs.
Our nfs servers allows connections from the 2a09:80c0:102::/64
network.
Add the following line to /etc/hosts
:
2a09:80c0:102::f000:0 nfs
And the following lines to /etc/fstab
to mount a shared /home
and /share
nfs:/export/home /home nfs4 nofail,timeo=14 0 2
nfs:/export/share /share nfs4 nofail,timeo=14 0 2
ZFS is used on all machines whenever possible. We enable automatic snapshots of
the filesystem every 15 minutes. The snapshot can be accessed by entering the
.zfs
directory of a zfs dataset mountpoint.
- for NFS mounted directories, snapshots are on the NFS master node (nardole?,
/export/home/.zfs
or/export/share/.zfs
) - for local zfs datasets (
zfs list
) snapshots are at/.zfs
,/home/.zfs
, ... - note that
.zfs
is not seen byls
Furthermore /share
and /home
are backed up daily to get RBG storage using
borgbackup. See also nixos borg wiki.
[root@nardole:/home/okelmann]# sudo su
[root@nardole:/home/okelmann]# eval $(ssh-agent)
[root@nardole:/home/okelmann]# ssh-add /run/secrets/tum-borgbackup-home-ssh
[root@nardole:/home/okelmann]# borg-job-eva-home list
nardole-eva-home-2022-12-28T00:00:00 Wed, 2022-12-28 00:00:05 [aca815ff996515a1b06c53e6363cff34fbdefaefda54b498fe1b579daeb97cff]
nardole-eva-home-2023-02-14T00:00:00 Tue, 2023-02-14 00:00:07 [5024321057e1da6b6664f88d1ab72340cc8a0d6c41e572cb24023bd73ba9f0d5]
nardole-eva-home-2023-02-15T00:00:00 Wed, 2023-02-15 00:00:07 [85aac6717e3f1835c7e4bb79e5d8dc9d2dde99db32e21851ada29b071e0f3aca]
[root@nardole:/home/okelmann]# borg-job-eva-home mount [email protected]:/mnt/backup/nfs-home::nardole-eva-home-2023-02-15T00:00:00
Our chair currently has three networks:
il01
: for devices in the officeruby
: firewall port 22 open (only forlogin
jumphost)
il01_16
: for the servers- open to
il01
(and VPN) - usually 10Gbit/s SFP+ connectors for fiber
- ipv4: 131.159.102.0/24
- ipv6: 2a09:80c0:102::/64
- open to
il01_15
for management- open only to il01 VPN provided by RBG
- usually 1Gbit/s RJ-45
- ipv4: 172.24.90.0/24
il01_14
for internal connections- closed, internal network, private ips
- ipv4: 172.24.89.0/24
- LRZs eduvpn:
- open to Münchner Wissenschaftsnetz
- ipv4: 10.0.0.0/8
- Via eduvpn client lrz eduvpn guide
- Or via OpenVPN client (certificate expires every few months) tum.eduvpn.lrz.de
- TUM-ITO dosvpn:
- may access:
- 131.159.0.0/16
- 192.187.0.0/16
- 10.0.0.0/8
- 172.24.0.0/17
- ipv4: 172.24.238.0(/24 maybe?)
- may access:
- LRZ: 192.187.0.0/16
- TUM-ITO/RBG: 131.159.0.0/16
- TUM-ITO/RBG private: 172.24.0.0/16
- TUM-ITO/RBG vlans: 172.24.0.0/17
- L3 Switch "Adric"
adric-mgmt.dos.cit.tum.de
- see adric.md
- FibreStore N8550-32C switch with Broadcom BCM56870 Trident III chip
- 32x 100G QSFP
- 2x 10G SFP+
- 8 core Intel Xeon D-1518 @ 2.20GHz
- subnet: 10.0.0.0/24
- management via ssh with username admin
- Retired: L3 Switch "Craig"
craig-mgmt.dos.cit.tum.de
(sops encrypted (config)[./craig.sops])- 6x 100Gbit/s QSFP
- many 10Gbit/s SFP+
- ip: 172.24.90.18
- vlan example config (layer2->static vlan config)
- vlan id: 1; untagged ports: Fx0/1-48,Cx0/1-2,Cx0/4; forbidden ports: Cx0/3,Cx0/5;
- vlan id: 2; vlan name: vlan2; untagged ports: Cx0/3,Cx0/5; forbidden ports: Fx0/1-48,Cx0/1-2,Cx0/4;
To add a new machine send the MAC address of your host interface and your IPMI/management interface to [email protected]
.
If the RGB group asks which networks to connect your machine to, tell them il01_16
for the machine and il01_15
for IPMI/BMC.
A graph of how the servers are connected right now can be found here.