Scripts for setting up an example CernVM-FS system. These scripts can be used to deploy the following servers:
- CernVM-FS Stratum 0
- CernVM-FS Stratum 1 replicas
- CernVM-FS caching proxies
- CernVM-FS clients
The CernVM-File System is a distributed file system. Files are stored efficiently using a content addressable file system. Files are transferred efficiently using a series of distributed caches, which also provide redundancy.
-
CernVM-FS repositories are deployed as a single Stratum 0 central server, which is the only place where files in the repository are modified.
-
The Stratum 1 replicas are full copies of the Stratum 1 repository.
-
The caching proxies may contain a cached copy of the files which have been accessed through it.
-
The clients make the repositories available to its users, mounted under the /cvmfs directory using autofs. The clients access the repositories as a read-only file system.
These scripts are not intended for use in a production system.
The scripts can be individually run on the hosts. But for testing, a quick way to run them is to use the "reset-all" command in the test/cvmfs-test.sh script.
See test/README.md for details.
The scripts also can be individually run on the hosts.
This example shows the use of these scripts to create and use a CernVM-FS repository called data.example.org.
It is assumed there are four hosts:
- 10.0.0.1 for the Stratum 0
- 10.1.1.1 for a Stratum 1
- 10.2.2.2 for a caching proxy
- 10.3.3.3 for a client (with any other clients being in the range 10.3.3.0/24)
**Important: ** the 10.x.x.x addresses are just examples for this document. You must use addresses that are correct for your network.
For simplicity, only one Stratum 1 replica and only one caching proxy is used in this example. But multiple ones can also be deployed for improved redundancy and performance.
First, copy the four scripts to their respective hosts.
When running the scripts, please be patient: they take several minutes to run, because they need to download and install several software packages.
Note: all the scripts support the --help
option, to print out a
brief description of the available options.
Create the Stratum 0 central server by running:
[stratum-0]$ sudo ./cvmfs-stratum-0-setup.sh data.example.org
Where the identifiers used for the repositories are provided as arguments ("data.example.org" in this case).
Keys for the repository will be generated. Copy the public key for the created repository (from "/etc/cvmfs/keys/data.example.org.pub") to the Stratum 1 host(s) and to the client host(s).
The Stratum 1 host(s) must be able to connect to port 80 on the Stratum 0 host.
Create a Stratum 1 replica by running:
[stratum-1]$ sudo ./cvmfs-stratum-1-setup.sh \
--stratum-0 10.0.0.1 \
--servername 10.1.1.1 \
--refresh 2 \
data.example.org.pub
The optional --servername
is used to set the ServerName in the
Apache Web Server configuration.
The refresh is used as the step value in the minutes field for the cron job that refreshes the replica from the Stratum 0 repository. It is set to 2 minutes here, so changes are propagated more quickly.
The file name of the public key should be the fully qualified repository name followed by a ".pub" extension. The script uses the basename of the file name without the ".pub" extension as the fully qualified repository name. If the file name is different, the argument must be the fully qualified repository name followed by a colon and the file name (e.g. "data.example.org:pubkey.pem").
The proxy host(s) must be able to connect to ports 80 and 8000 on the Stratum 1 host(s).
Create a proxy by running:
[proxy]$ sudo ./cvmfs-proxy-setup.sh --stratum-1 10.1.1.1 10.3.3.0/24
The proxy only needs to know about the Stratum 1 replicas and which client hosts are allowed to use the proxy. It does not need to know where the Stratum 1 host is. It also does not need to know what the repositories are (since that information will be in the requests from the clients).
The client host(s) must be able to connect to port 3128 on the proxy host(s). That is the conventional port used for CernVM-FS caching proxies, but it can be changed via a command line option on the script.
Create a client by running:
[client]$ sudo ./cvmfs-client-setup.sh \
--stratum-1 10.1.1.1 --proxy 10.2.2.2 --no-geo-api data.example.org.pub
The --no-geo-api
option is required, because the Stratum 1 server
was not configured with a Geo API license key. To use the Geo API,
a license key needs to be obtained from
MaxMind and used to configure the Stratum
1 replica.
Like for the Stratum 1 setup script, the file name of the public key should be the fully qualified repository name followed by a ".pub" extension; or it must contain the fully qualified repository name followed by the file name separated by a colon.
Initially, there are no mount points under the /cvmfs directory, since the mounts are using autofs. The repositories will be automatically mounted when they are accessed (and are automatically unmounted when not used).
[client]$ ls /cvmfs
[client]$ ls /cvmfs/data.example.org
new_repository
[client]$ ls /cvmfs
data.example.org
The file called "new_repository" is automatically created in all new repositories (when they are created on the Stratum 0). It makes testing easier, since there is something to see, but can be deleted from the repository.
Add or modify files in the repository by starting a transaction, making the changes and then publishing the changes.
[stratum-0]$ cvmfs_server transaction data.example.org
[stratum-0]$ echo "Hello world!" > /cvmfs/data.example.org/README.txt
[stratum-0]$ cvmfs_server publish data.example.org
If the above commands are run as the repository user, root privileges are not required.
To close a transaction without publishing (i.e. discarding any
changes), run cvmfs_server abort
.
Wait for about 3 minutes and then check for the changes to appear on the client. The cron job on the Stratum 1 host (to update/snapshot the repository) was set to run every 2 minutes, and a little more time is needed for it and the proxy and client caches to update.
[client]$ ls /cvmfs/data.example.org
new_repository README.txt
[client]$ cvmfs_config stat -v
The monitor-file.sh script can be used to detect when a file changes in the client, instead of manually waiting for it to change.
To use repositories from more than one organisation, configure the proxies and clients to support them.
To support multiple organisations, provide all the Stratum 1 hosts to the cvmfs-proxy-setup.sh script.
The proxy does not make a distinction between which Stratum 1 host is used for which organisation's repositories.
To support multiple organisations, run the cvmfs-client-setup.sh script multiple times: once for each organisation. Configure only one organisation per run. That is, the Stratum 1 hosts and repositories must be for the same organisation.
The client does require the repositories to be associated with their correct Statum 1 hosts.
The scripts only work on Linux, since they use the yum or apt-get package managers to install the CernVM-FS software.
The setup scripts have been tested on:
- CentOS 7
- CentOS 8
- CentOS Stream 8
- CentOS Stream 9
- Ubuntu 20.04
- Ubuntu 20.10
- Ubuntu 21.04
The Stratum 0 host must allow access to its port 80 from the Stratum 1 hosts.
The Stratum 1 hosts must allow access to its ports 80 and 8000 from the proxy hosts.
The proxy hosts must allow access to its port 3128 from the clients.
The scripts normally output a brief description of what it is
installing or configuring. If the --quiet
option is specified (and
--verbose
is not specified), no output will be produce unless there
is an error.
The output from the cvmfs_server command will be printed out if the
scripts are run in very verbose mode: by specifying the -v
option
twice. It is only useful for the Stratum 0 and Stratum 1 scripts.
Extra help information is displayed when -v
is used along with
-h
. That extra information is not related to the script, but are
a reminder of some useful/related CernVM-FS commands.
-
Installs the Squid from the distribution, which may be old and deprecated. A production deployment should use a newer version of Squid.
-
Only one organisation is supported. But multiple repositories under that organisation are possible.
Try running the same setup script again. It seems to work the second time it is run.
See troubleshooting.md for details.
This work is supported by the Australian BioCommons which is enabled by NCRIS via Bioplatforms Australia funding.