All notable changes to this project will be documented in this file.
- The lifetime of auto generated TLS certificates is now configurable with the role and roleGroup
config property
requestedSecretLifetime
. This helps reducing frequent Pod restarts (#501). - Run a
containerdebug
process in the background of each Spark container to collect debugging information (#508).
- Make spark-env.sh configurable via
configOverrides
(#473). - The Spark history server can now service logs from HDFS compatible systems (#479).
- The operator can now run on Kubernetes clusters using a non-default cluster domain.
Use the env var
KUBERNETES_CLUSTER_DOMAIN
or the operator Helm chart propertykubernetesClusterDomain
to set a non-default cluster domain (#480).
- Reduce CRD size from
1.2MB
to103KB
by accepting arbitrary YAML input instead of the underlying schema for the following fields (#450):podOverrides
affinity
volumes
volumeMounts
- Update tests and docs to Spark version 3.5.2 (#459)
- BREAKING: The fields
connection
andhost
onS3Connection
as well asbucketName
onS3Bucket
are now mandatory (#472). - Fix
envOverrides
for SparkApplication and SparkHistoryServer (#451). - Ensure SparkApplications can only create a single submit Job. Fix for #457 (#460).
- Invalid
SparkApplication
/SparkHistoryServer
objects don't cause the operator to stop functioning (#[482]).
- Support for Spark versions 3.4.2 and 3.4.3 has been dropped (#459).
- BREAKING (behaviour): Specified CPU resources are now applied correctly (instead of rounding it to the next whole number). This might affect your jobs, as they now e.g. only have 200m CPU resources requested instead of the 1000m it had so far, meaning they might slow down significantly (#408).
- Processing of corrupted log events fixed; If errors occur, the error messages are added to the log event (#412).
- Helm: support labels in values.yaml (#344).
- Support version
3.5.1
(#373). - Support version
3.4.2
(#357). spec.job.config.volumeMounts
property to easily mount volumes on the job pod (#359)
- Various documentation of the CRD (#319).
- [BREAKING] Removed version field. Several attributes have been changed to mandatory. While this change is technically breaking, existing Spark jobs would not have worked before as these attributes were necessary (#319).
- [BREAKING] Remove
userClassPathFirst
properties fromspark-submit
. This is an experimental feature that was introduced to support logging in XML format. The side effect of this removal is that the vector agent cannot aggregate output from thespark-submit
containers. On the other side, it enables dynamic provisionining of java packages (such as Delta Lake) with Stackable stock images which is much more important. (#355)
- Add missing
deletecollection
RBAC permission for Spark drivers. Previously this caused confusing error messages in the spark driver log (User "system:serviceaccount:default:my-spark-app" cannot deletecollection resource "configmaps" in API group "" in the namespace "default".
) (#313).
- Default stackableVersion to operator version. It is recommended to remove
spec.image.stackableVersion
from your custom resources (#267, #268). - Configuration overrides for the JVM security properties, such as DNS caching (#272).
- Support PodDisruptionBudgets for HistoryServer (#288).
- Support for versions 3.4.1, 3.5.0 (#291).
- History server now exports metrics via jmx exporter (port 18081) (#291).
- Document graceful shutdown (#306).
vector
0.26.0
->0.33.0
(#269, #291).operator-rs
0.44.0
->0.55.0
(#267, #275, #288, #291).- Removed usages of SPARK_DAEMON_JAVA_OPTS since it's not a reliable way to pass extra JVM options (#272).
- [BREAKING] use product image selection instead of version (#275).
- [BREAKING] refactored application roles to use
CommonConfiguration
structures from the operator framework (#277). - Let secret-operator handle certificate conversion (#286).
- Extended resource-usage documentation (#297).
- Removed support for versions 3.2.1, 3.3.0 (#291).
- Generate OLM bundle for Release 23.4.0 (#238).
- Add support for Spark 3.4.0 (#243).
- Add support for using custom certificates when accessing S3 with TLS (#247).
- Use bitnami charts for testing S3 access with TLS (#247).
- Set explicit resources on all containers (#249).
- Support pod overrides (#256).
operator-rs
0.38.0
->0.44.0
(#235, #259).- Use 0.0.0-dev product images for testing (#236).
- Use testing-tools 0.2.0 (#236).
- Run as root group (#241).
- Added kuttl test suites (#252).
- Fix quoting issues when spark config values contain spaces (#243).
- Increase the size limit of log volumes (#259).
- Typo in executor cpu limit property (#263).
- [BREAKING] Support specifying Service type for HistoryServer.
This enables us to later switch non-breaking to using
ListenerClasses
for the exposure of Services. This change is breaking, because - for security reasons - we default to thecluster-internal
ListenerClass
. If you need your cluster to be accessible from outside of Kubernetes you need to setclusterConfig.listenerClass
toexternal-unstable
orexternal-stable
(#228). - [BREAKING]: Dropped support for old
spec.{driver,executor}.nodeSelector
field. Usespec.{driver,executor}.affinity.nodeSelector
instead (#217) - Revert openshift settings (#207)
- BUGFIX: assign service account to history pods (#207)
- Merging and validation of the configuration refactored (#223)
operator-rs
0.36.0
→0.38.0
(#223)
- Create and manage history servers (#187)
- Updated stackable image versions (#176)
operator-rs
0.22.0
→0.27.1
(#178)operator-rs
0.27.1
->0.30.2
(#187)- Don't run init container as root and avoid chmod and chowning (#183)
- [BREAKING] Implement fix for S3 reference inconsistency as described in the issue #162 (#187)
- Bumped image to
3.3.0-stackable0.2.0
in tests and docs (#145) - BREAKING: use resource limit struct instead of passing spark configuration arguments (#147)
- Fixed resources test (#151)
- Fixed inconsistencies with resources usage (#166)
- Add Getting Started documentation (#114).
- Add missing role to read S3Connection and S3Bucket objects (#112).
- Update annotation due to update to rust version (#114).
- Update RBAC properties for OpenShift compatibility (#126).
- Include chart name when installing with a custom release name (#97)
- Pinned MinIO version for tests (#100)
operator-rs
0.21.0
→0.22.0
(#102).- Added owner-reference to pod templates (#104)
- Added kuttl test for the case when pyspark jobs are provisioned using the
image
property of theSparkApplication
definition (#107)
- BREAKING: Use current S3 connection/bucket structs (#86)
- Add node selector to top-level job and specify node selection in PVC-relevant tests (#90)
- Update kuttl tests to use Spark 3.3.0 (#91)
- Bugfix for duplicate volume mounts in PySpark jobs (#92)
- Added new fields to govern image pull policy (#75)
- New
nodeSelector
fields for both the driver and the executors (#76) - Mirror driver pod status to the corresponding spark application (#77)
- Updated examples (#71)