Skip to content

Latest commit

 

History

History
1439 lines (1307 loc) · 56.7 KB

StandardAttributes.rst

File metadata and controls

1439 lines (1307 loc) · 56.7 KB

Standard Attributes

.. toctree::
   :caption: Standard Attributes

OpenEXR files store metadata in attributes. The attributes marked below as "required" are present in every .exr file and specify values essential to every image.

The optional attributes store extra information in a conventional form. These tables give the presumed definitions of the most common data associated with .exr image files.

Basic Attributes

attribute name type definition

displayWindow

required

Box2i

The boundaries of an OpenEXR image in pixel space. The display window is defined by the coordinates of the pixels in the upper left and lower right corners. See Overview of the OpenEXR File Format for more details.

dataWindow

required

Box2i

An OpenEXR file may not have pixel data for all the pixels in the display window, or the file may have pixel data beyond the boundaries of the display window. The region for which pixel data are available is defined by a second axis-parallel rectangle in pixel space, the data window. See Overview of the OpenEXR File Format for more details.

originalDataWindow

Box2i

If application software crops an image, then it should save the data window of the original, un-cropped image in the originalDataWindow attribute.

pixelAspectRatio

required

float

Width divided by height of a pixel when the image is displayed with the correct aspect ratio. A pixel's width (height) is the distance between the centers of two horizontally (vertically) adjacent pixels on the display.

screenWindowCenter

required

V2f

The screenWindowCenter and screenWindowWidth describe the perspective projection that produced the image. Programs that deal with images as purely two-dimensional objects may not be able so generate a description of a perspective projection. Those programs should set screenWindowWidth to 1, and screenWindowCenter to (0, 0).

screenWindowWidth

required

float

lineOrder

required

LineOrder

Specifies in what order the scan lines in the file are stored in the file:

  • INCREASING_Y - first scan line has lowest y coordinate
  • DECREASING_Y - first scan line has highest y coordinate
  • RANDOM_Y - only for tiled files; tiles are written in random order

compression

required

Compression

Specifies the compression method applied to the pixel data of all channels in the file.

  • NO_COMPRESSION - no compression
  • RLE_COMPRESSION - run length encoding
  • ZIPS_COMPRESSION - zlib compression, one scan line at a time
  • ZIP_COMPRESSION - zlib compression, in blocks of 16 scan lines
  • PIZ_COMPRESSION - piz-based wavelet compression
  • PXR24_COMPRESSION - lossy 24-bit float compression
  • B44_COMPRESSION - lossy 4-by-4 pixel block compression, fixed compression rate
  • B44A_COMPRESSION - lossy 4-by-4 pixel block compression, flat fields are compressed more
  • DWAA_COMPRESSION - lossy DCT based compression, in blocks of 32 scanlines. More efficient for partial buffer access.
  • DWAB_COMPRESSION - lossy DCT based compression, in blocks of 256 scanlines. More efficient space wise and faster to decode full frames than DWAA_COMPRESSION.

channels

required

ChannelList

A description of the image channels stored in the file.

Multi-Part and Deep Data

attribute name type definition

name

required for multi-part images

string

The name attribute defines the name of each part. The name of each part must be unique. Names may contain '*.*' characters to present a tree-like structure of the parts in a file.

type

required for multi-part images

string

Data types are defined by the type attribute. There are four types:

  • Scan line images: indicated by a value of scanlineimage
  • Tiled images: indicated by a value of tiledimage
  • Deep scan line images: indicated by a value of deepscanline
  • Deep tiled images: indicated by a value of deeptile

version

required for multi-part images

int

Version 1 data for all part types is described in the section on the OpenEXR File Layout.

chunkCount

required for multi-part images

int

Indicates the number of chunks in this part. Required if the multipart bit (12) is set. This attribute is created automatically by the library; the user does not need to compute it.

tiles

required for multi-part images

TileDescription

This attribute is required only for tiled files. It specifies the size of the tiles, and the file's level mode.

Position and Orientation

These attributes describe the position and orientation of the physical or CG camera at the time of capture. All are optional.

attribute name type definition
worldToCamera M44f

For images generated by 3D computer graphics rendering, a matrix that transforms 3D points from the world to the camera coordinate space of the renderer.

The camera coordinate space is left-handed. Its origin indicates the location of the camera. The positive x and y axes correspond to the "right" and "up" directions in the rendered image. The positive z axis indicates the camera's viewing direction. (Objects in front of the camera have positive z coordinates.)

Camera coordinate space in OpenEXR is the same as in Pixar's Renderman.

worldToNDC M44f

For images generated by 3D computer graphics rendering, a matrix that transforms 3D points from the world to the Normalized Device Coordinate (NDC) space of the renderer.

NDC is a 2D coordinate space that corresponds to the image plane, with positive x and pointing to the right and y positive pointing down. The coordinates (0, 0) and (1, 1) correspond to the upper left and lower right corners of the OpenEXR display window.

To transform a 3D point in word space into a 2D point in NDC space, multiply the 3D point by the worldToNDC matrix and discard the z coordinate.

NDC space in OpenEXR is the same as in Pixar's Renderman.

sensorCenterOffset V2f

Horizontal and vertical distances, in microns, of the center of the light-sensitive area of the camera's sensor from a point on that sensor where a sensor surface normal would intersect the center of the lens mount.

When compared to an image captured with a perfectly centered sensor, an image where both horizontal and vertical distances were positive would contain more content holding what was at the right and what was at the bottom of the scene being captured.

sensorOverallDimensions V2f

Dimensions of the light-sensitive area of the sensor, in millimeters, independent of the subset of that region from which image data are obtained.

sensorPhotositePitch float

Distance between centers of sensor photosites, in microns.

sensorAcquisitionRectangle Box2i

The rectangular area of the sensor containing photosites the contents of which are in one-to-one correspondence with the captured sensels, for a monochrome sensor, or with the reconstructed RGB pixels, for a sensor covered with color filter array material in a Bayer or a similar pattern.

Because understanding the above formal definition is critical for many applications, including camera solvers, some short definitions:

  • a photosite is that optoelectronic component on the sensor which, when light hits it, accumulates or otherwise registers electric charge
  • a sensel is the read-out contents of a single photosite
  • color filter array material is material deposited on top of an array of photosites such that each photosite is discretely covered with a material that passes photons of certain wavelengths and that blocks photons of other wavelengths
  • an RGB pixel contains red, green and blue components indicating relative exposure values
  • RGB pixel reconstruction is the process of taking sensel data from a neighborhood of a particular photosite, including that photosite itself, in a photosite covered by either red, green or blue CFA material, and combining the read-out sensel data from a particular photosite with that of surrounding photosites, said surrounding photosites being covered by a variety of red, green or blue CFA materials, to produce an RGB pixel.

The Wikipedia article on demosaicing covers the basics of these ideas quite well.

In the case of sensels read from a monochrome sensor, the idea of a one-to-one relationship between sensels read from the photosite array and pixels in the data window can be straightforward. Often there is a conversion of the sensel data read from the photosites to a different representation (e.g. integer to float) along with scaling of the individual values.

A common spatial scaling is from a 2880 x 1620 acquisition format to a 1920 x 1080 HD format. In this case, a camera solver will want to know the original set of photosites that contributed sensel values to the downscaler: their number, and their position. Through a combination of sensorAcquisitionRectangle, sensorPhotositePitch, sensorOverallDimensions and sensorCenterOffset, the application can know exactly the area on the sensor on which the light fell to create the sensel values that produced a monochrome image.

RGB images are more complicated. RGB pixel reconstruction is a form of filtering, and kernels are square, with relatively small span, e.g. 5x5 or 7x7. Edge handling for the kernel is important; the Wikipedia article describing an image processing kernel covers it well.

Elements of the reconstruction kernel that are never at the center of the kernel are not counted as part of the sensorAcquisitionRectangle. Recalling the simple case above of a non-spatially-scaled 2880 x 1620 monochrome image being in 1:1 correspondence with an array of photosites on the sensor, if we are instead reading from a CFA-covered sensor to reconstruct a 2880 x 1620 RGB image, the actual array of all photosites whose sensel values were fed into a 5x5 reconstruction kernel would not be 2880 x 1620, but 2884 x 1624. Nevertheless, the size of the sensorAcquisitionRectangle would be 2880 x 1620.

Camera systems differ on how to handle the case where the position of the RGB reconstruction kernel is such that one or more elements of the kernel do not correspond to physical photosites; these are edge cases in every sense of the phrase.

xDensity float

Horizontal output density, in pixels per inch. The image's vertical output density is xDensity * pixelAspectRatio.

longitude float

For images of real objects, the location where the image was recorded. Longitude and latitude are in degrees east of Greenwich and north of the equator. Altitude is in meters above sea level. For example, Kathmandu, Nepal is at longitude 85.317, latitude 27.717, altitude 1305.

latitude float
altitude float

Camera ID

These attributes identify the camera. All are optional.

attribute name type definition
cameraMake string

Manufacturer or vendor of the camera. If present, the value should be UTF-8-encoded and have a nonzero length.

cameraModel string

Model name or model number of the camera. If present, the value should be UTF-8-encoded and have a nonzero length.

cameraSerialNumber string

Serial number of the camera If present, the value should be UTF-8-encoded and have a nonzero length. Note that despite the name, the value can include non-digits as well as digits.

cameraFirmwareVersion string

The firmware version of the camera. If present, the value should be UTF-8-encoded and have a nonzero length.

cameraUuid string

Identifies this camera uniquely among all cameras from all vendors.

Uniqueness could be accomplished with, e.g., a MAC address, a concatenation of cameraMake, cameraModel, cameraSerialNumber, etc. The string may have arbitrary format; it doesn't need to follow the UUID 128-bit string format, even though that is implied by the name.

If present, the value should be UTF-8-encoded and have a nonzero length.

cameraLabel string

Text label identifying how the camera was used or assigned, e.g. "Camera 1 Left", "B Camera", "POV", etc

If present, the value should be UTF-8-encoded and have a nonzero length.

Camera State

These attributes describe the camera settings. All are optional.

attribute name type definition
cameraCCTSetting float

Color temperature, in Kelvin, configured for the physical or virtual camera creating or capturing the image.

The cameraCCTSetting is primarily forensic, and indicates the stated color balance of a film stock, the color temperature setting on a physical digital camera or the nominal color temperature of the scene adopted white as passed to a virtual camera's API.

A professional digital cinema cameras is not constrained to map every supplied correlated color temperature to a point on the curve of a Planckian radiator, or map every supplied color temperature to a chromaticity corresponding to a combination of the three principal components forming a basis for the CIE D series of illuminants.

Often, lower color temperatures are on the Planckian locus, higher color temperatures are on a locus of CIE D series chromaticities, and the camera performs a crossfade (typically a linear crossfade) between the two for intermediate temperatures. That the start and end of the crossfade could differ for every camera vendor -- or even across cameras offered by the same vendor -- means that no universal algorithm can map a camera color temperature setting (combined with a tint setting, see below) into a scene adopted white chromaticity.

The most common use for the cameraCCTSetting attribute is to feed its value into a camera-vendor-provided application or API, along with a cameraTintSetting attribute value, to reproduce the color processing done in-camera on set.

If a cameraCCTSetting attribute is provided, and no cameraTintSetting is provided, then a value of zero should be passed to any application or API using the cameraCCTSetting and cameraTintSetting.

cameraTintSetting float

Green/magenta tint configured for the physical or virtual camera creating or capturing the image.

The cameraTintSetting is primarily forensic. There is no vendor- independent mapping from a unit of tint to a distance on a chromaticity diagram. One camera vendor might choose a color space (e.g. the CIE 1960 UCS) and have a unit amount of tint represent some delta uv distance from the point by the cammeraCCTSetting and a tint value of 0. Another might choose to express the effect of tint by analogy to a traditional unit from a film workflow, e.g. a Kodak or Rosco color correction filter. About the only guaranteed commonality is that all camera vendor tint schemes have positive values shift the adopted scene white towards green, and negative values toward magenta.

If the camera vendor maps cameraCCTSetting to a point defined by a linear crossfade between a Planckian blackbody locus and loci of CIE D Series illuminants, the slope of the tint isotherm at the exact points where the linear crossfade starts and ends can be indeterminate and an inverse mapping from chromaticity to a pair of CCT and tint can be one-to-many.

The most common use for the cameraTintSetting attribute is to feed its value into a camera-vendor-provided application or API, along with a cameraCCTSetting attribute value, to reproduce the color processing done in-camera on set.

cameraColorBalance V2f

Chromaticity in CIE 1960 UCS coordinates indicating a color the user of the camera would like the camera to treat as neutral, and corresponding to a particular camera configuration of make, model, camera firmware version, CCT setting and tint setting.

Note that this is not necessarily (or even probably) the same chromaticity as that of the scene adopted white stored in an adoptedNeutral attribute (if present).

For example, if a physical digital cinema camera was configured with a CCT of 3200K and a tint of -3 (in some camera vendor dependent unit), and the camera output had been processed such that the image containing this attribute was encoded as per SMPTE ST 2065-4:2023, then the adoptedNeutral attribute would have the value corresponding to the ACES neutral chromaticity, very near that of CIE Illuminant D60, whereas the cameraColorBalance would have a chromaticity much, much warmer than that of the adoptedNeutral attribute.

isoSpeed float

The ISO speed of the film or the ISO setting of the camera that was used to record the image.

expTime float

Exposure time, in seconds

shutterAngle float

Shutter angle, in degrees

For a physical film or digital camera, changing the shutter angle inexorably affects both motion blur and exposure. For a CG camera, the parameters to the renderer control whether or not changing the shutter angle affects simulation of either or both of these phenomena.

captureRate Rational

Capture rate, in frames per second, of the image sequence to which the image belongs, represented as a rational number

For variable frame rates, time-lapse photography, etc. the capture rate r is calculated as:

r = 1 / (tN - tNm1)

where tn is the time, in seconds, of the center of frame N's exposure interval, and tNm1 is the time, in seconds, of the center of frame N-1's exposure interval.

Both the numerator and denominator of r must be strictly positive.

Lens ID

These attributes identify the lens. All are optional.

attribute name type definition
lensMake string

Manufacturer or vendor of the lens. If present, the value should be UTF-8-encoded and have a nonzero length.

lensModel string

Model name or model number of the lens. If present, the value should be UTF-8-e coded and have a nonzero length.

lensSerialNumber string

Serial number of the lens

Note that despite the name, the value can include non-digits as well as digits.

If present, the value should be UTF-8-encoded and have a nonzero length.

lensFirmwareVersion string

Firmware version of the lens. If present, the value should be UTF-8-encoded and have a nonzero length.

Lens State

These attributes describe the lens settings. All are optional.

attribute name type definition
nominalFocalLength float

Mumber printed on barrel of a prime lens, or number next to index mark on a zoom lens, in units of millimeters.

Nominal focal length is appropriate for asset tracking of lenses (e.g. a camera rental house catalogs its lens stock by nominal focal length).

pinholeFocalLength float

In the simplest model of image formation, the distance between the pinhole and the image plane, in units of millimeters.

When a CGI application supplies a method for an artist to provide focal length to some calculation, pinhole focal length is almost always the appropriate number to convey to the application.

effectiveFocalLength float

In the thick lens model, the effective focal length is the distance between the front focal point and the front nodal point, or equivalently the back focal point and the back nodal point, in units of millimeters.

The effective focal length is an abstraction used in lens design and, unless a CGI application is sophisticated enough to be using the thick lens model, should not be supplied to the application; for normal CGI applications, pinhole focal length should be used.

Note that the forward and back lens nodal points mentioned above are distinct in meaning and in position from the forward and back lens entrance pupils. A 'no-parallax' rotation is rotation around the forward lens entrance pupil.

entrancePupilOffset float

The axial distance from the image plane to the entrance pupil, in units of millimeters. A larger entrance pupil offset means the entrance pupil is closer to the object.

Note that in some lens configurations, the entrance pupil offset can be negative.

aperture float

The f-number of the lens, computed as the ratio of lens effective focal length to the diameter of lens entrance pupil at the time the image was created or captured.

tStop float

The ratio of lens effective focal length to diameter of entrance pupil divided by the square root of the transmittance the lens presents to a paraxial ray. Note that tStop, like aperture, must be strictly positive; and that tStop will always be a larger number than aperture.

focus float

The camera's focus distance, in meters

Editorial

These attribute help to document the image. All are optional.

attribute name type definition
owner string

Name of the owner of the image.

comments string

Additional image information in human-readable form, for example a verbal description of the image.

capDate string

The date when the image was created or captured, in local time, and formatted as YYYY:MM:DD hh:mm:ss, where YYYY is the year (4 digits, e.g. 2003), MM is the month (2 digits, 01, 02, ... 12), DD is the day of the month (2 digits, 01, 02, ... 31), hh is the hour (2 digits, 00, 01, ... 23), mm is the minute, and ss is the second (2 digits, 00, 01, ... 59).

utcOffset float

Universal Coordinated Time (UTC), in seconds: UTC == local time + utcOffset.

keyCode KeyCode

For motion picture film frames. Identifies film manufacturer, film type, film roll and frame position within the roll.

framesPerSecond Rational

Defines the nominal playback frame rate for image sequences, in frames per second. Every image in a sequence should have a framesPerSecond attribute, and the attribute value should be the same for all images in the sequence. If an image sequence has no framesPerSecond attribute, playback software should assume that the frame rate for the sequence is 24 frames per second.

In order to allow exact representation of NTSC frame and field rates, framesPerSecond is stored as a rational number. A rational number is a pair of integers, n and d, that represents the value n/d.

timeCode TimeCode

Time and control code.

imageCounter int

An image number.

For a sequence of images, the image number increases when the images are accessed in the intended play order. imageCounter can be used to order frames when more standard ordering systems are inapplicable, including but not limited to uniquely identifying frames of high-speed photography that would have identical time codes, ordering sequences of frames where some frames may have been captured and discarded due to real-time constraints, or ordering frames in a sequence that is intermittently accumulated from devices such as security cameras triggered by motion in an environment.

reelName string

Name for a sequence of unique images. If present, the value should be UTF-8-encoded and have a nonzero length.

ascFramingDecisionList string

JSON-encoded description of framing decisions associated with the captured image, in a format termed 'ASC-FDL', designed and documented by the American Society of Cinematographers (ASC).

If present, the value should be UTF-8-encoded and have a nonzero length.

Encoded Image Color Characteristics

These attributes describe the color characteristics of the image. All are optional.

attribute name type definition
chromaticities Chromaticities

For RGB images, specifies the CIE (x,y) chromaticities of the primaries and the white point.

whiteLuminance float

For RGB images, defines the luminance, in Nits (candelas per square meter) of the RGB value (1.0, 1.0, 1.0).

If the chromaticities and the luminance of an RGB image are known, then it is possible to convert the image's pixels from RGB to CIE XYZ tristimulus values.

adoptedNeutral V2f

Specifies the CIE (x,y) coordinates that should be considered neutral during color rendering. Pixels in the image file whose (x,y) coordinates match the adoptedNeutral value should be mapped to neutral values on the display.

Anticipated Use in Pipeline

These attributes relate to the application of the image in the motion picture pipeline. All are optional.

attribute name type definition
envmap Envmap

If this attribute is present, the image represents an environment map. The attribute's value defines how 3D directions are mapped to 2D pixel locations.

wrapmodes string

Determines how texture map images are extrapolated. If an OpenEXR file is used as a texture map for 3D rendering, texture coordinates (0.0, 0.0) and (1.0, 1.0) correspond to the upper left and lower right corners of the data window. If the image is mapped onto a surface with texture coordinates outside the zero-to-one range, then the image must be extrapolated. This attribute tells the renderer how to do this extrapolation. The attribute contains either a pair of comma-separated keywords, to specify separate extrapolation modes for the horizontal and vertical directions; or a single keyword, to specify extrapolation in both directions (e.g. "clamp,periodic" or "clamp"). Extra white space surrounding the keywords is allowed, but should be ignored by the renderer ("clamp, black " is equivalent to "clamp,black"). The keywords listed below are predefined; some renderers may support additional extrapolation modes:

  • black - pixels outside the zero-to-one range are black
  • clamp - texture coordinates less than 0.0 and greater than 1.0 are clamped to 0.0 and 1.0 respectively.
  • periodic - the texture image repeats periodically
  • mirror - the texture image repeats periodically, but every other instance is mirrored
multiView StringVector

Defines the view names for multi-view image files. A multi-view image contains two or more views of the same scene, as seen from different viewpoints, for example a left-eye and a right-eye view for stereo displays. The multiView attribute lists the names of the views in an image, and a naming convention identifies the channels that belong to each view.

deepImageState DeepImageState

Specifies whether the pixels in a deep image are sorted and non-overlapping.

Note: this attribute can be set by application code that writes a file in order to tell applications that read the file whether the pixel data must be cleaned up prior to image processing operations such as flattening. The OpenEXR library does not verify that the attribute is consistent with the actual state of the pixels. Application software may assume that the attribute is valid, as long as the software will not crash or lock up if any pixels are inconsistent with the deepImageState attribute. See Interpreting OpenEXR Deep Pixels for more details.

idManifest CompressedIDManifest

ID manifest.

Deprecated Attributes

These attributes are how obsolete and are no longer officially supported by file format. Note that you can still read and write images that contain these attributes.

attribute name definition
dwaCompressionLevel

Sets the quality level for images compressed with the DWAA or DWAB method.

renderingTransform

Specify the names of the CTL functions that implements the intended color rendering and look modification transforms for this image.

lookModTransform
maxSamplesPerPixel

Stores the maximum number of samples used by any single pixel within a deep image. If this number is small, it may be appropriate to read the deep image into a fix-sized buffer for processing. However, this number may be very large.

Note that the library never actually enforced the correctness of this value, so if it appears in legacy files, it should not be trusted.