-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathion-rfc-07-real-numbers.nroff
400 lines (320 loc) · 12.2 KB
/
ion-rfc-07-real-numbers.nroff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
.tm 7. Real Numbers ................................................... \n%
.ti 0
7. Real Numbers
Ion supports two types of real numbers: floats and decimals.
Both the text and binary representations of an Ion value stream may be
compressed in one or more GZIP [RFC 1952] members.
.tm _ 7.1. Floats .................................................... \n%
.ti 0
7.1. Floats
Ion supports IEEE-754 binary floating point values using the IEEE-754
32-bit (binary32) and 64-bit (binary64) encodings. In the data model,
all floating point values are treated as though they are binary64 (all
binary32 encoded values can be represented exactly in binary64).
.tm _ 7.1.1. Encoding Considerations ............................. \n%
.ti 0
7.1.1. Encoding Considerations
In text, binary float is represented using familiar base-10 digits.
While this is convenient for human representation, there is no explicit
notation for expressing a particular floating point value as binary32 or
binary64. Furthermore, many base-10 real numbers are irrational with
respect to base-2 and cannot be expressed exactly in either binary
floating point encoding (e.g. 1.1e0).
Because of this asymmetry, the rules for Ion text float notation when
round-tripping to Ion binary MUST be observed:
.in 9
.ti 6
o Any text notation that can be exactly represented as binary32 MAY be
encoded as either binary32 or binary64 in Ion binary.
.ti 6
o Any text notation that can only be exactly represented as binary64
MUST be encoded as binary64 in Ion binary.
.ti 6
o Any text notation that has no exact representation (i.e. irrational
in base-2 or more precision than the binary64 mantissa), MUST be
encoded as binary64. This is to ensure that irrational numbers or
truncated values are represented in the highest fidelity of the
float data type.
.in 3
When encoding a decimal real number that is irrational in base-2 or has
more precision than can be stored in binary64, the exact binary64 value
is determined by using the IEEE-754 round-to-nearest mode with a
round-half-to-even as the tie-break. This mode/tie-break is the common
default used in most programming environments and is discussed in detail in
"Correctly Rounded Binary-Decimal and Decimal-Binary Conversions" (see
http://ampl.com/REFS/rounding.pdf). This conversion algorithm is illustrated
in a straightforward way in Clinger's Algorithm (see
http://www.cesura17.net/~will/professional/research/papers/howtoread.pdf).
When encoding a binary32 or binary64 value in text notation, an
implementation MAY want to consider the approach described in "Printing
Floating-Point Numbers Quickly and Accurately" (see
http://www.cs.indiana.edu/~dyb/pubs/FP-Printing-PLDI96.pdf).
.tm _ 7.1.2. Special Values ...................................... \n%
.ti 0
7.1.2. Special Values
The IEEE-754 binary floating point encoding supports special non-number
values. These are represented in the binary format as per the encoding
rules of the IEEE-754 specification, and are represented in text by the
following keywords:
.in 9
.ti 6
o nan - denotes the not a number (NaN) value.
.ti 6
o +inf - denotes positive infinity.
.ti 6
o -inf - denotes negative infinity.
.in 3
The Ion data model considers all encodings of positive infinity to be
equivalent to one another and all encodings of negative infinity to be
equivalent to one another. Thus, an implementation encoding +inf or -inf
in Ion binary MAY choose to encode it using the binary32 or
binary64 form.
The IEEE-754 specification has many encodings of NaN, but the Ion data
model considers all encodings of NaN (i.e. all forms of signaling or
quiet NaN) to be equivalent. Note that the text keyword nan does not map
to any particular encoding, the only requirement is that an
implementation emit a bit-pattern that represents an IEEE-754 NaN value
when converting to binary (e.g. the binary64 bit pattern of
0x7FF8000000000000).
An important consideration is that NaN is not treated in a consistent
manner between programming environments. For example, Java defines that
there is only one canonical NaN value and it happens to be signaling.
On C/C++, on the other hand, NaN is mostly platform defined, but on
platforms that support it, the NAN macro is a quiet NaN. In general,
common programming environments give testing routines for NaN, but no
consistent way to represent it.
.tm _ 7.1.3. Examples ............................................ \n%
.ti 0
7.1.2. Examples
To illustrate the text/binary round-tripping rules above, consider the
following examples.
The Ion text literal 2.147483647e9 overflows the 23-bits of significand
in binary32 and MUST be encoded in Ion binary as a binary64 value. The
Ion binary encoding for this text literal is as follows:
.KS
.nf
0x48 0x41 0xDF 0xFF 0xFF 0xFF 0xC0 0x00 0x00
.KE
.in 3
The base-2 irrational literal 1.2e0 following the rounding and encoding
rules MUST be encoded in Ion binary as:
.KS
.nf
0x48 0x3F 0xF3 0x33 0x33 0x33 0x33 0x33 0x33
.KE
.in 3
Although the textual representative of 1.2e0 itself is irrational, its
canonical form in the data model is not (based on the rounding rules),
thus the following text forms all map to the same binary64 value:
.KS
.nf
// the most human-friendly representation
1.2e0
// the exact textual representation in base-10 for the binary64 value
// 1.2e0 represents
1.1999999999999999555910790149937383830547332763671875e0
// a shortened, irrational version, but still the same value
1.1999999999999999e0
// a lengthened, irrational version that is still the same value
1.19999999999999999999999999999999999999999999999999999999e0
.in 3
.tm _ 7.2. Decimals .................................................. \n%
.ti 0
7.2. Decimals
Ion supports a decimal numeric type to allow accurate representation of
base-10 floating point values such as currency amounts. An Ion Decimal
has arbitrary precision and scale. This representation preserves
significant trailing zeros when converting between text and binary forms.
Decimals are supported in addition to the traditional base-2 floating
point type. This avoids the loss of exactness often incurred when storing
a decimal fraction as a binary fraction. Many common decimal numbers with
relatively few digits cannot be represented as a terminating
binary fraction.
.tm _ 7.2.1. Data Model ........................................... \n%
.ti 0
7.2.1. Data Model
Ion decimals follow the IBM Hursley Lab General Decimal Arithmetic
Specification (see: http://speleotrove.com/decimal/decarith.html), which
defines an abstract decimal data model (see:
http://speleotrove.com/decimal/damodel.html) represented by the following
3-tuple:
.KS
.nf
(<sign 0|1>, <coefficient: unsigned integer>, <exponent: integer>)
.KE
.in 3
Decimals should be considered equivalent if and only if their data model
tuples are equivalent, where exponents of +0 and -0 are considered
equivalent. All forms of positive zero are distinguished only by the
exponent. All forms of negative zero, which are distinct from all forms
of positive zero, also are distinguished only by the exponent.
.tm _ 7.2.2. Text Format ......................................... \n%
.ti 0
7.2.2. Text Format
The Hursley rules for describing a finite value converting from textual
notation must be followed. The Hursley rules for describing a special
value are not followed--the rules for
.in 9
.ti 6
o infinity - rule is not applicable for Ion Decimals
.ti 6
o nan - rule is not applicable for Ion Decimals
.in 3
Specifically, the rules for getting the integer coefficient from the
decimal-part (digits preceding the exponent) of the textual
representation are specified as follows.
.in 6
If the decimal-part included a decimal point the exponent is then
reduced by the count of digits following the decimal point (which may
be zero) and the decimal point is removed. The remaining string of
digits has any leading zeros removed (except for the rightmost digit)
and is then converted to form the coefficient which will be zero or
positive.
.in 3
Where X is any unsigned integer, all of the following formulae can be
demonstrated to be equivalent using the text conversion rules and the
data model.
.KS
.nf
// Exponent implicitly zero
X.
// Exponent explicitly zero
Xd0
// Exponent explicitly negative zero (equivalent to zero).
Xd-0
.KE
.in 3
Other equivalent representations include the following, where Y is the
number of digits in X.
.KS
.nf
// There are Y digits past the decimal point in the
// decimal-part, making the exponent zero. One leading zero
// is removed.
0.XdY
.KE
.in 3
For example, all of the following text Ion decimal representations are
equivalent to each other.
.KS
.nf
0.
0d0
0d-0
0.0d1
.KE
.in 3
Additionally, all of the following are equivalent to each other (but not
to any forms of positive zero).
.KS
.nf
-0.
-0d0
-0d-0
-0.0d1
.KE
.in 3
Because all forms of zero are distinctly identified by the exponent, the
following are not equivalent to each other.
.KS
.nf
// Exponent implicitly zero.
0.
// Exponent explicitly 5.
0d5
.KE
.in 3
All of the following are equivalent to each other.
.KS
.nf
42.
42d0
42d-0
4.2d1
0.42d2
.KE
.in 3
However, the following are not equivalent to each other.
.KS
.nf
// Text converted to 42.
0.42d2
// Text converted to 42.0
0.420d2
.KE
.in 3
.tm _ 7.2.3. Binary Format ....................................... \n%
.ti 0
7.2.3. Binary Format
The encoding of Ion decimals, which follows the decimal data model
described above, is specified in [Ion Binary Encoding].
The following binary encodings of decimal values are all equivalent
to 0d0.
KS
.nf
+-----------------+------------+-------------+
| type descriptor | exponent | coefficient |
| | (VarInt) | (Int) |
+-----------------+------------+-------------+
Most compact encoding of 0d0
+-----------------+
: 0x50 :
+-----------------+
Explicit encoding of 0d0
+-----------------+------------+-------------+
: 0x52 : 0x80 : 0x00 |
+-----------------+------------+-------------+
Explicit encoding of 0d(negative)0
+-----------------+------------+-------------+
: 0x52 : 0xC0 : 0x00 |
+-----------------+------------+-------------+
0d0 with overpadded coefficient
+-----------------+------------+-------------+
: 0x53 : 0x80 : 0x00 0x00 |
+-----------------+------------+-------------+
0d0 with overpadded exponent and coefficient
+-----------------+------------+-------------+
: 0x54 : 0x00 0x80 : 0x00 0x00 |
+-----------------+------------+-------------+
.KE
.in 3
Note: The latter two examples demonstrate overpadded encodings of the
exponent and coefficient subfields. Overpadded encodings such as these
are possible for any decimal and are always equivalent to the
unpadded encoding.
The following binary encodings of decimal values are equivalent to -0d0
(but not to 0d0).
.KS
.nf
+-----------------+------------+-------------+
| type descriptor | exponent | coefficient |
| | (VarInt) | (Int) |
+-----------------+------------+-------------+
Explicit encoding of (negative)0d0
+-----------------+------------+-------------+
: 0x52 : 0x80 : 0x80 |
+-----------------+------------+-------------+
Explicit encoding of (negative)0d(negative)0
+-----------------+------------+-------------+
: 0x52 : 0xC0 : 0x80 |
+-----------------+------------+-------------+
.KE
.in 3
Finally, the following binary encodings of decimal values are equivalent
to 42d0.
.KS
.nf
+-----------------+------------+-------------+
| type descriptor | exponent | coefficient |
| | (VarInt) | (Int) |
+-----------------+------------+-------------+
Explicit encoding of 42d0
+-----------------+------------+-------------+
: 0x52 : 0x80 : 0x2A |
+-----------------+------------+-------------+
Explicit encoding of 42d(negative)0
+-----------------+------------+-------------+
: 0x52 : 0xC0 : 0x2A |
+-----------------+------------+-------------+
.KE
.in 3