Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array.fromFormula with docs and tests #112

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jmagaram
Copy link
Contributor

@jmagaram jmagaram commented Mar 17, 2023

This is helpful for initializing an array when you don't know exactly how many items are going to be in it. Also can be used for int/float ranges like Array.fromFormula(10, i => i <= 200 ? Some(i, i + 10) : None). F# exposes this as Array.unfold and from Googling this seems to be a frequently used term. Another possible name is fromSeed. It's like a generator but don't think we should use that name.

This is meant to round out the array creation functions. I know fromInitializer was added recently to fill this gap and I'm trying to make this more complete and useful. I would rename make and fromInitializer so we'd end up with this list below.

  let fromRepeat: (~length: int, 'a) => array<'a>
  let fromIndexMap: (~length: int, int => 'a) => array<'a>
  let fromOneItem: 'a => array<'a>
  let fromOption: option<'a> => array<'a>
  let fromFormula: ('state => option<('a, 'state)>, 'state) => array<'a>
  let fromArrayLike: Js.Array2.array_like<'a> => array<'a>
  let fromArrayLikeWithMap: (Js.Array2.array_like<'a>, 'a => 'b) => array<'b>

Copy link
Contributor

@glennsl glennsl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems a bit too complicated to me, compared to just having a loop with a simple condition that pushes onto an array.

Compare:

Array.fromFormula(1, i => i<100 ? Some(i, i*2) : None)

to

let loop = (arr, i) =>
  if i < 100 {
    arr->Array.push(i)
    loop(arr, i * 2)
  } else {
    arr
  }
loop([], 1)

or

let arr = []
let i = ref(1)

while i.contents < 100 {
  arr->Array.push(i.contents)
  i := i.contents * 2
}

arr

Array.fromFormula is definitely shorter, but I don't think it's clearer, because it muddles several different concerns, the exit condition, the state and the value returned, in ambiguous ways.

@jmagaram
Copy link
Contributor Author

Would having labels help? I'm not a fan but here's what it could look like. I don't think it is any more complicated than the Array.reduce methods. I'm disappointed you don't like it. I like how flexible it is and find it much simpler (and obviously shorter) to use/write than the alternatives. It supports floats, ints, string, dates, etc. and inclusive and exclusive ranges easily.

let fromSeed = (~start, ~step, ~stop) =>
  fromFormula(start, i => stop(i) ? None : Some(start, step(i)))

fromSeed(~start=1, ~step=i => i * 2, ~stop=i => i <= 100)->ignore // 1, 2, 4, 8, 16, 32, 64
fromSeed(~start=5.0, ~step=i => i -. 0.5, ~stop=i => i >= 0.0)->ignore // 5.0, 4.5, ... 0.0

@jmagaram
Copy link
Contributor Author

One other thing. I remember when I saw that unfold feature in the F# Array module years ago and I thought it was a very clever solution. So I didn't invent and probably couldn't have invented this - just stealing someone else's idea.

@jmagaram
Copy link
Contributor Author

I'm experimenting and trying to make it more intuitive. Here is one that substitutes a record for the option and some labeled parameters. But this isn't as easy to read to me.

type loopOptions<'state, 'item> = Break | Continue({push: 'item, next: 'state})
let fromLoop: (~start: 'state, ~do: 'state => loopOptions<'state, 'item>) => array<'item>

let nums = Array.fromLoop(~start=1, ~do=i => i <= 100 ? Continue({push: i, next: i + 3}) : Break)

Here are a few ways to write it using the original proposal. Maybe one of these will look more natural to you.

let nums1 = ArrayEx.fromFormula(1, i => i <= 100 ? Some(i, i * 3) : None)

let nums2 = ArrayEx.fromFormula(1, i =>
  Some(i)->Option.filter(i => i <= 100)->Option.map(i => (i, i * 3)))

let nums3 = ArrayEx.fromFormula(1, i =>
  Some(i)->Option.flatMap(i => i <= 100 ? Some(i, i * 3) : None))

// fibonacci just for fun
let fibs = Array.concat(
  [0, 1],
  ArrayEx.fromFormula((0, 1), ((a, b)) => a + b <= 100 ? Some(a + b, (b, a + b)) : None))

@glennsl
Copy link
Contributor

glennsl commented Mar 22, 2023

I think this is fairly readable, even if using an option instead of a custom variant:

let nums = Array.fromLoop(~start=1, ~do=i => i <= 100 ? Continue({push: i, next: i + 3}) : Break)

With a few tweaks to the naming I think it would be even clearer:

let nums = Array.fromGenerator(~initialState=1, i => i <= 100 ? Some({state: i, element: i + 3}) : None)

It might also be an idea to separate state generation and exit condition from element generation:

let nums = Array.fromGenerator(
  ~initialState=1,
  ~nextState=i => i <= 100 ? Some(i) : None,
  i => i * 3,
)

But this is already getting quite close to being just a slightly more optimized chain of simpler functions that already exist:

Array.range(0, 100)->Array.map(i => i * 3)

And if you really need to optimize, you'll almost always do better with an encapsulated imperative loop anyway.

@jmagaram
Copy link
Contributor Author

I think we're going to have to agree to disagree. I believe my original proposal here is as elegant/simple/flexible as it is going to get for initializing various arrays of ints, floats, dates, and other data types, in increasing or decreasing order, and is barely more complicated than the signature on Array.reduce. I'm not sure you saw my doc comments. I found it helpful while attempting to use this function. I also put some examples in there for the common case of a stepped range.

fromFormula(state, f) creates an array that contains the elements generated by a
computation. The function f is first passed state and then is repeatedly called
to build the array. When f returns a tuple of Some(item, state), item is
pushed onto the array, and state is fed into the next computation. When f
returns None, the computation is complete and the new array is returned.

If people find it unintuitive/confusing they can write an imperative loop. I'm used to this style of programming and like it. Performance-wise the implementation is close to optimal unless you really go crazy. So I'm really hoping we can add this to the library and that you will try it with the doc help and reconsider.

I don't think we can use the name "generator" because it means something specific for Javascript that we don't support. I'd be very happy with fromSeed or fromFormula. I don't see the Array.range function you're talking about. Maybe that's the fromIndexMap I'm talking about? I'm not sure who is in charge here and make the decisions. But again I'd really like to see some naming "improvements" for these array creation functions.

let fromRepeat: (~length: int, 'a) => array<'a>
let fromIndexMap: (~length: int, int => 'a) => array<'a>
let fromOneItem: 'a => array<'a>
let fromOption: option<'a> => array<'a>
let fromFormula: ('state => option<('a, 'state)>, 'state) => array<'a>
let fromArrayLike: Js.Array2.array_like<'a> => array<'a>
let fromArrayLikeWithMap: (Js.Array2.array_like<'a>, 'a => 'b) => array<'b>

@glennsl
Copy link
Contributor

glennsl commented Mar 22, 2023

I believe my original proposal here is as elegant/simple/flexible as it is going to get for initializing various arrays of ints, floats, dates, and other data types, in increasing or decreasing order, and is barely more complicated than the signature on Array.reduce.

For something that tries to fit everything but the ktichen sink, I suppose it could be worse. But when there are already existing functions that do all of these things better, I still see little reason for adding it.

Here's the other example from the docstring, made much clearer by using an existing and more appropriate function:

Array.fromFormula(3.0, i => i >= 0.0 ? Some(i, i -. 0.5) : None)
Float.rangeWithOptions(3.0, 0.0, {step: -0.5, inclusive: true})

I don't see the Array.range function you're talking about.

Sorry, it's Int.range. And its sibling Int.rangeWithOptions for more flexibility.

@zth
Copy link
Collaborator

zth commented Mar 23, 2023

This is too niche to be included in Core, and fits better in user land.

@jmagaram
Copy link
Contributor Author

Some of the complexity of fromFormula is there are different types for 'state and 'a and so when the next item is generated it must be a tuple of ('a, 'state). Below is how it would look without that. Not sure what to name it. With this version I think it is obvious that everything that gets generated will be included so there is no inclusive/exclusive issue. It is less intuitive than Int.range. But it works for all data types and not just addition/subtraction.

let fromRange: ('a, 'a => option<'a>) => array<'a>

let nums1 = Array.fromRange(3.0, i => Some(i -. 0.5)->Option.filter(i => i >= 0.0))
let nums2 = Array.fromRange(3.0, i => i > 0.0 ? Some(i -. 0.5) : None)

I agree the kitchen sink with 'state is niche. But it lets you do useful things with very little code. Here it generates binary digits...

let binary = n => Array.fromSeed(n, i => i > 0 ? Some(mod(i, 2), i / 2) : None)->Array.reverse

This divides an array into chunks of arbitrary size...

let chunk = (xs, size) =>
  ArrayEx.fromUnfold(0, i => {
    switch xs->Array.slice(~start=i, ~end=i + size) {
    | [] => None
    | c => Some(c, i + size)
    }
  })

Unfold in ocaml
Unfold in Swift

Maybe there is a way to have both simple mathematical sequences and this more general unfold mechanism.

@zth
Copy link
Collaborator

zth commented Mar 23, 2023

Oh, I'm not saying the functionality isn't useful, or that you shouldn't use it. Just that it doesn't fit in the scope of Core.

@glennsl
Copy link
Contributor

glennsl commented Mar 23, 2023

Agree that it's useful, and chunk is a good example of it. In a "normal" standard library I think it would be a good addition, though I still don't think this specific design is all that good. But this library ought to have something more similar to generators instead I think.

But it lets you do useful things with very little code.

That's the wrong goal. It is true that less code often makes it easier to understand what's going on because there's less of it to understand. But it also often does the opposite, when useful hints are made ambiguous or eliminated, or multiple concepts are conflated into one, such as the case is here. The measure you want to optimize for is clarity. And more so for reading, not writing, since we tend to spend a lot more time doing the former.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants