Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semigroup[Function1[I, O]] produces a function that's not stack safe #4089

Closed
mrdziuban opened this issue Dec 20, 2021 · 6 comments · Fixed by #4093
Closed

Semigroup[Function1[I, O]] produces a function that's not stack safe #4089

mrdziuban opened this issue Dec 20, 2021 · 6 comments · Fixed by #4093

Comments

@mrdziuban
Copy link
Contributor

Following up on this conversation in discord. With a stack size of 512k (-Xss512k) I get a StackOverflowError with this code:

val f: Unit => Unit = _ => ()
val g = 1.to(5000).foldLeft(f)((acc, _) => Semigroup[Unit => Unit].combine(acc, f))
// StackOverflowError when I call `g`
g(())

I wasn't able to resolve the error using either Eval or AndThen, which were both suggested in discord, but it's entirely possible I was doing something wrong. Code snippets with my attempts in case they're helpful:

Eval attempt
val f: Unit => Unit = _ => ()
val g = 1.to(5000).foldLeft(Eval.now(f))((acc, _) => acc.map(Semigroup[Unit => Unit].combine(_, f)))
// StackOverflowError when I call `g`
g.value(())
AndThen attempt
// needed to define a `Semigroup[AndThen[I, O]]` as there's not one built in
implicit def andThenSemigroup[I, O](implicit O: Semigroup[O]): Semigroup[AndThen[I, O]] = new Semigroup[AndThen[I, O]] {
  def combine(x: AndThen[I, O], y: AndThen[I, O]): AndThen[I, O] =
    Apply[AndThen[I, *]].map2(x, y)(O.combine)
}

val f: AndThen[Unit, Unit] = AndThen(_ => ())
val g = 1.to(5000).foldLeft(f)((acc, _) => Semigroup[AndThen[Unit, Unit]].combine(acc, f))
// StackOverflowError when I call `g`
g(())
@armanbilge
Copy link
Member

I think if you define your function:

val f: Kleisli[Eval, Unit, Unit] = Kleisli(_ => Eval.now(()))

the semigroup will be stacksafe again.

Otherwise, I don't think this can be fixed 🤔

@mrdziuban
Copy link
Contributor Author

Thanks for continuing to look @armanbilge! Unfortunately that seems to have the same problem

val f: Kleisli[Eval, Unit, Unit] = Kleisli(_ => Eval.now(()))
val g = 1.to(5000).foldLeft(f)((acc, _) => Semigroup[Kleisli[Eval, Unit, Unit]].combine(acc, f))
g(()).value
/*
Exception in thread "main" java.lang.StackOverflowError
  at cats.data.KleisliInstances0_5$$anon$8.FB(Kleisli.scala:353)
  at cats.data.KleisliSemigroup.$anonfun$combine$1(Kleisli.scala:559)
  at cats.data.KleisliSemigroup.$anonfun$combine$1(Kleisli.scala:559)
  ...
*/

@armanbilge
Copy link
Member

Oh dear! Thanks for trying it. That definitely seems like something to be fixed 😅

@armanbilge
Copy link
Member

Related: #3947

@mrdziuban
Copy link
Contributor Author

I was just able to achieve stack safety using Free -- may not be super helpful in fixing the Kleisli issue since cats-free depends on cats-core, not the other way around, but here's the working code:

implicit def freeSemigroup[I, O: Semigroup]: Semigroup[Free[Function1[I, *], O]] =
  Apply.semigroup[Free[Function1[I, *], *], O]

val f: Free[Function1[Unit, *], Unit] = Free.pure(())
val g = 1.to(5000).foldLeft(f)((acc, _) => Semigroup[Free[Function1[Unit, *], Unit]].combine(acc, f))
g.runTailRec.apply(())

@johnynek
Copy link
Contributor

johnynek commented Dec 20, 2021

There are a few issues.

The code you are calling is defined here:

trait Function1Semigroup[A, B] extends Semigroup[A => B] {

it is the naive thing: (i: I) => sg.combine(left(i), right(i))

So, that is clearly going to stack-overflow. It doesn't even define a combineAllOption which could at least solve the problem for foldMap:

def combineAllOption(it: IterableOnce[A => B]): Option[A => B] =
  if (is.isEmpty) None
  else Some { a =>
    // the get is safe because we know the IterableOnce is nonempty
    sg.combineAllOption(it.map(_.apply(a))).get
  }

So, that could be a PR that would marginally improve things.

AndThen is about composition and this semigroup is not about composition.

Lastly, you could imagine making it stack safe this way:

final case class CombineFn[A, B](left: A => B, right: A => B, semiB: Semigroup[B]) extends (A => B) {
  def call(fn: A => B, a: A): Eval[B]
    fn match {
      case CombineFn(l, r, sg) =>
        for {
          lb <- Eval.defer(call(l, a))
          rb <- Eval.defer(call(r, a))
        } yield sg.combine(lb, rb)
      case _ => Eval.now(fn(a))
   }
      
  def apply(a: A): B = call(this, a).value
}

Since Eval isn't in kernel, we could use TailCall from the standard library or you could make a custom stack representation on heap that matches this problem (something like the below, which I wrote without the compiler or tests so it may be wrong):

final case class CombineFn[A, B](left: A => B, right: A => B, semiB: Semigroup[B]) extends (A => B) {
  @tailrec final def call(fn: A => B, a: A, leftB: Option[(B, Semigroup[B])], rightStack: List[(A => B, Semigroup[B])]): B
    fn match {
      case CombineFn(l, r, sg) =>
        call(l, a, leftB, (r, sg) :: rightStack)
      case _ =>
         val b1 =
           leftB match {
              case None => fn(a)
              case Some((bl, sgl)) => sgl.combine(bl, fn(a))
            }
        rightStack match {
          case Nil => b1
          case (r1, sg) :: right1 => call(r1, a, Some((b1, sg)), right1)
        }
   }
      
  def apply(a: A): B = call(left, a, None, (right, semiB) :: Nil)
}

I think the latter solution will be slower in the general case, but will handle stack overflows.

Lastly, you could track the depth of CombineFn when you are building it so you do the expensive apply when the depth gets greater than something like 500 or something, and do the naive thing when depth is < than that (something like AndThen's hack to use a constant amount of stack to do applications).

I'm happy to review a PR and help refine these ideas, but I won't have time to send a PR with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants