Skip to content
This repository has been archived by the owner on Nov 28, 2017. It is now read-only.

js.Array[JValue] vs Array[JValue] #13

Open
OlivierBlanvillain opened this issue Oct 8, 2015 · 10 comments
Open

js.Array[JValue] vs Array[JValue] #13

OlivierBlanvillain opened this issue Oct 8, 2015 · 10 comments

Comments

@OlivierBlanvillain
Copy link
Contributor

The difference on the type signature on case class JArray(value: js.Array[JValue] = js.Array()) and case class JArray(value: Array[JValue] = Array.empty prevents from writing generic code on org.json4s.ast.fast.

My question is the following: how much do we lose if we cross compile jvm/fast/JValue.scala?

@OlivierBlanvillain
Copy link
Contributor Author

Here my workaround, which does not feel right at all: org.json4s.ast.safe.JArray(Vector(j.toSafe)).toFast

@mdedetrich
Copy link
Contributor

This is something that I am sought on the fence for. org.json4s.ast.fast is designed to be as fast as possible, hence why it uses js.Array on the JS platform, and not Array (and js.Array is quite a bit faster, its your native [] in Javascript)

I also had a look at this in broader context, in the sense that its far more likely that someone will construct org.json4s.ast.fast from a network request, rather than sharing the constructors in the form of common code. I place a higher value on the genericness of working with org.json4s.ast.fast, rather than constructing it (ideally I would like to be fully generic, but there is the proviso from the first paragraph)

I am open for suggestions, but for now, you would probably have to explicitly deal with the construction of a org.json4s.ast.fast JArray for different VM's (JS/JVM). A js.Array should have the same methods as a normal Array (plus a few others like pop)

@mdedetrich
Copy link
Contributor

Also note that there are .toArray and .toJSArray methods, see http:https://www.scala-js.org/doc/js-interoperability.html

@OlivierBlanvillain
Copy link
Contributor Author

IMO you should aim for a single crossProject.crossType(CrossType.Pure) for both safe and fast implementations. If the goal of json4s is "to provide a single AST to be used by other scala json libraries", these details should not be exposed to the library users, if not entirely left to the compiler.

and js.Array is quite a bit faster, its your native [] in Javascript

I think this would be worth a benchmark. Looking at the output of a fastOptJS on a trivial example it looks like scala.Array are compiled down to something very close to native loop over native []:

example.scala:

import scala.scalajs.js

object ScalaJSExample extends js.JSApp {
  def main(): Unit = {
    val scalaArray = Array(1, 2, 3)
    scalaArray foreach println

    val jsArray = js.Array(1, 2, 3)
    jsArray foreach println
  }
}

Relevant example-fastopt.js part:

$c_Lexample_ScalaJSExample$.prototype.main__V = (function() {
  var scalaArray = $m_s_Array$().apply__I__sc_Seq__AI(1, new $c_sjs_js_WrappedArray().init___sjs_js_Array([2, 3]));
  var i = 0;
  var len = scalaArray.u["length"];
  while ((i < len)) {
    var idx = i;
    var arg1 = scalaArray.u[idx];
    var this$5 = $m_s_Console$();
    var this$6 = this$5.outVar$2;
    $as_Ljava_io_PrintStream(this$6.tl$1.get__O()).println__O__V(arg1);
    i = ((1 + i) | 0)
  };
  var jsArray = [1, 2, 3];
  var i$1 = 0;
  var len$1 = $uI(jsArray["length"]);
  while ((i$1 < len$1)) {
    var index = i$1;
    var arg1$1 = jsArray[index];
    var this$9 = $m_s_Console$();
    var this$10 = this$9.outVar$2;
    $as_Ljava_io_PrintStream(this$10.tl$1.get__O()).println__O__V(arg1$1);
    i$1 = ((1 + i$1) | 0)
  }
});

If I where you I would even get ride of the safe/fast separation, and provide a single shared AST with every interface duplicated. For instance, JArray could have two constructors, one with Vector and one with Array, and expose both interfaces, with something like myArray.value and myArray.unsafeValue. You would probably opt for an internal Array storage, and to a toVector at the edge of the interface. But that's maybe another debate :)

@mdedetrich
Copy link
Contributor

IMO you should aim for a single crossProject.crossType(CrossType.Pure) for both safe and fast implementations. If the goal of json4s is "to provide a single AST to be used by other scala json libraries", these details should not be exposed to the library users, if not entirely left to the compiler.

Originally the intention was to provide a single AST, but it turned out that it was impossible to make anyone happy with any resulting design. Some people needed ordering, some people needed speed, some people needed safety, some people needed the library to use minimal memory, etc etc.

You can't make a JSON library that satisfies all of those requirements decently

I think this would be worth a benchmark. Looking at the output of a fastOptJS on a trivial example it looks like scala.Array are compiled down to something very close to native loop over native []:

Agreed, I need to set up a javascript benchmarking suite. Note that the idea of org.json4s.ast.fast is that its typically going to be used for things like high performance parsers or for streaming (i.e. what https://github.com/non/jawn does behind the scenes), hence why it uses js.Array. Performance in other areas, such as gradually building the array, matters a lot in these circumstances

If I where you I would even get ride of the safe/fast separation, and provide a single shared AST with every interface duplicated.

As mentioned before, this was tried before. There wasn't a single datastructure that is both high performance/memory efficient and is safe.

For instance, JArray could have two constructors, one with Vector and one with Array, and expose both interfaces, with something like myArray.value and myArray.unsafeValue. You would probably opt for an internal Array storage, and to a toVector at the edge of the interface. But that's maybe another debate :)

This would use a huge amount of memory, and its not something you can opt out of.

I think the easiest way to think about it, is that org.json4s.ast.fast is analogous to a Array[Byte], where as org.json4s.ast.safe is a String. You need both of them and they both need to be seperate in terms of interface.

For the same reason, we expect some libraries to implement org.json4s.ast.fast and some to implement org.json4s.ast.safe, and regardless of which version they implement, you can always jump between the two

@OlivierBlanvillain
Copy link
Contributor Author

Hi, sorry for the late reply. I shouldn't have included the last paragraph about the safe/fast separation, that was just me thinking out loud about the possibility of a compromise that would be both safe and fast. Please ignore it for this issue.

The original point I wanted to make was about the differences between the jvm and js versions, in particular in their fast vast variants. In my opinion both jvm/fast and js/fast ADT should have identical type signatures, that is using scala.Array in both places instead of scala.Array in one place and js.Array in the other.

Obviously benchmarks would help settle on this, but my intuition is that you can achieve consistency, and maybe even share most of (if not all) the implementations between the js and jvm versions without trading off performances.

I also back my request with one of point made in the "Principles for Modular, Functional, Approachable Libraries" by Erik Osheim, which goes into this direction: https://www.youtube.com/watch?v=iKyIKozv8a8&t=62m11s.

@mdedetrich
Copy link
Contributor

The original point I wanted to make was about the differences between the jvm and js versions, in particular in their fast vast variants. In my opinion both jvm/fast and js/fast ADT should have identical type signatures, that is using scala.Array in both places instead of scala.Array in one place and js.Array in the other.

Ideally yes, there is however a conflicting aim of org.json4s.ast being as fast as possible (which does mean using js.Array). I am trying to set up a microbench framework for Javascript, but I am still in the process of finding one

Obviously benchmarks would help settle on this, but my intuition is that you can achieve consistency, and maybe even share most of (if not all) the implementations between the js and jvm versions without trading off performances.

One thing to node, is that js.Array is source compatible with Array, its just not type compatible (iirc). So if you have source code that uses both org.json4s.ast.fast on both JVM and JS, you just need to recompile for the respective platforms.

I am happy to have my stance on this changed if the performance between Array and js.Array is negligible, I suppose that asks for a bigger push for getting the benchmarking out for the JS platform ;)

@OlivierBlanvillain
Copy link
Contributor Author

Great, looking forward to seeing the benchmarks then!

I didn't notice that Array and js.Array where source compatible. I the context of the library I'm working on I need to have the same types on both sides, I will stick to org.json4s.ast.safe for now.

@mdedetrich
Copy link
Contributor

Note that it is still possible to use Array and js.Array in your library, it just means you have to set up a different build process to make seperate binaries for both Javascript/JVM (note that you have to do this anyways).

There might be a nicer way to fix this, like using some sought of union type?

I have been doing some research into a benchmarking facility for Scala.js, but I am not getting anything promising. There is however work to get scalameter working for Scala.js (scalameter/scalameter#154 (comment))

@OlivierBlanvillain
Copy link
Contributor Author

If you need some tests cases for a POC benchmark you might be interested in the test suite of the library I'm working on : https://github.com/OlivierBlanvillain/validation/tree/scala-js-squashed/validation-json4s/src/test/scala (still wip). It uses ScalaTest to run the tests in both js and jvm, and outputs the execution time for both platform.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

No branches or pull requests

2 participants