forked from arangodb/arangodb
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGELOG
21185 lines (14682 loc) 路 879 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
devel
-----
* Updated arangosync to v2.13.0-preview-4.
* Delay a MoveShard operation for leader change, until the old leader has
actually assumed its leadership and until the new leader is actually in
sync. This fixes a bug which could block a shard under certain circumstances.
This fixes BTS-1110.
* Fixed issue #17367: FILTER fails when using negation (!) on variable whose
name starts with "in". Add trailing context to NOT IN token.
* Do not query vertex data in K_PATHS queries if vertex data is not needed.
* Removed assertions from cluster rebalance js test that obligated the rebalance
plan to always have moves, but there were cases in which all there are none.
* Show number of HTTP requests in cluster query profiles.
* Improved the syntax highlighter for AQL queries in the web interface
with support for multi-line strings, multi-line identifiers in forward
and backticks, colorization of escape sequences, separate tokens for
pseudo-keywords and pseudo-variables, an updated regex for numbers, the
addition of the AT LEAST and WITH COUNT INTO constructs, and the
SHA256() function.
* Enable "collect-in-cluster" optimizer rule for SmartGraph edge collections.
* Improve performance and memory usage of IN list lookups for hash, skiplist
and persistent indexes.
* Improve memory usage tracking for IN list lookups and other RocksDB-based
lookups.
* Remove inactive query plan cache code (was only a stub and never enabled
before).
* Fixed BTS-441: Honor read only mode with disabled authentication
* Obsolete startup option `--database.force-sync-properties`. This option
was useful with the MMFiles storage engine, but didn't have any useful
effect when used with the RocksDB engine.
* Updated OpenSSL to 1.1.1s.
* BTS_483: Added restriction for usage of query cache for streaming and JS
transactions when they are not read-only.
* Remove map and map.gz files from repository and add them to gitignore.
These files are only used for debugging and therefore should not be
included in any release. This also reduces the size of release packages.
* Repair "load indexes into memory" function in the web UI.
* Improved help texts for the collection type and satellite collection
options in the web UI.
* APM-517: Add tooltips with values of the displayed properties after
clicking a node or an edge in the graph viewer.
* Deprecate the startup option `--agency.pool-size`. This option was never
properly supported for any values other than the value of `--agency.size`.
Now any value set for `--agency.pool-size` other than the value set for
`--agency.size` will now produce a fatal error on startup.
* BTS-1082: Updating properties of a satellite collection breaks
replicationFactor.
* FE-159: When creating a database in cluster mode, there are several parameters
required. However they are invisible (nothing shown) if I open DB settings
after creation. Those settings should be visible in readonly mode (grey out).
* BTS-209: Fixed requests to `_admin/execute` treating every payload as plain
text when they're in JSON or velocypack format, but will only treat the
payload as velocypack if specified in the header's `content-type`.
* Fixed issue #17394: Unnecessary document-lookup instead of Index-Only query.
This change improves projection handling so that more projections can be
served from indexes.
* Updated arangosync to v2.13.0-preview-2.
* BTS-1070: Fixed query explain not dealing with an aggregate function without
arguments and the WINDOW node not being defined as an Ast node type name.
* Solve a case of excessive memory consumption in certain AQL queries with
IN filters with very long lists. Free sub-iterators as soon as they are
exhausted.
* Improved shard distribution during collection creation.
* Change default output format of arangoexport from `json` to `jsonl`.
* Added startup option `--query.log-failed` to optionally log all failed AQL
queries to the server log. The option is turned off by default.
* Added startup option `--query.log-memory-usage-threshold` to optionally log
all AQL queries that have a peak memory usage larger than the configured
value. The default value is 4GB.
* Added startup option `--query.max-artifact-log-length` to control the
maximum length of logged query strings and bind parameter values.
This allows truncating overly long query strings and bind parameter values
to a reasonable length. Previously the cutoff length was hard-coded.
* Fixed GitHub issue #17291: Fixed a server crash which could occur in case an
AQL query using a PRUNE or FILTER statement, combined with UDFs (user defined
functions), got executed.
* Improve cardinality estimate for AQL EnumerateCollectionNode in case a
`SORT RAND() LIMIT 1` is used. Here, the estimated number of items is at
most 1.
* ES-1312: fix handling of reaching the WAL archive capacity limit.
* BTS-941: The HTTP API now delivers the correct list of the collection's
shards in case a collection from an EnterpriseGraph, SmartGraph, Disjoint
EnterpriseGraph, Disjoint SmartGraph or SatelliteGraph is being used.
* Log the documents counts on leader and follower shards at the end of each
successful shard synchronization.
* Changed the encoding of revision ids returned by the following REST APIs:
- GET /_api/collection/<collection-name>/revision: the revision id was
previously returned as numeric value, and now it will be returned as
a string value with either numeric encoding or HLC-encoding inside.
- GET /_api/collection/<collection-name>/checksum: the revision id in
the "revision" attribute was previously encoded as a numeric value
in single server, and as a string in cluster. This is now unified so
that the "revision" attribute always contains a string value with
either numeric encoding or HLC-encoding inside.
* Fixed handling of empty URL parameters in HTTP request handling.
* Fixed diffing of completely non-overlapping revision trees, which could
lead to out-of-bounds reads at the right end of the first (smaller) tree.
* Fixed aborting the server process if an exception was thrown in C++ code
that was invoked from the llhttp C code dispatcher. That dispatcher code
couldn't handle C++ exceptions properly.
* Fixed BTS-1073: Fix encoding and decoding of revision ids in replication
incremental sync protocol. Previously, the encoding of revision ids could
be ambiguous under some circumstances, which could prevent shards from
getting into sync.
* Log better diagnosis information in case multiple servers in a cluster are
configured to use the same endpoint.
* Fixed BTS-852 (user's saved queries used to disappear after updating user profile).
* MDS-1016: When creating a new collection the fields "Number of Shards" and
"Replication factor" are greyed out now when the field "Distribute shards
like" is not empty.
* MDS-1019: Make user search case-insensitive and allow search by name.
* BTS-465: Added tests for RandomGenerator and warning that other options
for creating random values that are not Mersenne are deprecated.
* BTS-1008: Update react-autocomplete-input to fix single letter collection bug
when creating a link in the views in the WebUI.
* Improved optimization of functions to be covered by Traversals. Now more functions
should be optimized into the traversal, and some that are not valid should not be optimized
anymore. Fixes #16589.
* BTS-908: Fixed WebUI GraphViewer not being able to create a new edge relation
between two nodes in cases where only one edge definition has been defined
inside the graph definition.
* Fixed BTS-850: Fixed the removal of already deleted orphan collections out
of a graph definition. The removal of an already deleted orphan collection
out of a graph definition failed and has been rejected in case the
collection got dropped already.
* BTS-1061: ARM was not recognized on Apple M1.
* BTS-977: Added an error message for when an unauthorized user makes an
HTTP GET request to current database from a database name that exists which
the user can't access and from a database name that doesn't exist, so both
requests have the same error message (`_db/<dbName>/_api/database/current`).
* BTS-325: Changed the HTTP status code from `400` to `404` of the ArangoDB
error code `ERROR_GRAPH_REFERENCED_VERTEX_COLLECTION_NOT_USED` to handle
this error in accordance to our edge errors.
* Added new AQL function SHA256(value).
* Adjust permissions for "search-alias" views.
Previously, "search-alias" views were visible to users that didn't have read
permissions on the underlying referenced collections. This was inconsistent,
because "arangosearch" views weren't shown to users that didn't have read
permissions on the underlying links.
Now, the behavior for "search-alias" views is the same as for "arangosearch"
views, i.e. "search-alias" views are not shown and are not accessible for
users that don't have at least read permissions on the underlying collections.
* BTS-969: Added restriction for HTTP request `/cluster/rebalance`not to
consider servers that have failed status as a possible target for rebalancing
shards in its execution plan.
* Added index cleanup in Supervision. If an index was not created successfully
and the coordinator which initiated the creation was rebooted or is dead,
then the agency Supervision will drop the index again. If it was created
successfully, the agency Supervision will finalize it.
* BTS-742: Added restriction for, when in smart graph, not accepting satellites
in invalid format when storing a graph (like `{satellites: null}`).
* Temporary fix for BTS-1006 (hides new view types).
* Updated arangosync to v2.12.0.
* Improve upload and download speed of hotbackup by changing the way we use
rclone. Empty hash files are now uploaded or downloaded by pattern, and
all other files are done in batches without remote directory listing,
which allows rclone to parallelize and avoid a lot of unnecessary network
traffic. The format of hotbackups does not change at all.
* Fixed issue BTS-1018: Improve logging of binary velocypack request data.
* BTS-477: added integration tests for covering log parameters.
* Moved the handling of escaping control and unicode chars in the log to
the Logger instead of LogAppenderFile.
* Updated ArangoDB Starter to 0.15.5.
* Updated arangosync to v2.12.0-preview-14.
* Fixed BTS-1017: Fixed a graph search issue, where subqueries lead to
incorrect results when they have been pushed down fully onto a DBServer
when they are in a Hybrid Disjoint SmartGraph context and
SatelliteCollections were part of it.
* Fixed issue BTS-1023:
Added Linux-specific startup option `--use-splice-syscall` to control
whether the Linux-specific splice() syscall should be used for copying
file contents. While the syscall is generally available since Linux 2.6.x,
it is also required that the underlying filesystem supports the splice
operation. This is not true for some encrypted filesystems, on which
splice() calls thus fail.
By setting the startup option `--use-splice-syscall` to `false`, a less
efficient, but more portable user-space file copying method will be
used instead, which should work on all filesystems.
The startup option is not available on other operating systems than Linux.
* Added authenticate header to the HTTP response when status code is 401
for HTTP/2.
* Best quality spam pushed down to DEBUG.
* Updated arangosync to v2.12.0-preview-13.
* Implement prefetch for revision trees, in case a batch is created with
a distinguished collection as for `SynchronizeShard`. This ensures that
the revision tree for the batch will be available when needed, even though
the revision tree for the collection might already have advanced beyond
the sequence number of the snapshot in the batch. This ensures that
shards can get in sync more reliably and more quickly.
* Fixed log with json format not respecting the value of parameter
`--log.shorten-filenames`.
* Updated arangosync to v2.12.0-preview-12.
* Added "intermediateCommits" statistics return value for AQL queries, to
relay the number of intermediate commits back that a write query performed.
* Updated ArangoDB Starter to 0.15.5-preview-3.
* Fixed a rare occuring issue where paths inside a DisjointSmart traversal
containing only satellite relevant nodes were not returned properly
(ES-1265).
* Fixed BTS-926: UI showing the "create index" form to non-admin users.
* Updated Views UI with all changes necessary for the 3.10.0 launch.
* Added message on the UI view of Logs when the user has restricted access,
either because cannot access `_system`, or because is currently in
another database.
* Fix for the Pregel's HITS algorithm using a fixed value instead of the
passed "threshold" parameter. The same applied to the new HITSKleinberg.
* Do not drop follower shard after too many failed shard synchronization
attempts.
* Added startup option `--arangosearch.skip-recovery` to skip the recovery
of arangosearch view links or inverted indexes.
The startup option can be specified multiple times and is expected to either
contain the string `all` (will skip the recovery for all view links and
inverted indexes) or a collection name + link id/name pair (e.g.
`testCollection/123456`, where `123456` is a link/index id or an index name).
This new startup option is an emergency means to speed up lengthy recovery
procedures when there is a large WAL backlog to replay. The normal recovery
will still take place even with the option set, but recovery data for
links/indexes can be skipped. This can improve the recovery speed and reduce
memory usage during the recovery process.
All links or inverted indexes that are marked as to-be-skipped via the
option, but for which there is recovery data, will be marked as "out of sync"
at the end of the recovery.
The recovery procedure will also print a list of links/indexes which it has
marked as out-of-sync.
Additionally, if committing data for a link/index fails for whatever reason,
the link/index is also marked as being out-of-sync.
If an out-of-sync link or index can be used in queries depends on another new
startup option `--arangosearch.fail-queries-on-out-of-sync`. It defaults to
`false`, meaning that out-of-sync links/indexes can still be queries. It the
option is set to `true`, queries on such links/indexes will fail with error
"collection/view is out of sync" (error code 1481).
Links/indexes that are marked out-of-sync will keep the out-of-sync flag
until they are dropped. To get rid of an out-of-sync link/index it is
recommended to manually drop and recreate it. As recreating a link/index may
cause high load, this is not done automatically but requires explicit user
opt-in.
The number of out-of-sync links/indexes is also observable via a new metric
`arangodb_search_num_out_of_sync_links`.
* Added startup option `--rocksdb.periodic-compaction-ttl`.
This option controls the TTL (in seconds) for periodic compaction of
.sst files in RocksDB, based on the .sst file age. The default value
from RocksDB is ~30 days. To avoid periodic auto-compaction, the option
can be set to 0.
* Now the Pregel API returns `{... algorithm: "pagerank", ...}` instead of
`{... algorithm: "PageRank", ...}` when the Page Rank algorithm is run
(in accordance to the documentation).
* Updated arangosync to v2.12.0-preview-11.
* Added integration tests for `--log.escape-control-chars` and
`--log.escape-unicode-chars`.
* Fix SEARCH-350: Crash during consolidation.
* SEARCH-357: Added SUBSTRING_BYTES function.
* A new Pregel algorithm: the version of Hypertext-Induced Topic Search (HITS)
as described in the original paper.
* Web UI: Reduce size and initial render height of a modal (fixes BTS-940).
* Disable optimization rule to avoid crash (BTS-951).
* Fix comparison of JSON schemas on DB servers after there was a schema change
via a coordinator: the schema comparison previously did not take into account
that some ArangoDB versions store an internal `{"type":"json"}` attribute in
the schema, and some don't. Thus two identical schemas could compare
differently.
The correct schema version was always applied and used, and validation of
documents against the schema was also not affected. However, because two
schemas could compare unequal, this could have caused unnecessary repeated
work for background maintenance threads.
* Removed transitive node dependencies.
* Web UI: Now correctly handles the server error response when an error occurred
during the modification of a document or an edge (BTS-934).
* Make graph search case-insensitive (fixes BTS-882).
* BTS-428: Added function DATE_ISOWEEKYEAR that retrieves the number of the
week counting from when the year started in ISO calendar and also the year
it's in.
* Added handling of requests with Transfer-Encodign chunked, which is
not implemented, so returns code HTTP code 501.
* Add progress reporting to RocksDB WAL recovery, in case there are many WAL
files to recover.
* Updated ArangoDB Starter to 0.15.5-preview-2.
* Fixed BTS-918 (incorrectly navigating back 1 level in history when a modal-dialog element is present).
* Updated arangosync to v2.12.0-preview-9.
* Disallowed index creation that covers fields in which the field's name starts
or ends with `:` for single server or cluster when the instance is a
coordinator or single server. This validation only happens for index creation,
so already existing indexes that might use such field names will remain as they are.
* Updated arangosync to v2.12.0-preview-6.
* When using `SHORTEST_PATH`, `K_SHORTEST_PATHS`, `ALL_SHORTEST_PATHS`, or
`K_PATHS` in an AQL Query and the query itself produced warnings during
execution, the type has been wrongly reported. It reported always with
`SHORTEST_PATH` and not the specific used one.
* Updated warning messages raised for non accepted query OPTIONS,
distinguishing between when the OPTIONS attribute is correct, but the value
is in incorrect format, and when the OPTIONS attribute itself is incorrect.
* Since ArangoDB 3.8 there was a loophole for creating duplicate keys in the
same collection. The requirements were:
- cluster deployment
- needs at least two collections (source and target), and the target
collection must have more than one shard and must use a custom shard key.
- inserting documents into the target collection must have happened via an
AQL query like `FOR doc IN source INSERT doc INTO target`.
In this particular combination, the document keys (`_key` attribute) from
the source collection were used as-is for insertion into the target
collection. However, as the target collection is not sharded by `_key` and
uses a custom shard key, it is actually not allowed to specify user-defined
values for `_key`. That check was missing since 3.8 in this particular
combination and has now been added back. AQL queries attempting to insert
documents into a collection like this will now fail with the error "must not
specifiy _key for this collection", as they used to do before 3.8.
* Updated ArangoDB Starter to 0.15.5-preview-1.
* Updated arangosync to v2.12.0-preview-4.
* Improve error handling for passing wrong transaction ids / cursor ids / pregel
job ids to request forwarding. Also prevent the error "transaction id not
found" in cases when request forwarding was tried to a coordinator that was
recently restarted.
* Added startup option `--rocksdb.verify-sst` to validate sst files already
present in the database directory on startup. Default: false.
* BTS-907: Fixed some rare SortNode related optimizer issues, when at least two
or more SortNodes appeared in the AQL execution plan.
* Updated arangosync to v2.12.0-preview-3.
* Added new AQL function `VALUE` capable of accessing object attribute by a
specified path.
* Added OFFSET_INFO function (Enterprise Edition only) to support search results
highliting.
* Updated Rclone to v1.59.0.
* Add serverId parameter to _admin/log/level. Allows you to forward the request to
other servers.
* Updated OpenSSL to 1.1.1q and OpenLDAP to 2.6.3.
* Updated arangosync to v2.12.0-preview-2.
* ArangoSearch nested search feature. (Enterprise Edition): Added ability to index
and search nested documents with ArangoSearch views.
* Fixed handling of illegal edges in Enterprise Graphs. Adding an edge to a SmartGraph
vertex collection through document API caused incorrect sharding of the edge. Now
this edge is rejected as invalid. (BTS-906)
* Removed unused log topics "CLUSTERCOMM", "COLLECTOR" and "PERFORMANCE" from
the code.
* Added ALL_SHORTEST_PATHS functionality to find all shortest paths between two
given documents.
* Added another test for computedValues attribute keepNull.
* BTS-913: check for proper timezone setup of the system on startup.
This will then log errors that else would only occur in AQL-Functions at
runtime.
* Changed rocksdb default compression type from snappy to lz4.
* Fixed a potential deadlock in RocksDB compaction.
For details see https://github.com/facebook/rocksdb/pull/10355
* Added more specific process exit codes for arangod and all client tools,
and changed the executables' exit code for the following situations:
- an unknown startup option name is used: previously the exit code was 1.
Now the exit code when using an invalid option is 3 (symbolic exit code
name EXIT_INVALID_OPTION_NAME).
- an invalid value is used for a startup option (e.g. a number that is
outside the allowed range for the option's underlying value type, or a
string value is used for a numeric option): previously the exit code was
1. Now the exit code for these case is 4 (symbolic exit code name
EXIT_INVALID_OPTION_VALUE).
- a config file is specified that does not exist: previously the exit code
was either 1 or 6 (symbolic exit code name EXIT_CONFIG_NOT_FOUND). Now
the exit code in this case is always 6 (EXIT_CONFIG_NOT_FOUND).
- a structurally invalid config file is used, e.g. the config file contains
a line that cannot be parsed: previously the exit code in this situation
was 1. Now it is always 6 (symbolic exit code name EXIT_CONFIG_NOT_FOUND).
Note that this change can affect any custom scripts that check for startup
failures using the specific exit code 1. These scripts should be adjusted so
that they check for a non-zero exit code. They can opt in to more specific
error handling using the additional exit codes mentioned above, in order to
distinguish between different kinds of startup errors.
* arangoimport now supports the option --remove-attribute on type JSON as well.
Before it was restricted to TSV and CSV only.
* Added CSP recommended headers to Aardvaark app for better security.
* Fixed BTS-851: "Could not fetch the applier state of: undefined".
* Removed internal JavaScript dependencies "expect.js", "media-typer" and
"underscore". We recommend always bundling your own copy of third-party
modules as all previously included third-party modules are now considered
deprecated and may be removed in future versions of ArangoD
* APM-84: Added option to spill intermediate AQL query results from RAM to
disk when their size exceeds certain thresholds. Currently the only AQL
operation that can make use of this is the SortExecutor (AQL SORT operation
without using a LIMIT). Further AQL executor types will be supported in
future releases.
Spilling over query results from RAM to disk is off by default and currently
in an experimental stage. In order to opt in to the feature, it is required
to set the following startup option `--temp.intermediate-results-path`.
The directory specified here must not be located underneath the instance's
database directory.
When this startup option is specified, ArangoDB assumes ownership of that
directory and will wipe its contents on startup and shutdown. The directory
can be placed on ephemeral storage, as the data stored inside it is there
only temporarily, while the instance is running. It does not need to be
persisted across instance restarts and does not need to be backed up.
When a directory is specified via the startup option, the following
additional configuration options can be used to control the threshold
values for spilling over data:
* `--temp.intermediate-results-capacity`: maximum on-disk size (in bytes)
for intermediate results. If set to 0, it means that the on-disk size
is not constrained. It can be set to a value other than 0 to restrict the
size of the temporary directory. Once the cumulated on-disk size of
intermediate results reaches the configured maximum capacity, the
query will be aborted with failure "disk capacity limit for intermediate
results exceeded".
* `--temp.intermediate-results-spillover-threshold-num-rows`: number of
result rows from which on a spillover from RAM to disk will happen.
* `--temp.intermediate-results-spillover-threshold-memory-usage`: memory
usage (in bytes) after which a spillover from RAM to disk will happen.
* `--temp.intermediate-results-encryption`: whether or not the on-disk
data should be encrypted. This option is only available in the Enterprise
Edition.
* `--temp.-intermediate-results-encryption-hardware-acceleration`: whether
or not to use hardware acceleration for the on-disk encryption. This
option is only available in the Enterprise Edition.
Please note that the feature is currently still experimental and may slightly
change in future releases. As mentioned, the only Executor that can make
use of spilling data to disk is the SortExecutor (SORT without LIMIT).
Also note that the query results will still be built up entirely in RAM
on coordinators and single servers for non-streaming queries. In order to
avoid the buildup of the entire query result in RAM, a streaming query
should be used.
* Enterprise only: Added `MINHASH`, `MINHASH_MATCH`, `MINHASH_ERROR`,
`MINHASH_COUNT` AQL functions.
* Enterprise only: Added `minhash` analyzer.
* BugFix in Pregel's status: When loading the graph into memory,
Pregel's state is now 'loading' instead of 'running'. When loading is finished,
Pregel's state changes to the 'running' state.
* arangoimport now supports an additional option "--overwrite-collection-prefix".
This option will only help while importing edge collections, and if it is used
together with "--to-collection-prefix" or "--from-collection-prefix". If there
are vertex collection prefixes in the file you want to import (e.g. you just
exported an edge collection from ArangoDB) you allow arangoimport to overwrite
those with the commandline prefixes. If the option is false (default value)
only _from and _to values without a prefix will be prefixed by the handed in
values.
* Added startup option `--rocksdb.compaction-style` to configure the compaction
style which is used to pick the next file(s) to be compacted.
* BugFix in Pregel's Label Propagation: the union of three undirected cliques
of size at least three connected by an undirected triangle now returns
three communities (each clique is a community) instead of two.
* Pregel now reports correct and ongoing runtimes for loading, running, and
storing as well as runtimes for the separate global supersteps.
* Fixed parsing of K_SHORTEST_PATHS queries to not allow ranges anymore.
* Updated arangosync to v2.11.0.
* Add log.time-format utc-datestring-micros to make debugging of concurrency
bugs easier.
* Renamed KShortestPathsNode to EnumeratePathsNote; this is visible in
explain outputs for AQL queries.
* Pregel SSSP now supports `resultField` as well as `_resultField` as
parameter name to specify the field into which results are stored.
The name `_resultField` will be deprecated in future.
* Update Windows CI compiler to Visual Studio 2022.
* Web UI: Fixes a GraphViewer issue related to display issues with node
and edge labels. Boolean node or edge values could not be used as label
values (ES-1084).
* Made the SortExecutor receive its input incrementally, instead of receiving
a whole matrix containing all input at once.
* Optimization for index post-filtering (early pruning): in case an index
is used for lookups, and the index covers the IndexNode's post-filter
condition, then loading the full document from the storage engine is
now deferred until the filter condition is evaluated and it is established
that the document matches the filter condition.
* Added a fully functional UI for Views that lets users view, modify mutable
properties and delete views from the web UI.
* Fix thread ids and thread names in log output for threads that are not
started directly by ArangoDB code, but indirectly via library code.
Previously, the ids of these threads were always reported as "1", and
the thread name was "main". Now return proper thread ids and names.
* Changed default Linux CI compiler to gcc-11.
* Updated arangosync to v2.11.0-preview-2.
* Add "AT LEAST" quantifier for array filters in AQL:
`RETURN [1,2,3][? AT LEAST (3) FILTER CURRENT > 42]`
`RETURN [1,2,3] AT LEAST (2) IN [1,2,3,4,5]`
* Changed default macOS CI compiler to LLVM clang-14.
* Added an automatic cluster rebalance api. Use `GET _admin/cluster/rebalance`
to receive an analysis of how imbalanced the cluster is. Calling it with
`POST _admin/cluster/rebalance` computes a plan of move shard operations to
rebalance the cluster. Options are passed via the request body. After
reviewing the plan, one can use `POST _admin/cluster/rebalance/execute` to
put that plan into action.
* Introduce reading from followers in clusters. This works by offering
an additional HTTP header "x-arango-allow-dirty-read" for certain
read-only APIs. This header has already been used for active failover
deployments to allow reading from followers. Using this header leads
to the fact that coordinators are allowed to read from follower shards
instead only from leader shards. This can help to spread the read load
better across the cluster. Obviously, using this header can result in
"dirty reads", which are read results returning stale data or even
not-yet-officially committed data. Use at your own risk if performance
is more important than correctness or if you know that data does not
change.
The responses which can contain dirty reads will have set the HTTP header
"x-arango-potential-dirty-read" set to "true".
There are the following new metrics showing the use of this feature:
- `arangodb_dirty_read_transactions_total`
- `arangodb_potentially_dirty_document_reads_total`
- `arangodb_dirty_read_queries_total`
* Changed HTTP response code for error number 1521 from 500 to 400.
Error 1521 (query collection lock failed) is nowadays only emitted by
traversals, when a collection is accessed during the traversal that has
not been specified in the WITH statement of the query.
Thus returning HTTP 500 is not a good idea, as it is clearly a user error
that triggered the problem.
* Renamed the `--frontend.*` startup options to `--web-interface.*`:
- `--frontend.proxy-request.check` -> `--web-interface.proxy-request.check`
- `--frontend.trusted-proxy` -> `--web-interface.trusted-proxy`
- `--frontend.version-check` -> `--web-interface.version-check`
The former startup options are still supported.
* Added Enterprise Graph feature to enterprise version of ArangoDB.
The enterprise graph is another graph sharding model that we introduced,
it is less strict, and therefore easier to start with, then SmartGraphs,
as it does not require a smartGraphAttribute, and allows free choice of
vertex _key values. But still maintains performance gains as compared to
general-graphs. For more details please check documentation.
* APM-135: Added multithreading to assigning non-unique indexes to documents,
in foreground or background mode. The number of index creation threads
is hardcoded to 2 for now. Improvements for higher parallelism are expected
for future versions.
* Issue 15592: Permit `MERGE_RECURSIVE()` to be called with a single argument.
* Fixed issue 16337: arangoimport with `--headers-file` and `--merge-attributes`
merges column names instead of row values on the first line of a CSV file.
Additionally, floating-point numbers are now merged using their standard
string representation instead of with a fixed precision of 6 decimal places.
* Now supporting projections on traversals. In AQL Traversal statements like
FOR v,e,p IN 1..3 OUTBOUND @start GRAPH @graph RETURN v.name
we will now detect attribute accesses on the data, in above example "v.name"
and use it to optimize data-loading, e.g. we will only extract the "name" attribute.
This optimization will help if you have large document sizes, but only access small
parts of the documents. By default we will only project up to 5 attributes on each
vertex, and edge. This limit can be modified by adding OPTIONS {maxProjections: 42}.
To identify if your query is using projections the explain output will now contain a
hint like `/* vertex (projections: `name`) */`
For now only attribute accesses are detected, functions like `KEEP` will not be projected.
* Updated arangosync to v2.11.0-preview-1.
* Change default `format_version` for RocksDB .sst files from 3 to 5.
* Added support for creating autoincrement keys on cluster mode, but only for
single sharded collections.
* Add support for LZ4 and LZ4HC compression support for RocksDB.
* Allow parallel access to the shards of smart edge collections in AQL via
parallel GatherNodes.
* Update RocksDB internal table checksum type to xxHash64.
* Updated arangosync to v2.10.0.
* Added several startup option to configure parallelism for individual Pregel
jobs:
- `--pregel.min-parallelism`: minimum parallelism usable in Pregel jobs.
- `--pregel.max-parallelism`: maximum parallelism usable in Pregel jobs.
- `--pregel.parallelism`: default parallelism to use in Pregel jobs.
These parallelism options can be used by administrators to set concurrency
defaults and bounds for Pregel jobs. Each individual Pregel job can set
its own parallelism value using the job's `parallelism` option, but the
job's parallelism value will be clamped to the bounds defined by
`--pregel.min-parallelism` and `--pregel.max-parallelism`. If a job does
not set its `parallelism` value, it will default to the parallelism value
configured via `--pregel.parallelism`.
* Added startup options to configure the usage of memory-mapped files for
Pregel temporary data:
- `--pregel.memory-mapped-files`: if set to `true`, Pregel jobs will by
default store their temporary data in disk-backed memory-mapped files.
If set to `false`, the temporary data of Pregel jobs will be buffered in
RAM. The default value is `true`, meaning that memory-mapped files will
be used. The option can be overridden for each Pregel job by setting the
`useMemoryMaps` option of the job.
- `--pregel.memory-mapped-files-location-type`: location for memory-mapped
files written by Pregel. This option is only meaningful if memory-mapped
files are actually used. The option can have one of the following values:
- `temp-directory`: store memory-mapped files in the temporary directory,
as configured via `--temp.path`. If `--temp.path` is not set, the
system's temporary directory will be used.
- `database-directory`: store memory-mapped files in a separate directory
underneath the database directory.
- `custom`: use a custom directory location for memory-mapped files. The
exact location must be set via the configuration parameter
`--pregel.memory-mapped-files-custom-path`.
The default value for this option is `temp-directory`.
- `--pregel.memory-mapped-files-custom-path`: custom directory location for
Pregel's memory-mapped files. This setting can only be used if the option
`--pregel.memory-mapped-files-location-type` is set to `custom`.
The default location for Pregel's memory-mapped files is the temporary
directory (`temp-directory`), which may not provide enough capacity for
larger Pregel jobs.
It may be more sensible to configure a custom directory for memory-mapped
files and provide the necessary disk space there (`custom`). Such custom
directory can be mounted on ephemeral storage, as the files are only needed
temporarily.
There is also the option to use a subdirectory of the database directory
as the storage location for the memory-mapped files (`database-directory`).
The database directory often provides a lot of disk space capacity, but
when it is used for both the regular database data and Pregel's memory-mapped
files, it has to provide enough capacity to store both.
* Pregel status now reports whether memory mapped files are used in a job.
* Fixed issue BTS-875.
* Updated arangosync to v2.10.0-preview-1.
* Enterprise only: Restricted behavior of Hybrid Disjoint Smart Graphs. Within
a single traversal or path query we now restrict that you can only switch
between Smart and Satellite sharding once, all queries where more than one
switch is (in theory) possible will be rejected. e.g:
```
FOR v IN 2 OUTBOUND @start smartToSatEdges, satToSmartEdges
```
will be rejected (we can go smart -> sat -> smart, so two switches)
```
FOR v1 IN 1 OUTBOUND @start smartToSatEdges
FOR v2 IN 1 OUTBOUND v1 satToSmartEdges
```
will still be allowed, as each statement only switches once.
We have decided to take this restrictions as especially for ShortestPath
queries the results are not well-defined. If you have a use-case where
this restriction hits you, please contact us.
* Change default value of `--rocksdb.block-cache-shard-bits` to an automatic
default value that allows data blocks of at least 128MiB to be stored in each
cache shard if the block cache's strict capacity limit is used. The strict
capacity limit for the block cache is enabled by default in 3.10, but can be
turned off by setting the option `--rocksdb.enforce-block-cache-size-limit`
to `false`. Also log a startup warning if the resulting cache shard size
would be smaller than is potentially safe when the strict capacity limit is
set.
Enforcing the block cache's capacity limit has the consequence that data
reads by RocksDB must fit into the block cache or the read operation will
fail with an "Incomplete" error.
* The API `/_admin/status` now returns a progress attribute that shows the
server's current state (starting, stopping, etc.), with details about which
feature is currently started, stopped etc. During recovery, the current WAL
recovery sequence number is also reported in a sub-attribute of the
`progress` attribute. Clients can query this attribute to track the
progress of the WAL recovery.
The additional progress attribute returned by `/_admin/status` is most
useful when using the `--server.early-connections true` setting. With that
setting, the server will respond to incoming requests to a limited set of
APIs already during server startup. When the setting is not used, the REST
interface will be opened relatively late during the startup sequence, so
that the progress attribute will likely not be very useful anymore.
* Optionally start up HTTP interface of servers earlier, so that ping probes
from tools can already be responded to when the server is not fully started.
By default, the HTTP interface is opened at the same point during the startup
sequence as before, but it can optionally be opened earlier by setting the
new startup option `--server.early-connections` to `true`. This will
open the HTTP interface early in the startup, so that the server can respond
to a limited set of REST APIs even during recovery. This can be useful
because the recovery procedure can take time proportional to the amount of
data to recover.
When the `--server.early-connections` option is set to `true`, the
server will respond to requests to the following APIs during the startup
already:
- `/_api/version`
- `/_admin/version`
- `/_admin/status`
All other APIs will be responded to with an HTTP response code 503, so that
callers can see that the server is not fully ready.
If authentication is used, then only JWT authentication can be used during
the early startup phase. Incoming requests relying on other authentication
mechanisms that require access to the database data will also be responded to
with HTTP 503 errors, even if correct credentials are used.
* Fix behavior when accessing a view instead of a collection by name in a REST
document operation. Now return a proper error.
* Upgraded bundled version of RocksDB to 7.2.
* Fix documentation of collection's `cacheEnabled` property default.
* Added `[?]` array operator to AQL, which works as follows:
- `nonArray[?]`: returns `false`
- `nonArray[? FILTER CURRENT ...]`: returns `false`
- `array[?]`: returns `false` if array is empty, `true` otherwise
- `array[? FILTER CURRENT ...]`: returns `false` if no array member
satisfies the filter condition, returns `true` if at least one member
satisfies it.
* Fixed Github issue #16279: assertion failure/crash in AQL query optimizer when
permuting adjacent FOR loops that depended on each other.
* No good reason to fatal error in agency state, when local database entries
lack local timestamp (legacy). In that situation, we will record epoch begin
as local time.
* Very verbose warning from failing to parse GEO JSON in search. Has lead to
billions of log lines on deployed services.
* Put hotbackup requests on the HIGH priority queue to make hotbackups work
under high load (BTS-865).
* Removed separate FlushThread (for views syncing) and merged it with the
RocksDBBackgroundThread.
* Fix some issues with WAL recovery for views. Previously it was possible that
changes to a view/link were already recovered and persisted, but that the
lower bound WAL tick was not moved forward. This could lead to already fully
recovered views/links being recovered again on the next restart.
* Updated OpenSSL to 1.1.1o and OpenLDAP to 2.6.2.
* Upgrade jemalloc to version 5.3.0.
* Fixed BTS-860. Changed ArangoSearch index recovery procedure to
remove necessity to always fully recreate index if IndexCreation marker
encountered.
* Updated arangosync to v2.9.1.
* Added option `--enable-revision-trees` to arangorestore, which will add the
attributes `syncByRevision` and `usesRevisionsAsDocumentIds` to the collection
structure if they are missing. As a consequence, these collections created by
arangorestore will be able to use revision trees and a faster getting-in-sync
procedure after a restart. The option defaults to `true`, meaning the
attributes will be added if they are missing. If the option is set to `false`,
the attributes will not be added to the collection structure.
If the attributes are already present in the dump data, they will not be
modified by arangorestore irrespective of the setting of this option.
* Set "useRevisionsAsDocumentIds" to true when restoring collection data
via arangorestore in case it is not set in the collection structure input
data. This allows using revision trees for restored collections.
* Fix: Highly unlikely race in cluster maintenance. For every shard only
one operation (change attribute, change leadership) should be performed
at the same time. However if two changes are detected in the same heartbeat
it could lead to both operations to be executed in parallel. In most cases
this is also fine, but could lead to races on the same attribute, however
the race will be sorted out in the next heartbeat interval.
* Added new optimization rule "arangosearch-constrained-sort" to perform
sorting & limiting inside ArangoSearch View enumeration node in case of
using just scoring for sort.
* Improve log output for WAL recovery, by providing more information and
making the wording more clear.
* Updated lz4 to version 1.9.3.
* Added option `--custom-query-file` to arangoexport, so that a custom query
string can also be read from an input file.
* Added startup option `--cluster.shard-synchronization-attempt-timeout` to
limit the amount of time to spend in shard synchronization attempts. The
default timeout value is 20 minutes.
Running into the timeout will not lead to a synchronization failure, but
will continue the synchronization shortly after. Setting a timeout can
help to split the synchronization of large shards into smaller chunks and
release snapshots and archived WAL files on the leader earlier.
This change also introduces a new metric `arangodb_sync_timeouts_total`
that counts the number of timed-out shard synchronization attempts.
* Updated arangosync to v2.9.1-preview-1.
* Make sure that newly created TTL indexes do not use index estimates, which
wouldn't be used for TTL indexes anyway.
* Fix: for the Windows build, the new Snappy version, which was introduced in
3.9, generated code that contained BMI2 instructions which where introduced
with the Intel Haswell architecture. However, our target architecture for 3.9
is actually Sandy Bridge, which predates Haswell. Running the build on these
older CPUs thus resulted in illegal instruction exceptions.
* FE-46: UI improvement on the view UI pages as well as adding tooltips to
options where necessary. The affected pages are mostly the Info and
Consolidation Policy pages.
* FE-44: Moved the Info page to before JSON, making the settings page the
default page in the view web UI.
* Refactor internal code paths responsible for `_key` generation. For
collections with only a single shard, we can now always let the leader
DB server generate the keys locally. For collections with multiple shards,
the coordinators are now always responsible for key generation.
Previously the responsibility was mixed and depended on the type of
operation executed (document insert API vs. AQL query, single operation
vs. batch).
* Make web UI show the following information for collections:
* key generator type
* whether or not the document and primary index cache is enabled
* if cache is enabled, show cache usage and allocation size in figures
The `cacheEnabled` property of collections is now also changeable via the
UI for existing collections.
* FE-45: Added tooltips with helpful information to the options on the View UI
settings page.
* FE-43: Simplify the workflow on the web view UI (Links page): allow for users
to view a single link or field with their properties at a time.
* Improve validation for variables used in the `KEEP` part of AQL COLLECT
operations. Previously referring to a variable that was introduced by the
COLLECT itself from out of the KEEP part triggered an internal error. The
case is detected properly now and handled with a descriptive error message.
* Updated arangosync to v2.9.0.
* Updated arangosync to v2.9.0-preview-6.
* Fixed BTS-811 in which there was an incongruence between data being
checksummed and data being written to `.sst` files, because checksumming
should have been made after the encryption of the data, not before it.
* Increase internal transaction lock timeout on followers during cluster
write operations. Although writes to the same keys on followers should be
serialized by the key locks held on the leader, it is still possible that
the global transaction lock striped mutex is a source of contention and
that concurrent write operations time out while waiting to acquire this
global mutex. The lock timeout on followers is now significantly increased
to make this very unlikely.
* Added startup option `--rocksdb.transaction-lock-stripes` to configure the
number of lock stripes to be used by RocksDB transactions. The option
defaults to the number of available cores, but is bumped to a value of
16 if the number of cores is lower.
* Make all requests which are needed for shard resync at least medium