Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge pg master to 2023.12.31 #664

Merged
merged 1,238 commits into from
May 17, 2024
Merged

Merge pg master to 2023.12.31 #664

merged 1,238 commits into from
May 17, 2024

Conversation

jiaoshuntian
Copy link
Collaborator

No description provided.

tglsfdc and others added 30 commits May 14, 2024 14:18
Oversight in 5c4c7efad: gotta adjust the cell height for removal of
an entry.  Per buildfarm.
Reported-by: Josh Kupershmidt

Discussion: https://postgr.es/m/CAK3UJRF=KY_nx_TRQq+t6jOrtS2rry79ktkzPiMDhFx_K=dZAg@mail.gmail.com

Author: Josh Kupershmidt

Backpatch-through: master
Reported-by: Michael Paquier

Discussion: https://postgr.es/m/CAB7nPqT1c9WrUw4+eSGF_-ru7ERBOC50a4r3tS1s-yT4OaYsLg@mail.gmail.com

Author: Michael Paquier

Backpatch-through: master
The brininsert code used to initialize (and destroy) BrinDesc and
BrinRevmap for each tuple, which is not free. This patch initializes
these structures only once, and reuses them for all inserts in the same
command. The data is passed through indexInfo->ii_AmCache.

This also introduces an optional AM callback "aminsertcleanup" that
allows performing custom cleanup in case simply pfree-ing ii_AmCache is
not sufficient (which is the case when the cache contains TupleDesc,
Buffers, and so on).

Author: Soumyadeep Chakraborty
Reviewed-by: Alvaro Herrera, Matthias van de Meent, Tomas Vondra
Discussion: https://postgr.es/m/CAE-ML%2B9r2%3DaO1wwji1sBN9gvPz2xRAtFUGfnffpd0ZqyuzjamA%40mail.gmail.com
This fixes some md5() calls that snuck in in 0457109344 after we had
removed them all in 208bf364a9.

Reviewed-by: Tomas Vondra <[email protected]>
Discussion: https://www.postgresql.org/message-id/[email protected]
These constructs have precedence, but we forgot to list them.
In HEAD, mention AT LOCAL as well as AT TIME ZONE.

Per gripe from Shay Rojansky.

Discussion: https://postgr.es/m/CADT4RqBPdbsZW7HS1jJP319TMRHs1hzUiP=iRJYR6UqgHCrgNQ@mail.gmail.com
Make a reminder that pg_stats view needs to be modified whenever a new slot
kind is added.  To prevent situations like 918eee0 when pg_stats was
forgotten to be updated.

Also, revise the comment that only non-null, non-empty rows are considered
for the range length histogram.

Discussion: https://postgr.es/m/flat/[email protected]
Author: Egor Rogov, Soumyadeep Chakraborty
Reviewed-by: Tomas Vondra, Justin Pryzby, Jian He
Values corresponding to STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM and
STATISTIC_KIND_BOUNDS_HISTOGRAM were not exposed to pg_stats when these
slot kinds were introduced in 918eee0.

This commit adds the missing fields to pg_stats.

Catversion is bumped.

Discussion: https://postgr.es/m/flat/[email protected]
Author: Egor Rogov, Soumyadeep Chakraborty
Reviewed-by: Tomas Vondra, Justin Pryzby, Jian He
The libpq code in charge of creating per-connection SSL objects was
prone to a race condition when loading the custom BIO methods needed by
my_SSL_set_fd().  As BIO methods are stored as a static variable, the
initialization of a connection could fail because it could be possible
to have one thread refer to my_bio_methods while it is being manipulated
by a second concurrent thread.

This error has been introduced by 8bb14cd, that has removed
ssl_config_mutex around the call of my_SSL_set_fd(), that itself sets
the custom BIO methods used in libpq.  Like previously, the BIO method
initialization is now protected by the existing ssl_config_mutex, itself
initialized earlier for WIN32.

While on it, document that my_bio_methods is protected by
ssl_config_mutex, as this can be easy to miss.

Reported-by: Willi Mann
Author: Willi Mann, Michael Paquier
Discussion: https://postgr.es/m/[email protected]
Backpatch-through: 12
This is preliminary patch.  It adds NOT NULL checking for the result of
pg_stat_statements_reset() function. It is needed for upcoming patch
"Track statement entry timestamp" that will change the result type of
this function to the timestamp of a reset performed.

Discussion: https://postgr.es/m/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru
Author: Andrei Zubkov
Reviewed-by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun
Reviewed-by: Anton Melnikov, Darren Rush, Michael Paquier, Sergei Kornilov
Reviewed-by: Alena Rybakina, Andrei Lepikhov
This patch adds 'stats_since' and 'minmax_stats_since' columns to the
pg_stat_statements view and pg_stat_statements() function.  The new min/max
reset mode for the pg_stat_stetments_reset() function is controlled by the
parameter minmax_only.

'stat_since' column is populated with the current timestamp when a new
statement is added to the pg_stat_statements hashtable.  It provides clean
information about statistics collection time intervals for each statement.
Besides it can be used by sampling solutions to detect situations when a
statement was evicted and stored again between samples.

Such a sampling solution could derive any pg_stat_statements statistic values
for an interval between two samples with the exception of all min/max
statistics. To address this issue this patch adds the ability to reset
min/max statistics independently of the statement reset using the new
minmax_only parameter of the pg_stat_statements_reset(userid oid, dbid oid,
queryid bigint, minmax_only boolean) function. The timestamp of such reset
is stored in the minmax_stats_since field for each statement.
pg_stat_statements_reset() function now returns the timestamp of a reset as the
result.

Discussion: https://postgr.es/m/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru
Author: Andrei Zubkov
Reviewed-by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun
Reviewed-by: Anton Melnikov, Darren Rush, Michael Paquier, Sergei Kornilov
Reviewed-by: Alena Rybakina, Andrei Lepikhov

Conflicts:
	contrib/pg_stat_statements/Makefile
52e4f0c introduced a bug in pgoutput in which missing values in tuples
were incorrectly filled in with NULL. The problem was the use of
CreateTupleDescCopy where CreateTupleDescCopyConstr was required, as the
former drops the constraints in the tuple description (specifically, the
default value constraint) on the floor.

The bug could result in incorrectness when a table replicated via
`REPLICA IDENTITY FULL` underwent a schema change that added a column
with a default value. The problem is that in such cases updates fill NULL
values in old tuples for missing columns for default values. Then on the
subscriber, we failed to find a matching tuple and missed updating the
required row.

Author: Nikhil Benesch
Reviewed-by: Hou Zhijie, Amit Kapila
Backpatch-through: 15
Discussion: http:https://postgr.es/m/CAPWqQZTEpZQamYsGMn6ZDRvVywwpVPiKH6OY4KSgA+NmeqFNzA@mail.gmail.com
XLogSetAsyncXactLSN(), called at asynchronous commit, would wake up
walwriter every time the LSN advances, but walwriter doesn't actually
do anything unless it has at least 'wal_writer_flush_after' full
blocks of WAL to write. Repeatedly waking up walwriter to do nothing
is a waste of CPU cycles in both walwriter and the backends doing the
wakeups. To fix, apply the same logic in XLogSetAsyncXactLSN() to
decide whether to wake up walwriter, as walwriter uses to determine if
it has any work to do.

In the passing, rename misleadingly named 'flushbytes' local variable
to 'flushblocks'.

Author: Andres Freund, Heikki Linnakangas
Discussion: https://www.postgresql.org/message-id/[email protected]
Fix a bug introduced by c1ec02be1d79. It may happen that the executor
opens indexes on the result relation, but no rows end up being inserted.
Then the index_insert_cleanup still gets executed, but passes down NULL
to the AM callback. The AM callback may not expect this, as is the case
of brininsertcleanup, leading to a crash.

Fixed by only calling the cleanup callback if (ii_AmCache != NULL). This
way the AM can simply assume to only see a valid cache.

Reported-by: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-w9qC-o9hQox9UHvdVZAYTp8OrPQOKtwbvzWaRejTT=Q@mail.gmail.com
It fails to use the CONCURRENTLY keyword where it was necessary, so add
it.  This text was added to pg11 in commit 5efd604; backpatch to pg12.

Author: Nikolay Samokhvalov <[email protected]>
Discussion: https://postgr.es/m/CAM527d9iz6+=_c7EqSKaGzjqWvSeCeRVVvHZ1v3gDgjTtvgsbw@mail.gmail.com
As of commits dd04e95 and 1833f1a, tuplestore_donestoring(),
SPI_push(), SPI_pop(), SPI_push_conditional(),
SPI_pop_conditional(), and SPI_restore_connection() are no-op
macros provided for backwards compatibility.  This commit removes
these macros, so any uses in third-party code will need to be
removed, too.  Since these macros have been no-ops for a while,
such adjustments won't produce any behavior changes for all
currently-supported versions of PostgreSQL.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACVeO58JM5tK2Qa8QC-%3DkC8sdkJOTd4BFU%3DK8zs4gGYpjQ%40mail.gmail.com
00b41463c adjusted Bitmapset so that an empty set is always represented
as NULL.  This makes checking for empty sets far cheaper than it used
to be.

There were various places in the code where we'd call bms_membership()
to handle the 3 possible BMS_Membership values.  For the BMS_SINGLETON
case, we'd also call bms_singleton_member() to find the single set member.
This can now be done in a more optimal way by first checking if the set is
NULL and then not bothering with bms_membership() and simply call
bms_get_singleton_member() instead to find the single member.  This
function will return false if there are multiple members in the set.

Here we also tidy up some logic in examine_variable() for the single
member case.  There's now no need to call bms_is_member() as we've
already established that we're working with a singleton Bitmapset, so we
can just check if varRelid matches the singleton member.

Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/CAApHDvqW+CxNPcY245GaWiuqkkqgTudtG2ncGvvSjGn2wdTZLA@mail.gmail.com
scram_SaltedPassword() could take a long time to compute when the number
of iterations used is large enough, and this code uses a tight loop to
compute a salted password.

Note that the same issue exists in libpq when using \password and a
large iteration number, but this cannot be interrupted.  A CFI in the
backend is useful for server-side computations, at least.

Backpatch down to 16, where the user-settable GUC scram_iterations has
been added.

Author: Bowen Shi
Reviewed-by: Aleksander Alekseev, Daniel Gustafsson
Discussion: https://postgr.es/m/CAM_vCueV6xfr08KczfaCEk5J_qeTZtgqN7+orkNLx=g+phE82Q@mail.gmail.com
Backpatch-through: 16
This routine is located in heapam_handler.c, not tableamapi.c.  Issue
noted while hacking the area for a different patch.

Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/[email protected]
robertmhaas and others added 27 commits May 14, 2024 14:18
I don't think there's a real problem here, because if we reach
the loop over 'tles' then we will either find at least one
TimeLineHistoryEntry such that oldest_segno != 0, in which case
unsummarized_lsn will be initialized, or else unsummarized_tli
will remain 0 and an error will occur before unsummarized_lsn
is used for anything. But some compilers are complainining, as
reported on list by Nathan Bossart and off-list by Andrew Dunstan.

Discussion: http:https://postgr.es/m/20231223215147.GA69623@nathanxps13
Recently-introduced code in reconstruct.c was using "unsigned"
to store the result of read(), pg_pread(), or write().  This is
completely bogus: it breaks subsequent tests for the result being
negative, as we're being reminded of by a chorus of buildfarm
warnings.  Switch to "int" as was doubtless intended.  (There are
several other uses of "unsigned" in this file that also look poorly
chosen to me, but for now I'm just trying to clean up the buildfarm.)

A larger problem is that "int" is not necessarily wide enough to hold
the result: per POSIX, all these functions return ssize_t.  In places
where the requested read or write length clearly fits in int, that's
academic.  It may be academic anyway as long as we constrain
individual data files to 1GB, since even a readv or writev-like
operation would then not be responsible for transferring more than
1GB.  Nonetheless it seems like trouble waiting to happen, so I made
a pass over readv and writev calls and fixed the result variables
where that seemed appropriate.  We might want to think about changing
some of the fd.c functions to return ssize_t too, for future-proofing;
but I didn't tackle that here.

Discussion: https://postgr.es/m/[email protected]
The code was passing a scalar argument to node->restart(), but it was
expecting a hash, which causes a warning from Perl ("Odd number of
elements in hash assignment").

But the node->restart() function doesn't take a mode argument anyway.
This was probably copied from an incorrect comment (see commit
750c59d).  The default restart mode is already "fast", so the test
should still be semantically correct without explicitly specifying the
mode.

Discussion: https://www.postgresql.org/message-id/[email protected]
add_file_to_manifest declared its mtime argument as pg_time_t,
apparently on the principle that copy-and-paste from the backend
is fine.  However, the callers are passing struct stat's st_mtime
field which is plain time_t, and add_file_to_manifest itself is
passing the value to gmtime(3) which expects plain time_t,
so the whole thing would not work at all on any platform where
those types are different.  Fortunately we can just switch this
variable to time_t.

Per warnings from assorted buildfarm members.
The previous wording here relied solely on an example to explain
aclitem output format.  Add an actual syntax synopsis and
explanation of the elements to make it clearer.

David Johnston and Tom Lane, per gripe from Eugen Konkov.

Discussion: https://postgr.es/m/170326116972.1876499.18357820037829248593@wrigleys.postgresql.org
This function was originally coded with a handmade expansion
of the array subscripts.  We can do it a little faster and far
more legibly today, by using unnest() WITH ORDINALITY.

While at it, let's apply the rowcount estimation support that exists
for the underlying unnest() function: reduce the default ROWS estimate
to 100 and attach array_unnest_support.  I'm not sure that
array_unnest_support can do anything useful today with the call sites
that exist in information_schema, but it can't hurt, and the existing
default rowcount of 1000 is surely much too high for any of these
cases.

The psql.sql regression script is using _pg_expandarray() as a
test case for \sf+.  While we could keep doing so, the new one-line
function body makes a poor test case for \sf+ row-numbering, so
switch it to print another information_schema function.

Discussion: https://postgr.es/m/[email protected]
The documentation has been missing one value in the list of catalog OIDs
that can be given to the validator function of a FDW, as of
AttributeRelationId, when changing the attribute options of a foreign
table.

Author: Ian Lawrence Barwick
Discussion: https://postgr.es/m/CAB8KJ=i16t2yJU_Pq2Z+hnNGWFhagp_bJmzxHZu3ZkOjZm-+rQ@mail.gmail.com
Backpatch-through: 12
Should match the name of the related GUC variable.

Discussion: https://www.postgresql.org/message-id/[email protected]
If the underlying table isn't being dumped, it's useless to dump
an extended statistics object; it'll just cause errors at restore.
We have always applied similar policies to, say, indexes.

(When and if we get cross-table stats objects, it might be profitable
to think a little harder about what to do with them.  But for now
there seems no point in considering a stats object as anything but
an appendage of its table.)

Rian McGuire and Tom Lane, per report from Rian McGuire.
Back-patch to supported branches.

Discussion: https://postgr.es/m/[email protected]

Conflicts:
	src/bin/pg_dump/t/002_pg_dump.pl
There are a lot of Perl scripts in the tree, mostly code generation
and TAP tests.  Occasionally, these scripts produce warnings.  These
are probably always mistakes on the developer side (true positives).
Typical examples are warnings from genbki.pl or related when you make
a mess in the catalog files during development, or warnings from tests
when they massage a config file that looks different on different
hosts, or mistakes during merges (e.g., duplicate subroutine
definitions), or just mistakes that weren't noticed because there is a
lot of output in a verbose build.

This changes all warnings into fatal errors, by replacing

    use warnings;

by

    use warnings FATAL => 'all';

in all Perl files.

Discussion: https://www.postgresql.org/message-id/flat/06f899fd-1826-05ab-42d6-adeb1fd5e200%40eisentraut.org
Do not rely on the OS recognizing a particular locale; find the right
locale by querying the "en_US" collation.

Author: Alexander Lakhin
Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/[email protected]
Mostly, we need to check whether $ENV{PG_TEST_EXTRA} is set before
doing regular expression matches against it.
When enabled (default off), this logs a backtrace anytime elog() or an
equivalent ereport() for internal errors is called.

This is not well covered by the existing backtrace_functions, because
there are many equally-worded low-level errors in many functions.  And
if you find out where the error is, then you need to manually rewrite
the elog() to ereport() to attach the errbacktrace(), which is
annoying.  Having a backtrace automatically on every elog() call could
be very helpful during development for various kinds of common errors
from palloc, syscache, node support, etc.

Discussion: https://www.postgresql.org/message-id/flat/[email protected]
This involves creating more than pg_stat_statements.max entries and
checking that the limit is kept and the least used entries are kicked
out.

Reviewed-by: Michael Paquier <[email protected]>
Reviewed-by: Julien Rouhaud <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/[email protected]

Conflicts:
	contrib/pg_stat_statements/Makefile
This tests that pg_stat_statement contents are successfully kept
across restart.  (This similar to
src/test/recovery/t/029_stats_restart.pl for the stats collector.)

Reviewed-by: Michael Paquier <[email protected]>
Reviewed-by: Julien Rouhaud <[email protected]>
Discussion: https://www.postgresql.org/message-id/flat/[email protected]
Commit 16671ba6e7 moved the code that sends "sorry, too many clients
already" and other such messages, but it had the effect that we would
send that error even if the the startup packet processing failed, e.g.
because the client sent an invalid startup packet. That was not
intentional.

Spotted while reading the code again.
Commit b437571714 added support for parallel builds for BRIN indexes,
using code similar to BTREE parallel builds, and also a new tuplesort
variant. This commit simplifies the new code in two ways:

* The "spool" grouping tuplesort and the heap/index is not necessary.
  The heap/index are available as separate arguments, causing confusion.
  So remove the spool, and use the tuplesort directly.

* The new tuplesort variant does not need the heap/index, as it sorts
  simply by the range block number, without accessing the tuple data.
  So simplify that too.

Initial report and patch by Ranier Vilela, further cleanup by me.

Author: Ranier Vilela
Discussion: https://postgr.es/m/CAEudQAqD7f2i4iyEaAz-5o-bf6zXVX-AkNUBm-YjUXEemaEh6A%40mail.gmail.com
The brinbuildCallbackParallel callback used by parallel BRIN builds did
not consider that the parallel table scans may be synchronized, starting
from an arbitrary block and then wrap around.

If this happened and the scan actually did wrap around, tuples from the
beginning of the table were added to the last range produced by the same
worker. The index would be missing range at the beginning of the table,
while the last range would be too wide. This would not produce incorrect
query results, but it'd be less efficient.

Fixed by checking for both past and future ranges in the callback. The
worker may produce multiple summaries for the same page range, but the
leader will merge them as if the summaries came from different workers.

Discussion: https://postgr.es/m/c2ee7d69-ce17-43f2-d1a0-9811edbda6e6%40enterprisedb.com
The format of these files becomes arguably worse after being indented,
and, as they are generated, there is no point in applying an indentation
anyway.

Author: Bharath Rupireddy
Discussion: https://postgr.es/m/CALj2ACW2JUocmieuR3n9AXL4iSsHcL1LmNkiukuFRUvKNMoiKg@mail.gmail.com
This reverts commit 742f6b3e6df980d7dafa4a18a165d285483d5f0e.

The new test failed on big-endian platforms.

Discussion: https://www.postgresql.org/message-id/flat/[email protected]

Conflicts:
	contrib/pg_stat_statements/Makefile
Copy link

coderabbitai bot commented May 14, 2024

Important

Auto Review Skipped

More than 25% of the files skipped due to max files limit. The review is being skipped to prevent a low-quality review.

122 files out of 240 files are above the max files limit of 50. Please upgrade to Pro plan to get higher limits.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger a review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@gaoxueyu gaoxueyu self-requested a review May 17, 2024 00:47
@gaoxueyu gaoxueyu merged commit 4d9b9f0 into IvorySQL:master May 17, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet