forked from greenplum-db/gpdb-archive
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arenadata patchset 57 #1008
Merged
Merged
Arenadata patchset 57 #1008
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Collaborator
Stolb27
commented
Jul 30, 2024
•
edited
Loading
edited
- ADBDEV-5410 arenadata_toolkit: optimize tests #952
- ADBDEV-5517 Fix stuck query after cancel or termination when segment is not responding #948
- ADBDEV-5755 Replace gpdb-specific dbid detection in the pg_rewind #970
- ADBDEV-5806 Update timezone database in GPDB 6 #974
- ADBDEV-5874 fix yum repos for CentOS 7 #981
- ADBDEV-3539 ADBDEV-4661 Extra tests for CTE with replicated tables
- ADBDEV-5897 ADBDEV-5897: Fix ABI tests #986
- ADBDEV-5691 Updated approach to counting the number of rows for replicated tables at DML #969
- ADBDEV-5809 Restrict COPY to intermediate partition tables #979
- ADBDEV-5811 ADBDEV-5811: Add collection of test logs of resource groups #982
- ADBDEV-5808 Fix gpcopy's regression test #995
- ADBDEV-5552 Flaky appendonly test #999
- ADBDEV-5943 support passwordless configuration in gpperfmon_install #996
- ADBDEV-5729 ADBDEV-5729 Skip already freed tuples in tuplestore_end #977
Test adb_get_relfilenodes_test was moved to input/output directories. It allows using macro @testtablespace@ instead of a path for tablespace. Now, the path for tablespace is generated by pg_regress. In some tests queries were optimized: left join is changed to join, "<table_name>::regclass is used" instead of scanning pg_class, tables' names are got from the "VALUES" section in the alphabet order. It removed one motion node from the query's plan (by not scanning pg_class) and not sorting output by tables' names. Preparing database was added at the beginning of each test. Small typo was fixed.
…nding (#948) Problem: The following scenario caused a stuck state of a coordinator backend process: 1. At some moment one of the segments stopped processing incoming requests (the reason itself is not important, as sometimes bad things may happen with any segment, and the system should be able to recover). 2. A query, which dispatches requests to segments, was executed (for example, select from 'gp_toolkit.gp_resgroup_status_per_segment'). As one of the segments was not responding, the coordinator hanged in 'checkDispatchResult' (it is expected, and can be handled by the FTS). 3. The stuck query was canceled or terminated (by 'pg_cancel_backend' or 'pg_terminate_backend'). It didn't help to return from the stuck state, but after this step the query became completely unrecoverable. Even after FTS had detected the malfunction segment and had promoted the mirror, the query was still hanging forever. Expected behavior is that after FTS mirror promotion, all stuck queries are unblocked and canceled successfully. Note: if FTS mirror promotion happened before step 3, the FTS canceled the query successfully. Root cause: During cancel or termination, the coordinator tried to do abort of the current transaction, and it hanged in the function 'internal_cancel' (called from PQcancel) on 'poll' system call. It has a timeout of 660 seconds, and, moreover, after the timeout expired, it looped forever trying to do 'poll' again. So, if the socket was opened on the segment side, but nobody replied (as the segment process became not available for some reason), 'internal_cancel' had no way to return. Fix: Once FTS promotes a mirror, it sends the signal to the coordinator postmaster (with PMSIGNAL_FTS_PROMOTED_MIRROR reason). On receiving of the signal, the coordinator postmaster sends the SIGUSR1 signal (with PROCSIG_FTS_PROMOTED_MIRROR reason) to all of its usual backends. Once the backend receives the signal, if it is in the state of cancelling or terminating of the query, it sets a flag in libpq. 'internal_cancel' checks this flag before calling the 'poll' system call. If it is set, it will return with an error. Thus: a. if the FTS promotion happens before cancel/terminate, the query will be canceled by the old logic; b. if the FTS promotion happens after cancel/terminate, but before the 'internal_cancel' calls the 'poll', 'internal_cancel' will return an error without calling the 'poll'; c. if the FTS promotion happens when the 'poll' is already called, the 'poll' will return EINTR (as the SIGUSR1 was received), and a new 'poll' will not be called, as the flag is set.
Opensource project pgbouncer was renamed to pgbouncer-archive. Old URL is unavailable. URL was changed according to new name of pgbouncer. (cherry picked from commit 28bcd04)
Sync 6.27.1 changes to dev
pg_rewind util has gpdb-specific logic to detect current dbid. It runs `postgres` which parses `postgresql.conf` and all included `.conf` files. Among others, this file includes `internal.auto.conf` file which contains gp_dbid parameter. There is no need to parse the whole `postgresql.conf` to get just gp_dbid. More, this file may be zeroed by unsuccessful pg_rewind sync or may be moved somewhere. The only needed file to run specific logic is `internal.auto.conf`, which is not synced, so the solution is to pass it as config to `postgres`. New test file was created to show solution works when `postgresql.conf` is zeroed.
DST law changes in Morocco and the Canadian Yukon. Historical corrections for Shanghai. The America/Godthab zone is renamed to America/Nuuk to reflect current English usage; however, the old name remains available as a compatibility link. (cherry picked from commit c820692)
This absorbs a leap-second-related bug fix in localtime.c, and teaches zic to handle an expiration marker in the leapseconds file. Neither are of any interest to us (for the foreseeable future anyway), but we need to stay more or less in sync with upstream. Also adjust some over-eager changes in the README from commit 9573384. I have no intention of making changes that require C99 in this code, until such time as all the live back branches require C99. Otherwise back-patching will get too exciting. For the same reason, absorb assorted whitespace and other cosmetic changes from HEAD into the back branches; mostly this reflects use of improved versions of pgindent. All in all then, quite a boring update. But I figured I'd get it done while I was looking at this code. (cherry picked from commit 812a84d)
This changes zic's default output format from "-b fat" to "-b slim". We were already using "slim" in v13/HEAD, so those branches drop the explicit -b switch in the Makefiles. Instead, add an explicit "-b fat" in v12 and before, so that we don't change the output file format in those branches. (This is perhaps excessively conservative, but we decided not to do so in a120791, and I'll stick with that.) Other non-cosmetic changes are to drop support for zic's long-obsolete "-y" switch, and to ensure that strftime() does not change errno unless it fails. As usual with tzcode changes, back-patch to all supported branches. (cherry picked from commit 3d13a83)
DST law changes in Morocco, Canadian Yukon, Fiji, Macquarie Island, Casey Station (Antarctica). Historical corrections for France, Hungary, Monaco. (cherry picked from commit b39c940)
There's no functional change at all here, but I'm curious to see whether this change successfully shuts up Coverity's warning about a useless strcmp(), which appeared with the previous update. Discussion: https://mm.icann.org/pipermail/tz/2020-October/029370.html (cherry picked from commit f56c42e)
DST law changes in Palestine, with a whopping 120 hours' notice. Also some historical corrections for Palestine. (cherry picked from commit 78ccf7f)
DST law changes in Russia (Volgograd zone) and South Sudan. Historical corrections for Australia, Bahamas, Belize, Bermuda, Ghana, Israel, Kenya, Nigeria, Palestine, Seychelles, and Vanuatu. Notably, the Australia/Currie zone has been corrected to the point where it is identical to Australia/Hobart. (cherry picked from commit 5db6ba3) Note: the cherry-picked commit 5db6ba3 missed changes from the original postgres commit c7edf4a in the makefile, resulting in not matching abbrevs.txt and known_abbrevs.txt. So these changes were added manually.
DST law changes in Fiji, Jordan, Palestine, and Samoa. Historical corrections for Barbados, Cook Islands, Guyana, Niue, Portugal, and Tonga. Also, the Pacific/Enderbury zone has been renamed to Pacific/Kanton. The following zones have been merged into nearby, more-populous zones whose clocks have agreed since 1970: Africa/Accra, America/Atikokan, America/Blanc-Sablon, America/Creston, America/Curacao, America/Nassau, America/Port_of_Spain, Antarctica/DumontDUrville, and Antarctica/Syowa. (cherry picked from commit 14b8d25)
DST law changes in Palestine. Historical corrections for Chile and Ukraine. (cherry picked from commit 2bb9f75)
We had two occurrences of "Mitteleuropäische Zeit" in Europe.txt, though the corresponding entries in Default were spelled "Mitteleuropaeische Zeit". Standardize on the latter spelling to avoid questions of which encoding to use. While here, correct a couple of other trivial inconsistencies between the Default file and the supposedly-matching entries in the *.txt files, as exposed by some checking with comm(1). Also, add BDST to the Europe.txt file; it previously was only listed in Default. None of this has any direct functional effect. Per complaint from Christoph Berg. As usual for timezone data patches, apply to all branches. Discussion: https://postgr.es/m/[email protected] (cherry picked from commit a40733d)
DST law changes in Chile, Fiji, Iran, Jordan, Mexico, Palestine, and Syria. Historical corrections for Chile, Crimea, Iran, and Mexico. Also, the Europe/Kiev zone has been renamed to Europe/Kyiv (retaining the old name as a link). The following zones have been merged into nearby, more-populous zones whose clocks have agreed since 1970: Antarctica/Vostok, Asia/Brunei, Asia/Kuala_Lumpur, Atlantic/Reykjavik, Europe/Amsterdam, Europe/Copenhagen, Europe/Luxembourg, Europe/Monaco, Europe/Oslo, Europe/Stockholm, Indian/Christmas, Indian/Cocos, Indian/Kerguelen, Indian/Mahe, Indian/Reunion, Pacific/Chuuk, Pacific/Funafuti, Pacific/Majuro, Pacific/Pohnpei, Pacific/Wake and Pacific/Wallis. (This indirectly affects zones that were already links to one of these: Arctic/Longyearbyen, Atlantic/Jan_Mayen, Iceland, Pacific/Ponape, Pacific/Truk, and Pacific/Yap.) America/Nipigon, America/Rainy_River, America/Thunder_Bay, Europe/Uzhgorod, and Europe/Zaporozhye were also merged into nearby zones after discovering that their claimed post-1970 differences from those zones seem to have been errors. While the IANA crew have been working on merging zones that have no post-1970 differences for some time, this batch of changes affects some zones that are significantly more populous than those merged in the past, notably parts of Europe. The loss of pre-1970 timezone history for those zones may be troublesome for applications expecting consistency of timestamptz display. As an example, the stored value '1944-06-01 12:00 UTC' would previously display as '1944-06-01 13:00:00+01' if the Europe/Stockholm zone is selected, but now it will read out as '1944-06-01 14:00:00+02'. There exists a "packrat" option that will build the timezone data files with this old data preserved, but the problem is that it also resurrects a bunch of other, far less well-attested data; so much so that actually more zones' contents change from 2022a with that option than without it. I have chosen not to do that here, for that reason and because it appears that no major OS distributions are using the "packrat" option, so that doing so would cause Postgres' behavior to diverge significantly depending on whether it was built with --with-system-tzdata. However, for anyone for whom these changes pose significant problems, there is a solution: build a set of timezone files with the "packrat" option and use those with Postgres. (cherry picked from commit e7c7605)
DST law changes in Greenland and Mexico. Notably, a new timezone America/Ciudad_Juarez has been split off from America/Ojinaga. Historical corrections for northern Canada, Colombia, and Singapore. (cherry picked from commit 758f44b)
DST law changes in Egypt, Greenland, Morocco, and Palestine. When observing Moscow time, Europe/Kirov and Europe/Volgograd now use the abbreviations MSK/MSD instead of numeric abbreviations, for consistency with other timezones observing Moscow time. Also, America/Yellowknife is no longer distinct from America/Edmonton; this affects some pre-1948 timestamps in that area. (cherry picked from commit 4ddee4d)
DST law changes in Ittoqqortoormiit, Greenland (America/Scoresbysund), Kazakhstan (Asia/Almaty and Asia/Qostanay) and Palestine; as well as updates for the Antarctic stations Casey and Vostok. Historical corrections for Vietnam, Toronto, and Miquelon. (cherry picked from commit 272a7c3034925162deb4395bf925bcf60dc2d061)
Debian recently decided to split out a bunch of "obsolete" timezone names into a new tzdata-legacy package, which isn't installed by default. One of these zone names is Pacific/Enderbury, and that breaks our regression tests (on --with-system-tzdata builds) because our default timezone abbreviations list defines PHOT as Pacific/Enderbury. Pacific/Enderbury got renamed to Pacific/Kanton in tzdata 2021b, so that in distros that still have this entry it's just a symlink to Pacific/Kanton anyway. So one answer would be to redefine PHOT as Pacific/Kanton. However, then things would fail if the installed tzdata predates 2021b, which is recent enough that that seems like a real problem. Instead, let's just remove PHOT from the default list. That seems likely to affect nobody in the real world, because (a) it was an abbreviation that the tzdb crew made up in the first place, with no evidence of real-world usage, and (b) the total human population of the Phoenix Islands is less than two dozen persons, per Wikipedia. If anyone does use this zone abbreviation they can easily put it back via a custom abbreviations file. We'll keep PHOT in the Pacific.txt reference file, but change it to Pacific/Kanton there, as that definition seems more likely to be useful to future readers of that file. Per report from Victor Wagner. Back-patch to all supported branches. Discussion: https://postgr.es/m/[email protected] (cherry picked from commit 5fd3e06f6ad1035da71124a79208242cd915ba2e)
Add an extra test case for planning queries with ORCA with replicated tables using CTE. There was a well-known problem in ORCA with planning distributed queries with replicated tables using CTE. It was addressed several times in both adb and upstream (PRs 13833 - cdd532f, PR 14896, PR 13728, ADBDEV-2411), but only the upstream commit 24a54a7 which included in 6.27 provided a more complete solution. This patch adds one more test case for the aforementioned problem, which was not appeared in the tests but was solved. In the particular query in the test in this patch ORCA could not build a plan due to too heavy restrictions applied to CPhysicalSequence by using CDistributionSpecNonSingleton.
CentOS Linux 7 reached end of life (EOL) on June 30, 2024. Default servers for yum repos were disabled. This patch changes URLs at yum's configuration files.
Move CONTRIBUTING.md from .github to the root. Replace the contents of CONTRIBUTING.md in README.md with a link to avoid the text duplication. Actualize the contents of CONTRIBUTING.md and README.md: update branch names and links, remove mentions of the mailing list, etc.
Add support for Ubuntu version 20.04 or 22.04. List of changes: - Update installing dependencies. - Create a symbolic link to Python 2. - Update some common platform steps. The correctness check was performed on a clean versions 20.04 and 22.04 of Ubuntu.
Commit 24a54a7 fixes many issues, including correctly planning inlined CTE with replicated table. This patch adds extra test for it.
GitHub actions/checkout@v3 is no longer working and v4 is incompatible with older Linux versions due to actions/checkout#1590. This patch uses workaround. The ACTIONS_ALLOW_USE_UNSECURE_NODE_VERSION variable is defined [here](https://github.com/actions/runner/blob/70746ff593636b07ad251a1525a3fabd1a7a36e9/src/Runner.Common/Constants.cs#L257) and used [here](https://github.com/actions/runner/blob/70746ff593636b07ad251a1525a3fabd1a7a36e9/src/Runner.Worker/Handlers/HandlerFactory.cs#L94-L124). A warning is added when the variable is not set.
…at DML (#969) Replicated tables are the same at all segments. Modifying replicated tables lead to do the same processing at each segment. All segments calculate the count of modified (updated, deleted or inserted) rows. The count on all segments will be the same. It led to situation: QD got the same counts of modified rows from each segments, summarized it, then divided it on number of segments. But it had to do not in all cases. One such case: select from a function which contains DML on replicated table. Such query led to an error: "consistency check on SPI tuple count failed". Also, the ExecutePlan function allows to limit the number of processing tuples, but for DML operations the number of tuples is acquired from the sum of such numbers from segments. It led to the same error, because QD processed the number of tuples, which is less than all segments sent. This patch modifies approach to counting the number of rows for replicated tables at DML operations. Only one segment will calculate the count of modified rows if the table is replicated. Other segments will send zero. It was done by disabling the canSetTag flag for all segments except one. Dividing on number of segments is no longer needed. Also, for SPI calculating the number of processed tuples was added for cases, when it is used at the _SPI_checktuples function. The behavior of _SPI_pquery was changed to the behavior at PostgreSQL. NOTE: the canSetTag flag enables saving the lastTid variable for the INSERT queries. But it is useless, because it is used at only one place: at the currtid_byreloid function, which has been never used. It was removed by PostgreSQL 14: postgres/postgres@7b94e99
Copy to a mid-level partition led to insert into such partition instead of leaf partition. This patch adds check, which does not allow copy to a mid-level partition. Correct copy should be into root partition or leaf partition.
During the testing of resource groups, log files are created that can be useful in analyzing problems. This patch adds saving log files before deleting containers. The log files are archived and saved to the /logs directory inside the container, which is a volume mounted in the logs_cdw and logs_sdw1 directories for the corresponding containers on the host.
Test passed when query generated unexpected output. There was a typo on the "--end_ignore" line, which led to ignoring diff after this line.
The test in the commit 3dea58d appeared to be flaky, because it used table names t1 and t2, which are already used in another running in the same parallel group. This patch renames tables t1 and t2 to unique names in the test appendonly
* support passwordless configuration in gpperfmon_install
If we have insufficient memory to execute a query, postgres spills some data to disk. However, if we hit spill file size limit, a query will be cancelled. When spilling tuples to disk, postgres clears them from memory. However, it doesn't mark them as inaccessible. So, if an error appears, tuplestore_end cycles through all tuples and tries to clear them. Attempting to clear an already freed tuple tuplestore_end causes segfault. Segfault happens because gpdb has an additional property in memory chunk header which is sharedHeader. If a tuple was once freed then sharedHeader is set NULL in AllocSetFree. The next attempt to free a tuple dereferences sharedHeader and causes segfault. Segfault causes postgres to go into segfault handler and drop current stack. This happens in the middle of working loop in dumptuple so the loop never ends. The next attempt to clear tuplestore tries to free the same tuples once again which causes segfault. This patch counts the number of freed tuples so that tuplestore_end skips them in case of an error. Ticket: ADBDEV-5729
Allure report https://allure.adsw.io/launch/76385 |
RekGRpth
approved these changes
Jul 30, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.