Skip to content

Commit 4656e3d

Browse files
committed
Replace CLOBBER_CACHE_ALWAYS with run-time GUC
Forced cache invalidation (CLOBBER_CACHE_ALWAYS) has been impractical to use for testing in PostgreSQL because it's so slow and because it's toggled on/off only at build time. It is helpful when hunting bugs in any code that uses the sycache/relcache because causes cache invalidations to be injected whenever it would be possible for an invalidation to occur, whether or not one was really pending. Address this by providing run-time control over cache clobber behaviour using the new debug_invalidate_system_caches_always GUC. Support is not compiled in at all unless assertions are enabled or CLOBBER_CACHE_ENABLED is explicitly defined at compile time. It defaults to 0 if compiled in, so it has negligible effect on assert build performance by default. When support is compiled in, test code can now set debug_invalidate_system_caches_always=1 locally to a backend to test specific queries, functions, extensions, etc. Or tests can toggle it globally for a specific test case while retaining normal performance during test setup and teardown. For backwards compatibility with existing test harnesses and scripts, debug_invalidate_system_caches_always defaults to 1 if CLOBBER_CACHE_ALWAYS is defined, and to 3 if CLOBBER_CACHE_RECURSIVE is defined. CLOBBER_CACHE_ENABLED is now visible in pg_config_manual.h, as is the related RECOVER_RELATION_BUILD_MEMORY setting for the relcache. Author: Craig Ringer <[email protected]> Discussion: https://www.postgresql.org/message-id/flat/CAMsr+YF=+ctXBZj3ywmvKNUjWpxmuTuUKuv-rgbHGX5i5pLstQ@mail.gmail.com
1 parent 8900b5a commit 4656e3d

File tree

9 files changed

+169
-54
lines changed

9 files changed

+169
-54
lines changed

doc/src/sgml/config.sgml

+37
Original file line numberDiff line numberDiff line change
@@ -9989,6 +9989,43 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
99899989
</listitem>
99909990
</varlistentry>
99919991

9992+
<varlistentry id="guc-debug-invalidate-system-caches-always" xreflabel="debug_invalidate_system_caches_always">
9993+
<term><varname>debug_invalidate_system_caches_always</varname> (<type>integer</type>)
9994+
<indexterm>
9995+
<primary><varname>debug_invalidate_system_caches_always</varname> configuration parameter</primary>
9996+
</indexterm>
9997+
</term>
9998+
<listitem>
9999+
<para>
10000+
When set to 1, each cache lookup for a system catalog entry is
10001+
invalidated at the first possible opportunity, irrespective of whether
10002+
anything that would render it invalid really occurred. Caching of
10003+
system catalogs is effectively disabled as a result, so the server
10004+
will run extremely slowly. Higher values run the cache invalidation
10005+
recursively, which is even slower and useful only useful for testing
10006+
in very specific scenarios.
10007+
</para>
10008+
10009+
<para>
10010+
This option can be very helpful when trying to trigger
10011+
hard-to-reproduce bugs involving concurrency and catalog changes but
10012+
is otherwise rarely needed. See the source code files
10013+
<filename>inval.c</filename> and
10014+
<filename>pg_config_manual.h</filename> for details.
10015+
</para>
10016+
10017+
<para>
10018+
This setting is supported but off by default (0) when
10019+
<symbol>CLOBBER_CACHE_ENABLED</symbol> is defined at compile time
10020+
(which happens automatically when using the
10021+
<literal>configure</literal> option
10022+
<option>--enable-cassert</option>). In production builds, its value
10023+
will always be <literal>0</literal> and attempts to set it to another
10024+
value will raise an error.
10025+
</para>
10026+
</listitem>
10027+
</varlistentry>
10028+
999210029
<varlistentry id="guc-ignore-system-indexes" xreflabel="ignore_system_indexes">
999310030
<term><varname>ignore_system_indexes</varname> (<type>boolean</type>)
999410031
<indexterm>

doc/src/sgml/regress.sgml

+2-1
Original file line numberDiff line numberDiff line change
@@ -372,7 +372,8 @@ make check EXTRA_REGRESS_OPTS="--temp-config=test_postgresql.conf"
372372

373373
<para>
374374
This can be useful to enable additional logging, adjust resource limits,
375-
or enable extra run-time checks.
375+
or enable extra run-time checks such as <xref
376+
linkend="guc-debug-invalidate-system-caches-always"/>.
376377
</para>
377378
</sect2>
378379

src/backend/utils/adt/lockfuncs.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -629,7 +629,7 @@ pg_isolation_test_session_is_blocked(PG_FUNCTION_ARGS)
629629
* Check if any of these are in the list of interesting PIDs, that being
630630
* the sessions that the isolation tester is running. We don't use
631631
* "arrayoverlaps" here, because it would lead to cache lookups and one of
632-
* our goals is to run quickly under CLOBBER_CACHE_ALWAYS. We expect
632+
* our goals is to run quickly with debug_invalidate_system_caches_always > 0. We expect
633633
* blocking_pids to be usually empty and otherwise a very small number in
634634
* isolation tester cases, so make that the outer loop of a naive search
635635
* for a match.

src/backend/utils/cache/inval.c

+24-24
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,7 @@
109109
#include "storage/sinval.h"
110110
#include "storage/smgr.h"
111111
#include "utils/catcache.h"
112+
#include "utils/guc.h"
112113
#include "utils/inval.h"
113114
#include "utils/memdebug.h"
114115
#include "utils/memutils.h"
@@ -179,6 +180,8 @@ static SharedInvalidationMessage *SharedInvalidMessagesArray;
179180
static int numSharedInvalidMessagesArray;
180181
static int maxSharedInvalidMessagesArray;
181182

183+
/* GUC storage */
184+
int debug_invalidate_system_caches_always = 0;
182185

183186
/*
184187
* Dynamically-registered callback functions. Current implementation
@@ -689,35 +692,32 @@ AcceptInvalidationMessages(void)
689692
/*
690693
* Test code to force cache flushes anytime a flush could happen.
691694
*
692-
* If used with CLOBBER_FREED_MEMORY, CLOBBER_CACHE_ALWAYS provides a
693-
* fairly thorough test that the system contains no cache-flush hazards.
694-
* However, it also makes the system unbelievably slow --- the regression
695-
* tests take about 100 times longer than normal.
695+
* This helps detect intermittent faults caused by code that reads a
696+
* cache entry and then performs an action that could invalidate the entry,
697+
* but rarely actually does so. This can spot issues that would otherwise
698+
* only arise with badly timed concurrent DDL, for example.
696699
*
697-
* If you're a glutton for punishment, try CLOBBER_CACHE_RECURSIVELY. This
698-
* slows things by at least a factor of 10000, so I wouldn't suggest
699-
* trying to run the entire regression tests that way. It's useful to try
700-
* a few simple tests, to make sure that cache reload isn't subject to
701-
* internal cache-flush hazards, but after you've done a few thousand
702-
* recursive reloads it's unlikely you'll learn more.
700+
* The default debug_invalidate_system_caches_always = 0 does no forced cache flushes.
701+
*
702+
* If used with CLOBBER_FREED_MEMORY, debug_invalidate_system_caches_always = 1
703+
* (CLOBBER_CACHE_ALWAYS) provides a fairly thorough test that the system
704+
* contains no cache-flush hazards. However, it also makes the system
705+
* unbelievably slow --- the regression tests take about 100 times longer
706+
* than normal.
707+
*
708+
* If you're a glutton for punishment, try debug_invalidate_system_caches_always = 3
709+
* (CLOBBER_CACHE_RECURSIVELY). This slows things by at least a factor
710+
* of 10000, so I wouldn't suggest trying to run the entire regression
711+
* tests that way. It's useful to try a few simple tests, to make sure
712+
* that cache reload isn't subject to internal cache-flush hazards, but
713+
* after you've done a few thousand recursive reloads it's unlikely
714+
* you'll learn more.
703715
*/
704-
#if defined(CLOBBER_CACHE_ALWAYS)
705-
{
706-
static bool in_recursion = false;
707-
708-
if (!in_recursion)
709-
{
710-
in_recursion = true;
711-
InvalidateSystemCaches();
712-
in_recursion = false;
713-
}
714-
}
715-
#elif defined(CLOBBER_CACHE_RECURSIVELY)
716+
#ifdef CLOBBER_CACHE_ENABLED
716717
{
717718
static int recursion_depth = 0;
718719

719-
/* Maximum depth is arbitrary depending on your threshold of pain */
720-
if (recursion_depth < 3)
720+
if (recursion_depth < debug_invalidate_system_caches_always)
721721
{
722722
recursion_depth++;
723723
InvalidateSystemCaches();

src/backend/utils/cache/plancache.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -897,7 +897,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
897897
* rejected a generic plan, it's possible to reach here with is_valid
898898
* false due to an invalidation while making the generic plan. In theory
899899
* the invalidation must be a false positive, perhaps a consequence of an
900-
* sinval reset event or the CLOBBER_CACHE_ALWAYS debug code. But for
900+
* sinval reset event or the debug_invalidate_system_caches_always code. But for
901901
* safety, let's treat it as real and redo the RevalidateCachedQuery call.
902902
*/
903903
if (!plansource->is_valid)

src/backend/utils/cache/relcache.c

+40-27
Original file line numberDiff line numberDiff line change
@@ -91,15 +91,15 @@
9191
#define RELCACHE_INIT_FILEMAGIC 0x573266 /* version ID value */
9292

9393
/*
94-
* Default policy for whether to apply RECOVER_RELATION_BUILD_MEMORY:
95-
* do so in clobber-cache builds but not otherwise. This choice can be
96-
* overridden at compile time with -DRECOVER_RELATION_BUILD_MEMORY=1 or =0.
94+
* Whether to bother checking if relation cache memory needs to be freed
95+
* eagerly. See also RelationBuildDesc() and pg_config_manual.h.
9796
*/
98-
#ifndef RECOVER_RELATION_BUILD_MEMORY
99-
#if defined(CLOBBER_CACHE_ALWAYS) || defined(CLOBBER_CACHE_RECURSIVELY)
100-
#define RECOVER_RELATION_BUILD_MEMORY 1
97+
#if defined(RECOVER_RELATION_BUILD_MEMORY) && (RECOVER_RELATION_BUILD_MEMORY != 0)
98+
#define MAYBE_RECOVER_RELATION_BUILD_MEMORY 1
10199
#else
102100
#define RECOVER_RELATION_BUILD_MEMORY 0
101+
#ifdef CLOBBER_CACHE_ENABLED
102+
#define MAYBE_RECOVER_RELATION_BUILD_MEMORY 1
103103
#endif
104104
#endif
105105

@@ -1040,19 +1040,25 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
10401040
* scope, and relcache loads shouldn't happen so often that it's essential
10411041
* to recover transient data before end of statement/transaction. However
10421042
* that's definitely not true in clobber-cache test builds, and perhaps
1043-
* it's not true in other cases. If RECOVER_RELATION_BUILD_MEMORY is not
1044-
* zero, arrange to allocate the junk in a temporary context that we'll
1045-
* free before returning. Make it a child of caller's context so that it
1046-
* will get cleaned up appropriately if we error out partway through.
1043+
* it's not true in other cases.
1044+
*
1045+
* When cache clobbering is enabled or when forced to by
1046+
* RECOVER_RELATION_BUILD_MEMORY=1, arrange to allocate the junk in a
1047+
* temporary context that we'll free before returning. Make it a child
1048+
* of caller's context so that it will get cleaned up appropriately if
1049+
* we error out partway through.
10471050
*/
1048-
#if RECOVER_RELATION_BUILD_MEMORY
1049-
MemoryContext tmpcxt;
1050-
MemoryContext oldcxt;
1051+
#ifdef MAYBE_RECOVER_RELATION_BUILD_MEMORY
1052+
MemoryContext tmpcxt = NULL;
1053+
MemoryContext oldcxt = NULL;
10511054

1052-
tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
1053-
"RelationBuildDesc workspace",
1054-
ALLOCSET_DEFAULT_SIZES);
1055-
oldcxt = MemoryContextSwitchTo(tmpcxt);
1055+
if (RECOVER_RELATION_BUILD_MEMORY || debug_invalidate_system_caches_always > 0)
1056+
{
1057+
tmpcxt = AllocSetContextCreate(CurrentMemoryContext,
1058+
"RelationBuildDesc workspace",
1059+
ALLOCSET_DEFAULT_SIZES);
1060+
oldcxt = MemoryContextSwitchTo(tmpcxt);
1061+
}
10561062
#endif
10571063

10581064
/*
@@ -1065,10 +1071,13 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
10651071
*/
10661072
if (!HeapTupleIsValid(pg_class_tuple))
10671073
{
1068-
#if RECOVER_RELATION_BUILD_MEMORY
1069-
/* Return to caller's context, and blow away the temporary context */
1070-
MemoryContextSwitchTo(oldcxt);
1071-
MemoryContextDelete(tmpcxt);
1074+
#ifdef MAYBE_RECOVER_RELATION_BUILD_MEMORY
1075+
if (tmpcxt)
1076+
{
1077+
/* Return to caller's context, and blow away the temporary context */
1078+
MemoryContextSwitchTo(oldcxt);
1079+
MemoryContextDelete(tmpcxt);
1080+
}
10721081
#endif
10731082
return NULL;
10741083
}
@@ -1247,10 +1256,13 @@ RelationBuildDesc(Oid targetRelId, bool insertIt)
12471256
/* It's fully valid */
12481257
relation->rd_isvalid = true;
12491258

1250-
#if RECOVER_RELATION_BUILD_MEMORY
1251-
/* Return to caller's context, and blow away the temporary context */
1252-
MemoryContextSwitchTo(oldcxt);
1253-
MemoryContextDelete(tmpcxt);
1259+
#ifdef MAYBE_RECOVER_RELATION_BUILD_MEMORY
1260+
if (tmpcxt)
1261+
{
1262+
/* Return to caller's context, and blow away the temporary context */
1263+
MemoryContextSwitchTo(oldcxt);
1264+
MemoryContextDelete(tmpcxt);
1265+
}
12541266
#endif
12551267

12561268
return relation;
@@ -1646,8 +1658,9 @@ LookupOpclassInfo(Oid operatorClassOid,
16461658
* while we are loading the info, and it's very hard to provoke that if
16471659
* this happens only once per opclass per backend.
16481660
*/
1649-
#if defined(CLOBBER_CACHE_ALWAYS)
1650-
opcentry->valid = false;
1661+
#ifdef CLOBBER_CACHE_ENABLED
1662+
if (debug_invalidate_system_caches_always > 0)
1663+
opcentry->valid = false;
16511664
#endif
16521665

16531666
if (opcentry->valid)

src/backend/utils/misc/guc.c

+24
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,7 @@
9999
#include "utils/rls.h"
100100
#include "utils/snapmgr.h"
101101
#include "utils/tzparser.h"
102+
#include "utils/inval.h"
102103
#include "utils/varlena.h"
103104
#include "utils/xml.h"
104105

@@ -3402,6 +3403,29 @@ static struct config_int ConfigureNamesInt[] =
34023403
check_huge_page_size, NULL, NULL
34033404
},
34043405

3406+
{
3407+
{"debug_invalidate_system_caches_always", PGC_SUSET, DEVELOPER_OPTIONS,
3408+
gettext_noop("Aggressively invalidate system caches for debugging purposes."),
3409+
NULL,
3410+
GUC_NOT_IN_SAMPLE
3411+
},
3412+
&debug_invalidate_system_caches_always,
3413+
#ifdef CLOBBER_CACHE_ENABLED
3414+
/* Set default based on older compile-time-only cache clobber macros */
3415+
#if defined(CLOBBER_CACHE_RECURSIVELY)
3416+
3,
3417+
#elif defined(CLOBBER_CACHE_ALWAYS)
3418+
1,
3419+
#else
3420+
0,
3421+
#endif
3422+
0, 5,
3423+
#else /* not CLOBBER_CACHE_ENABLED */
3424+
0, 0, 0,
3425+
#endif /* not CLOBBER_CACHE_ENABLED */
3426+
NULL, NULL, NULL
3427+
},
3428+
34053429
/* End-of-list marker */
34063430
{
34073431
{NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL, NULL

src/include/pg_config_manual.h

+39
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,45 @@
309309
*/
310310
/* #define RANDOMIZE_ALLOCATED_MEMORY */
311311

312+
/*
313+
* For cache invalidation debugging, define CLOBBER_CACHE_ENABLED to enable
314+
* use of the debug_invalidate_system_caches_always GUC to aggressively flush
315+
* syscache/relcache entries whenever it's possible to deliver invalidations.
316+
* See AcceptInvalidationMessages() in src/backend/utils/cache/inval.c for
317+
* details.
318+
*
319+
* USE_ASSERT_CHECKING builds default to enabling this. It's possible to use
320+
* CLOBBER_CACHE_ENABLED without a cassert build and the implied
321+
* CLOBBER_FREED_MEMORY and MEMORY_CONTEXT_CHECKING options but it's unlikely
322+
* to be as effective at identifying problems.
323+
*/
324+
/* #define CLOBBER_CACHE_ENABLED */
325+
326+
#if defined(USE_ASSERT_CHECKING) && !defined(CLOBBER_CACHE_ENABLED)
327+
#define CLOBBER_CACHE_ENABLED
328+
#endif
329+
330+
/*
331+
* Backwards compatibility for the older compile-time-only cache clobber
332+
* macros.
333+
*/
334+
#if !defined(CLOBBER_CACHE_ENABLED) && (defined(CLOBBER_CACHE_ALWAYS) || defined(CLOBBER_CACHE_RECURSIVELY))
335+
#define CLOBBER_CACHE_ENABLED
336+
#endif
337+
338+
/*
339+
* Recover memory used for relcache entries when invalidated. See
340+
* RelationBuildDescr() in src/backend/utils/cache/relcache.c.
341+
*
342+
* This is active automatically for clobber cache builds when clobbering is
343+
* active, but can be overridden here by explicitly defining
344+
* RECOVER_RELATION_BUILD_MEMORY. Define to 1 to always free relation cache
345+
* memory even when clobber is off, or to 0 to never free relation cache
346+
* memory even when clobbering is on.
347+
*/
348+
/* #define RECOVER_RELATION_BUILD_MEMORY 0 */ /* Force disable */
349+
/* #define RECOVER_RELATION_BUILD_MEMORY 1 */ /* Force enable */
350+
312351
/*
313352
* Define this to force all parse and plan trees to be passed through
314353
* copyObject(), to facilitate catching errors and omissions in

src/include/utils/inval.h

+1
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
#include "storage/relfilenode.h"
1919
#include "utils/relcache.h"
2020

21+
extern PGDLLIMPORT int debug_invalidate_system_caches_always;
2122

2223
typedef void (*SyscacheCallbackFunction) (Datum arg, int cacheid, uint32 hashvalue);
2324
typedef void (*RelcacheCallbackFunction) (Datum arg, Oid relid);

0 commit comments

Comments
 (0)