Skip to content

HBASE-30180: Can still add data to read-only region after flipping read-only flag multiple times#8280

Open
kgeisz wants to merge 2 commits into
apache:masterfrom
kgeisz:HBASE-30180-can-add-data-to-read-only-table
Open

HBASE-30180: Can still add data to read-only region after flipping read-only flag multiple times#8280
kgeisz wants to merge 2 commits into
apache:masterfrom
kgeisz:HBASE-30180-can-add-data-to-read-only-table

Conversation

@kgeisz
Copy link
Copy Markdown
Contributor

@kgeisz kgeisz commented May 28, 2026

https://issues.apache.org/jira/browse/HBASE-30180

Summary

This pull request fixes an issue with the new Read-Replica feature where if you create a table on the active cluster and then flip the read-only flag multiple times, then sometimes it is still possible to add data to a table despite the cluster being in read-only mode. This issue was happening due to how the configuration for HRegion was being updated after running update_all_config in the HBase shell.

Read-only mode is determined by checking whether the ReadOnly coprocessors are loaded on the cluster. If they are present, then read-only mode is active, and vice-vera if they are not present. The HRegion.onConfigurationChange() method was not properly updating HRegion's configuration when trying to add/remove the ReadOnly coprocessors. It was updating both HRegion's this.conf and the newConf provided in the onConfigurationChangeMethod(). Furthermore, the incorrect configuration was being provided when updating HRegion's RegionCoprocessorHost.

HRegion's this.conf object is a special CompoundConfiguration object, so this needs to have the ReadOnly coprocessors added to or removed from directly. After, this CompoundConfiguration should be used to update HRegion's this.coprocessorHost object since this was originally initialized using a CompoundConfiguration.

More Details

HMaster and HRegionServer Configuration

When a cluster's configuration is dynamically updated, the configuration is reloaded and all observers are notified via the HBaseServerBase.updateConfiguration() method. For HMaster and HRegionServer, their this.conf references the same object as the newConf arg in their respective onConfigurationChange() methods. The following custom log messages showing hash codes support this:

# From HMaster (note hash 1923999715 for both this.conf and newConf)
--------------------------------------------------------------------
2026-05-27T23:51:04,911 INFO  [RpcServer.priority.RWQ.Fifo.read.handler=12,queue=3,port=16000] master.HMaster: kevin: HMaster: System.identityHashCode(newConf) = 1923999715
2026-05-27T23:51:04,911 INFO  [RpcServer.priority.RWQ.Fifo.read.handler=12,queue=3,port=16000] master.HMaster: kevin: HMaster: System.identityHashCode(this.conf) = 1923999715


# From HRegionServer (note hash 351962798 for both this.conf and newConf)
-------------------------------------------------------------------------
2026-05-27T23:51:04,930 INFO  [RpcServer.priority.RWQ.Fifo.read.handler=12,queue=3,port=16020] regionserver.HRegionServer: kevin: HRegionServer: System.identityHashCode(newConf) = 351962798
2026-05-27T23:51:04,930 INFO  [RpcServer.priority.RWQ.Fifo.read.handler=12,queue=3,port=16020] regionserver.HRegionServer: kevin: HRegionServer: System.identityHashCode(this.conf) = 351962798


# From HRegion for table 't1' (note hash 351962798 for newConf and this.baseConf versus 1398174598 hash for this.conf)
----------------------------------------------------------------------------------------------------------------------
2026-05-27T23:51:48,203 INFO  [RpcServer.priority.RWQ.Fifo.read.handler=5,queue=3,port=16020] regionserver.HRegion: kevin: HRegion t1,,1779925880045.ef09d14e54df7af1e951c894a5556acf.: System.identityHashCode(newConf) = 351962798
2026-05-27T23:51:48,203 INFO  [RpcServer.priority.RWQ.Fifo.read.handler=5,queue=3,port=16020] regionserver.HRegion: kevin: HRegion t1,,1779925880045.ef09d14e54df7af1e951c894a5556acf.: System.identityHashCode(this.conf) = 1398174598
2026-05-27T23:51:48,203 INFO  [RpcServer.priority.RWQ.Fifo.read.handler=5,queue=3,port=16020] regionserver.HRegion: kevin: HRegion t1,,1779925880045.ef09d14e54df7af1e951c894a5556acf.: System.identityHashCode(this.baseConf) = 351962798

This means it does not matter whether you update this.conf or the newConf reference in HMaster's or HRegionServer's onConfigurationChange() method since they reference the same Configuration object.

HRegion Configuration

HRegion's configuration is different from HMaster and HRegionServer. HRegion's this.conf is actually a CompoundConfiguration object, which is a shallow merge of multiple Configurations. It is initialized like so in an HRegion constructor:

this.baseConf = confParam;
this.conf = new CompoundConfiguration().add(confParam).addBytesMap(htd.getValues());

Based on this initialization and the log messages and hash codes shared above, we see HRegionServer's configuration is the base conf for HRegion's this.conf. We can also see the newConf arg in HRegion.onConfigurationChange() is not the same object reference as HRegion's this.conf. Since the CompoundConfiguration uses a shallow merge, updating HRegionServer's configuration also updates a layer of the CompoundConfiguration. The original code was incorrectly updating newConf for HRegion, so this leads to a race condition where updating HRegionServer also updates all regions automatically, and eventually a region would not be properly updated (such as the ReadOnly coprocessors not being removed when they should have been).

In addition, HRegion's this.coprocessorHost object was being reinitialized using newConf's Configuration object which it should have been using an updated version of this.conf's CompoundConfiguration object.

…ad-only flag multiple times

Change-Id: I8c5edf6ae4775d36dbbe2b260f77e2168a2a92da
@kgeisz kgeisz force-pushed the HBASE-30180-can-add-data-to-read-only-table branch from 3f2cc32 to 024d94c Compare May 28, 2026 00:53
@taklwu taklwu requested review from Copilot and taklwu May 28, 2026 01:20
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes HBASE-30180 where toggling the read-only flag multiple times could leave a region writable. The root cause is that HRegion.this.conf is a CompoundConfiguration distinct from the newConf passed to onConfigurationChange(). The previous code synced the read-only coprocessor list onto newConf and built a new RegionCoprocessorHost from newConf rather than from this.conf, missing the table-descriptor layer and leaving stale data. For HMaster/HRegionServer, this.conf and newConf are the same object, so the issue is region-specific.

Changes:

  • Overload maybeUpdateCoprocessors to accept a separate confToUpdate, used for syncing read-only coprocessors and passed into the reload lambda.
  • Pass this.conf (the CompoundConfiguration) as confToUpdate in HRegion.onConfigurationChange so the new RegionCoprocessorHost is built from the correct config layer; remove the now-unneeded updateCoprocessorListInConf helper.
  • Simplify the HMaster/HRegionServer call sites since their this.conf == newConf.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
hbase-server/src/main/java/org/apache/hadoop/hbase/util/CoprocessorConfigurationUtil.java Adds confToUpdate parameter to maybeUpdateCoprocessors, removes obsolete updateCoprocessorListInConf helper, gates syncReadOnlyConfigurations on read-only-mode change only.
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java Passes this.conf (CompoundConfiguration) as confToUpdate and uses it for rebuilding RegionCoprocessorHost.
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Simplifies lambda since this.conf == newConf.
hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java Same simplification as HRegionServer.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

@taklwu taklwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would you be able to come up with a unit test to prevent this error from happening?

CoprocessorConfigurationUtil.syncReadOnlyConfigurations(newConf, coprocessorConfKey);
reloadTask.reload(newConf);
componentName, instance);
if (hasReadOnlyModeChanged) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this reminded me about #7514 were facing the same issue, IMO you're doing the right change.

Change-Id: I11accf45b4e92184f289652bba9a32e7651b5af7
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants