Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove support for delaying state recovery pending master #53845

Merged
merged 17 commits into from
Feb 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions docs/reference/migration/migrate_8_0/settings.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -144,3 +144,25 @@ may be set to false.
Discontinue use of the removed settings. Specifying these settings in
`elasticsearch.yml` will result in an error on startup.
====

.Settings used to defer cluster recovery pending a certain number of master nodes have been removed.
[%collapsible]
====
*Details* +
The following settings were deprecated in {es} 7.8.0 and have been removed in
{es} 8.0.0:

* `gateway.expected_nodes`
* `gateway.expected_master_nodes`
* `gateway.recover_after_nodes`
* `gateway.recover_after_master_nodes`

It is safe to recover the cluster as soon as a majority of master-eligible
nodes have joined so there is no benefit in waiting for any additional
master-eligible nodes to start.

*Impact* +
Discontinue use of the removed settings. If needed, use
`gateway.expected_data_nodes` or `gateway.recover_after_data_nodes` to defer
cluster recovery pending a certain number of data nodes.
====
37 changes: 11 additions & 26 deletions docs/reference/modules/gateway.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -10,20 +10,6 @@ recover the cluster state and the cluster's data.

NOTE: These settings only take effect on a full cluster restart.

`gateway.expected_nodes`::
(<<static-cluster-setting,Static>>)
deprecated:[7.7.0, This setting will be removed in 8.0. Use `gateway.expected_data_nodes` instead.]
Number of data or master nodes expected in the cluster.
Recovery of local shards begins when the expected number of
nodes join the cluster. Defaults to `0`.

`gateway.expected_master_nodes`::
(<<static-cluster-setting,Static>>)
deprecated:[7.7.0, This setting will be removed in 8.0. Use `gateway.expected_data_nodes` instead.]
Number of master nodes expected in the cluster.
Recovery of local shards begins when the expected number of
master nodes join the cluster. Defaults to `0`.

`gateway.expected_data_nodes`::
(<<static-cluster-setting,Static>>)
Number of data nodes expected in the cluster.
Expand All @@ -34,25 +20,24 @@ data nodes join the cluster. Defaults to `0`.
(<<static-cluster-setting,Static>>)
If the expected number of nodes is not achieved, the recovery process waits
for the configured amount of time before trying to recover.
Defaults to `5m` if one of the `expected_nodes` settings is configured.
Defaults to `5m`.
+
Once the `recover_after_time` duration has timed out, recovery will start
as long as the following conditions are met:

`gateway.recover_after_nodes`::
(<<static-cluster-setting,Static>>)
deprecated:[7.7.0, This setting will be removed in 8.0. Use `gateway.recover_after_data_nodes` instead.]
Recover as long as this many data or master nodes have joined the cluster.

`gateway.recover_after_master_nodes`::
(<<static-cluster-setting,Static>>)
deprecated:[7.7.0, This setting will be removed in 8.0. Use `gateway.recover_after_data_nodes` instead.]
Recover as long as this many master nodes have joined the cluster.
as long as the following condition is met:

`gateway.recover_after_data_nodes`::
(<<static-cluster-setting,Static>>)
Recover as long as this many data nodes have joined the cluster.

These settings can be configured in `elasticsearch.yml` as follows:

[source,yaml]
--------------------------------------------------
gateway.expected_data_nodes: 3
gateway.recover_after_time: 600s
gateway.recover_after_data_nodes: 3
--------------------------------------------------

[[dangling-indices]]
==== Dangling indices

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import org.elasticsearch.common.Priority;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.util.concurrent.EsExecutors;
import org.elasticsearch.gateway.GatewayService;
import org.elasticsearch.monitor.os.OsStats;
import org.elasticsearch.node.NodeRoleSettings;
import org.elasticsearch.test.ESIntegTestCase;
Expand Down Expand Up @@ -225,16 +226,12 @@ public void testAllocatedProcessors() throws Exception {
assertThat(response.getNodesStats().getOs().getAllocatedProcessors(), equalTo(nodeProcessors));
}

public void testClusterStatusWhenStateNotRecovered() throws Exception {
internalCluster().startMasterOnlyNode(Settings.builder().put("gateway.recover_after_nodes", 2).build());
public void testClusterStatusWhenStateNotRecovered() {
internalCluster().startMasterOnlyNode(Settings.builder().put(GatewayService.RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 1).build());
ClusterStatsResponse response = client().admin().cluster().prepareClusterStats().get();
assertThat(response.getStatus(), equalTo(ClusterHealthStatus.RED));

if (randomBoolean()) {
internalCluster().startMasterOnlyNode();
} else {
internalCluster().startDataOnlyNode();
}
internalCluster().startDataOnlyNode();
// wait for the cluster status to settle
ensureGreen();
response = client().admin().cluster().prepareClusterStats().get();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,12 +62,12 @@ public void testPendingTasksWithClusterNotRecoveredBlock() throws Exception {
}

// restart the cluster but prevent it from performing state recovery
final int nodeCount = client().admin().cluster().prepareNodesInfo("data:true", "master:true").get().getNodes().size();
final int nodeCount = client().admin().cluster().prepareNodesInfo("data:true").get().getNodes().size();
internalCluster().fullRestart(new InternalTestCluster.RestartCallback() {
@Override
public Settings onNodeStopped(String nodeName) {
return Settings.builder()
.put(GatewayService.RECOVER_AFTER_NODES_SETTING.getKey(), nodeCount + 1)
.put(GatewayService.RECOVER_AFTER_DATA_NODES_SETTING.getKey(), nodeCount + 1)
.build();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@

import java.util.Set;

import static org.elasticsearch.gateway.GatewayService.RECOVER_AFTER_DATA_NODES_SETTING;
import static org.elasticsearch.test.NodeRoles.dataOnlyNode;
import static org.elasticsearch.test.NodeRoles.masterOnlyNode;
import static org.hamcrest.Matchers.equalTo;
Expand All @@ -44,79 +45,16 @@ public Client startNode(Settings.Builder settings) {
return internalCluster().client(name);
}

public void testRecoverAfterNodes() throws Exception {
internalCluster().setBootstrapMasterNodeIndex(0);
logger.info("--> start node (1)");
Client clientNode1 = startNode(Settings.builder().put("gateway.recover_after_nodes", 3));
assertThat(clientNode1.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));

logger.info("--> start node (2)");
Client clientNode2 = startNode(Settings.builder().put("gateway.recover_after_nodes", 3));
Thread.sleep(BLOCK_WAIT_TIMEOUT.millis());
assertThat(clientNode1.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));
assertThat(clientNode2.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));

logger.info("--> start node (3)");
Client clientNode3 = startNode(Settings.builder().put("gateway.recover_after_nodes", 3));

assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, clientNode1).isEmpty(), equalTo(true));
assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, clientNode2).isEmpty(), equalTo(true));
assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, clientNode3).isEmpty(), equalTo(true));
}

public void testRecoverAfterMasterNodes() throws Exception {
internalCluster().setBootstrapMasterNodeIndex(0);
logger.info("--> start master_node (1)");
Client master1 = startNode(Settings.builder().put("gateway.recover_after_master_nodes", 2).put(masterOnlyNode()));
assertThat(master1.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));

logger.info("--> start data_node (1)");
Client data1 = startNode(Settings.builder().put("gateway.recover_after_master_nodes", 2).put(dataOnlyNode()));
assertThat(master1.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));
assertThat(data1.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));

logger.info("--> start data_node (2)");
Client data2 = startNode(Settings.builder().put("gateway.recover_after_master_nodes", 2).put(dataOnlyNode()));
assertThat(master1.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));
assertThat(data1.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));
assertThat(data2.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));

logger.info("--> start master_node (2)");
Client master2 = startNode(Settings.builder().put("gateway.recover_after_master_nodes", 2).put(masterOnlyNode()));
assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, master1).isEmpty(), equalTo(true));
assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, master2).isEmpty(), equalTo(true));
assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, data1).isEmpty(), equalTo(true));
assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, data2).isEmpty(), equalTo(true));
}

public void testRecoverAfterDataNodes() throws Exception {
public void testRecoverAfterDataNodes() {
internalCluster().setBootstrapMasterNodeIndex(0);
logger.info("--> start master_node (1)");
Client master1 = startNode(Settings.builder().put("gateway.recover_after_data_nodes", 2).put(masterOnlyNode()));
Client master1 = startNode(Settings.builder().put(RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 2).put(masterOnlyNode()));
assertThat(master1.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));

logger.info("--> start data_node (1)");
Client data1 = startNode(Settings.builder().put("gateway.recover_after_data_nodes", 2).put(dataOnlyNode()));
Client data1 = startNode(Settings.builder().put(RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 2).put(dataOnlyNode()));
assertThat(master1.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));
Expand All @@ -125,7 +63,7 @@ public void testRecoverAfterDataNodes() throws Exception {
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));

logger.info("--> start master_node (2)");
Client master2 = startNode(Settings.builder().put("gateway.recover_after_data_nodes", 2).put(masterOnlyNode()));
Client master2 = startNode(Settings.builder().put(RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 2).put(masterOnlyNode()));
assertThat(master2.admin().cluster().prepareState().setLocal(true).execute().actionGet()
.getState().blocks().global(ClusterBlockLevel.METADATA_WRITE),
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));
Expand All @@ -137,7 +75,7 @@ public void testRecoverAfterDataNodes() throws Exception {
hasItem(GatewayService.STATE_NOT_RECOVERED_BLOCK));

logger.info("--> start data_node (2)");
Client data2 = startNode(Settings.builder().put("gateway.recover_after_data_nodes", 2).put(dataOnlyNode()));
Client data2 = startNode(Settings.builder().put(RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 2).put(dataOnlyNode()));
assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, master1).isEmpty(), equalTo(true));
assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, master2).isEmpty(), equalTo(true));
assertThat(waitForNoBlocksOnNode(BLOCK_WAIT_TIMEOUT, data1).isEmpty(), equalTo(true));
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@
import static org.elasticsearch.cluster.metadata.IndexMetadata.SETTING_NUMBER_OF_REPLICAS;
import static org.elasticsearch.cluster.metadata.IndexMetadata.SETTING_NUMBER_OF_SHARDS;
import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
import static org.elasticsearch.gateway.GatewayService.RECOVER_AFTER_NODES_SETTING;
import static org.elasticsearch.gateway.GatewayService.RECOVER_AFTER_DATA_NODES_SETTING;
import static org.elasticsearch.index.query.QueryBuilders.matchAllQuery;
import static org.elasticsearch.index.query.QueryBuilders.termQuery;
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked;
Expand Down Expand Up @@ -304,7 +304,7 @@ public void testTwoNodeFirstNodeCleared() throws Exception {
@Override
public Settings onNodeStopped(String nodeName) {
return Settings.builder()
.put(RECOVER_AFTER_NODES_SETTING.getKey(), 2)
.put(RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 2)
.putList(INITIAL_MASTER_NODES_SETTING.getKey()) // disable bootstrapping
.build();
}
Expand All @@ -329,7 +329,7 @@ public boolean clearData(String nodeName) {

public void testLatestVersionLoaded() throws Exception {
// clean two nodes
List<String> nodes = internalCluster().startNodes(2, Settings.builder().put("gateway.recover_after_nodes", 2).build());
List<String> nodes = internalCluster().startNodes(2, Settings.builder().put(RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 2).build());
Settings node1DataPathSettings = internalCluster().dataPathSettings(nodes.get(0));
Settings node2DataPathSettings = internalCluster().dataPathSettings(nodes.get(1));

Expand Down Expand Up @@ -386,8 +386,8 @@ public void testLatestVersionLoaded() throws Exception {
logger.info("--> starting the two nodes back");

internalCluster().startNodes(
Settings.builder().put(node1DataPathSettings).put("gateway.recover_after_nodes", 2).build(),
Settings.builder().put(node2DataPathSettings).put("gateway.recover_after_nodes", 2).build());
Settings.builder().put(node1DataPathSettings).put(RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 2).build(),
Settings.builder().put(node2DataPathSettings).put(RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 2).build());

logger.info("--> running cluster_health (wait for the shards to startup)");
ensureGreen();
Expand Down Expand Up @@ -542,7 +542,7 @@ public void testStartedShardFoundIfStateNotYetProcessed() throws Exception {
@Override
public Settings onNodeStopped(String nodeName) throws Exception {
// make sure state is not recovered
return Settings.builder().put(RECOVER_AFTER_NODES_SETTING.getKey(), 2).build();
return Settings.builder().put(RECOVER_AFTER_DATA_NODES_SETTING.getKey(), 2).build();
}
});

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -229,11 +229,7 @@ public void apply(Settings value, Settings current, Settings previous) {
DestructiveOperations.REQUIRES_NAME_SETTING,
NoMasterBlockService.NO_MASTER_BLOCK_SETTING,
GatewayService.EXPECTED_DATA_NODES_SETTING,
GatewayService.EXPECTED_MASTER_NODES_SETTING,
GatewayService.EXPECTED_NODES_SETTING,
GatewayService.RECOVER_AFTER_DATA_NODES_SETTING,
GatewayService.RECOVER_AFTER_MASTER_NODES_SETTING,
GatewayService.RECOVER_AFTER_NODES_SETTING,
GatewayService.RECOVER_AFTER_TIME_SETTING,
PersistedClusterStateService.SLOW_WRITE_LOGGING_THRESHOLD,
NetworkModule.HTTP_DEFAULT_TYPE_SETTING,
Expand Down
Loading