Skip to content

Commit

Permalink
KAFKA-2790: doc improvements
Browse files Browse the repository at this point in the history
Author: Gwen Shapira <cshapi@gmail.com>

Reviewers: Jun Rao, Guozhang Wang

Closes apache#491 from gwenshap/KAFKA-2790
  • Loading branch information
gwenshap committed Nov 11, 2015
1 parent e0098b4 commit a8ccdc6
Show file tree
Hide file tree
Showing 7 changed files with 156 additions and 18 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -384,21 +384,24 @@ else if (!k2.hasDefault() && k1.hasDefault())
}
});
StringBuilder b = new StringBuilder();
b.append("<table>\n");
b.append("<table class=\"data-table\"><tbody>\n");
b.append("<tr>\n");
b.append("<th>Name</th>\n");
b.append("<th>Description</th>\n");
b.append("<th>Type</th>\n");
b.append("<th>Default</th>\n");
b.append("<th>Valid Values</th>\n");
b.append("<th>Importance</th>\n");
b.append("<th>Description</th>\n");
b.append("</tr>\n");
for (ConfigKey def : configs) {
b.append("<tr>\n");
b.append("<td>");
b.append(def.name);
b.append("</td>");
b.append("<td>");
b.append(def.documentation);
b.append("</td>");
b.append("<td>");
b.append(def.type.toString().toLowerCase());
b.append("</td>");
b.append("<td>");
Expand All @@ -418,12 +421,9 @@ else if (def.type == Type.STRING && def.defaultValue.toString().isEmpty())
b.append("<td>");
b.append(def.importance.toString().toLowerCase());
b.append("</td>");
b.append("<td>");
b.append(def.documentation);
b.append("</td>");
b.append("</tr>\n");
}
b.append("</table>");
b.append("</tbody></table>");
return b.toString();
}
}
8 changes: 4 additions & 4 deletions docs/api.html
Original file line number Diff line number Diff line change
Expand Up @@ -15,21 +15,21 @@
limitations under the License.
-->

We are in the process of rewritting the JVM clients for Kafka. As of 0.8.2 Kafka includes a newly rewritten Java producer. The next release will include an equivalent Java consumer. These new clients are meant to supplant the existing Scala clients, but for compatability they will co-exist for some time. These clients are available in a seperate jar with minimal dependencies, while the old Scala clients remain packaged with the server.
Apache Kafka includes new java clients (in the org.apache.kafka.clients package). These are meant to supplant the older Scala clients, but for compatability they will co-exist for some time. These clients are available in a seperate jar with minimal dependencies, while the old Scala clients remain packaged with the server.

<h3><a id="producerapi">2.1 Producer API</a></h3>

As of the 0.8.2 release we encourage all new development to use the new Java producer. This client is production tested and generally both faster and more fully featured than the previous Scala client. You can use this client by adding a dependency on the client jar using the following example maven co-ordinates (you can change the version numbers with new releases):
We encourage all new development to use the new Java producer. This client is production tested and generally both faster and more fully featured than the previous Scala client. You can use this client by adding a dependency on the client jar using the following example maven co-ordinates (you can change the version numbers with new releases):
<pre>
&lt;dependency&gt;
&lt;groupId&gt;org.apache.kafka&lt;/groupId&gt;
&lt;artifactId&gt;kafka-clients&lt;/artifactId&gt;
&lt;version&gt;0.8.2.0&lt;/version&gt;
&lt;version&gt;0.9.0.0&lt;/version&gt;
&lt;/dependency&gt;
</pre>

Examples showing how to use the producer are given in the
<a href="http://kafka.apache.org/082/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html" title="Kafka 0.8.2 Javadoc">javadocs</a>.
<a href="http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/producer/KafkaProducer.html" title="Kafka 0.9.0 Javadoc">javadocs</a>.

<p>
For those interested in the legacy Scala producer api, information can be found <a href="http://kafka.apache.org/081/documentation.html#producerapi">
Expand Down
6 changes: 6 additions & 0 deletions docs/documentation.html
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,12 @@ <h1>Kafka 0.9.0 Documentation</h1>
<li><a href="#security_ssl">7.2 Encryption and Authentication using SSL</a></li>
<li><a href="#security_sasl">7.3 Authentication using SASL</a></li>
<li><a href="#security_authz">7.4 Authorization and ACLs</a></li>
<li><a href="#zk_authz">7.5 ZooKeeper Authentication</a></li>
<ul>
<li><a href="zk_authz_new"</li>
<li><a href="zk_authz_migration">Migrating Clusters</a></li>
<li><a href="zk_authz_ensemble">Migrating the ZooKeeper Ensemble</a></li>
</ul>
</ul>
</li>
<li><a href="#connect">8. Kafka Connect</a>
Expand Down
6 changes: 3 additions & 3 deletions docs/ops.html
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ <h4><a id="basic_ops_cluster_expansion">Expanding your cluster</a></h4>
<p>
The process of migrating data is manually initiated but fully automated. Under the covers what happens is that Kafka will add the new server as a follower of the partition it is migrating and allow it to fully replicate the existing data in that partition. When the new server has fully replicated the contents of this partition and joined the in-sync replica one of the existing replicas will delete their partition's data.
<p>
The partition reassignment tool can be used to move partitions across brokers. An ideal partition distribution would ensure even data load and partition sizes across all brokers. In 0.8.1, the partition reassignment tool does not have the capability to automatically study the data distribution in a Kafka cluster and move partitions around to attain an even load distribution. As such, the admin has to figure out which topics or partitions should be moved around.
The partition reassignment tool can be used to move partitions across brokers. An ideal partition distribution would ensure even data load and partition sizes across all brokers.
<p>
The partition reassignment tool can run in 3 mutually exclusive modes -
<ul>
Expand Down Expand Up @@ -257,7 +257,7 @@ <h5>Custom partition assignment and migration</h5>
</pre>

<h4><a id="basic_ops_decommissioning_brokers">Decommissioning brokers</a></h4>
The partition reassignment tool does not have the ability to automatically generate a reassignment plan for decommissioning brokers yet. As such, the admin has to come up with a reassignment plan to move the replica for all partitions hosted on the broker to be decommissioned, to the rest of the brokers. This can be relatively tedious as the reassignment needs to ensure that all the replicas are not moved from the decommissioned broker to only one other broker. To make this process effortless, we plan to add tooling support for decommissioning brokers in 0.8.2.
The partition reassignment tool does not have the ability to automatically generate a reassignment plan for decommissioning brokers yet. As such, the admin has to come up with a reassignment plan to move the replica for all partitions hosted on the broker to be decommissioned, to the rest of the brokers. This can be relatively tedious as the reassignment needs to ensure that all the replicas are not moved from the decommissioned broker to only one other broker. To make this process effortless, we plan to add tooling support for decommissioning brokers in the future.

<h4><a id="basic_ops_increase_replication_factor">Increasing replication factor</a></h4>
Increasing the replication factor of an existing partition is easy. Just specify the extra replicas in the custom reassignment json file and use it with the --execute option to increase the replication factor of the specified partitions.
Expand Down Expand Up @@ -427,7 +427,7 @@ <h4><a id="os">OS</a></h4>
</ul>

<h4><a id="diskandfs">Disks and Filesystem</a></h4>
We recommend using multiple drives to get good throughput and not sharing the same drives used for Kafka data with application logs or other OS filesystem activity to ensure good latency. As of 0.8 you can either RAID these drives together into a single volume or format and mount each drive as its own directory. Since Kafka has replication the redundancy provided by RAID can also be provided at the application level. This choice has several tradeoffs.
We recommend using multiple drives to get good throughput and not sharing the same drives used for Kafka data with application logs or other OS filesystem activity to ensure good latency. You can either RAID these drives together into a single volume or format and mount each drive as its own directory. Since Kafka has replication the redundancy provided by RAID can also be provided at the application level. This choice has several tradeoffs.
<p>
If you configure multiple data directories partitions will be assigned round-robin to data directories. Each partition will be entirely in one of the data directories. If data is not well balanced among partitions this can lead to load imbalance between disks.
<p>
Expand Down
6 changes: 3 additions & 3 deletions docs/quickstart.html
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,11 @@ <h3><a id="quickstart">1.3 Quick Start</a></h3>

<h4> Step 1: Download the code </h4>

<a href="https://www.apache.org/dyn/closer.cgi?path=/kafka/0.8.2.0/kafka_2.10-0.8.2.0.tgz" title="Kafka downloads">Download</a> the 0.8.2.0 release and un-tar it.
<a href="https://www.apache.org/dyn/closer.cgi?path=/kafka/0.9.0.0/kafka_2.10-0.9.0.0.tgz" title="Kafka downloads">Download</a> the 0.9.0.0 release and un-tar it.

<pre>
&gt; <b>tar -xzf kafka_2.10-0.8.2.0.tgz</b>
&gt; <b>cd kafka_2.10-0.8.2.0</b>
&gt; <b>tar -xzf kafka_2.10-0.9.0.0.tgz</b>
&gt; <b>cd kafka_2.10-0.9.0.0</b>
</pre>

<h4>Step 2: Start the server</h4>
Expand Down
133 changes: 132 additions & 1 deletion docs/security.html
Original file line number Diff line number Diff line change
Expand Up @@ -261,6 +261,137 @@ <h3><a id="security_sasl">7.3 Authentication using SASL</a></h3>
</ol>

<h3><a id="security_authz">7.4 Authorization and ACLs</a></h3>
Kafka ships with a pluggable Authorizer and an out-of-box authorizer implementation that uses zookeeper to store all the acls. Kafka acls are defined in the general format of "Principal P is [Allowed/Denied] Operation O From Host H On Resource R". You can read more about the acl structure on KIP-11. In order to add, remove or list acls you can use the Kafka authorizer CLI.
<h4>Command Line Interface</h4>
Kafka Authorization management CLI can be found under bin directory with all the other CLIs. The CLI script is called <b>kafka-acls.sh</b>. Following lists all the options that the script supports:
<p></p>
<table class="data-table">
<tr>
<th>Option</th>
<th>Description</th>
<th>Default</th>
<th>Option type</th>
</tr>
<tr>
<td>--add</td>
<td>Indicates to the script that user is trying to add an acl.</td>
<td></td>
<td>Action</td>
</tr>
<tr>
<td>--remove</td>
<td>Indicates to the script that user is trying to remove an acl.</td>
<td></td>
<td>Action</td>
</tr>
<tr>
<td>--list</td>
<td>Indicates to the script that user is trying to list acls.</td>
<td></td>
<td>Action</td>
</tr>
<tr>
<td>--authorizer</td>
<td>Fully qualified class name of the authorizer.</td>
<td>kafka.security.auth.SimpleAclAuthorizer</td>
<td>Configuration</td>
</tr>
<tr>
<td>--authorizer-properties</td>
<td>comma separated key=val pairs that will be passed to authorizer for initialization.</td>
<td></td>
<td>Configuration</td>
</tr>
<tr>
<td>--cluster</td>
<td>Specifies cluster as resource.</td>
<td></td>
<td>Resource</td>
</tr>
<tr>
<td>--topic [topic-name]</td>
<td>Specifies the topic as resource.</td>
<td></td>
<td>Resource</td>
</tr>
<tr>
<td>--consumer-group [group-name]</td>
<td>Specifies the consumer-group as resource.</td>
<td></td>
<td>Resource</td>
</tr>
<tr>
<td>--allow-principal</td>
<td>Principal is in PrincipalType:name format that will be added to ACL with Allow permission. <br>You can specify multiple --allow-principal in a single command.</td>
<td></td>
<td>Principal</td>
</tr>
<tr>
<td>--deny-principal</td>
<td>Principal is in PrincipalType:name format that will be added to ACL with Deny permission. <br>You can specify multiple --deny-principal in a single command.</td>
<td></td>
<td>Principal</td>
</tr>
<tr>
<td>--allow-hosts</td>
<td>Comma separated list of hosts from which principals listed in --allow-principals will have access.</td>
<td> if --allow-principals is specified defaults to * which translates to "all hosts"</td>
<td>Host</td>
</tr>
<tr>
<td>--deny-hosts</td>
<td>Comma separated list of hosts from which principals listed in --deny-principals will be denied access.</td>
<td>if --deny-principals is specified defaults to * which translates to "all hosts"</td>
<td>Host</td>
</tr>
<tr>
<td>--operations</td>
<td>Comma separated list of operations.<br>
Valid values are : Read, Write, Create, Delete, Alter, Describe, ClusterAction, All</td>
<td>All</td>
<td>Operation</td>
</tr>
<tr>
<td>--producer</td>
<td> Convenience option to add/remove acls for producer role. This will generate acls that allows WRITE,
DESCRIBE on topic and CREATE on cluster.</td>
<td></td>
<td>Convenience</td>
</tr>
<tr>
<td>--consumer</td>
<td> Convenience option to add/remove acls for consumer role. This will generate acls that allows READ,
DESCRIBE on topic and READ on consumer-group.</td>
<td>Convenience</td>
</tr>
</tbody></table>

<h4>Examples</h4>
<ul>
<li><b>Adding Acls</b><br>
Suppose you want to add an acl "Principals User:Bob and User:Alice are allowed to perform Operation Read and Write on Topic Test-Topic from Host1 and Host2". You can do that by executing the CLI with following options:
<pre>bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:Bob --allow-principal User:Alice --allow-hosts Host1,Host2 --operations Read,Write --topic Test-topic</pre>
By default all principals that don't have an explicit acl that allows access for an operation to a resource are denied. In rare cases where an allow acl is defined that allows access to all but some principal we will have to use the --deny-principals and --deny-host option. For example, if we want to allow all users to Read from Test-topic but only deny User:BadBob from host bad-host we can do so using following commands:
<pre>bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:* --allow-hosts * --deny-principal User:BadBob --deny-hosts bad-host --operations Read--topic Test-topic</pre>
Above examples add acls to a topic by specifying --topic [topic-name] as the resource option. Similarly user can add acls to cluster by specifying --cluster and to a consumer group by specifying --consumer-group [group-name].</li>

<li><b>Removing Acls</b><br>
Removing acls is pretty much the same. The only difference is instead of --add option users will have to specify --remove option. To remove the acls added by the first example above we can execute the CLI with following options:
<pre> bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=localhost:2181 --remove --allow-principal User:Bob --allow-principal User:Alice --allow-hosts Host1,Host2 --operations Read,Write --topic Test-topic </pre></li>

<li><b>List Acls</b><br>
We can list acls for any resource by specifying the --list option with the resource. To list all acls for Test-topic we can execute the CLI with following options:
<pre>bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=localhost:2181 --list --topic Test-topic</pre></li>

<li><b>Adding or removing a principal as producer or consumer</b><br>
The most common use case for acl management are adding/removing a principal as producer or consumer so we added convenience options to handle these cases. In order to add User:Bob as a producer of Test-topic we can execute the following command:
<pre> bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:Bob --producer --topic Test-topic</pre>
Similarly to add Alice as a consumer of Test-topic with consumer group Group-1 we just have to pass --consumer option:
<pre> bin/kafka-acls.sh --authorizer kafka.security.auth.SimpleAclAuthorizer --authorizer-properties zookeeper.connect=localhost:2181 --add --allow-principal User:Bob --consumer --topic test-topic --consumer-group Group-1 </pre>
Note that for consumer option we must also specify the consumer group.
In order to remove a principal from producer or consumer role we just need to pass --remove option. </li>
</ul>

<h3><a id="zk_authz">7.5 ZooKeeper Authentication</a></h3>
<h4><a id="zk_authz_new">7.5.1 New clusters</a></h4>
To enable ZooKeeper authentication on brokers, there are two necessary steps:
Expand Down Expand Up @@ -292,7 +423,7 @@ <h4><a id="zk_authz_migration">7.5.2 Migrating clusters</a></h4>
<pre>
./bin/zookeeper-security-migration --help
</pre>
<h4><a id="zk_authz_new">7.5.3 Migrating the ZooKeeper ensemble</a></h4>
<h4><a id="zk_authz_ensemble">7.5.3 Migrating the ZooKeeper ensemble</a></h4>
It is also necessary to enable authentication on the ZooKeeper ensemble. To do it, we need to perform a rolling restart of the server and set a few properties. Please refer to the ZooKeeper documentation for more detail:
<ol>
<li><a href="http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html#sc_ZooKeeperAccessControl">Apache ZooKeeper documentation</a></li>
Expand Down
3 changes: 2 additions & 1 deletion docs/upgrade.html
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ <h5>Potential breaking changes in 0.9.0.0</h5>

<ul>
<li> Java 1.6 is no longer supported. </li>
<li> Scala 2.9 is no longer supported. </li>
<li> Tools packaged under <em>org.apache.kafka.clients.tools.*</em> have been moved to <em>org.apache.kafka.tools.*</em>. All included scripts will still function as usual, only custom code directly importing these classes will be affected. </li>
<li> The default Kafka JVM performance options (KAFKA_JVM_PERFORMANCE_OPTS) have been changed in kafka-run-class.sh. </li>
<li> The kafka-topics.sh script (kafka.admin.TopicCommand) now exits with non-zero exit code on failure. </li>
Expand All @@ -61,4 +62,4 @@ <h4>Upgrading from 0.8.0 to 0.8.1</h4>

<h4>Upgrading from 0.7</h4>

0.8, the release in which added replication, was our first backwards-incompatible release: major changes were made to the API, ZooKeeper data structures, and protocol, and configuration. The upgrade from 0.7 to 0.8.x requires a <a href="https://cwiki.apache.org/confluence/display/KAFKA/Migrating+from+0.7+to+0.8">special tool</a> for migration. This migration can be done without downtime.
Release 0.7 is incompatible with newer releases. Major changes were made to the API, ZooKeeper data structures, and protocol, and configuration in order to add replication (Which was missing in 0.7). The upgrade from 0.7 to later versions requires a <a href="https://cwiki.apache.org/confluence/display/KAFKA/Migrating+from+0.7+to+0.8">special tool</a> for migration. This migration can be done without downtime.

0 comments on commit a8ccdc6

Please sign in to comment.