Second pass over the documentation (grafana#24)

* documentation second pass * explain that docker-compose example is for load testing
findingrish · Mar 16, 2020 · 25b75da · 25b75da
1 parent c5abdfb
commit 25b75da
Show file tree

Hide file tree

Showing 9 changed files with 77 additions and 23 deletions.
diff --git a/README.md b/README.md
@@ -4,26 +4,26 @@ Grafana Cloud Agent is an observability data collector optimized for sending
 metrics and log data to [Grafana Cloud](https://grafana.com/products/cloud/).
 
 Users of Prometheus cloud storage vendors can struggle sending their data at
-scale: Prometheus is a single point of failure, a "special snowflake," and
-generally requires a giant machine with a lot of resources allocated to it.
+scale: Prometheus is sometimes called a single point of failure that generally
+requires a giant machine with a lot of resources allocated to it.
 
 The Grafana Cloud Agent tackles these issues by stripping Prometheus down to its
-most relavant parts:
+most relavant parts for interaction with hosted metrics:
 
 1. Service Discovery
 2. Scraping
 3. Write Ahead Log (WAL)
 4. Remote Write
 
 On top of these, the Grafana Cloud Agent allows for an optional host filter
-mechanism, enabling users to distribute the resource requirements of metrics
-collection by running one agent per machine.
+mechanism, enabling users to easily shard the Agent across their cluster and
+lower the memory requirements per machine.
 
 A typical deployment of the Grafana Cloud Agent for Prometheus metrics can see
 up to a 40% reduction in memory usage with comparable scrape loads.
 
 Despite called the "Grafana Cloud Agent," it can be utilized with any Prometheus
-remote_write API.
+`remote_write` API.
 
 ## Trade-offs
 
@@ -34,6 +34,8 @@ trade-offs have been made:
   storage.
 - Recording rules aren't supported.
 - Alerts aren't supported.
+- When sharding the Agent, if your node has problems that interrupt metric
+  availability, metrics tracking that node won't be sent for alerting on.
 
 The Agent sets the expectation that recording rules and alerts should be the
 responsibility of the remote write system rather than the responsibility of the
@@ -44,6 +46,7 @@ metrics collector.
 - [x] Prometheus metrics
 - [ ] Promtail for Loki logs
 - [ ] `carbon-relay-ng` for Graphite metrics.
+- [ ] A second clustering mode to solve sharding monitoring availability problems.
 
 ## Getting Started
 

diff --git a/docs/configuration-reference.md b/docs/configuration-reference.md
@@ -4,7 +4,6 @@ The Grafana Cloud Agent is configured in a YAML file (usually called
 `agent.yaml`) which contains information on the Grafana Cloud Agent and its
 Prometheus instances.
 
-
 * [server_config](#server_config)
 * [prometheus_config](#prometheus_config)
 

diff --git a/docs/getting-started.md b/docs/getting-started.md
@@ -75,6 +75,10 @@ prometheus:
 
 ## Running
 
+If you've installed the agent with Kubernetes, it's already running! The
+following sections below describe running the agent in environments that need
+extra steps.
+
 ### Docker Container
 
 Copy the following block below, replacing `/tmp/agent` with the host directory

diff --git a/docs/overview.md b/docs/overview.md
@@ -3,7 +3,7 @@
 The Grafana Cloud Agent is an observability data collector optimized for sending
 metrics and log data to [Grafan Cloud](https://grafana.com/products/cloud).
 
-Current, it only comes with support for collecting and sending Prometheus
+Currently, it only comes with support for collecting and sending Prometheus
 metrics, accomplished through utilizing the same battle-tested code that
 Prometheus contains.
 
@@ -12,6 +12,12 @@ so some Prometheus features, such as querying, local storage, recording rules,
 and alerts aren't present. `remote_write`, service discovery, and relabeling
 rules are included.
 
+The Grafana Cloud Agent has a concept of an "instance", each of which acts as
+its own mini Prometheus agent with own `scrape_configs` section and
+`remote_write` rules. Most users will only ever need to define one instance.
+Multiple instances will be more useful in the future when a clustering mode is
+added to the Agent.
+
 The Grafana Cloud Agent can be deployed in two modes:
 
 - Prometheus `remote_write` drop-in
@@ -22,12 +28,19 @@ replacement for Prometheus `remote_write`. The Agent will act similarly to a
 single-processed Prometheus, doing service discovery, scraping, and remote
 writing.
 
-The other deployment mode, host filtering mode, is achieved by setting a
-`host_filter` flag in the Agent's configuration file. When this flag is set, the
-Agent will only scrape metrics from targets that are running on the same machine
-as the target. This is extremely useful in environments such as Kubernetes,
-which enables users to deploy the Agent as a DaemonSet and distribute memory
-requirements across the cluster.
+The other deployment mode, Host Filtering mode, is achieved by setting a
+`host_filter` flag on a specific instance inside the Agent's configuration file.
+When this flag is set, the instance will only scrape metrics from targets that
+are running on the same machine as the target. This is extremely useful to
+migrate to sharded Prometheus instances in a Kubernetes cluster, where the Agent
+can then be deployed as a DaemonSet and distribute memory requirements across
+multiple nodes.
+
+Note that Host Filtering mode and sharding your instances means that if an
+Agent's metrics are being sent to an alerting system, alerts for that Agent may
+not be able to be generated if the entire node has problems. This changes the
+semantics of failure detection, and alerts would have to be configured to catch
+agents not reporting in.
 
 For more information on installing and running the agent, see
 [Getting started](./getting-started.md) or
@@ -57,7 +70,7 @@ using its code.
 Alternatives that support Prometheus metrics try to incorporate more than just
 Prometheus metrics ingestion, and tend to reimplement the code for doing so.
 This leads to missing features or the other agents feeling like a
-Jack-of-all-trades, master-of-none.
+jack-of-all-trades, master-of-none.
 
 ## Roadmap
 

diff --git a/example/README.md b/example/README.md
@@ -8,6 +8,10 @@ the following components:
 3. Grafana to visualize metrics
 4. Avalanche to load test the Agent.
 
+This example is used for seeing how a single instance of the Agent performs
+under moderate load; the Docker Compose configuration as present in this
+directory will generate roughly 90,000 metrics.
+
 To get started, run the following from this directory:
 
 ```
@@ -31,6 +35,9 @@ The reduced memory requirements is a critical feature of the Agent, and
 the example provides a good launching point to end-to-end test and validate
 the usage.
 
+To build the image locally, run `make agent-image` at the root of this
+repository.
+
 To get a memory profile, you can use `pprof` against the Agent:
 
 ```

diff --git a/example/cortex/config/cortex.yaml b/example/cortex/config/cortex.yaml
@@ -46,3 +46,7 @@ storage:
     directory: /tmp/cortex/index
   filesystem:
     directory: /tmp/cortex/chunks
+
+limits:
+  ingestion_rate: 250000
+  ingestion_burst_size: 500000
diff --git a/example/prometheus/config/prometheus.yml b/example/prometheus/config/prometheus.yml
@@ -2,7 +2,7 @@ global:
   scrape_interval:     5s
 
 scrape_configs:
-  - job_name: 'local_scrape'
+  - job_name: 'prometheus_scrape'
     static_configs:
       - targets: ['localhost:9090']
         labels:

diff --git a/production/README.md b/production/README.md
@@ -24,15 +24,39 @@ See the [Kubernetes README](./kubernetes/README.md) for more information.
 
 ## Running the Agent with Docker
 
+To run the Agent with Docker, you should have a configuration file on
+your local machine ready to bind mount into the container. Then modify
+the following command for your environment. Replace `/path/to/config.yaml` with
+the full path to your YAML configuration, and replace `/tmp/agent` with the
+directory on your host that you want the agent to store its WAL.
+
+```
+docker run \
+  -v /tmp/agent:/etc/agent \
+  -v /path/to/config.yaml:/etc/agent-config/agent.yaml \
+  --entrypoint "/bin/agent -config.file=/etc/agent-config/agent.yaml -prometheus.wal-directory=/etc/agent/data"
+  grafana/agent:v0.1.0
+```
+
 ## Running the Agent locally
 
 Currently, you must provide your own system configuration files to run the
 Agent as a long-living process (e.g., write your own systemd unit files).
 
 ## Use the example Kubernetes configs
 
+The install script replaces variable placeholders in the [example Kubernetes
+manifest](./kubernetes/agent.yaml) in the Kubernetes directory. Feel free to
+examine that file and modify it for your own needs!
+
 ## Build the Agent from source
 
-## Use our production Tanka configs
+Go 1.14 is currently needed to build the agent from source. Run `make agent`
+from the root of this repository, and then the build agent binary will be placed
+at `./cmd/agent/agent`.
 
+## Use our production Tanka configs
 
+The Tanka configs we use to deploy the agent ourselves can be found in our
+[production Tanka directory](./tanka/grafana-agent). These configs are also used
+to generate the Kubernetes configs for the install script.
diff --git a/production/kubernetes/README.md b/production/kubernetes/README.md
@@ -53,10 +53,10 @@ The YAML file provided is created using Grafana Labs' production
 build the YAML file with some custom values, you will need the following pieces
 of software installed:
 
-1. Tanka
-2. `jsonnet-bundler`
+1. [Tanka](https://github.com/grafana/tanka) >= v0.8
+2. [`jsonnet-bundler`](https://github.com/jsonnet-bundler/jsonnet-bundler) >= v0.2.1
 
-See the [`template` environment](./build/template) for the current settings
-that initialize the Grafana Agent tanka configs. To build the YAML file,
-execute the `./build/build.sh` script or run `make example-kubernetes` from the
-project's root directory.
+See the [`template` Tanka environment](./build/template) for the current
+settings that initialize the Grafana Agent Tanka configs. To build the YAML
+file, execute the `./build/build.sh` script or run `make example-kubernetes`
+from the project's root directory.