Skip to content

Commit

Permalink
Second pass over the documentation (grafana#24)
Browse files Browse the repository at this point in the history
* documentation second pass

* explain that docker-compose example is for load testing
  • Loading branch information
rfratto committed Mar 16, 2020
1 parent c5abdfb commit 25b75da
Show file tree
Hide file tree
Showing 9 changed files with 77 additions and 23 deletions.
15 changes: 9 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,26 +4,26 @@ Grafana Cloud Agent is an observability data collector optimized for sending
metrics and log data to [Grafana Cloud](https://grafana.com/products/cloud/).

Users of Prometheus cloud storage vendors can struggle sending their data at
scale: Prometheus is a single point of failure, a "special snowflake," and
generally requires a giant machine with a lot of resources allocated to it.
scale: Prometheus is sometimes called a single point of failure that generally
requires a giant machine with a lot of resources allocated to it.

The Grafana Cloud Agent tackles these issues by stripping Prometheus down to its
most relavant parts:
most relavant parts for interaction with hosted metrics:

1. Service Discovery
2. Scraping
3. Write Ahead Log (WAL)
4. Remote Write

On top of these, the Grafana Cloud Agent allows for an optional host filter
mechanism, enabling users to distribute the resource requirements of metrics
collection by running one agent per machine.
mechanism, enabling users to easily shard the Agent across their cluster and
lower the memory requirements per machine.

A typical deployment of the Grafana Cloud Agent for Prometheus metrics can see
up to a 40% reduction in memory usage with comparable scrape loads.

Despite called the "Grafana Cloud Agent," it can be utilized with any Prometheus
remote_write API.
`remote_write` API.

## Trade-offs

Expand All @@ -34,6 +34,8 @@ trade-offs have been made:
storage.
- Recording rules aren't supported.
- Alerts aren't supported.
- When sharding the Agent, if your node has problems that interrupt metric
availability, metrics tracking that node won't be sent for alerting on.

The Agent sets the expectation that recording rules and alerts should be the
responsibility of the remote write system rather than the responsibility of the
Expand All @@ -44,6 +46,7 @@ metrics collector.
- [x] Prometheus metrics
- [ ] Promtail for Loki logs
- [ ] `carbon-relay-ng` for Graphite metrics.
- [ ] A second clustering mode to solve sharding monitoring availability problems.

## Getting Started

Expand Down
1 change: 0 additions & 1 deletion docs/configuration-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ The Grafana Cloud Agent is configured in a YAML file (usually called
`agent.yaml`) which contains information on the Grafana Cloud Agent and its
Prometheus instances.


* [server_config](#server_config)
* [prometheus_config](#prometheus_config)

Expand Down
4 changes: 4 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,10 @@ prometheus:

## Running

If you've installed the agent with Kubernetes, it's already running! The
following sections below describe running the agent in environments that need
extra steps.

### Docker Container

Copy the following block below, replacing `/tmp/agent` with the host directory
Expand Down
29 changes: 21 additions & 8 deletions docs/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
The Grafana Cloud Agent is an observability data collector optimized for sending
metrics and log data to [Grafan Cloud](https://grafana.com/products/cloud).

Current, it only comes with support for collecting and sending Prometheus
Currently, it only comes with support for collecting and sending Prometheus
metrics, accomplished through utilizing the same battle-tested code that
Prometheus contains.

Expand All @@ -12,6 +12,12 @@ so some Prometheus features, such as querying, local storage, recording rules,
and alerts aren't present. `remote_write`, service discovery, and relabeling
rules are included.

The Grafana Cloud Agent has a concept of an "instance", each of which acts as
its own mini Prometheus agent with own `scrape_configs` section and
`remote_write` rules. Most users will only ever need to define one instance.
Multiple instances will be more useful in the future when a clustering mode is
added to the Agent.

The Grafana Cloud Agent can be deployed in two modes:

- Prometheus `remote_write` drop-in
Expand All @@ -22,12 +28,19 @@ replacement for Prometheus `remote_write`. The Agent will act similarly to a
single-processed Prometheus, doing service discovery, scraping, and remote
writing.

The other deployment mode, host filtering mode, is achieved by setting a
`host_filter` flag in the Agent's configuration file. When this flag is set, the
Agent will only scrape metrics from targets that are running on the same machine
as the target. This is extremely useful in environments such as Kubernetes,
which enables users to deploy the Agent as a DaemonSet and distribute memory
requirements across the cluster.
The other deployment mode, Host Filtering mode, is achieved by setting a
`host_filter` flag on a specific instance inside the Agent's configuration file.
When this flag is set, the instance will only scrape metrics from targets that
are running on the same machine as the target. This is extremely useful to
migrate to sharded Prometheus instances in a Kubernetes cluster, where the Agent
can then be deployed as a DaemonSet and distribute memory requirements across
multiple nodes.

Note that Host Filtering mode and sharding your instances means that if an
Agent's metrics are being sent to an alerting system, alerts for that Agent may
not be able to be generated if the entire node has problems. This changes the
semantics of failure detection, and alerts would have to be configured to catch
agents not reporting in.

For more information on installing and running the agent, see
[Getting started](./getting-started.md) or
Expand Down Expand Up @@ -57,7 +70,7 @@ using its code.
Alternatives that support Prometheus metrics try to incorporate more than just
Prometheus metrics ingestion, and tend to reimplement the code for doing so.
This leads to missing features or the other agents feeling like a
Jack-of-all-trades, master-of-none.
jack-of-all-trades, master-of-none.

## Roadmap

Expand Down
7 changes: 7 additions & 0 deletions example/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,10 @@ the following components:
3. Grafana to visualize metrics
4. Avalanche to load test the Agent.

This example is used for seeing how a single instance of the Agent performs
under moderate load; the Docker Compose configuration as present in this
directory will generate roughly 90,000 metrics.

To get started, run the following from this directory:

```
Expand All @@ -31,6 +35,9 @@ The reduced memory requirements is a critical feature of the Agent, and
the example provides a good launching point to end-to-end test and validate
the usage.

To build the image locally, run `make agent-image` at the root of this
repository.

To get a memory profile, you can use `pprof` against the Agent:

```
Expand Down
4 changes: 4 additions & 0 deletions example/cortex/config/cortex.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,7 @@ storage:
directory: /tmp/cortex/index
filesystem:
directory: /tmp/cortex/chunks

limits:
ingestion_rate: 250000
ingestion_burst_size: 500000
2 changes: 1 addition & 1 deletion example/prometheus/config/prometheus.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ global:
scrape_interval: 5s

scrape_configs:
- job_name: 'local_scrape'
- job_name: 'prometheus_scrape'
static_configs:
- targets: ['localhost:9090']
labels:
Expand Down
26 changes: 25 additions & 1 deletion production/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,15 +24,39 @@ See the [Kubernetes README](./kubernetes/README.md) for more information.

## Running the Agent with Docker

To run the Agent with Docker, you should have a configuration file on
your local machine ready to bind mount into the container. Then modify
the following command for your environment. Replace `/path/to/config.yaml` with
the full path to your YAML configuration, and replace `/tmp/agent` with the
directory on your host that you want the agent to store its WAL.

```
docker run \
-v /tmp/agent:/etc/agent \
-v /path/to/config.yaml:/etc/agent-config/agent.yaml \
--entrypoint "/bin/agent -config.file=/etc/agent-config/agent.yaml -prometheus.wal-directory=/etc/agent/data"
grafana/agent:v0.1.0
```

## Running the Agent locally

Currently, you must provide your own system configuration files to run the
Agent as a long-living process (e.g., write your own systemd unit files).

## Use the example Kubernetes configs

The install script replaces variable placeholders in the [example Kubernetes
manifest](./kubernetes/agent.yaml) in the Kubernetes directory. Feel free to
examine that file and modify it for your own needs!

## Build the Agent from source

## Use our production Tanka configs
Go 1.14 is currently needed to build the agent from source. Run `make agent`
from the root of this repository, and then the build agent binary will be placed
at `./cmd/agent/agent`.

## Use our production Tanka configs

The Tanka configs we use to deploy the agent ourselves can be found in our
[production Tanka directory](./tanka/grafana-agent). These configs are also used
to generate the Kubernetes configs for the install script.
12 changes: 6 additions & 6 deletions production/kubernetes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,10 +53,10 @@ The YAML file provided is created using Grafana Labs' production
build the YAML file with some custom values, you will need the following pieces
of software installed:

1. Tanka
2. `jsonnet-bundler`
1. [Tanka](https://github.com/grafana/tanka) >= v0.8
2. [`jsonnet-bundler`](https://github.com/jsonnet-bundler/jsonnet-bundler) >= v0.2.1

See the [`template` environment](./build/template) for the current settings
that initialize the Grafana Agent tanka configs. To build the YAML file,
execute the `./build/build.sh` script or run `make example-kubernetes` from the
project's root directory.
See the [`template` Tanka environment](./build/template) for the current
settings that initialize the Grafana Agent Tanka configs. To build the YAML
file, execute the `./build/build.sh` script or run `make example-kubernetes`
from the project's root directory.

0 comments on commit 25b75da

Please sign in to comment.