AWS Billing Dashboard

Visualize AWS Cost and Usage data using Glue, Amazon Elasticsearch, and Kibana

AWS launched Cost and Usage Reports (CUR) in late 2015 which provides comprehensive data about your costs, and Amazon provides Cost Explorer to view your costs for up to the last 13 months. But what if you want to analyze your billing data in the past a few years? What if you want a customize way to show your data for your business needs?

In this post, I'll show you how to import Cost and Usage Report data into Amazon Elasticsearch service, and create your own Kibana billing dashboard.

Solution Overview

Many customers have legitimate privacy and security concerns about sending and receiving data across the Internet. We address these concerns by putting the Glue job and Elasticsearch Service (ES) domain inside of your VPC and using VPC endpoints to securely access billing data from an S3 bucket.

In this solution, we use a scheduled Glue ETL job. The Glue job will transform the newly uploaded data and upload it to Elasticsearch to build the index so the end users can see the latest charts right away from Kibana Dashboard.

Here is an overview of how the solution works:

Why Glue and Elasticsearch Service

Let's take a step back and see how the cost and usage report works and how we ended up choosing Elasticsearch Service and Glue.

AWS delivers the cost and usage report files to an Amazon S3 bucket that you specify in your account, and updates the report up to three times a day in .csv format. Each upload will generate a unique folder name called assembly ID, and the csv file(s) are GZipped. The size of an individual report can grow to more than two gigabytes, and if a report is larger than most applications can handle, AWS splits the report into multiple files that are stored in the same folder.

Here is an example of the folder structure for the report:

<report-prefix>/<report-name>/20180101-20180201/<dd7544e6-b151-4cb6-805d-6400f07c614a>/<report-name>-<1>.csv.gz
<report-prefix>/<report-name>/20180101-20180201/<dd7544e6-b151-4cb6-805d-6400f07c614a>/<report-name>-<2>.csv.gz
<report-prefix>/<report-name>/20180101-20180201/<dd7544e6-b151-4cb6-805d-6400f07c614a>/<report-name>-<3>.csv.gz
<report-prefix>/<report-name>/20180101-20180201/<dd7544e6-b151-4cb6-805d-6400f07c614a>/<report-name>-Manifest.json

The first challenge we face is that the consolidated billing is enabled for several accounts, which results in large amount of billing data, and AWS splits the data into 16+ csv.gz files. Here is the data we are dealing with for a single upload of the CUR reports:

Total data size: 25 GB
Total billing records: 30 million
Average size of .csv.gz: 120 MB (about 1.5GB after unzipping)

The second challenge is the schema of the csv file may change from time to time, because:

Add/Remove Customized Cost Allocation Tags
.csv columns are changed by Amazon

At first, I tried Athena and QuickSight, because the in-place S3 query and tight integration between these two services are really intriguing. But soon I found that Athena cannot handle the schema change which will pop up the partition schema mismatch issue when I tried to load data from multiple months using Glue data catalog as meta store. That's why I chose AWS ElasticSearch and Kibana for the solution. The dynamic indexing feature of ElasticSearch makes it easy to index data in different schemas and make it available for users to explore. The performance is also good for the quantity of data we have. Fortunately, AWS has a fully managed Elasticsearch Service. Glue is a natural choice for ETL since it's a serverless, managed solution.

Now let's take a closer look at this solution and how to deploy it.

Prerequisites

Here are the prerequisites for this solution:

The Cost and Usage Report must be enabled, if you haven't done this, please follow the instructions here
Prepare an IAM role for the Glue job which has these IAM permissions:

Read access to S3 bucket that contains the CUR reports
Full access to Amazon Elasticsearch Domain, consider the AmazonESFullAccess managed policy.
AWSGlueServiceRole

Setup and Deploy the solution

Identify a VPC you want to use to deploy the solution.
Choose or create a private subnet in the VPC
Create a security group in the VPC.
Create VPC Endpoint for S3 and associate with the subnet above - AWS Blog shows a detailed instruction about how to create it through the console.
Set up a ES domain in the VPC and private subnet above, here is the for the detailed steps
Create a JDBC connection following guidelines. Choose the VPC, subnet and security group created above.
Create a Python library package in your local console and upload it to a S3 bucket. NOTE: please use python 2.x for this solution
```
mkdir esglueaccess
cd esglueaccess
pip install elasticsearch -t .
zip -r ../es.zip ./*
```

Follow the Glue add job instructions to create a Glue job with these inputs:

Name: cur-s3-to-es-domain
IAM role: the role you created in prerequisites section
This job runs: A new script to be authored by you
ETL language: Python
Python lib path: S3 bucket you created in Step 6, e.g. s3://glue-python-package-for-es/es.zip
Job parameters: you need to provide 5 parameters, the values here are for your reference

    # source bucket name
    # e.g. cur-billing-bucket
    --source_bucket
    # the prefix of the actual report folders, the report folders are like this 20171231-20180101
    # e.g. CostAndUsageReport
    --report_folder_prefix
    # the elasticsearch index name template {} will be replaced by year and month
    # e.g. cur{}{}
    --index_name_prefix_template
    # the elasticsearch index pattern used to generate the dashboard
    # NOTE: it needs to be different from index name template
    # e.g. yourorgname-billing-report
    --index_pattern_prefix
    # your elasticsearch service domain URL
    --es_domain_url

Connections: Choose the jdbc connection created in step 6.
For other inputs, use default settings

Get the glue job script from 1Strategy Cost and Usage Reports Github repo and put it into the script console, click [Save]
Create a daily scheduled Glue job trigger following the guideline here. Make sure to set all job parameters properly esp. the --es_domain_url.

Note:

To put Glue job in same VPC with ES domain you'll need to create a JDBC connection in Glue data catalog and make sure to choose the right VPC.
It's not very common to use Glue jobs to access ES in same VPC, Glue was designed to access JDBC data source. Just put in a mock JDBC url like this jdbc:mysql://test:9243/test.

Glue job and index speed tuning

You'll need to tune the performance a bit to find the best parallel bulk sessions (partitions) against the Elasticsearch Service cluster spec you choose. You can refer to the guildelines on how to tune indexing performance.

Here are the items we tested that helps improved index performance for our solution:

Disable refresh and replicas for initial data loads
Use auto-generated ids
Use bulk requests
User 3 worker nodes (m4.xlarge) to send data to Elasticsearch Domain

Kibana dashboard and metrics

Now the cost and usage report data is scheduled to be copied to Elasticserch service domain everyday, time to visualize these data using Kibana.

There is a tradeoff to use VPC to secure and simplify communication between Glue and Elasticsearch service domain - the subnet has to be private. It is by design for Glue job when using JDBC connections for security purposes. So we can't use the default Kibana that comes with the Elasticsearch domain. We need to start a Kibana server on a EC2 instance in the same VPC but a public subnet, and use security group to manage the access to the kibana server.

Here are some sample metrics we created:

Conclusion

AWS cost and usage reports have all the data we need to understand our billing, we can use these data to create customized dashboard based on our business needs, in order to monitor, control and optimize our cost. Athena and Quicksight is a great and quick solution when you want to analyze billing data for single month, and with Glue and Elasticsearch Service, we are able to create a fully customized billing dashboard with large amount of data.

License

Licensed under the Apache License, Version 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
glue_job.py		glue_job.py
readme.md		readme.md
sampledashboard.png		sampledashboard.png
structure.png		structure.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AWS Billing Dashboard

Visualize AWS Cost and Usage data using Glue, Amazon Elasticsearch, and Kibana

Solution Overview

Why Glue and Elasticsearch Service

Prerequisites

Setup and Deploy the solution

Glue job and index speed tuning

Kibana dashboard and metrics

Conclusion

License

About

Releases

Packages

Languages

License

1Strategy/visualize-cur-using-glue-es

Folders and files

Latest commit

History

Repository files navigation

AWS Billing Dashboard

Visualize AWS Cost and Usage data using Glue, Amazon Elasticsearch, and Kibana

Solution Overview

Why Glue and Elasticsearch Service

Prerequisites

Setup and Deploy the solution

Glue job and index speed tuning

Kibana dashboard and metrics

Conclusion

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages