Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build data pipeline with ELK stack for visualizing logs #253

Open
5 tasks
djbowers opened this issue Aug 19, 2020 · 3 comments
Open
5 tasks

Build data pipeline with ELK stack for visualizing logs #253

djbowers opened this issue Aug 19, 2020 · 3 comments
Assignees

Comments

@djbowers
Copy link
Collaborator

djbowers commented Aug 19, 2020

Undebate ELK Data Pipeline

We want to be able to visualize our log data so that we can make better decisions about where to focus our time and effort to most effectively improve this application.

So far @epg323 has made some progress towards this goal with a Node app that reads the logs data from MongoDB directly and outputs some results to the command line, which led to discovering some changes that needed to be made in the application itself.

This ticket is an attempt to create a deployable pipeline using the open-source ELK stack (Elasticsearch, Logstash, and Kibana) that will accomplish the same goal. We will use Logstash to pipe the data from Mongo to Elasticsearch, then visualize the data in Elasticsearch using Kibana.

Our hope is to be able to deploy this pipeline with Docker on AWS. When we tear down the pipeline, we will store the historical log data in S3. Every time we deploy the pipeline, we will first read in historical data from S3, then read in the new data from Mongo. This way we have cheap storage for our historical data and don't have to worry about our short log history on Mongo.

Tasks Remaining

  • Use logstash to pipe logs data from Mongo to Elasticsearch
  • Use Kibana to visualize logs data in Elasticsearch
  • Offload historical data from Elasticsearch to S3
  • Merge historical data from S3 with new data from Mongo without duplicating logs
  • Make the pipeline easily deployable with Docker on AWS
@ddfridley
Copy link
Contributor

Goal for next week it to be able to read in the documents from Mongo and make some visualization.

@djbowers
Copy link
Collaborator Author

In case Logstash doesn't work out for moving the logs from MongoDB to Elasticsearch, I found this stack overflow question on doing it with Python: https://stackoverflow.com/questions/44155858/load-data-from-mongodb-to-elasticsearch-through-python

@ddfridley
Copy link
Contributor

goal for next week is to be able to read in all of the key words.

@djbowers djbowers removed their assignment Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants