io


Throttling ELK log ingestion pipeline

Published:

How to process bulk logs and still have hardware resources left for other tasks
Problem statement Recently I had to send a sizeable amount of logs into our log pipeline, a whole 23781261 log lines to be exact. Our log pipeline is the standard ELK stack, plus Filebeat, a lightweight log shipper from Elastic that forwards logs from the central log server into Logstash. Here’s a diagram to illustrate the entire flow of logs in our system: Nodes -> Syslog-NG -> Central logserver Filebeat -> Logstash -> Elasticsearch Pushing this number of logs with pretty much standard configurations of Filebeat and ELK completely saturated disk IO on the server running Elasticsearch (this node is using spinning disks so not that difficult of a feat to achieve).