How To Import Prometheus Metrics Into VictoriaMetrics

Published:

So you heard about VictoriaMetrics and its claims for increased performance at lower resource usage and you want to see for yourself. After all, who believes everything the authors say about their code? :)

For a meaningful comparison between VictoriaMetrics and Prometheus, you first need to get the same amount of metrics in VM. Prometheus has been in your stack for months and say has 6 months of metrics. How do you get that data into VM?

Remote Write

The standard way to add VictoriaMetrics into your Prometheus stack is to configure Prometheus remote_write. remote_write is all well and good, there are several tuning parameters on Proemtheus end to ensure it can keep up with the number of metrics you’re ingesting.

However, remote_write only reads from the WAL, which means that once you enable it, VictoriaMetrics will receive last 2 hours worth of metrics and then receive them close to real-time going forward.

If your retention period is 6 months in Prometheus, do you wait for 6 months to fill VictoriaMetrics and then run your tests? That’s not a feasible option.

Bulk import with vmctl

VictoriaMetrics authors have written a tool that can import Prometheus (and InfluxDB) data into VictoriaMetrics. To make such imports even faster than standard ingestion via remote_write allows, VictoriaMetrics even has a bulk-import API endpoint and that’s exactly what vmctl uses.

Let’s build it and see how it works. The sources are on GitHub. Clone the repo and build it with make build. My test nodes are running FreeBSD, so I’m building a FreeBSD executable:

cd $GOPATH/src/github.com/vmctl
GOOS=freebsd go build -mod=vendor -o vmctl

All done. Let’s see how to use it:

./vmctl prometheus -h
NAME:
   vmctl prometheus - Migrate timeseries from Prometheus

USAGE:
   vmctl prometheus [command options] [arguments...]

OPTIONS:
   --prom-snapshot value            Path to Prometheus snapshot. Pls see for details https://www.robustperception.io/taking-snapshots-of-prometheus-data
   --prom-concurrency value         Number of concurrently running snapshot readers (default: 1)
   --prom-filter-time-start value   The time filter to select timeseries with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z'
   --prom-filter-time-end value     The time filter to select timeseries with timestamp equal or lower than provided value. E.g. '2020-01-01T20:07:00Z'
   --prom-filter-label value        Prometheus label name to filter timeseries by. E.g. '__name__' will filter timeseries by name.
   --prom-filter-label-value value  Prometheus regular expression to filter label from "prom-filter-label" flag. (default: ".*")
   --vm-addr value                  VictoriaMetrics address to perform import requests. Should be the same as --httpListenAddr value for single-node version or VMSelect component. (default: "http://localhost:8428")
   --vm-user value                  VictoriaMetrics username for basic auth [$VM_USERNAME]
   --vm-password value              VictoriaMetrics password for basic auth [$VM_PASSWORD]
   --vm-account-id value            Account(tenant) ID - is required for cluster VM. (default: -1)
   --vm-concurrency value           Number of workers concurrently performing import requests to VM (default: 2)
   --vm-compress                    Whether to apply gzip compression to import requests (default: true)
   --vm-batch-size value            How many datapoints importer collects before sending the import request to VM (default: 200000)
   --help, -h                       show help (default: false)

2020/04/03 17:56:52 Total time: 509.024µs

So what we need is a snapshot of the Prometheus TSDB. The help output conveniently includes a link to Robust Perception blog about Prometheus snapshots if you’ve not used them before. In a nutshell, Prometheus snapshot is simply a consistent copy of metrics at a given point in time.

Let’s take a snapshot:

curl -X POST http://localhost:9090/api/v1/admin/tsdb/snapshot
{"status":"success","data":{"name":"20200404T012938Z-66bb0213b2d7bbe8"}}

This won’t take too long as snapshots use hard links, but of course depends on the size of your Prometheus TSDB and the speed of the IO subsystem on the host. The Prometheus instance we’re testing with here has 4 month retention & TSDB is nearly 35GB; the snapshot took several seconds to complete.

Now let’s import this snapshot. Note that the server we’re using here has spinning disks for storage and that’s the ultimate bottleneck for the total import duration, YMWV.

Also, if you use Grafana, you should install the official VM dashboard so you can have an idea of how well it’s performing during import.

./vmctl prometheus --prom-snapshot /var/db/prometheus/snapshots/20200404T012938Z-66bb0213b2d7bbe8/ --vm-addr http://localhost:8428
Prometheus import mode
Prometheus snapshot stats:
  blocks found: 1413;
  blocks skipped: 0;
  min time: 1575583200000 (2019-12-05T22:00:00Z);
  max time: 1585963778392 (2020-04-04T01:29:38Z);
  samples: 11001521842;
  series: 15405068.
Filter is not taken into account for series and samples numbers.
Found 1413 blocks to import. Continue? [Y/n] y
1413 / 1413 [---------------------------------------------------------------------------------------------------] 100.00% 0 p/s
2020/04/04 06:41:13 Import finished!
2020/04/04 06:41:13 VictoriaMetrics importer stats:
  time spent while waiting: 8h6m10.020967102s;
  time spent while importing: 2h16m21.181395204s;
  total datapoints: 11001521842;
  datapoints/s: 1344735.11;
  total bytes: 223.4 GB;
  bytes/s: 27.3 MB;
  import requests: 109817;
  import requests retries: 0;
2020/04/04 06:41:13 Total time: 5h11m16.392353968s

Wow, 5 hours is a very long time, but since this is just a test, then we don’t particularly care. If you have SSD-based storage, you should experiment with tuning read and write concurrency settings in vmctl, you’ll get much higher import throughput.

Specifically, you want your import to take less than 2 hours, so that more recent metrics can be backfilled via remote_write to keep VM nearly in-sync with Prometheus from then on.

If you’re importing a lot of data and it still takes more than 2 hours you can simply take another snapshot and import that. vmctl can be told a timestamp from which to start the import with --prom-filter-time-start option, which effectively means it’ll do an incremental import (even though the snapshot generated is complete). The value you give here should match the max time timestamp (in rfc3339 format) from the previous import output. For example, the samples in our original snapshot ended at 2020-04-04T01:29:38Z, so for next import we’d run:

./vmctl prometheus --prom-snapshot /var/db/prometheus/snapshots/$snapshot_id/ --vm-addr http://localhost:8428 --prom-filter-time-start 2020-04-04T01:29:38Z

Summary

Setting up VictoriaMetrics is fairly trivial - you can be fully-provisioned and have an instance full with metrics in a few hours time. Now it’s time to include VM into your metrics read path and compare the numbers.