Batch import

When getting started with TempoDB, you'll likely have existing data to batch import. This data is usually in flat files, like CSVs, or in existing databases, like MySQL or MongoDB.

This is very common, and we'll show a few different approaches to batch importing your data. We'll also look at different ways to make your batch import complete faster.

Best practices

Each call to TempoDB includes an HTTP request. To reduce HTTP overhead, you'll want to send multiple datapoints in each request to TempoDB. There are two ways to do this:

Write multiple (timestamp, value) pairs for a single Series using the write id or write key enpoints.
Write multiple (series, value) pairs for a single timestamp using the bulk write endpoint.

The optimal size for each request is around 20 items. So for write id and write key calls, each request should have around 20 (timestamp, value) pairs. Similarly, each bulk write should include 20 (series, value) pairs.

You can increase the speed of your batch import to TempoDB by parallelizing the calls. This can be accomplished by making calls on separate threads or using multiple workers if consuming off a queue. The optimal thread pool size/worker count is 3-7. Test with the particular write endpoints you are using and the size of each request to find the sweet spot for your import.

Getting Started

Modeling Time Series

Batch Import

Overview

CSV files

Python batch import script

API Clients

API Docs

API Console

Best Practices

Heroku

Batch import

Best practices

Tutorials

CSV files

MySQL database (coming soon)

MongoDB (coming soon)

Consuming off a queue (coming soon)

Python batch import script