ColumnStore Data Ingestion

ColumnStore provides several mechanisms to ingest data:

  • cpimport provides the fastest performance for inserting data and ability to route data to particular PM nodes. Normally this should be the default choice for loading data .
  • LOAD DATA INFILE provides another means of bulk inserting data.
    • By default with autocommit on it will internally stream the data to an instance of the cpimport process. This requires some memory overhead on the UM server so may be less reliable than cpimport for very large imports.
    • In transactional mode DML inserts are performed which will be significantly slower plus it will consume both binlog transaction files and ColumnStore VersionBuffer files.
  • DML, i.e. INSERT, UPDATE, and DELETE, provide row level changes. ColumnStore is optimized towards bulk modifications and so these operations are slower than they would be in say InnoDB.
    • Currently ColumnStore does not support operating as a replication slave target.
    • Bulk DML operations will in general perform better than multiple individual statements.
      • INSERT INTO SELECT with autocommit behaves similarly to LOAD DATE INFILE in that internally it is mapped to cpimport for higher performance.
      • Bulk update operations based on a join with a small staging table can be relatively fast especially if updating a single column.
  • Using ColumnStore Bulk Write SDK or ColumnStore Streaming Data Adapters
Title Description
ColumnStore Bulk Data Loading Using high-speed bulk load utility cpimport
ColumnStore LOAD DATA INFILE Using the LOAD DATA INFILE statement for bulk data loading.
ColumnStore Batch Insert Mode Batch data insert mode with cpimport
ColumnStore Bulk Write SDK Introduction Starting with MariaDB ColumnStore 1.1 a C++ SDK is available ...
ColumnStore remote bulk data import: mcsimport Overview mcsimport is a high-speed bulk load utility that imports data int...
ColumnStore Streaming Data Adapters The ColumnStore Bulk Data API enables the creation of higher performance a...
Content reproduced on this site is the property of its respective owners, and this content is not reviewed in advance by MariaDB. The views, information and opinions expressed by this content do not necessarily represent those of MariaDB or any other party.

© 2021 MariaDB
Licensed under the Creative Commons Attribution 3.0 Unported License and the GNU Free Documentation License.
https://mariadb.com/kb/en/columnstore-data-ingestion/