Category Archives: Pentaho

How to get data into MongoDB using PDI

This demo will show how to import data from a CSV file into MongoDB using Pentaho Data Integration tool (a.k.a. Kettle).  The following items will be demonstrated:

  1. Basics of how to map columns from CSV file to fields in a MongoDB JSON document.
  2. How to handle variable/optional columns.
  3. Perform basic data scrubbing before adding data into MongoDB.

Although this demo uses a CSV file as input data, PDI can just as easily import data from many JDBC compliant databases by using the Table Input step.

Leave a comment

Filed under Big Data, MongoDB, PDI, Pentaho