How to get data into MongoDB using PDI

This demo will show how to import data from a CSV file into MongoDB using Pentaho Data Integration tool (a.k.a. Kettle).  The following items will be demonstrated:

  1. Basics of how to map columns from CSV file to fields in a MongoDB JSON document.
  2. How to handle variable/optional columns.
  3. Perform basic data scrubbing before adding data into MongoDB.

Although this demo uses a CSV file as input data, PDI can just as easily import data from many JDBC compliant databases by using the Table Input step.

Leave a comment

Filed under Big Data, MongoDB, PDI, Pentaho

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s