Pentaho Data Integration (PDI, also called Kettle) is the component of Pentaho responsible for the Extract, Transform and Load (ETL) processes. Though ETL tools are most frequently used in data warehouses environments, PDI can also be used for other purposes such as, migrating data between applications or databases, exporting data from databases to flat files, loading data massively into databases, data cleansing, and integrating applications.

Pentaho Data Integration 4 Cookbook starts off by explaining the details about working with databases, files, and XML structures. It then shows readers different ways for searching data, executing and reusing jobs and transformations, and manipulating streams. Furthermore, it will also teach them to solve common Excel needs such as reading from a particular cell or generating several sheets at a time.

Using this book, readers will be able to look up information from different sources such as databases, web services, or spreadsheets and others. They will learn how to work with data flows performing operations such as joining, merging, or filtering rows and customize the Kettle logs to their needs. The cookbook also teaches how to integrate Kettle with Pentaho Reporting, Pentaho Dashboards, Community Data Access, and Pentaho BI Platform.

With plenty of well-organized tips, screenshots, tables, and examples to aid quick and easy understanding, this book is ideal for any software developer or anyone involved or interested in developing ETL solutions, or in general, doing any kind of data manipulation. A basic understanding of the PDI tool, SQL language, and databases is necessary. This book is out now and available from Packt. For more information or to download a sample chapter, please visit:

