OpenIntel - ETL

The openIntel solution seeks to provide an easy way for districts to deploy a data warehouse without having to pay high prices for commercial solutions. To this end, we have developed Extract Transform and Load (ETL) plans using the Pentaho Data Integration tool, formerly known as Kettle. This tool allows for graphical programming of ETL routines. ETL refers to the process of extracting data from source systems, transforming it to fit into the data warehouse model and then loading the data into the data warehouse. This process is the hardest part of a data warehouse and the bulk of the work is done by district resources who know where the data is located.

In our first release, we have developed ETL plans that take data from a flat file and load it into the data warehouse. The flat file is comprised of your district's data and constructed according to the specifications we have defined for you. You can use the ETL tool to source data directly from your source systems. In our next release, we will integrate the openIntel data warehouse directly to Centre, the open source student information system. The ETL plans will then source data directly from Centre eliminating the need for a district to build their own flat files. An example plan is shown in the image below.

If you think this is sounding hard, relax. Because we have done this before, we have given you all of the tools and documentation to get you going. Included in our download section, is a complete set of data mapping spreadhsheets for the openIntel data warehouse. These sheets contain instructions telling you exactly how to map your source data to the data warehouse model and hints and tips for making it easier. Examples of data rules are also included so that you understand any data transformations that may have to occur. An example of a data transformation would be: 0 in my SIS system = Male in the data warehouse model.

One last thing: We haven't built any data validation routines for you yet. Data validation means checking your source data to validate that data going into a numeric field in the data warehouse is in fact numeric. Right now, the ETL plans will stop and roll back if data does not conform to the data warehouse model. There is is a reason for that and it is a very good reason. The people using your reports will be experts in the data and will recognize bad data. When that happens, they will quit using your data and all of your efforts will be wasted. So, we only let you load good data! It is not as hard as you think. Visit the download section and get started!

Support Center
View openSIS Demo
Check Out openSIS Screenshots