2011-11-18 by Stefan Urbanek

Cubes Tutorial 1 - Getting started

In this tutorial you are going to learn how to start with cubes. The example shows:

This tutorial is obsolete, please refer to the actual cubes documentation for more information.

  • how to build a model programatically
  • how to create a model with flat dimensions
  • how to aggregate whole cube
  • how to drill-down and aggregate through a dimension

The example data used are IBRD Balance Sheet taken from The World Bank

Create a tutorial directory and download the file:

curl -O https://raw.github.com/Stiivi/cubes/master/tutorial/data/IBRD_Balance_Sheet__FY2010.csv

Create a tutorial_01.py:

import sqlalchemy
import cubes
import cubes.tutorial.sql as tutorial

Cubes package contains tutorial helper methods. It is advised not to use them in production, they are provided just to simplify learner’s life.

Prepare the data using the tutorial helper methods:


engine = sqlalchemy.create_engine('sqlite:///:memory:')
tutorial.create_table_from_csv(engine, 
                      "IBRD_Balance_Sheet__FY2010.csv", 
                      table_name="irbd_balance", 
                      fields=[
                            ("category", "string"), 
                            ("line_item", "string"),
                            ("year", "integer"), 
                            ("amount", "integer")],
                      create_id=True    

                        )

Now, create a model:

model = cubes.Model()

Add dimensions to the model. Reason for having dimensions in a model is, that they might be shared by multiple cubes.

model.add_dimension(cubes.Dimension("category"))
model.add_dimension(cubes.Dimension("line_item"))
model.add_dimension(cubes.Dimension("year"))

Define a cube and specify already defined dimensions:

cube = cubes.Cube(name="irbd_balance", 
                  model=model,
                  dimensions=["category", "line_item", "year"],
                  measures=["amount"]
                  )

Create a browser and get a cell representing the whole cube (all data):

browser = cubes.backends.sql.SQLBrowser(cube, engine.connect(), view_name = "irbd_balance")

cell = browser.full_cube()

Compute the aggregate. Measure fields of aggregation result have aggregation suffix, currenlty only _sum. Also a total record count within the cell is included as record_count.

result = browser.aggregate(cell)

print "Record count: %d" % result.summary["record_count"]
print "Total amount: %d" % result.summary["amount_sum"]

Now try some drill-down by category:

print "Drill Down by Category"
result = browser.aggregate(cell, drilldown=["category"])

print "%-20s%10s%10s" % ("Category", "Count", "Total")
for record in result.drilldown:
    print "%-20s%10d%10d" % (record["category"], record["record_count"], record["amount_sum"])

Drill-dow by year:

print "Drill Down by Year:"
result = browser.aggregate(cell, drilldown=["year"])
print "%-20s%10s%10s" % ("Year", "Count", "Total")
for record in result.drilldown:
    print "%-20s%10d%10d" % (record["year"], record["record_count"], record["amount_sum"])

All tutorials with example data and models will be stored together with cubes sources under the tutorial/ directory.

Next: Model files and hierarchies.

If you have any questions, comments or suggestions, do not hesitate to ask.