May 2012
4 posts
4 tags
Cubes 0.9.1: Ranges, denormalization and query...
The new minor release of Cubes – light-weight Python
OLAP framework –
brings range cuts,
denormalization
with the slicer tool and cells in /report query, together with fixes and
important changes.
See the second part of this post for the full list.
Range Cuts
Range cuts were implemented in the SQL Star Browser. They are used as follows:
Python:
cut = RangeCut("date", [2010],...
3 tags
Cubes 0.9 Released
The new version of Cubes – light-weight Python OLAP framework – brings new StarBrowser, which we discussed in previous blog posts:
mappings, see also documentation
joins and denormalization
aggregations and new features, see also documentation
The new SQL backend is written from scratch, it is much cleaner, transparent, configurable and open for future extensions. Also allows direct browsing...
2 tags
Star Browser, Part 3: Aggregations and Cell...
Last time I was talking about joins and
denormalisation in the Star
Browser. This is the last part about the star browser where I will describe the aggregation and what has changed, compared to the old browser.
The Star Browser is new aggregation browser in for the Cubes – lightweight
Python OLAP Framework. Next version v0.9 will be released next week.
Aggregation
sum is not the only...
2 tags
Star Browser, Part 2: Joins and Denormalization
Last time I was talking about how logical attributes are mapped to the
physical table columns in the
Star Browser. Today I will describe how joins are formed and how
denormalization is going to be used.
The Star Browser is new aggregation browser in for the
Cubes – lightweight Python OLAP Framework.
Star, Snowflake, Master and Detail
Star browser supports a star:
… and snowflake...
April 2012
4 posts
2 tags
Star Browser, Part 1: Mappings
Star Browser is new aggregation browser in for the
Cubes – lightweight Python OLAP Framework.
I am going to talk briefly about current state and why new browser is needed.
Then I will describe in more details the new browser: how mappings work, how
tables are joined. At the end I will mention what will be added soon and what
is planned in the future.
Originally I wanted to write one blog post...
2 tags
Cubes Backend Progress and Comparison
I’ve been working on a new SQL backend for cubes called StarBrowser. Besides
new features and fixes, it is going to be more polished and maintainable.
Current Backend Comparison
In the following table you can see comparison of backends (or rather
aggregation browsers). Current backend is sql.browser which reqiures
denormalized table as a source. Future preferred backend will be...
1 tag
Data Streaming Basics in Brewery
How to build and run a data analysis stream? Why streams? I am going to talk about
how to use brewery from command line and from Python scripts.
Brewery is a Python framework and a way of analysing and auditing data. Basic
principle is flow of structured data through processing and analysing nodes.
This architecture allows more transparent, understandable and maintainable
data streaming...
3 tags
Brewery 0.8 Released
I’m glad to announce new release of Brewery – stream based data auditing and analysis framework for Python.
There are quite a few updates, to mention the notable ones:
new brewery runner with commands run and graph
new nodes: pretty printer node (for your terminal pleasure), generator
function node
many CSV updates and fixes
Added several simple how-to
examples, such as:
aggregation...
March 2012
1 post
3 tags
Cubes 0.8 Released
Another minor release of Cubes - Light Weight Python OLAP framework is out. Main change is that backend is no longer hard-wired in the Slicer server and can be selected through configuration file.
There were lots of documentation changes, for example the reference was separated from the rest of docs. Hello World! example was added.
The news, changes and fixes are:
New Features
Started...
February 2012
3 posts
3 tags
Cubes goes MIT license with small addition for...
Cubes - The Lightweight Python OLAP Framework is now licensed under the MIT license with small addition. The full license is as follows:
Copyright (c) 2011-2012 Stefan Urbanek, see AUTHORS for more details
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
associated documentation files (the “Software”), to deal in the Software...
2 tags
Cubes – Python OLAP Framework Architecture
What is inside the Cubes Python OLAP Framework? Here is a brief overview of the core modules, their purpose and functionality.
The lightweight framework Cubes is composed of four public modules:
model - Description of data (metadata): dimensions, hierarchies, attributes, labels, localizations.
browser - Aggregation browsing, slicing-and-dicing, drill-down.
backends - Actual aggregation...
IRC Channel for Brewery and Cubes
I’ve opened an IRC channel on irc.freenode.net for Data Brewery and Cubes: #databrewery. I will be there mostly during CET daytime/late evening time. Check it out.
December 2011
1 post
4 tags
Cubes 0.7.1 released
I am glad to announce new minor release of Cubes - Light Weight Python OLAP framework for multidimensional data aggregation and browsing. The news, changes and fixes are:
New Features
New method: Dimension.attribute_reference: returns full reference to an attribute
str(cut) will now return constructed string representation of a cut as it can be used by Slicer
Slicer server:
added /locales...
November 2011
3 posts
4 tags
How-to: hierarchies, levels and drilling-down
In this Cubes OLAP how-to we are going to learn:
how to create a hierarchical dimension
how to do drill-down through a hierarchy
detailed level description
In the previous
tutorial we learned how
to use model descriptions in a JSON file and how to do physical to logical mappings.
Data used are similar as in the second tutorial, manually modified IBRD Balance
Sheet taken
from The...
2 tags
Cubes Tutorial 2 - Model and Mappings
In the first tutorial we talked about how to construct model programmatically and how to do basic aggregations.
In this tutorial we are going to learn:
how to use model description file
why and how to use logical to physical mappings
Data used are the same as in the first tutorial, IBRD Balance Sheet taken from The World Bank. However, for purpose of this tutorial, the file was little bit...
2 tags
Cubes Tutorial 1 - Getting started
In this tutorial you are going to learn how to start with cubes. The example shows:
how to build a model programatically
how to create a model with flat dimensions
how to aggregate whole cube
how to drill-down and aggregate through a dimension
The example data used are IBRD Balance Sheet taken from The World Bank
Create a tutorial directory and download the file:
curl -O...
October 2011
1 post
3 tags
Book: Star Schema – The Complete Reference →
Well written book - very understandable even for a beginner, despite being focused on more advanced specialists. Explains multi-dimensional database design: star schemas, snowflakes, fact tables, dimensions, aggregated data browsing and more.
September 2011
1 post
4 tags
Cubes 0.7 released
I am happy to announce another release of Cubes - Python OLAP framework for multidimensional data aggregation and browsing.
This release, besides some new features, renames Cuboid to more appropriate Cell. This introduces backward python API incompatibility.
Main source repository has changed to Github https://github.com/Stiivi/cubes
Changes
Class ‘Cuboid’ was renamed to more...
June 2011
2 posts
3 tags
Brewery 0.7 Released
New small release is out with quite nice addition of documentation. It does not bring too many new features, but contains a refactoring towards better package structure, that breaks some compatibility.
Documentation updates
installation instructions with list of optional dependencies
information about fields and metadata
included documentation about data store classes
included text from...
2 tags
Data Cleansing introduction (for BigClean Prague... →
Presentation from BigClean Prague about data cleansing - with Brewery examples.
April 2011
1 post
3 tags
Cubes 0.6 released
New version of Cubes - Python OLAP framework and server - was released.
Cubes is a framework for:
Online Analytical Processing - OLAP, mostly relational DB based - ROLAP
multidimensional analysis
star and snowflake schema denormalisation
Source: https://bitbucket.org/Stiivi/cubes
Documentation: http://packages.python.org/cubes
Python Package page: http://pypi.python.org/pypi/cubes
Notable...
March 2011
3 posts
3 tags
Forking Forks with Higher Order Messaging
New way of constructing streams has been implemented which uses “higher order messaging”. What does that mean? Instead of constructing the stream from nodes and connections, you “call” functions that process your data. You pretend in your script that you work with data using functions:
...
main.csv_source("data.csv")
main.sample(1000)
main.aggregate(keys =...
3 tags
Brew data from Scraper Wiki
New subproject sprouted in Brewery: Opendata. The new package will contain wrappers for various open data services with APIs for structured data. First wrapper is for the Scraper Wiki. There are two new classes: ScraperWikiDataSource for plain data reading and ScraperWikiSourceNode for stream processing.
Example with ScraperWikiDataSource: Copy data from Scraper Wiki source into a local...
2 tags
Introduction
Freshly brewed clean data with analytical taste – that is what Data Brewery is for. The Python framework will allow you to:
stream structured data from various sources (CSV, XLS, SQL database, Google spreadsheet) to various structured targets
create analytical streams using flow-based programming: connect processing nodes together and let the structured data flow through them
measure data...