May 2012
4 posts
4 tags
Cubes 0.9.1: Ranges, denormalization and query...
The new minor release of Cubes – light-weight Python OLAP framework – brings range cuts, denormalization with the slicer tool and cells in /report query, together with fixes and important changes. See the second part of this post for the full list. Range Cuts Range cuts were implemented in the SQL Star Browser. They are used as follows: Python: cut = RangeCut("date", [2010],...
May 29th
3 tags
Cubes 0.9 Released
The new version of Cubes – light-weight Python OLAP framework – brings new StarBrowser, which we discussed in previous blog posts: mappings, see also documentation joins and denormalization aggregations and new features, see also documentation The new SQL backend is written from scratch, it is much cleaner, transparent, configurable and open for future extensions. Also allows direct browsing...
May 14th
2 tags
Star Browser, Part 3: Aggregations and Cell...
Last time I was talking about joins and denormalisation in the Star Browser. This is the last part about the star browser where I will describe the aggregation and what has changed, compared to the old browser. The Star Browser is new aggregation browser in for the Cubes – lightweight Python OLAP Framework. Next version v0.9 will be released next week. Aggregation sum is not the only...
May 12th
2 tags
Star Browser, Part 2: Joins and Denormalization
Last time I was talking about how logical attributes are mapped to the physical table columns in the Star Browser. Today I will describe how joins are formed and how denormalization is going to be used. The Star Browser is new aggregation browser in for the Cubes – lightweight Python OLAP Framework. Star, Snowflake, Master and Detail Star browser supports a star: … and snowflake...
May 1st
1 note
April 2012
4 posts
2 tags
Star Browser, Part 1: Mappings
Star Browser is new aggregation browser in for the Cubes – lightweight Python OLAP Framework. I am going to talk briefly about current state and why new browser is needed. Then I will describe in more details the new browser: how mappings work, how tables are joined. At the end I will mention what will be added soon and what is planned in the future. Originally I wanted to write one blog post...
Apr 30th
2 tags
Cubes Backend Progress and Comparison
I’ve been working on a new SQL backend for cubes called StarBrowser. Besides new features and fixes, it is going to be more polished and maintainable. Current Backend Comparison In the following table you can see comparison of backends (or rather aggregation browsers). Current backend is sql.browser which reqiures denormalized table as a source. Future preferred backend will be...
Apr 29th
1 note
1 tag
Data Streaming Basics in Brewery
How to build and run a data analysis stream? Why streams? I am going to talk about how to use brewery from command line and from Python scripts. Brewery is a Python framework and a way of analysing and auditing data. Basic principle is flow of structured data through processing and analysing nodes. This architecture allows more transparent, understandable and maintainable data streaming...
Apr 13th
3 tags
Brewery 0.8 Released
I’m glad to announce new release of Brewery – stream based data auditing and analysis framework for Python. There are quite a few updates, to mention the notable ones: new brewery runner with commands run and graph new nodes: pretty printer node (for your terminal pleasure), generator function node many CSV updates and fixes Added several simple how-to examples, such as: aggregation...
Apr 4th
March 2012
1 post
3 tags
Cubes 0.8 Released
Another minor release of Cubes - Light Weight Python OLAP framework is out. Main change is that backend is no longer hard-wired in the Slicer server and can be selected through configuration file. There were lots of documentation changes, for example the reference was separated from the rest of docs. Hello World! example was added. The news, changes and fixes are: New Features Started...
Mar 9th
February 2012
3 posts
3 tags
Cubes goes MIT license with small addition for...
Cubes - The Lightweight Python OLAP Framework is now licensed under the MIT license with small addition. The full license is as follows: Copyright (c) 2011-2012 Stefan Urbanek, see AUTHORS for more details Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software...
Feb 23rd
2 tags
Cubes – Python OLAP Framework Architecture
What is inside the Cubes Python OLAP Framework? Here is a brief overview of the core modules, their purpose and functionality. The lightweight framework Cubes is composed of four public modules: model - Description of data (metadata): dimensions, hierarchies, attributes, labels, localizations. browser - Aggregation browsing, slicing-and-dicing, drill-down. backends - Actual aggregation...
Feb 21st
IRC Channel for Brewery and Cubes
I’ve opened an IRC channel on irc.freenode.net for Data Brewery and Cubes: #databrewery. I will be there mostly during CET daytime/late evening time. Check it out.
Feb 14th
December 2011
1 post
4 tags
Cubes 0.7.1 released
I am glad to announce new minor release of Cubes - Light Weight Python OLAP framework for multidimensional data aggregation and browsing. The news, changes and fixes are: New Features New method: Dimension.attribute_reference: returns full reference to an attribute str(cut) will now return constructed string representation of a cut as it can be used by Slicer Slicer server: added /locales...
Dec 5th
15 notes
November 2011
3 posts
4 tags
How-to: hierarchies, levels and drilling-down
In this Cubes OLAP how-to we are going to learn: how to create a hierarchical dimension how to do drill-down through a hierarchy detailed level description In the previous tutorial we learned how to use model descriptions in a JSON file and how to do physical to logical mappings. Data used are similar as in the second tutorial, manually modified IBRD Balance Sheet taken from The...
Nov 28th
9 notes
2 tags
Cubes Tutorial 2 - Model and Mappings
In the first tutorial we talked about how to construct model programmatically and how to do basic aggregations. In this tutorial we are going to learn: how to use model description file why and how to use logical to physical mappings Data used are the same as in the first tutorial, IBRD Balance Sheet taken from The World Bank. However, for purpose of this tutorial, the file was little bit...
Nov 24th
2 tags
Cubes Tutorial 1 - Getting started
In this tutorial you are going to learn how to start with cubes. The example shows: how to build a model programatically how to create a model with flat dimensions how to aggregate whole cube how to drill-down and aggregate through a dimension The example data used are IBRD Balance Sheet taken from The World Bank Create a tutorial directory and download the file: curl -O...
Nov 18th
October 2011
1 post
3 tags
Book: Star Schema – The Complete Reference →
Well written book - very understandable even for a beginner, despite being focused on more advanced specialists. Explains multi-dimensional database design: star schemas, snowflakes, fact tables, dimensions, aggregated data browsing and more.
Oct 3rd
7 notes
September 2011
1 post
4 tags
Cubes 0.7 released
I am happy to announce another release of Cubes - Python OLAP framework for multidimensional data aggregation and browsing. This release, besides some new features, renames Cuboid to more appropriate Cell. This introduces backward python API incompatibility. Main source repository has changed to Github https://github.com/Stiivi/cubes Changes Class ‘Cuboid’ was renamed to more...
Sep 29th
5 notes
June 2011
2 posts
3 tags
Brewery 0.7 Released
New small release is out with quite nice addition of documentation. It does not bring too many new features, but contains a refactoring towards better package structure, that breaks some compatibility. Documentation updates installation instructions with list of optional dependencies information about fields and metadata included documentation about data store classes included text from...
Jun 25th
5 notes
2 tags
Data Cleansing introduction (for BigClean Prague... →
Presentation from BigClean Prague about data cleansing - with Brewery examples.
Jun 8th
April 2011
1 post
3 tags
Cubes 0.6 released
New version of Cubes - Python OLAP framework and server - was released. Cubes is a framework for: Online Analytical Processing - OLAP, mostly relational DB based - ROLAP multidimensional analysis star and snowflake schema denormalisation Source: https://bitbucket.org/Stiivi/cubes Documentation: http://packages.python.org/cubes Python Package page: http://pypi.python.org/pypi/cubes Notable...
Apr 25th
3 notes
March 2011
3 posts
3 tags
Forking Forks with Higher Order Messaging
New way of constructing streams has been implemented which uses “higher order messaging”. What does that mean? Instead of constructing the stream from nodes and connections, you “call” functions that process your data. You pretend in your script that you work with data using functions: ... main.csv_source("data.csv") main.sample(1000) main.aggregate(keys =...
Mar 28th
3 tags
Brew data from Scraper Wiki
New subproject sprouted in Brewery: Opendata. The new package will contain wrappers for various open data services with APIs for structured data. First wrapper is for the Scraper Wiki. There are two new classes: ScraperWikiDataSource for plain data reading and ScraperWikiSourceNode for stream processing. Example with ScraperWikiDataSource: Copy data from Scraper Wiki source into a local...
Mar 23rd
2 tags
Introduction
Freshly brewed clean data with analytical taste – that is what Data Brewery is for. The Python framework will allow you to: stream structured data from various sources (CSV, XLS, SQL database, Google spreadsheet) to various structured targets create analytical streams using flow-based programming: connect processing nodes together and let the structured data flow through them measure data...
Mar 23rd