<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:dc="http://purl.org/dc/elements/1.1/" version="2.0"><channel><atom:link rel="hub" href="http://tumblr.superfeedr.com/" xmlns:atom="http://www.w3.org/2005/Atom"/><description>analytical data streams &amp; online analytical processing Python frameworks</description><title>Data Brewery</title><generator>Tumblr (3.0; @databrewery)</generator><link>http://blog.databrewery.org/</link><item><title>Cubes 0.9 Released</title><description>&lt;p&gt;The new version of Cubes – light-weight &lt;a href="http://www.python.org/"&gt;Python&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/Online_analytical_processing"&gt;OLAP&lt;/a&gt; framework – brings new StarBrowser, which we discussed in previous blog posts:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="http://blog.databrewery.org/post/22119118550"&gt;mappings&lt;/a&gt;, see also &lt;a href="http://packages.python.org/cubes/mapping.html"&gt;documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.databrewery.org/post/22214335636"&gt;joins and denormalization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.databrewery.org/post/22904157693"&gt;aggregations and new features&lt;/a&gt;, see also &lt;a href="http://packages.python.org/cubes/aggregate.html"&gt;documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The new &lt;a href="http://packages.python.org/cubes/api/backends.html"&gt;SQL backend&lt;/a&gt; is written from scratch, it is much cleaner, transparent, configurable and open for future extensions. Also allows direct browsing of star/snowflake schema without denormalization, therefore you can use Cubes on top of a read-only database. See &lt;a href="http://packages.python.org/cubes/api/backends.html#cubes.backends.sql.mapper.DenormalizedMapper"&gt;DenormalizedMapper&lt;/a&gt; and &lt;a href="http://packages.python.org/cubes/api/backends.html#cubes.backends.sql.mapper.SnowflakeMapper"&gt;SnowflakeMapper&lt;/a&gt; for more information.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m410f6IT8G1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;Just to name a few new features: &lt;a href="http://packages.python.org/cubes/aggregate.html#aggregations-and-aggregation-result"&gt;multiple aggregated computations&lt;/a&gt; (min, max,&amp;#8230;), &lt;a href="http://packages.python.org/cubes/aggregate.html#cell-details"&gt;cell details&lt;/a&gt;, optional/configurable &lt;a href="http://packages.python.org/cubes/api/backends.html#cubes.backends.sql.star.SQLStarWorkspace.create_denormalized_view"&gt;denormalization&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Important Changes&lt;/h2&gt;

&lt;p&gt;Summary of most important changes that might affect your code:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Slicer&lt;/strong&gt;: Change all your slicer.ini configuration files to have [workspace]
section instead of old [db] or [backend]. Depreciation warning is issued, will
work if not changed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model&lt;/strong&gt;: Change &lt;code&gt;dimensions&lt;/code&gt; in &lt;code&gt;model&lt;/code&gt; to be an array instead of a
dictionary. Same with &lt;code&gt;cubes&lt;/code&gt;. Old style: &lt;code&gt;"dimensions" = { "date" = ... }&lt;/code&gt;
new style: &lt;code&gt;"dimensions" = [ { "name": "date", ... } ]&lt;/code&gt;. Will work if not
changed, just be prepared.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Python&lt;/strong&gt;: Use Dimension.hierarchy() instead of Dimension.default_hierarchy.&lt;/p&gt;

&lt;h2&gt;New Features&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;slicer_context() - new method that holds all relevant information from 
configuration. can be reused when creating tools that work in connected
database environment&lt;/li&gt;
&lt;li&gt;added Hierarchy.all_attributes() and .key_attributes()&lt;/li&gt;
&lt;li&gt;Cell.rollup_dim() - rolls up single dimension to a specified level. this might
later replace the Cell.rollup() method&lt;/li&gt;
&lt;li&gt;Cell.drilldown() - drills down the cell&lt;/li&gt;
&lt;li&gt;create_workspace(backend,model, **options) - new top-level method for creating a workspace by specifying backend name. Easier to create browsers (from
possible browser pool) programmatically. The backend name might be full
module name path or relative to the cubes.backends, for example
&lt;code&gt;sql.star&lt;/code&gt; for new or &lt;code&gt;sql.browser&lt;/code&gt; for old SQL browser.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;get_backend() - get backend by name&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AggregationBrowser.cell_details(): New method returning values of attributes
representing the cell. Preliminary implementation, return value might
change.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;AggregationBrowser.cut_details(): New method returning values of attributes
representing a single cut. Preliminary implementation, return value might
change.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Dimension.validate() now checks whether there are duplicate attributes&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;Cube.validate() now checks whether there are duplicate measures or details&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;SQL backend:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;new &lt;a href="http://packages.python.org/cubes/api/backends.html#cubes.backends.sql.star.StarBrowser"&gt;StarBrowser&lt;/a&gt; implemented:

&lt;ul&gt;&lt;li&gt;StarBrowser supports snowflakes or denormalization (optional)&lt;/li&gt;
&lt;li&gt;for snowflake browsing no write permission is required (does not have to
be denormalized)&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;new &lt;a href="http://packages.python.org/cubes/api/backends.html#cubes.backends.sql.mapper.DenormalizedMapper"&gt;DenormalizedMapper&lt;/a&gt; for mapping logical model to denormalized view&lt;/li&gt;
&lt;li&gt;new &lt;a href="http://packages.python.org/cubes/api/backends.html#cubes.backends.sql.mapper.SnowflakeMapper"&gt;SnowflakeMapper&lt;/a&gt; for mapping logical model to a snowflake schema&lt;/li&gt;
&lt;li&gt;ddl_for_model() - get schema DDL as string for model&lt;/li&gt;
&lt;li&gt;join finder and attribute mapper are now just Mapper - class responsible for
finding appropriate joins and doing logical-to-physical mappings&lt;/li&gt;
&lt;li&gt;coalesce_attribute() - new method for coalescing multiple ways of describing
a physical attribute (just attribute or table+schema+attribute)&lt;/li&gt;
&lt;li&gt;dimension argument was removed from all methods working with attributes
(the dimension is now required attribute property)&lt;/li&gt;
&lt;li&gt;added create_denormalized_view() with options: materialize, create_index,
keys_only&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Slicer tool/server:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;slicer ddl - generate schema DDL from model&lt;/li&gt;
&lt;li&gt;slicer test - test configuration and model against database and report list 
of issues, if any&lt;/li&gt;
&lt;li&gt;Backend options are now in [workspace], removed configurability of custom
backend section. Warning are issued when old section names [db] and
[backend] are used &lt;/li&gt;
&lt;li&gt;server responds to /details which is a result of
AggregationBrowser.cell_details()&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Examples:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;added simple Flask based web example - dimension aggregation browser&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Changes&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;in Model: dimension and cube dictionary specification during model
initialization is depreciated, list should be used (with explicitly
mentioned attribute &amp;#8220;name&amp;#8221;) &amp;#8212; &lt;strong&gt;important&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;important&lt;/strong&gt;: Now all attribute references in the model (dimension
attributes, measures, &amp;#8230;) are required to be instances of Attribute() and
the attribute knows it&amp;#8217;s dimension&lt;/li&gt;
&lt;li&gt;removed &lt;code&gt;hierarchy&lt;/code&gt; argument from &lt;code&gt;Dimension.all_attributes()&lt;/code&gt; and &lt;code&gt;.key_attributes()&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;renamed builder to denormalizer&lt;/li&gt;
&lt;li&gt;Dimension.default_hierarchy is now depreciated in favor of
Dimension.hierarchy() which now accepts no arguments or argument None -
returning default hierarchy in those two cases&lt;/li&gt;
&lt;li&gt;metadata are now reused for each browser within one workspace - speed
improvement.&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Fixes&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;Slicer version should be same version as Cubes: Original intention was to
have separate server, therefore it had its own versioning. Now there is no
reason for separate version, moreover it can introduce confusion.&lt;/li&gt;
&lt;li&gt;Proper use of database schema in the Mapper&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Links&lt;/h2&gt;

&lt;p&gt;Sources can be found on &lt;a href="https://github.com/Stiivi/cubes"&gt;github&lt;/a&gt;.
Read the &lt;a href="http://packages.python.org/cubes/"&gt;documentation&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Join the &lt;a href="http://groups.google.com/group/cubes-discuss"&gt;Google Group&lt;/a&gt; for discussion, problem solving and announcements.&lt;/p&gt;

&lt;p&gt;Submit issues and suggestions &lt;a href="https://github.com/Stiivi/cubes/issues"&gt;on github&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;IRC channel &lt;a href="irc://irc.freenode.net/#databrewery"&gt;#databrewery&lt;/a&gt; on irc.freenode.net&lt;/p&gt;

&lt;p&gt;If you have any questions, comments, requests, do not hesitate to ask.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/23049114551</link><guid>http://blog.databrewery.org/post/23049114551</guid><pubDate>Mon, 14 May 2012 20:55:23 +0200</pubDate><category>cubes</category><category>announcement</category><category>release</category></item><item><title>Star Browser, Part 3: Aggregations and Cell Details</title><description>&lt;p&gt;Last time I was talking about &lt;a href="http://blog.databrewery.org/post/22214335636"&gt;joins and
denormalisation&lt;/a&gt; in the Star
Browser.  This is the last part about the star browser where I will describe the aggregation and what has changed, compared to the old browser.&lt;/p&gt;

&lt;p&gt;The Star Browser is new aggregation browser in for the Cubes – lightweight
Python OLAP Framework. Next version v0.9 will be released next week.&lt;/p&gt;

&lt;h1&gt;Aggregation&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;sum&lt;/em&gt; is not the only aggregation. The new browser allows to have other
aggregate functions as well, such as &lt;em&gt;min&lt;/em&gt;, &lt;em&gt;max&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;You can specify the aggregations for each measure separately:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
{
    "name": "amount",
    "aggregations": ["sum", "min", "max"]
}
&lt;/pre&gt;

&lt;p&gt;The resulting aggregated attribute name will be constructed from the measure
name and aggregation suffix, for example the mentioned &lt;em&gt;amount&lt;/em&gt; will have
three aggregates in the result: &lt;code&gt;amount_sum&lt;/code&gt;, &lt;code&gt;amount_min&lt;/code&gt; and &lt;code&gt;amount_max&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Source code reference: &lt;em&gt;see StarQueryBuilder.aggregations_for_measure&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Aggregation Result&lt;/h2&gt;

&lt;p&gt;Result of aggregation is a structure containing: &lt;code&gt;summary&lt;/code&gt; - summary for the
aggregated cell, &lt;code&gt;drilldown&lt;/code&gt; - drill down cells, if was desired, and
&lt;code&gt;total_cell_count&lt;/code&gt; - total cells in the drill down, regardless of pagination.&lt;/p&gt;

&lt;h2&gt;Cell Details&lt;/h2&gt;

&lt;p&gt;When we are browsing the cube, the cell provides current browsing context. For
aggregations and selections to happen, only keys and some other internal
attributes are necessary. Those can not be presented to the user though. For
example we have geography path (&lt;code&gt;country&lt;/code&gt;, &lt;code&gt;region&lt;/code&gt;) as &lt;code&gt;['sk', 'ba']&lt;/code&gt;,
however we want to display to the user &lt;code&gt;Slovakia&lt;/code&gt; for the country and
&lt;code&gt;Bratislava&lt;/code&gt; for the region. We need to fetch those values from the data
store.  Cell details is basically a human readable description of the current
cell.&lt;/p&gt;

&lt;p&gt;For applications where it is possible to store state between aggregation
calls, we can use values from previous aggregations or value listings. Problem
is with web applications - sometimes it is not desirable or possible to store
whole browsing context with all details. This is exact the situation where
fetching cell details explicitly might come handy.&lt;/p&gt;

&lt;p&gt;Note: The Original browser added cut information in the summary, which was ok
when only point cuts were used. In other situations the result was undefined
and mostly erroneous.&lt;/p&gt;

&lt;p&gt;The cell details are now provided separately by method
&lt;code&gt;AggregationBrowser.cell_details(cell)&lt;/code&gt; which has Slicer HTTP equivalent
&lt;code&gt;/details&lt;/code&gt; or &lt;code&gt;{"query":"detail", ...}&lt;/code&gt; in &lt;code&gt;/report&lt;/code&gt; request. The result is
a list of&lt;/p&gt;

&lt;p&gt;For point cuts, the detail is a list of dictionaries for each level. For
example our previously mentioned path &lt;code&gt;['sk', 'ba']&lt;/code&gt; would have details
described as:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
[
    {
        "geography.country_code": "sk",
        "geography.country_name": "Slovakia",
        "geography.something_more": "..."
        "_key": "sk",
        "_label": "Slovakia"
    },
    {
        "geography.region_code": "ba",
        "geography.region_name": "Bratislava",
        "geography.something_even_more": "...",
        "_key": "ba",
        "_label": "Bratislava"
    }
]
&lt;/pre&gt;

&lt;p&gt;You might have noticed the two redundant keys: &lt;code&gt;_key&lt;/code&gt; and &lt;code&gt;_label&lt;/code&gt; - those
contain values of a level key attribute and level label attribute
respectively. It is there to simplify the use of the details in presentation
layer, such as templates. Take for example doing only one-dimensional
browsing and compare presentation of &amp;#8220;breadcrumbs&amp;#8221;:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
labels = [detail["_label"] for detail in cut_details]
&lt;/pre&gt;

&lt;p&gt;Which is equivalent to:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
levels = dimension.hierarchy.levels()
labels = []
for i, detail in enumerate(cut_details):
    labels.append(detail[level[i].label_attribute.full_name()])
&lt;/pre&gt;

&lt;p&gt;Note that this might change a bit: either full detail will be returned or just
key and label, depending on an option argument (not yet decided).&lt;/p&gt;

&lt;h2&gt;Pre-aggregation&lt;/h2&gt;

&lt;p&gt;The Star Browser is being created with SQL pre-aggregation in mind. This is
not possible in the old browser, as it is not flexible enough. It is planned
to be integrated when all basic features are finished.&lt;/p&gt;

&lt;p&gt;Proposed access from user&amp;#8217;s perspective will be through configuration options:
&lt;code&gt;use_preaggregation&lt;/code&gt;, &lt;code&gt;preaggregation_prefix&lt;/code&gt;, &lt;code&gt;preaggregation_schema&lt;/code&gt; and
a method for cube pre-aggregation will be available through the slicer tool.&lt;/p&gt;

&lt;h1&gt;Summary&lt;/h1&gt;

&lt;p&gt;The new browser has better internal structure resulting in increased
flexibility for future extensions. It fixes not so good architectural
decisions of the old browser.&lt;/p&gt;

&lt;p&gt;New and fixed features:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;direct star/snowflake schema browsing&lt;/li&gt;
&lt;li&gt;improved mappings - more transparent and understandable process&lt;/li&gt;
&lt;li&gt;ability to explicitly specify database schemas&lt;/li&gt;
&lt;li&gt;multiple aggregations&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The new backend sources are
&lt;a href="https://github.com/Stiivi/cubes/blob/master/cubes/backends/sql/star.py"&gt;here&lt;/a&gt;
and the mapper is
&lt;a href="https://github.com/Stiivi/cubes/blob/master/cubes/backends/sql/mapper.py"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;To do&lt;/h2&gt;

&lt;p&gt;To be done in the near future:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;DDL generator for denormalized schema, corresponding logical schema and
physical schema&lt;/li&gt;
&lt;li&gt;explicit list of attributes to be selected (instead of all)&lt;/li&gt;
&lt;li&gt;selection of aggregations per-request (now all specified in model are used)&lt;/li&gt;
&lt;/ul&gt;&lt;h1&gt;Links&lt;/h1&gt;

&lt;p&gt;See also &lt;a href="https://github.com/Stiivi/cubes"&gt;Cubes at github&lt;/a&gt;,
&lt;a href="http://packages.python.org/cubes/"&gt;Cubes Documentation&lt;/a&gt;,
&lt;a href="http://groups.google.com/group/cubes-discuss/"&gt;Mailing List&lt;/a&gt;
and &lt;a href="https://github.com/Stiivi/cubes/issues"&gt;Submit issues&lt;/a&gt;. Also there is an 
IRC channel &lt;a href="irc://irc.freenode.net/#databrewery"&gt;#databrewery&lt;/a&gt; on
irc.freenode.net&lt;/p&gt;</description><link>http://blog.databrewery.org/post/22904157693</link><guid>http://blog.databrewery.org/post/22904157693</guid><pubDate>Sat, 12 May 2012 17:04:00 +0200</pubDate><category>cubes</category><category>olap</category></item><item><title>Star Browser, Part 2: Joins and Denormalization</title><description>&lt;p&gt;Last time I was talking about how &lt;a href="http://blog.databrewery.org/post/22119118550"&gt;logical attributes are mapped to the
physical table columns&lt;/a&gt; in the
Star Browser. Today I will describe how joins are formed and how
denormalization is going to be used.&lt;/p&gt;

&lt;p&gt;The Star Browser is new aggregation browser in for the
&lt;a href="https://github.com/Stiivi/cubes"&gt;Cubes&lt;/a&gt; – lightweight Python OLAP Framework.&lt;/p&gt;

&lt;h1&gt;Star, Snowflake, Master and Detail&lt;/h1&gt;

&lt;p&gt;Star browser supports a star:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m3ajfbXcHo1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;&amp;#8230; and snowflake database schema:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m3ajfn8QYt1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;The browser should know how to construct the star/snowflake and that is why
you have to specify the joins of the schema. The join specification is very
simple:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
"joins" = [
    { "master": "fact_sales.product_id", "detail": "dim_product.id" }
]
&lt;/pre&gt;

&lt;p&gt;Joins support only single-column keys, therefore you might have to create
surrogate keys for your dimensions.&lt;/p&gt;

&lt;p&gt;As in mappings, if you have specific needs for explicitly mentioning database
schema or any other reason where &lt;code&gt;table.column&lt;/code&gt; reference is not enough, you
might write:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
"joins" = [
    { 
        "master": "fact_sales.product_id",
        "detail": {
            "schema": "sales",
            "table": "dim_products",
            "column": "id"
        }
]
&lt;/pre&gt;

&lt;p&gt;What if you need to join same table twice? For example, you have list of
organizations and you want to use it as both: supplier and service consumer.
It can be done by specifying alias in the joins:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
"joins" = [
    {
        "master": "contracts.supplier_id", 
        "detail": "organisations.id",
        "alias": "suppliers"
    },
    {
        "master": "contracts.consumer_id", 
        "detail": "organisations.id",
        "alias": "consumers"
    }
]
&lt;/pre&gt;

&lt;p&gt;In the mappings you refer to the table by alias specified in the joins, not by
real table name:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
"mappings": {
    "supplier.name": "suppliers.org_name",
    "consumer.name": "consumers.org_name"
}
&lt;/pre&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m3ajian3sA1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;h2&gt;Relevant Joins and Denormalization&lt;/h2&gt;

&lt;p&gt;The new mapper joins only tables that are relevant for given query. That is,
if you are browsing by only one dimension, say &lt;em&gt;product&lt;/em&gt;, then only product
dimension table is joined.&lt;/p&gt;

&lt;p&gt;Joins are slow, expensive and the denormalization can be
helpful:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m3ajglKwV11qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;The old browser is based purely on the denormalized view. Despite having a
performance gain, it has several disadvantages. From the
join/performance perspective the major one is, that the denormalization is
required and it is not possible to browse data in a database that was
&amp;#8220;read-only&amp;#8221;. This requirements was also one unnecessary step for beginners,
which can be considered as usability problem.&lt;/p&gt;

&lt;p&gt;Current implementation of the &lt;em&gt;Mapper&lt;/em&gt; and &lt;em&gt;StarBrowser&lt;/em&gt; allows
denormalization to be integrated in a way, that it might be used based on
needs and situation:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m3d4ctMm6K1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;It is not yet there and this is what needs to be done:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;function for denormalization - similar to the old one: will take cube and
view name and will create denormalized view (or a table)&lt;/li&gt;
&lt;li&gt;make mapper accept the view and ignore joins&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Goal is not just to slap denormalization in, but to make it a configurable
alternative to default star browsing. From user&amp;#8217;s perspective, the workflow
will be:&lt;/p&gt;

&lt;ol&gt;&lt;li&gt;browse star/snowflake until need for denormalization arises&lt;/li&gt;
&lt;li&gt;configure denormalization and create denormalized view&lt;/li&gt;
&lt;li&gt;browse the denormalized view&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;The proposed options are: &lt;code&gt;use_denormalization&lt;/code&gt;, &lt;code&gt;denormalized_view_prefix&lt;/code&gt;,
&lt;code&gt;denormalized_view_schema&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The Star Browser is half-ready for the denormalization, just few changes are
needed in the mapper and maybe query builder. These changes have to be
compatible with another, not-yet-included feature: SQL pre-aggregation.&lt;/p&gt;

&lt;h1&gt;Conclusion&lt;/h1&gt;

&lt;p&gt;The new way of joining is very similar to the old one, but has much more
cleaner code and is separated from mappings. Also it is more transparent. New
feature is the ability to specify a database schema. Planned feature to be
integrated is automatic join detection based on foreign keys.&lt;/p&gt;

&lt;p&gt;In the next post (the last post in this series) about the new &lt;em&gt;StarBrowser&lt;/em&gt;, I am going to
explain &lt;a href="http://blog.databrewery.org/post/22904157693"&gt;aggregation improvements and changes&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;Links&lt;/h1&gt;

&lt;p&gt;Relevant source code is &lt;a href="https://github.com/Stiivi/cubes/blob/master/cubes/backends/sql/mapper.py"&gt;this one&lt;/a&gt; (github).&lt;/p&gt;

&lt;p&gt;See also &lt;a href="https://github.com/Stiivi/cubes"&gt;Cubes at github&lt;/a&gt;,
&lt;a href="http://packages.python.org/cubes/"&gt;Cubes Documentation&lt;/a&gt;,
&lt;a href="http://groups.google.com/group/cubes-discuss/"&gt;Mailing List&lt;/a&gt;
and &lt;a href="https://github.com/Stiivi/cubes/issues"&gt;Submit issues&lt;/a&gt;. Also there is an 
IRC channel &lt;a href="irc://irc.freenode.net/#databrewery"&gt;#databrewery&lt;/a&gt; on
irc.freenode.net&lt;/p&gt;</description><link>http://blog.databrewery.org/post/22214335636</link><guid>http://blog.databrewery.org/post/22214335636</guid><pubDate>Tue, 01 May 2012 23:20:00 +0200</pubDate><category>cubes</category><category>olap</category></item><item><title>Star Browser, Part 1: Mappings</title><description>&lt;p&gt;Star Browser is new aggregation browser in for the
&lt;a href="https://github.com/Stiivi/cubes"&gt;Cubes&lt;/a&gt; – lightweight Python OLAP Framework.
I am going to talk briefly about current state and why new browser is needed.
Then I will describe in more details the new browser: how mappings work, how
tables are joined. At the end I will mention what will be added soon and what
is planned in the future.&lt;/p&gt;

&lt;p&gt;Originally I wanted to write one blog post about this, but it was too long, so
I am going to split it into three:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;mappings (this one)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.databrewery.org/post/22214335636"&gt;joins and denormalization&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.databrewery.org/post/22904157693"&gt;aggregations and new features&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;h1&gt;Why new browser?&lt;/h1&gt;

&lt;p&gt;Current &lt;a href="https://github.com/Stiivi/cubes/blob/master/cubes/backends/sql/browser.py"&gt;denormalized
browser&lt;/a&gt;
is good, but not good enough. Firstly, it has grown into a spaghetti-like
structure inside and adding new features is quite difficult. Secondly, it is
not immediately clear what is going on inside and not only new users are
getting into troubles. For example the mapping of logical to physical is not
obvious; denormalization is forced to be used, which is good at the end, but
is making OLAP newbies puzzled.&lt;/p&gt;

&lt;p&gt;The new browser, called
&lt;a href="https://github.com/Stiivi/cubes/blob/master/cubes/backends/sql/star.py"&gt;StarBrowser&lt;/a&gt;.
is half-ready and will fix many of the old decisions with better ones.&lt;/p&gt;

&lt;h1&gt;Mapping&lt;/h1&gt;

&lt;p&gt;Cubes provides an analyst&amp;#8217;s view of dimensions and their attributes by hiding
the physical representation of data. One of the most important parts of proper
OLAP on top of the relational database is the mapping of physical attributes
to logical.&lt;/p&gt;

&lt;p&gt;First thing that was implemented in the new browser is proper mapping of
logical attributes to physical table columns. For example, take a reference to
an attribute &lt;em&gt;name&lt;/em&gt; in a dimension &lt;em&gt;product&lt;/em&gt;. What is the column of what table
in which schema that contains the value of this dimension attribute?&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m3ajdppDAa1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;There are two ways how the mapping is being done: implicit and explicit. The
simplest, straightforward and most customizable is the explicit way, where the
actual column reference is provided in the model description:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
"mappings": {
    "product.name": "dm_products.product_name"
}
&lt;/pre&gt;

&lt;p&gt;If it is in different schema or any part of the reference contains a dot:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
"mappings": {
    "product.name": {
            "schema": "sales",
            "table": "dm_products",
            "column": "product_name"
        }
}
&lt;/pre&gt;

&lt;p&gt;Disadvantage of the explicit way is it&amp;#8217;s verbosity and the fact that developer
has to write more metadata, obviously.&lt;/p&gt;

&lt;p&gt;Both, explicit and implicit mappings have ability to specify default database
schema (if you are using Oracle, PostgreSQL or any other DB which supports
schemas).&lt;/p&gt;

&lt;p&gt;The mapping process process is like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m3akrsmX9b1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;h2&gt;Implicit Mapping&lt;/h2&gt;

&lt;p&gt;With implicit mapping one can match a database schema with logical model and
does not have to specify additional mapping metadata. Expected structure is
star schema with one table per (denormalized) dimension.&lt;/p&gt;

&lt;p&gt;Basic rules:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;fact table should have same name as represented cube&lt;/li&gt;
&lt;li&gt;dimension table should have same name as the represented dimension, for
example: &lt;code&gt;product&lt;/code&gt; (singular)&lt;/li&gt;
&lt;li&gt;references without dimension name in them are expected to be in the fact
table, for example: &lt;code&gt;amount&lt;/code&gt;, &lt;code&gt;discount&lt;/code&gt; (see note below for simple flat
dimensions)&lt;/li&gt;
&lt;li&gt;column name should have same name as dimension attribute: &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;code&lt;/code&gt;,
&lt;code&gt;description&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;if attribute is localized, then there should be one column per localization
and should have locale suffix: &lt;code&gt;description_en&lt;/code&gt;, &lt;code&gt;description_sk&lt;/code&gt;,
&lt;code&gt;description_fr&lt;/code&gt; (see below for more information)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;This means, that by default &lt;code&gt;product.name&lt;/code&gt; is mapped to the table &lt;code&gt;product&lt;/code&gt;
and column &lt;code&gt;name&lt;/code&gt;. Measure &lt;code&gt;amount&lt;/code&gt; is mapped to the table &lt;code&gt;sales&lt;/code&gt; and column
&lt;code&gt;amount&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;What about dimensions that have only one attribute, like one would not have a
full date but just a &lt;code&gt;year&lt;/code&gt;? In this case it is kept in the fact table without
need of separate dimension table. The attribute is treated in by the same rule
as measure and is referenced by simple &lt;code&gt;year&lt;/code&gt;. This is applied to all
dimensions that have only one attribute (representing key as well). This
dimension is referred to as &lt;em&gt;flat and without details&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Note for advanced users: this behavior can be disabled by setting
&lt;code&gt;simplify_dimension_references&lt;/code&gt; to &lt;code&gt;False&lt;/code&gt; in the mapper. In that case you
will have to have separate table for the dimension attribute and you will have
to reference the attribute by full name. This might be useful when you know
that your dimension will be more detailed.&lt;/p&gt;

&lt;h2&gt;Localization&lt;/h2&gt;

&lt;p&gt;Despite localization taking place first in the mapping process, we talk about
it at the end, as it might be not so commonly used feature. From physical
point of view, the data localization is very trivial and requires language
denormalization - that means that each language has to have its own column for
each attribute.&lt;/p&gt;

&lt;p&gt;In the logical model, some of the attributes may contain list of locales that
are provided for the attribute. For example product category can be in
English, Slovak or German. It is specified in the model like this:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
attributes = [{
    "name" = "category",
    "locales" = [en, sk, de],
}]
&lt;/pre&gt;

&lt;p&gt;During the mapping process, localized logical reference is created first:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m3aksf89Zb1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;In short: if attribute is localizable and locale is requested, then locale
suffix is added. If no such localization exists then default locale is used.
Nothing happens to non-localizable attributes.&lt;/p&gt;

&lt;p&gt;For such attribute, three columns should exist in the physical model. There
are two ways how the columns should be named. They should have attribute name
with locale suffix such as &lt;code&gt;category_sk&lt;/code&gt; and &lt;code&gt;category_en&lt;/code&gt; (&lt;em&gt;underscore&lt;/em&gt;
because it is more common in table column names), if implicit mapping is used.
You can name the columns as you like, but you have to provide explicit mapping
in the mapping dictionary. The key for the localized logical attribute should
have &lt;code&gt;.locale&lt;/code&gt; suffix, such as &lt;code&gt;product.category.sk&lt;/code&gt; for Slovak version of
category attribute of dimension product. Here the &lt;em&gt;dot&lt;/em&gt; is used because dots
separate logical reference parts.&lt;/p&gt;

&lt;h2&gt;Customization of the Implicit&lt;/h2&gt;

&lt;p&gt;The implicit mapping process has a little bit of customization as well:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;em&gt;dimension table prefix&lt;/em&gt;: you can specify what prefix will be used for all
dimension tables. For example if the prefix is &lt;code&gt;dim_&lt;/code&gt; and attribute is
&lt;code&gt;product.name&lt;/code&gt; then the table is going to be &lt;code&gt;dim_product&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;fact table prefix&lt;/em&gt;: used for constructing fact table name from cube name.
Example: having prefix &lt;code&gt;ft_&lt;/code&gt; all fact attributes of cube &lt;code&gt;sales&lt;/code&gt; are going
to be looked up in table &lt;code&gt;ft_sales&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;fact table name&lt;/em&gt;: one can explicitly specify fact table name for each cube
separately&lt;/li&gt;
&lt;/ul&gt;&lt;h1&gt;The Big Picture&lt;/h1&gt;

&lt;p&gt;Here is the whole mapping schema, after localization:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m3akttdCmK1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;h1&gt;Links&lt;/h1&gt;

&lt;p&gt;The commented mapper source is
&lt;a href="https://github.com/Stiivi/cubes/blob/master/cubes/backends/sql/mapper.py"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/Stiivi/cubes"&gt;github sources&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://packages.python.org/cubes/"&gt;Documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://groups.google.com/group/cubes-discuss/"&gt;Mailing List&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Stiivi/cubes/issues"&gt;Submit issues&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;IRC channel &lt;a href="irc://irc.freenode.net/#databrewery"&gt;#databrewery&lt;/a&gt; on irc.freenode.net&lt;/li&gt;
&lt;/ul&gt;</description><link>http://blog.databrewery.org/post/22119118550</link><guid>http://blog.databrewery.org/post/22119118550</guid><pubDate>Mon, 30 Apr 2012 14:28:00 +0200</pubDate><category>cubes</category><category>sql</category></item><item><title>Cubes Backend Progress and Comparison</title><description>&lt;p&gt;I&amp;#8217;ve been working on a new SQL backend for cubes called StarBrowser. Besides
new features and fixes, it is going to be more polished and maintainable.&lt;/p&gt;

&lt;h1&gt;Current Backend Comparison&lt;/h1&gt;

&lt;p&gt;In the following table you can see comparison of backends (or rather
aggregation browsers). Current backend is &lt;code&gt;sql.browser&lt;/code&gt; which reqiures
denormalized table as a source. Future preferred backend will be &lt;code&gt;sql.star&lt;/code&gt;.&lt;/p&gt;

&lt;iframe width="500" height="300" frameborder="0" src="https://docs.google.com/spreadsheet/pub?key=0AsH8n-1Zd5PadGxpVDAxdDhVNHdrUFZkT0pJR2JZamc&amp;amp;single=true&amp;amp;gid=0&amp;amp;range=a1%3Ag26&amp;amp;output=html&amp;amp;widget=true"&gt;&lt;/iframe&gt;

&lt;p&gt;&lt;a href="https://docs.google.com/spreadsheet/ccc?key=0AsH8n-1Zd5PadGxpVDAxdDhVNHdrUFZkT0pJR2JZamc"&gt;Document link&lt;/a&gt; at Google Docs.&lt;/p&gt;

&lt;h1&gt;Star Browser state&lt;/h1&gt;

&lt;p&gt;More detailed description with schemas and description of what is happening
behind will be published once the browser will be useable in most of the
important features (that is, no sooner than drill-down is implemented). Here
is a peek to the new browser features.&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;separated attribute mapper - doing the logical-to-physical mapping. or in
other words: knows what column in which table represents what dimension
attribute or a measure&lt;/li&gt;
&lt;li&gt;more intelligent join building - uses only joins that are relevant to the
retrieved attributes, does not join the whole star/snowflake if not necessary&lt;/li&gt;
&lt;li&gt;allows tables to be stored in different database schemas (previously
everything had to be in one schema)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;There is still some work to be done, including drill-down and ordering
of results.&lt;/p&gt;

&lt;p&gt;You can try limited feature set of the browser by using &lt;code&gt;sql.star&lt;/code&gt; backend
name. Do not expect much at this time, however if you find a bug, I would be
glad if report it through &lt;a href="https://github.com/Stiivi/cubes/issues"&gt;github
issues&lt;/a&gt;. The source is in the
&lt;code&gt;cubes/backends/sql/star.py&lt;/code&gt; and &lt;code&gt;cubes/backends/sql/common.py&lt;/code&gt; (or
&lt;a href="https://github.com/Stiivi/cubes/blob/master/cubes/backends/sql/star.py"&gt;here&lt;/a&gt;).&lt;/p&gt;

&lt;h2&gt;New and improved&lt;/h2&gt;

&lt;p&gt;Here is a list of features you can expect (not yet fully implemented, if at
all started):&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;more SQL aggregation types and way to specify what aggregations
should be used by-default for each measure&lt;/li&gt;
&lt;li&gt;DDL schema generator for: denormalized table, logical model - star schema,
physical model&lt;/li&gt;
&lt;li&gt;model tester - tests whether all attributes and joins are valid in the
physical model&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Also the new implementation of star browser will allow easier integration of
 pre-aggregated store (planned) and various other optimisations.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/22070217246</link><guid>http://blog.databrewery.org/post/22070217246</guid><pubDate>Sun, 29 Apr 2012 22:02:06 +0200</pubDate><category>cubes</category><category>sql</category></item><item><title>Data Streaming Basics in Brewery</title><description>&lt;p&gt;How to build and run a data analysis stream? Why streams? I am going to talk about
how to use brewery from command line and from Python scripts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/Stiivi/brewery"&gt;Brewery&lt;/a&gt; is a Python framework and a way of analysing and auditing data. Basic
principle is flow of structured data through processing and analysing nodes.
This architecture allows more transparent, understandable and maintainable
data streaming process.&lt;/p&gt;

&lt;p&gt;You might want to use brewery when you:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;want to learn more about data&lt;/li&gt;
&lt;li&gt;encounter unknown datasets and/or you do not know what you have in your
datasets&lt;/li&gt;
&lt;li&gt;do not know exactly how to process your data and you want to play-around
without getting lost&lt;/li&gt;
&lt;li&gt;want to create alternative analysis paths and compare them&lt;/li&gt;
&lt;li&gt;measure data quality and feed data quality results into the data processing
process&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;There are many approaches and ways how to the data analysis. Brewery brings a certain workflow to the analyst:&lt;/p&gt;

&lt;ol&gt;&lt;li&gt;examine data&lt;/li&gt;
&lt;li&gt;prototype a stream (can use data sampling, not to overheat the machine)&lt;/li&gt;
&lt;li&gt;see results and refine stream, create alternatives (at the same time)&lt;/li&gt;
&lt;li&gt;repeat 3. until satisfied&lt;/li&gt;
&lt;/ol&gt;&lt;p&gt;Brewery makes the steps 2. and 3. easy - quick prototyping, alternative
branching, comparison. Tries to keep the analysts workflow clean and understandable.&lt;/p&gt;

&lt;h1&gt;Building and Running a Stream&lt;/h1&gt;

&lt;p&gt;There are two ways to create a stream: programmatic in Python and command-line
without Python knowledge requirement. Both ways have two alternatives: quick
and simple, but with limited feature set. And the other is full-featured but
is more verbose.&lt;/p&gt;

&lt;p&gt;The two programmatic alternatives to create a stream are: &lt;em&gt;basic construction&lt;/em&gt;
and &lt;em&gt;&amp;#8220;HOM&amp;#8221;&lt;/em&gt; or &lt;em&gt;forking construction&lt;/em&gt;. The two command line ways to run a
stream: &lt;em&gt;run&lt;/em&gt; and &lt;em&gt;pipe&lt;/em&gt;. We are now going to look closer at them.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m2f46vi6Po1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;Note regarding Zen of Python: this does not go against &amp;#8220;There should be one –
and preferably only one – obvious way to do it.&amp;#8221; There is only one way: the
raw construction. The others are higher level ways or ways in different
environments.&lt;/p&gt;

&lt;p&gt;In our examples below we are going to demonstrate simple linear (no branching)
stream that reads a CSV file, performs very basic audit and &amp;#8220;pretty prints&amp;#8221;
out the result. The stream looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m2f49jBpOK1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;h2&gt;Command line&lt;/h2&gt;

&lt;p&gt;Brewery comes with a command line utility &lt;code&gt;brewery&lt;/code&gt; which can run streams
without needing to write a single line of python code. Again there are two
ways of stream description: json-based and plain linear pipe.&lt;/p&gt;

&lt;p&gt;The simple usage is with &lt;code&gt;brewery pipe&lt;/code&gt; command:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;brewery pipe csv_source resource=data.csv audit pretty_printer
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The &lt;code&gt;pipe&lt;/code&gt; command expects list of nodes and &lt;code&gt;attribute=value&lt;/code&gt; pairs for node
configuration. If there is no source pipe specified, CSV on standard input is
used. If there is no target pipe, CSV on standard output is assumed:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;cat data.csv | brewery pipe audit
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The actual stream with implicit nodes is:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_m2f47oLuwZ1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;json&lt;/code&gt; way is more verbose but is full-featured: you can create complex
processing streams with many branches. &lt;code&gt;stream.json&lt;/code&gt;:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    {
        "nodes": { 
            "source": { "type":"csv_source", "resource": "data.csv" },
            "audit":  { "type":"audit" },
            "target": { "type":"pretty_printer" }
        },
        "connections": [
            ["source", "audit"],
            ["audit", "target"]
        ]
    }
&lt;/pre&gt;

&lt;p&gt;And run:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ brewery run stream.json
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;To list all available nodes do:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ brewery nodes
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;To get more information about a node, run &lt;code&gt;brewery nodes &amp;lt;node_name&amp;gt;&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ brewery nodes string_strip
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Note that data streaming from command line is more limited than the python
way. You might not get access to nodes and node features that require python
language, such as python storage type nodes or functions.&lt;/p&gt;

&lt;h2&gt;Higher order messaging&lt;/h2&gt;

&lt;p&gt;Preferred programming way of creating streams is through &lt;em&gt;higher order
messaging&lt;/em&gt; (HOM), which is, in this case, just fancy name for pretending doing
something while in fact we are preparing the stream.&lt;/p&gt;

&lt;p&gt;This way of creating a stream is more readable and maintainable. It is easier
to insert nodes in the stream and create forks while not losing picture of the
stream. Might be not suitable for very complex streams though. Here is an
example:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    b = brewery.create_builder()
    b.csv_source("data.csv")
    b.audit()
    b.pretty_printer()
&lt;/pre&gt;

&lt;p&gt;When this piece of code is executed, nothing actually happens to the data
stream. The stream is just being prepared and you can run it anytime:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    b.stream.run()
&lt;/pre&gt;

&lt;p&gt;What actually happens? The builder &lt;code&gt;b&lt;/code&gt; is somehow empty object that accepts
almost anything and then tries to find a node that corresponds to the method
called. Node is instantiated, added to the stream and connected to the
previous node.&lt;/p&gt;

&lt;p&gt;You can also create branched stream:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    b = brewery.create_builder()
    b.csv_source("data.csv")
    b.audit()

    f = b.fork()
    f.csv_target("audit.csv")

    b.pretty_printer()
&lt;/pre&gt;

&lt;h2&gt;Basic Construction&lt;/h2&gt;

&lt;p&gt;This is the lowest level way of creating the stream and allows full
customisation and control of the stream. In the &lt;em&gt;basic construction&lt;/em&gt; method
the programmer prepares all node instance objects and connects them
explicitly, node-by-node. Might be a too verbose, however it is to be used by
applications that are constructing streams either using an user interface or
from some stream descriptions. All other methods are using this one.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    from brewery import Stream
    from brewery.nodes import CSVSourceNode, AuditNode, PrettyPrinterNode

    stream = Stream()

    # Create pre-configured node instances
    src = CSVSourceNode("data.csv")
    stream.add(src)

    audit = AuditNode()
    stream.add(audit)

    printer = PrettyPrinterNode()
    stream.add(printer)

    # Connect nodes: source -&amp;gt; target
    stream.connect(src, audit)
    stream.connect(audit, printer)

    stream.run()
&lt;/pre&gt;

&lt;p&gt;It is possible to pass nodes as dictionary and connections as list of tuples
&lt;em&gt;(source, target)&lt;/em&gt;:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    stream = Stream(nodes, connections)
&lt;/pre&gt;

&lt;h1&gt;Future plans&lt;/h1&gt;

&lt;p&gt;What would be lovely to have in brewery?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Probing and data quality indicators&lt;/strong&gt; – tools for simple data probing and
easy way of creating data quality indicators. Will allow something like
&amp;#8220;test-driven-development&amp;#8221; but for data. This is the next step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stream optimisation&lt;/strong&gt; – merge multiple nodes into single processing unit
before running the stream. Might be done in near future.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Backend-based nodes and related data transfer between backend nodes&lt;/strong&gt; – For
example, two SQL nodes might pass data through a database table instead of
built-in data pipe or two numpy/scipy-based nodes might use numpy/scipy
structure to pass data to avoid unnecessary streaming. Not very soon, but
foreseeable future.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stream compilation&lt;/strong&gt; – compile a stream to an optimised script. Not too
soon, but like to have that one.&lt;/p&gt;

&lt;p&gt;Last, but not least: Currently there is little performance cost because of the
nature of brewery implementation. This penalty will be explained in another
blog post, however to make long story short, it has to do with threads, Python
GIL and non-optimalized stream graph. There is no future prediction for this
one, as it might be included step-by-step. Also some Python 3 features look
promising, such as &lt;code&gt;yield from&lt;/code&gt; in Python 3.3 (&lt;a href="http://www.python.org/dev/peps/pep-0380/"&gt;PEP 308&lt;/a&gt;).&lt;/p&gt;

&lt;h2&gt;Links&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://github.com/Stiivi/brewery"&gt;Brewery at github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://packages.python.org/brewery/"&gt;Documentation&lt;/a&gt; and &lt;a href="http://packages.python.org/brewery/node_reference.html"&gt;Node Reference&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Stiivi/brewery/tree/master/examples"&gt;Examples at github&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://groups.google.com/forum/?fromgroups#!forum/databrewery"&gt;Google Group&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description><link>http://blog.databrewery.org/post/21021110882</link><guid>http://blog.databrewery.org/post/21021110882</guid><pubDate>Fri, 13 Apr 2012 14:43:26 +0200</pubDate><category>brewery</category></item><item><title>Brewery 0.8 Released</title><description>&lt;p&gt;I&amp;#8217;m glad to announce new release of &lt;a href="https://github.com/Stiivi/brewery"&gt;Brewery&lt;/a&gt; – stream based data auditing and analysis framework for Python.&lt;/p&gt;

&lt;p&gt;There are quite a few updates, to mention the notable ones:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;new &lt;code&gt;brewery&lt;/code&gt; &lt;a href="http://packages.python.org/brewery/tools.html#brewery"&gt;runner&lt;/a&gt; with commands &lt;code&gt;run&lt;/code&gt; and &lt;code&gt;graph&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;new nodes: &lt;em&gt;pretty printer&lt;/em&gt; node (for your terminal pleasure), &lt;em&gt;generator
function&lt;/em&gt; node&lt;/li&gt;
&lt;li&gt;many CSV updates and fixes&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Added several simple &lt;a href="https://github.com/Stiivi/brewery/tree/master/examples"&gt;how-to
examples&lt;/a&gt;, such as:
aggregation of remote CSV, basic audit of a CSV, how to use a generator
function. Feedback and questions are welcome. I&amp;#8217;ll help you.&lt;/p&gt;

&lt;p&gt;Note that there are couple changes that break compatibility, however they can
be updated very easily. I apologize for the inconvenience, but until 1.0 the
changes might happen more frequently. On the other hand, I will try to make
them as painless as possible.&lt;/p&gt;

&lt;p&gt;Full listing of news, changes and fixes is below.&lt;/p&gt;

&lt;h1&gt;Version 0.8&lt;/h1&gt;

&lt;h2&gt;News&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;Changed license to MIT&lt;/li&gt;
&lt;li&gt;Created new brewery runner commands: &amp;#8216;run&amp;#8217; and &amp;#8216;graph&amp;#8217;:

&lt;ul&gt;&lt;li&gt;&amp;#8216;brewery run stream.json&amp;#8217; will execute the stream&lt;/li&gt;
&lt;li&gt;&amp;#8216;brewery graph stream.json&amp;#8217; will generate graphviz data&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Nodes: Added pretty printer node - textual output as a formatted table&lt;/li&gt;
&lt;li&gt;Nodes: Added source node for a generator function&lt;/li&gt;
&lt;li&gt;Nodes: added analytical type to derive field node&lt;/li&gt;
&lt;li&gt;Preliminary implementation of data probes (just concept, API not decided yet
for 100%)&lt;/li&gt;
&lt;li&gt;CSV: added empty_as_null option to read empty strings as Null values&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Nodes can be configured with node.configure(dictionary, protected). If 
&amp;#8216;protected&amp;#8217; is True, then protected attributes (specified in node info) can 
not be set with this method.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;added node identifier to the node reference doc&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;added create_logger&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;added experimental retype feature (works for CSV only at the moment)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;Mongo Backend - better handling of record iteration&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Changes&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;CSV: resource is now explicitly named argument in CSV*Node&lt;/li&gt;
&lt;li&gt;CSV: convert fields according to field storage type (instead of all-strings)&lt;/li&gt;
&lt;li&gt;Removed fields getter/setter (now implementation is totally up to stream
subclass)&lt;/li&gt;
&lt;li&gt;AggregateNode: rename &lt;code&gt;aggregates&lt;/code&gt; to &lt;code&gt;measures&lt;/code&gt;, added &lt;code&gt;measures&lt;/code&gt; as
public node attribute&lt;/li&gt;
&lt;li&gt;moved errors to brewery.common&lt;/li&gt;
&lt;li&gt;removed &lt;code&gt;field_name()&lt;/code&gt;, now str(field) should be used&lt;/li&gt;
&lt;li&gt;use named blogger &amp;#8216;brewery&amp;#8217; instead of the global one&lt;/li&gt;
&lt;li&gt;better debug-log labels for nodes (node type identifier + python object ID)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;WARNING:&lt;/strong&gt; Compatibility break:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;depreciate &lt;code&gt;__node_info__&lt;/code&gt; and use plain &lt;code&gt;node_info&lt;/code&gt; instead&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Stream.update()&lt;/code&gt; now takes nodes and connections as two separate arguments&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Fixes&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;added SQLSourceNode, added option to keep ifelds instead of dropping them in 
FieldMap and FieldMapNode (patch by laurentvasseur @ bitbucket)&lt;/li&gt;
&lt;li&gt;better traceback handling on node failure (now actually the traceback is
displayed)&lt;/li&gt;
&lt;li&gt;return list of field names as string representation of FieldList&lt;/li&gt;
&lt;li&gt;CSV: fixed output of zero numeric value in CSV (was empty string)&lt;/li&gt;
&lt;/ul&gt;&lt;h1&gt;Links&lt;/h1&gt;

&lt;ul&gt;&lt;li&gt;github  &lt;strong&gt;sources&lt;/strong&gt;: &lt;a href="https://github.com/Stiivi/brewery"&gt;https://github.com/Stiivi/brewery&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Documentation&lt;/strong&gt;: &lt;a href="http://packages.python.org/brewery/"&gt;http://packages.python.org/brewery/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mailing List&lt;/strong&gt;: &lt;a href="http://groups.google.com/group/databrewery/"&gt;http://groups.google.com/group/databrewery/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Submit &lt;strong&gt;issues&lt;/strong&gt; here: &lt;a href="https://github.com/Stiivi/brewery/issues"&gt;https://github.com/Stiivi/brewery/issues&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;IRC channel: &lt;a href="irc://irc.freenode.net/#databrewery"&gt;#databrewery&lt;/a&gt; on irc.freenode.net&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;If you have any questions, comments, requests, do not hesitate to ask.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/20467943307</link><guid>http://blog.databrewery.org/post/20467943307</guid><pubDate>Wed, 04 Apr 2012 16:57:22 +0200</pubDate><category>brewery</category><category>release</category><category>announcement</category></item><item><title>Cubes 0.8 Released</title><description>&lt;p&gt;Another minor release of Cubes - Light Weight Python OLAP framework is out. Main change is that backend is no longer hard-wired in the &lt;a href="http://packages.python.org/cubes/server.html"&gt;Slicer server&lt;/a&gt; and can be selected through configuration file.&lt;/p&gt;

&lt;p&gt;There were lots of documentation changes, for example &lt;a href="http://packages.python.org/cubes/api/index.html"&gt;the reference&lt;/a&gt; was separated from the rest of docs. &lt;a href="https://github.com/Stiivi/cubes/tree/master/examples/hello_world"&gt;Hello World! example&lt;/a&gt; was added.&lt;/p&gt;

&lt;p&gt;The news, changes and fixes are:&lt;/p&gt;

&lt;h2&gt;New Features&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;Started writing &lt;a href="https://github.com/Stiivi/cubes/blob/master/cubes/backends/sql/star_browser.py"&gt;StarBrowser&lt;/a&gt; - another SQL aggregation browser with different 
approach (see code/docs)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;a href="http://packages.python.org/cubes/server.html#configuration"&gt;Slicer Server&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;added configuration option &lt;code&gt;modules&lt;/code&gt; under &lt;code&gt;[server]&lt;/code&gt; to load additional 
modules&lt;/li&gt;
&lt;li&gt;added ability to specify backend module&lt;/li&gt;
&lt;li&gt;backend configuration is in [backend] by default, for SQL it stays in [db]&lt;/li&gt;
&lt;li&gt;added server config option for default &lt;code&gt;prettyprint&lt;/code&gt; value (useful for 
demontration purposes)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;a href="http://packages.python.org/cubes"&gt;Documentation&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="http://blog.databrewery.org/post/18142294411"&gt;Changed license&lt;/a&gt; to MIT + small addition. Please refer to the LICENSE file.&lt;/li&gt;
&lt;li&gt;Updated documentation - added missing parts, made reference more readable, 
moved class and function reference docs from descriptive part to reference 
(API) part.&lt;/li&gt;
&lt;li&gt;added backend documentation &lt;/li&gt;
&lt;li&gt;Added &amp;#8220;&lt;a href="https://github.com/Stiivi/cubes/tree/master/examples/hello_world"&gt;Hello World!&lt;/a&gt;&amp;#8221; example&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Changed Features&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;removed default SQL backend from the server&lt;/li&gt;
&lt;li&gt;moved worskpace creation into the backend module&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Fixes&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;Fixed create_view to handle not materialized properly (thanks to deytao)&lt;/li&gt;
&lt;li&gt;Slicer tool header now contains #!/usr/bin/env python&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Links&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;github  &lt;strong&gt;sources&lt;/strong&gt;: &lt;a href="https://github.com/Stiivi/cubes"&gt;https://github.com/Stiivi/cubes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Documentation&lt;/strong&gt;: &lt;a href="http://packages.python.org/cubes/"&gt;http://packages.python.org/cubes/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mailing List&lt;/strong&gt;: &lt;a href="http://groups.google.com/group/cubes-discuss"&gt;http://groups.google.com/group/cubes-discuss&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Submit &lt;strong&gt;issues&lt;/strong&gt; here: &lt;a href="https://github.com/Stiivi/cubes/issues"&gt;https://github.com/Stiivi/cubes/issues&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;IRC channel: &lt;a href="irc://irc.freenode.net/#databrewery"&gt;#databrewery&lt;/a&gt; on irc.freenode.net&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;If you have any questions, comments, requests, do not hesitate to ask.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/19000202563</link><guid>http://blog.databrewery.org/post/19000202563</guid><pubDate>Fri, 09 Mar 2012 14:18:02 +0100</pubDate><category>cubes</category><category>announcement</category><category>olap</category></item><item><title>Cubes goes MIT license with small addition for SaaS</title><description>&lt;p&gt;Cubes - The Lightweight Python OLAP Framework is now licensed under the MIT license with small addition. The full license is as follows:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Copyright (c) 2011-2012 Stefan Urbanek, see &lt;a href="https://github.com/Stiivi/cubes/blob/master/AUTHORS"&gt;AUTHORS&lt;/a&gt; for more details&lt;/p&gt;
  
  &lt;p&gt;Permission is hereby granted, free of charge, to any person obtaining a copy of this software and
  associated documentation files (the &amp;#8220;Software&amp;#8221;), to deal in the Software without restriction, including
  without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
  copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to
  the following conditions:&lt;/p&gt;
  
  &lt;ul&gt;&lt;li&gt;&lt;p&gt;The above copyright notice and this permission notice shall be included in all copies or substantial
  portions of the Software.&lt;/p&gt;&lt;/li&gt;
  &lt;li&gt;&lt;p&gt;If your version of the Software supports interaction with it remotely through a computer network, the
  above copyright notice and this permission notice shall be accessible to all users.&lt;/p&gt;&lt;/li&gt;
  &lt;/ul&gt;&lt;p&gt;THE SOFTWARE IS PROVIDED &amp;#8220;AS IS&amp;#8221;, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT
  LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN
  NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
  WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
  SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The addition says, that if you use it as part of software as a service (SaaS) you have to provide the copyright notice in an about, legal info, credits or some similar kind of page or info box. That&amp;#8217;s all.&lt;/p&gt;

&lt;p&gt;May it be like that? :-)&lt;/p&gt;

&lt;p&gt;Updated Cubes sources &lt;a href="https://github.com/Stiivi/cubes"&gt;are here&lt;/a&gt;, as usual.&lt;/p&gt;

&lt;p&gt;Enjoy.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/18142294411</link><guid>http://blog.databrewery.org/post/18142294411</guid><pubDate>Thu, 23 Feb 2012 21:04:16 +0100</pubDate><category>cubes</category><category>opensource</category><category>legal</category></item><item><title>Cubes – Python OLAP Framework Architecture</title><description>&lt;p&gt;What is inside the Cubes Python OLAP Framework? Here is a brief overview of the core modules, their purpose and functionality.&lt;/p&gt;

&lt;p&gt;The lightweight framework Cubes is composed of four public modules:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lzr33cGIx41qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;em&gt;model&lt;/em&gt; - Description of data (&lt;em&gt;metadata&lt;/em&gt;): dimensions, hierarchies, attributes, labels, localizations.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;browser&lt;/em&gt; - Aggregation browsing, slicing-and-dicing, drill-down.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;backends&lt;/em&gt; - Actual aggregation implementation and utility functions.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;server&lt;/em&gt; - WSGI HTTP server for Cubes&lt;/li&gt;
&lt;/ul&gt;&lt;h1&gt;Model&lt;/h1&gt;

&lt;p&gt;Logical model describes the data from user’s or analyst’s perspective: data how they are being measured, aggregated and reported. Model is independent of physical implementation of data. This physical independence makes it easier to focus on data instead on ways of how to get the data in understandable form.&lt;/p&gt;

&lt;p&gt;Cubes model is described by:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lzr33wXsWd1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;model object (&lt;a href="http://packages.python.org/cubes/model.html#cubes.model.Model"&gt;doc&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;list of cubes&lt;/li&gt;
&lt;li&gt;dimensions of cubes (they are shared with all cubes within model) (&lt;a href="http://packages.python.org/cubes/api/cubes.html#cubes.Dimension"&gt;doc&lt;/a&gt;) (&lt;a href="http://packages.python.org/cubes/api/cubes.html#cubes.Dimension"&gt;doc&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;hierarchies (&lt;a href="http://packages.python.org/cubes/api/cubes.html#cubes.Hierarchy"&gt;doc&lt;/a&gt;) and hierarchy levels (&lt;a href="http://packages.python.org/cubes/api/cubes.html#cubes.Level"&gt;doc&lt;/a&gt;) of dimensions (such as &lt;em&gt;category-subcategory&lt;/em&gt;, &lt;em&gt;country-region-city&lt;/em&gt;)&lt;/li&gt;
&lt;li&gt;optional mappings from logical model to the physical model (&lt;a href="http://packages.python.org/cubes/model.html#attribute-mappings"&gt;doc&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;optional join specifications for star schemas, used by the SQL denormalizing backend (&lt;a href="http://packages.python.org/cubes/model.html#joins"&gt;doc&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;There is a utility function provided for loading the model from a JSON file: &lt;code&gt;load_model&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The model module object are capable of being localized (see &lt;a href="http://packages.python.org/cubes/localization.html"&gt;Model Localization&lt;/a&gt; for more information). The cubes provides localization at the metadata level (the model) and functionality to have localization at the data level.&lt;/p&gt;

&lt;p&gt;See also: &lt;a href="http://packages.python.org/cubes/model.html"&gt;Model Documentation&lt;/a&gt;&lt;/p&gt;

&lt;h1&gt;Browser&lt;/h1&gt;

&lt;p&gt;Core of the Cubes analytics functionality is the aggregation browser. The &lt;code&gt;browser&lt;/code&gt; module contains utility classes and functions for the browser to work.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lzr34qlXN11qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;The module components are:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Cell&lt;/strong&gt; – specification of the portion of the cube to be explored, sliced or drilled down. Each cell is specified by a set of cuts. A cell without any cuts represents whole cube.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Cut&lt;/strong&gt; – definition where the cell is going to be sliced through single dimension. There are three types of cuts: point, range and set.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The types of cuts:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Point Cut&lt;/strong&gt; – Defines one single point on a dimension where the cube is going to be sliced. The point might be at any level of hierarchy. The point is specified by &amp;#8220;path&amp;#8221;. Examples of point cut: &lt;code&gt;[2010]&lt;/code&gt; for &lt;em&gt;year&lt;/em&gt; level of Date dimension, &lt;code&gt;[2010,1,7]&lt;/code&gt; for full date point.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Range Cut&lt;/strong&gt; – Defines two points (dimension paths) on a sortable dimension between whose the cell is going to be sliced from cube.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Set Cut&lt;/strong&gt; – Defines list of multiple points (dimension paths) which are going to be included in the sliced cell.&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Example of point cut effect:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lzr35pNwxo1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;The module provides couple utility functions:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;code&gt;path_from_string&lt;/code&gt; - construct a dimension path (point) from a string&lt;/li&gt;
&lt;li&gt;&lt;code&gt;string_from_path&lt;/code&gt; - get a string representation of a dimension path (point)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;string_from_cuts&lt;/code&gt; and &lt;code&gt;cuts_from_string&lt;/code&gt; are for conversion between string and list of cuts. (Currently only list of point cuts are supported in the string representation)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The aggregation browser can:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;aggregate a cell (&lt;code&gt;aggregate(cell)&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;drill-down through multiple dimensions and aggregate (&lt;code&gt;aggregate(cell, drilldown="date")&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;get all detailed facts within the cell (&lt;code&gt;facts(cell)&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;get single fact (&lt;code&gt;fact(id)&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;There is convenience function &lt;code&gt;report(cell, report)&lt;/code&gt; that can be implemented by backend in more efficient way to get multiple aggregation queries in single call.&lt;/p&gt;

&lt;p&gt;More about aggregated browsing can be found in the &lt;a href="http://packages.python.org/cubes/api/cubes.html#aggregate-browsing"&gt;Cubes documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;Backends&lt;/h1&gt;

&lt;p&gt;Actual aggregation is provided by the backends. The backend should implement aggregation browser interface.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lzr37ayQWJ1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;Cubes comes with built-in &lt;a href="http://en.wikipedia.org/wiki/ROLAP"&gt;ROLAP&lt;/a&gt; backend which uses SQL database through SQLAlchemy. The backend has two major components:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;em&gt;aggregation browser&lt;/em&gt; that works on single denormalized view or a table&lt;/li&gt;
&lt;li&gt;&lt;em&gt;SQL denormalizer&lt;/em&gt; helper class that converts &lt;a href="http://en.wikipedia.org/wiki/Star_schema"&gt;star schema&lt;/a&gt; into a denormalized view or table (kind of materialisation).&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;There was an attempt to write a &lt;a href="https://github.com/Stiivi/cubes/tree/master/cubes/backends/mongo"&gt;Mongo DB backend&lt;/a&gt;, but it does not work any more, it is included in the sources only as reminder, that there should be a mongo backend sometime in the future.&lt;/p&gt;

&lt;p&gt;Anyone can write a backend. If you are interested, drop me a line.&lt;/p&gt;

&lt;h1&gt;Server&lt;/h1&gt;

&lt;p&gt;Cubes comes with Slicer - a WSGI HTTP OLAP server with API for most of the cubes framework functionality. The server is based on the Werkzeug framework.&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lzr37v5B6G1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;Intended use of the slicer is basically as follows:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;application prepares the cell to be aggregated, drilled, listed&amp;#8230; The &lt;em&gt;cell&lt;/em&gt; might be whole cube.&lt;/li&gt;
&lt;li&gt;HTTP request is sent to the server&lt;/li&gt;
&lt;li&gt;the server uses appropriate aggregation browser backend (note that currently there is only one: SQL denormalized) to compute the request&lt;/li&gt;
&lt;li&gt;Slicer returns a JSON reply to the application&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;For more information, please refer to the Cubes &lt;a href="http://packages.python.org/cubes/server.html"&gt;Slicer server documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h1&gt;One more thing&amp;#8230;&lt;/h1&gt;

&lt;p&gt;There are plenty things to get improved, of course. Current focus is not on performance, but on achieving simple usability.&lt;/p&gt;

&lt;p&gt;The Cubes sources can be found on Github: &lt;a href="https://github.com/stiivi/cubes"&gt;https://github.com/stiivi/cubes&lt;/a&gt; . There is also a IRC channel #databrewery on irc.freenode.net (I try to be there during late evening CET). Issues can be reported on the &lt;a href="https://github.com/stiivi/cubes/issues?sort=created&amp;amp;direction=desc&amp;amp;state=open"&gt;github project page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you have any questions, suggestions, recommendations, just let me know.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://news.ycombinator.com/item?id=3617672"&gt;&lt;em&gt;HackerNews Thread&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;</description><link>http://blog.databrewery.org/post/18013047462</link><guid>http://blog.databrewery.org/post/18013047462</guid><pubDate>Tue, 21 Feb 2012 17:13:00 +0100</pubDate><category>cubes</category><category>olap</category></item><item><title>IRC Channel for Brewery and Cubes</title><description>&lt;p&gt;I&amp;#8217;ve opened an IRC channel on irc.freenode.net for Data Brewery and Cubes: #databrewery. I will be there mostly during CET daytime/late evening time. Check it out.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/17601253059</link><guid>http://blog.databrewery.org/post/17601253059</guid><pubDate>Tue, 14 Feb 2012 10:15:59 +0100</pubDate></item><item><title>Cubes 0.7.1 released</title><description>&lt;p&gt;I am glad to announce new minor release of Cubes - Light Weight Python OLAP framework for multidimensional data aggregation and browsing. The news, changes and fixes are:&lt;/p&gt;

&lt;h2&gt;New Features&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;New method: Dimension.attribute_reference: returns full reference to an attribute&lt;/li&gt;
&lt;li&gt;str(cut) will now return constructed string representation of a cut as it can be used by Slicer&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Slicer server:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;added /locales to slicer&lt;/li&gt;
&lt;li&gt;added locales key in /model request&lt;/li&gt;
&lt;li&gt;added Access-Control-Allow-Origin for JS/jQuery&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Changes&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;Allow dimensions in cube to be a list, noy only a dictionary (internally it is ordered dictionary)&lt;/li&gt;
&lt;li&gt;Allow cubes in model to be a list, noy only a dictionary (internally it is ordered dictionary)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Slicer server:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;slicer does not require default cube to be specified: if no cube is in the request then try default from
config or get first from model&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Fixes&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;Slicer not serves right localization regardless of what localization was used first after server was
launched (changed model localization copy to be deepcopy (as it should be))&lt;/li&gt;
&lt;li&gt;Fixes some remnants that used old Cell.foo based browsing to Browser.foo(cell, &amp;#8230;) only browsing &lt;/li&gt;
&lt;li&gt;fixed model localization issues; once localized, original locale was not available&lt;/li&gt;
&lt;li&gt;Do not try to add locale if not specified. Fixes #11: &lt;a href="https://github.com/Stiivi/cubes/issues/11"&gt;https://github.com/Stiivi/cubes/issues/11&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Tutorials&lt;/h2&gt;

&lt;p&gt;Added tutorials in tutorials/ with models in tutorials/models/ and data in tutorials/data/:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="http://blog.databrewery.org/post/12966527920/cubes-tutorial-1-getting-started"&gt;Tutorial 1&lt;/a&gt;: 

&lt;ul&gt;&lt;li&gt;how to build a model programatically&lt;/li&gt;
&lt;li&gt;how to create a model with flat dimensions&lt;/li&gt;
&lt;li&gt;how to aggregate whole cube&lt;/li&gt;
&lt;li&gt;how to drill-down and aggregate through a dimension&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.databrewery.org/post/13255558153/cubes-tutorial-2-model-and-mappings"&gt;Tutorial 2&lt;/a&gt;: 

&lt;ul&gt;&lt;li&gt;how to create and use a model file&lt;/li&gt;
&lt;li&gt;mappings&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.databrewery.org/post/13457860520/how-to-hierarchies-levels-and-drilling-down"&gt;Tutorial 3&lt;/a&gt;: 

&lt;ul&gt;&lt;li&gt;how hierarhies work&lt;/li&gt;
&lt;li&gt;drill-down through a hierarchy&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;li&gt;Tutorial 4 (not blogged about it yet):

&lt;ul&gt;&lt;li&gt;how to launch slicer server&lt;/li&gt;
&lt;/ul&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Links&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;github  &lt;strong&gt;sources&lt;/strong&gt;: &lt;a href="https://github.com/Stiivi/cubes"&gt;https://github.com/Stiivi/cubes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Documentation&lt;/strong&gt;: &lt;a href="http://packages.python.org/cubes/"&gt;http://packages.python.org/cubes/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Mailing List&lt;/strong&gt;: &lt;a href="http://groups.google.com/group/cubes-discuss"&gt;http://groups.google.com/group/cubes-discuss&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Submit &lt;strong&gt;issues&lt;/strong&gt; here: &lt;a href="https://github.com/Stiivi/cubes/issues"&gt;https://github.com/Stiivi/cubes/issues&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;If you have any questions, comments, requests, do not hesitate to ask.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/13775805019</link><guid>http://blog.databrewery.org/post/13775805019</guid><pubDate>Mon, 05 Dec 2011 12:30:00 +0100</pubDate><category>cubes</category><category>olap</category><category>announcement</category><category>release</category></item><item><title>How-to: hierarchies, levels and drilling-down</title><description>&lt;p&gt;In this Cubes OLAP how-to we are going to learn:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;how to create a hierarchical dimension&lt;/li&gt;
&lt;li&gt;how to do drill-down through a hierarchy&lt;/li&gt;
&lt;li&gt;detailed level description&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;In the &lt;a href="http://blog.databrewery.org/post/13255558153/cubes-tutorial-2-model-and-mappings"&gt;previous
tutorial&lt;/a&gt; we learned how
to use model descriptions in a JSON file and how to do physical to logical mappings.&lt;/p&gt;

&lt;p&gt;Data used are similar as in the second tutorial, manually modified &lt;a href="https://raw.github.com/Stiivi/cubes/master/tutorial/data/IBRD_Balance_Sheet__FY2010-t03.csv"&gt;IBRD Balance
Sheet&lt;/a&gt; taken
from &lt;a href="https://finances.worldbank.org/Accounting-and-Control/IBRD-Balance-Sheet-FY2010/e8yz-96c6"&gt;The World
Bank&lt;/a&gt;.
Difference between second tutorial and this one is added two columns: category code and sub-category code.
They are simple letter codes for the categories and subcategories.&lt;/p&gt;

&lt;h2&gt;Hierarchy&lt;/h2&gt;

&lt;p&gt;Some dimensions can have multiple levels forming a hierarchy. For example dates have year, month, day;
geography has country, region, city; product might have category, subcategory and the product.&lt;/p&gt;

&lt;p&gt;Note: Cubes supports multiple hierarchies, for example for date you might have year-month-day or
year-quarter-month-day. Most dimensions will have one hierarchy, thought.&lt;/p&gt;

&lt;p&gt;In our example we have the &lt;code&gt;item&lt;/code&gt; dimension with three levels of hierarchy: &lt;em&gt;category&lt;/em&gt;, &lt;em&gt;subcategory&lt;/em&gt; and
&lt;em&gt;line item&lt;/em&gt;:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lvdr0nIHgl1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;The levels are defined in the model:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
"levels": [
    {
        "name":"category",
        "label":"Category",
        "attributes": ["category"]
    },
    {
        "name":"subcategory",
        "label":"Sub-category",
        "attributes": ["subcategory"]
    },
    {
        "name":"line_item",
        "label":"Line Item",
        "attributes": ["line_item"]
    }
]
&lt;/pre&gt;

&lt;p&gt;You can see a slight difference between this model description and the previous one: we didn&amp;#8217;t just specify
level names and didn&amp;#8217;t let cubes to fill-in the defaults. Here we used explicit description of each level.
&lt;code&gt;name&lt;/code&gt; is level identifier, &lt;code&gt;label&lt;/code&gt; is human-readable label of the level that can be
used in end-user applications and &lt;code&gt;attributes&lt;/code&gt; is list of attributes that belong to the level.
The first attribute, if not specified otherwise, is the key attribute of the level.&lt;/p&gt;

&lt;p&gt;Other level description attributes are &lt;code&gt;key&lt;/code&gt; and &lt;code&gt;label_attribute&lt;/code&gt;. The
&lt;code&gt;key&lt;/code&gt; specifies attribute name which contains key for the level. Key is an id number, code or
anything that uniquely identifies the dimension level. &lt;code&gt;label_attribute&lt;/code&gt; is name of an attribute
that contains human-readable value that can be displayed in user-interface elements such as tables or
charts.&lt;/p&gt;

&lt;h2&gt;Preparation&lt;/h2&gt;

&lt;p&gt;In this how-to we are going to skip all off-topic code, such as data initialization. The full example can
be found in the &lt;a href="https://github.com/Stiivi/cubes/tree/master/tutorial"&gt;tutorial sources&lt;/a&gt; with suffix
&lt;code&gt;03&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In short we need:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;data in a database&lt;/li&gt;
&lt;li&gt;logical model (see &lt;code&gt;model_03.json&lt;/code&gt;) prepared with appropriate mappings&lt;/li&gt;
&lt;li&gt;denormalized view for aggregated browsing (for current simple SQL browser implementation)&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Drill-down&lt;/h2&gt;

&lt;p&gt;Drill-down is an action that will provide more details about data. Drilling down through a dimension
hierarchy will expand next level of the dimension. It can be compared to browsing through your directory
structure.&lt;/p&gt;

&lt;p&gt;We create a function that will recursively traverse a dimension hierarchy and will print-out aggregations
(count of records in this example) at the actual browsed location.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Attributes&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;cell - cube cell to drill-down&lt;/li&gt;
&lt;li&gt;dimension - dimension to be traversed through all levels&lt;/li&gt;
&lt;li&gt;path - current path of the &lt;code&gt;dimension&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Path is list of dimension points (keys) at each level. It is like file-system path.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
def drill_down(cell, dimension, path = []):
&lt;/pre&gt;

&lt;p&gt;Get dimension&amp;#8217;s default hierarchy. Cubes supports multiple hierarchies, for example for date you might
have year-month-day or year-quarter-month-day. Most dimensions will have one hierarchy, thought.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    hierarchy = dimension.default_hierarchy
&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;Base path&lt;/em&gt; is path to the most detailed element, to the leaf of a tree, to the fact. Can we go deeper in
the hierarchy?&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    if hierarchy.path_is_base(path):
        return
&lt;/pre&gt;

&lt;p&gt;Get the next level in the hierarchy. &lt;code&gt;levels_for_path&lt;/code&gt; returns list of levels according to
provided path. When &lt;code&gt;drilldown&lt;/code&gt; is set to &lt;code&gt;True&lt;/code&gt; then one more level is returned.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    levels = hierarchy.levels_for_path(path,drilldown=True)
    current_level = levels[-1]
&lt;/pre&gt;

&lt;p&gt;We need to know name of the level key attribute which contains a path component. If the model does not
explicitly specify key attribute for the level, then first attribute will be used:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    level_key = dimension.attribute_reference(current_level.key)
&lt;/pre&gt;

&lt;p&gt;For prettier display, we get name of attribute which contains label to be displayed
for the current level. If there is no label attribute, then key attribute is used.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    level_label = dimension.attribute_reference(current_level.label_attribute)
&lt;/pre&gt;

&lt;p&gt;We do the aggregation of the cell&amp;#8230; Think of &lt;code&gt;ls $CELL&lt;/code&gt; command in commandline, where
&lt;code&gt;$CELL&lt;/code&gt; is a directory name. In this function we can think of &lt;code&gt;$CELL&lt;/code&gt; to be same as
current working directory (&lt;code&gt;pwd&lt;/code&gt;)&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    result = browser.aggregate(cell, drilldown=[dimension])

    for record in result.drilldown:
        print "%s%s: %d" % (indent, record[level_label], record["record_count"])
        ...
&lt;/pre&gt;

&lt;p&gt;And now the drill-down magic. First, construct new path by key attribute value appended to the current
path:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
        drill_path = path[:] + [record[level_key]]
&lt;/pre&gt;

&lt;p&gt;Then get a new cell slice for current path:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
        drill_down_cell = cell.slice(dimension, drill_path)
&lt;/pre&gt;

&lt;p&gt;And do recursive drill-down:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
        drill_down(drill_down_cell, dimension, drill_path)
&lt;/pre&gt;

&lt;p&gt;The function looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lvdrsbS5VW1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;Working function example &lt;code&gt;03&lt;/code&gt; can be found in the &lt;a href="https://github.com/Stiivi/cubes/blob/master/tutorial/tutorial_03.py"&gt;tutorial
sources&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Get the full cube (or any part of the cube you like):&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
cell = browser.full_cube()
&lt;/pre&gt;

&lt;p&gt;And do the drill-down through the item dimension:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
drill_down(cell, cube.dimension("item"))
&lt;/pre&gt;

&lt;p&gt;The output should look like this:&lt;/p&gt;

&lt;pre&gt;
a: 32
    da: 8
        Borrowings: 2
        Client operations: 2
        Investments: 2
        Other: 2
    dfb: 4
        Currencies subject to restriction: 2
        Unrestricted currencies: 2
    i: 2
        Trading: 2
    lo: 2
        Net loans outstanding: 2
    nn: 2
        Nonnegotiable, nonintrest-bearing demand obligations on account of subscribed capital: 2
    oa: 6
        Assets under retirement benefit plans: 2
        Miscellaneous: 2
        Premises and equipment (net): 2
&lt;/pre&gt;

&lt;p&gt;Note that because we have changed our source data, we see level codes instead of level names. We will fix
that later. Now focus on the drill-down.&lt;/p&gt;

&lt;p&gt;See that nice hierarchy tree?&lt;/p&gt;

&lt;p&gt;Now if you slice the cell through year 2010 and do the exact same drill-down:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
    cell = cell.slice("year", [2010])
    drill_down(cell, cube.dimension("item"))
&lt;/pre&gt;

&lt;p&gt;you will get similar tree, but only for year 2010 (obviously).&lt;/p&gt;

&lt;h2&gt;Level Labels and Details&lt;/h2&gt;

&lt;p&gt;Codes and ids are good for machines and programmers, they are short, might follow some scheme, easy to
handle in scripts. Report users have no much use of them, as they look cryptic and have no meaning for the
first sight.&lt;/p&gt;

&lt;p&gt;Our source data contains two columns for category and for subcategory: column with code and column with
label for user interfaces. Both columns belong to the same dimension and to the same level. The key column
is used by the analytical system to refer to the dimension point and the label is just decoration.&lt;/p&gt;

&lt;p&gt;Levels can have any number of detail attributes. The detail attributes have no analytical meaning and are
just ignored during aggregations. If you want to do analysis based on an attribute, make it a separate
dimension instead.&lt;/p&gt;

&lt;p&gt;So now we fix our model by specifying detail attributes for the levels:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lvdr2aHJRJ1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;The model description is:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
"levels": [
        {
            "name":"category",
            "label":"Category",
            "label_attribute": "category_label",
            "attributes": ["category", "category_label"]
        },
        {
            "name":"subcategory",
            "label":"Sub-category",
            "label_attribute": "subcategory_label",
            "attributes": ["subcategory", "subcategory_label"]
        },
        {
            "name":"line_item",
            "label":"Line Item",
            "attributes": ["line_item"]
        }
    ]
}
&lt;/pre&gt;

&lt;p&gt;Note the &lt;code&gt;label_attribute&lt;/code&gt; keys. They specify which attribute contains label to be displayed.
Key attribute is by-default the first attribute in the list. If one wants to use some other attribute it
can be specified in &lt;code&gt;key_attribute&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Because we added two new attributes, we have to add mappings for them:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
"mappings": { "item.line_item": "line_item",
              "item.subcategory": "subcategory",
              "item.subcategory_label": "subcategory_label",
              "item.category": "category",
              "item.category_label": "category_label" 
             }
&lt;/pre&gt;

&lt;p&gt;In the example tutorial, which can be found in the Cubes sources under &lt;code&gt;tutorial/&lt;/code&gt; directory,
change the model file from &lt;code&gt;model/model_03.json&lt;/code&gt; to &lt;code&gt;model/model_03-labels.json&lt;/code&gt;
and run the code again. Or fix the file as specified above.&lt;/p&gt;

&lt;p&gt;Now the result will be:&lt;/p&gt;

&lt;pre&gt;
Assets: 32
    Derivative Assets: 8
        Borrowings: 2
        Client operations: 2
        Investments: 2
        Other: 2
    Due from Banks: 4
        Currencies subject to restriction: 2
        Unrestricted currencies: 2
    Investments: 2
        Trading: 2
    Loans Outstanding: 2
        Net loans outstanding: 2
    Nonnegotiable: 2
        Nonnegotiable, nonintrest-bearing demand obligations on account of subscribed capital: 2
    Other Assets: 6
        Assets under retirement benefit plans: 2
        Miscellaneous: 2
        Premises and equipment (net): 2
&lt;/pre&gt;

&lt;h2&gt;Implicit hierarchy&lt;/h2&gt;

&lt;p&gt;Try to remove the last level &lt;em&gt;line_item&lt;/em&gt; from the model file and see what happens. Code still works, but
displays only two levels. What does that mean? If metadata - logical model - is used properly in an
application, then application can handle most of the model changes without any application modifications.
That is, if you add new level or remove a level, there is no need to change your reporting application.&lt;/p&gt;

&lt;h2&gt;Summary&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;hierarchies can have multiple levels&lt;/li&gt;
&lt;li&gt;a hierarchy level is identifier by a key attribute&lt;/li&gt;
&lt;li&gt;a hierarchy level can have multiple detail attributes and there is one special detail attribute: label attribute used for display in user interfaces&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Next: slicing and dicing or slicer server, not sure yet.&lt;/p&gt;

&lt;p&gt;If you have any questions, suggestions, comments, let me know.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/13457860520</link><guid>http://blog.databrewery.org/post/13457860520</guid><pubDate>Mon, 28 Nov 2011 18:14:49 +0100</pubDate><category>cubes</category><category>tutorial</category><category>olap</category><category>howto</category></item><item><title>Cubes Tutorial 2 - Model and Mappings</title><description>&lt;p&gt;In the &lt;a href="http://blog.databrewery.org/post/12966527920/cubes-tutorial-1-getting-started"&gt;first tutorial&lt;/a&gt; we talked about how to construct model programmatically and how to do basic aggregations.&lt;/p&gt;

&lt;p&gt;In this tutorial we are going to learn:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;how to use model description file&lt;/li&gt;
&lt;li&gt;why and how to use logical to physical mappings&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Data used are the same as in the first tutorial,  &lt;a href="https://raw.github.com/Stiivi/cubes/master/tutorial/data/IBRD_Balance_Sheet__FY2010.csv"&gt;IBRD Balance Sheet&lt;/a&gt; taken from &lt;a href="https://finances.worldbank.org/Accounting-and-Control/IBRD-Balance-Sheet-FY2010/e8yz-96c6"&gt;The World Bank&lt;/a&gt;. However, for purpose of this tutorial, the  file was little bit manually edited: the column &amp;#8220;Line Item&amp;#8221; is split into two:
&lt;em&gt;Subcategory&lt;/em&gt; and &lt;em&gt;Line Item&lt;/em&gt; to provide two more levels to total of three levels of hierarchy.&lt;/p&gt;

&lt;h2&gt;Logical Model&lt;/h2&gt;

&lt;p&gt;The Cubes framework uses a logical model. Logical model describes the data from user’s or analyst’s
perspective: data how they are being measured, aggregated and reported. Model creates an abstraction layer
therefore making reports independent of physical structure of the data. More information can be found in the
&lt;a href="http://packages.python.org/cubes/model.html"&gt;framework documentation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The model description file is a JSON file containing a dictionary:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
{
    "dimensions": [  ...  ],
    "cubes": { ... }
}
&lt;/pre&gt;

&lt;p&gt;First we define the dimensions. They might be shared by multiple cubes, therefore they belong to the model
space. There are two dimensions: &lt;em&gt;item&lt;/em&gt; and &lt;em&gt;year&lt;/em&gt; in our dataset. The &lt;em&gt;year&lt;/em&gt; dimension is flat, contains only one
level and has no details. The dimension &lt;em&gt;item&lt;/em&gt; has three levels: &lt;em&gt;category&lt;/em&gt;, &lt;em&gt;subcategory&lt;/em&gt; and &lt;em&gt;line item&lt;/em&gt;.
It looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lv67lezyq31qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;We define them as:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
{
    "dimensions": [
        {"name":"item",
         "levels": ["category", "subcategory", "line_item"]
        },
        {"name":"year"}
    ],
    "cubes": {...}
}
&lt;/pre&gt;

&lt;p&gt;The levels of our tutorial dimensions are simple, with no details. There is little bit of implicit
construction going on behind the scenes of dimension initialization, but that will be described later. In
short: default hierarchy is created and for each level single attribute is created with the same name as the
level.&lt;/p&gt;

&lt;p&gt;Next we define the cubes. The cube is in most cases specified by list of dimensions and measures:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
{
    "dimensions": [...],
    "cubes": {
        "irbd_balance": {
            "dimensions": ["item", "year"],
            "measures": ["amount"]
        }
    }
}
&lt;/pre&gt;

&lt;p&gt;And we are done: we have dimensions and a cube. Well, almost done: we have to tell the framework, which
attributes are going to be used.&lt;/p&gt;

&lt;h2&gt;Attribute Naming&lt;/h2&gt;

&lt;p&gt;As mentioned before, cubes uses logical model to describe the data used in the reports. To assure
consistency with dimension attribute naming, cubes uses sheme: &lt;code&gt;dimension.attribute&lt;/code&gt; for non-flat
dimensions. Why? Firstly, it decreases doubt to which dimension the attribute belongs. Secondly the
&lt;code&gt;item.category&lt;/code&gt; will always be &lt;code&gt;item.category&lt;/code&gt; in the report, regardless of how the
field will be named in the source and in which table the field exists.&lt;/p&gt;

&lt;p&gt;Imagine a snowflake schema: fact table in the middle with references to multiple tables containing various
dimension data. There might be a dimension spanning through multiple tables, like product category in one
table, product subcategory in another table. We should not care about what table the attribute comes from,
we should care only that the attribute is called &lt;code&gt;category&lt;/code&gt; and belongs to a dimension
&lt;code&gt;product&lt;/code&gt; for example.&lt;/p&gt;

&lt;p&gt;Another reason is, that in localized data, the analyst will use &lt;code&gt;item.category_label&lt;/code&gt; and
appropriate localized physical attribute will be used. Just to name few reasons.&lt;/p&gt;

&lt;p&gt;Knowing the naming scheme we have following cube attribute names:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;code&gt;year&lt;/code&gt; (it is flat dimension)&lt;/li&gt;
&lt;li&gt;&lt;code&gt;item.category&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;item.subcategory&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;item.line_item&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Problem is, that the table does not have the columns with the names. That is what mapping is for: maps
logical attributes in the model into physical attributes in the table.&lt;/p&gt;

&lt;h1&gt;Mapping&lt;/h1&gt;

&lt;p&gt;The source table looks like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://media.tumblr.com/tumblr_lv67uvnhtJ1qgmvbu.png" alt=""/&gt;&lt;/p&gt;

&lt;p&gt;We have to tell how the dimension attributes are mapped to the table columns. It is a simple dictionary
where keys are dimension attribute names and values are physical table column names.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
{
    ...
    "cubes": {
        "irbd_balance": {
            ...
            "mappings": { "item.line_item": "line_item",
                          "item.subcategory": "subcategory",
                          "item.category": "category" }
        }
    }
}
&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;Note:&lt;/em&gt; The mapping values might be backend specific. They are physical table column names for the current
implementation of the SQL backend.&lt;/p&gt;

&lt;p&gt;Full model looks like this:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
{
    "dimensions": [
        {"name":"item",
         "levels": ["category", "subcategory", "line_item"]
        },
        {"name":"year"}
    ],
    "cubes": {
        "irbd_balance": {
            "dimensions": ["item", "year"],
            "measures": ["amount"],
            "mappings": { "item.line_item": "line_item",
                          "item.subcategory": "subcategory",
                          "item.category": "category" }
        }
    }
}
&lt;/pre&gt;

&lt;h1&gt;Example&lt;/h1&gt;

&lt;p&gt;Now we have the model, saved for example in the &lt;code&gt;models/model_02.json&lt;/code&gt;. Let&amp;#8217;s do some
preparation:&lt;/p&gt;

&lt;p&gt;Define table names and a view name to be used later. The view is going to be used as logical abstraction.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
FACT_TABLE = "ft_irbd_balance"
FACT_VIEW = "vft_irbd_balance"
&lt;/pre&gt;

&lt;p&gt;Load the data, as in the previous example, using the tutorial helper function (again, do not use that in
production):&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
engine = sqlalchemy.create_engine('sqlite:///:memory:')
tutorial.create_table_from_csv(engine, 
                      "data/IBRD_Balance_Sheet__FY2010-t02.csv", 
                      table_name=FACT_TABLE, 
                      fields=[
                            ("category", "string"), 
                            ("subcategory", "string"), 
                            ("line_item", "string"),
                            ("year", "integer"), 
                            ("amount", "integer")],
                      create_id=True    
                        
                        )
connection = engine.connect()
&lt;/pre&gt;

&lt;p&gt;The new data sheet is in the &lt;a href="https://github.com/Stiivi/cubes/raw/master/tutorial/data/IBRD_Balance_Sheet__FY2010-t02.csv"&gt;github
repository&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Load the model, get the cube and specify where cube&amp;#8217;s source data comes from:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
model = cubes.load_model("models/model_02.json")
cube = model.cube("irbd_balance")
cube.fact = FACT_TABLE
&lt;/pre&gt;

&lt;p&gt;We have to prepare the logical structures used by the browser. Currenlty provided is simple data
denormalizer: creates one wide view with logical column names (optionally with localization). Following
code initializes the denomralizer and creates a view for the cube:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
dn = cubes.backends.sql.SQLDenormalizer(cube, connection)

dn.create_view(FACT_VIEW)
&lt;/pre&gt;

&lt;p&gt;And from this point on, we can continue as usual:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
browser = cubes.backends.sql.SQLBrowser(cube, connection, view_name = FACT_VIEW)

cell = browser.full_cube()
result = browser.aggregate(cell)

print "Record count: %d" % result.summary["record_count"]
print "Total amount: %d" % result.summary["amount_sum"]
&lt;/pre&gt;

&lt;p&gt;The tutorial sources can be found in the &lt;a href="https://github.com/Stiivi/cubes/tree/master/tutorial"&gt;Cubes github
repository&lt;/a&gt;. Requires current git clone.&lt;/p&gt;

&lt;p&gt;Next: Drill-down through deep hierarchy.&lt;/p&gt;

&lt;p&gt;If you have any questions, suggestions, comments, let me know.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/13255558153</link><guid>http://blog.databrewery.org/post/13255558153</guid><pubDate>Thu, 24 Nov 2011 17:04:54 +0100</pubDate><category>cubes</category><category>tutorial</category></item><item><title>Cubes Tutorial 1 - Getting started</title><description>&lt;p&gt;In this tutorial you are going to learn how to start with cubes. The example shows:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;how to build a model programatically&lt;/li&gt;
&lt;li&gt;how to create a model with flat dimensions&lt;/li&gt;
&lt;li&gt;how to aggregate whole cube&lt;/li&gt;
&lt;li&gt;how to drill-down and aggregate through a dimension&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;The example data used are &lt;a href="https://raw.github.com/Stiivi/cubes/master/tutorial/data/IBRD_Balance_Sheet__FY2010.csv"&gt;IBRD Balance Sheet&lt;/a&gt; taken from &lt;a href="https://finances.worldbank.org/Accounting-and-Control/IBRD-Balance-Sheet-FY2010/e8yz-96c6"&gt;The World Bank&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create a tutorial directory and download the file:&lt;/p&gt;

&lt;pre&gt;
curl -O &lt;a href="https://raw.github.com/Stiivi/cubes/master/tutorial/data/IBRD_Balance_Sheet__FY2010.csv"&gt;https://raw.github.com/Stiivi/cubes/master/tutorial/data/IBRD_Balance_Sheet__FY2010.csv&lt;/a&gt;
&lt;/pre&gt;

&lt;p&gt;Create a &lt;code&gt;tutorial_01.py&lt;/code&gt;:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
import sqlalchemy
import cubes
import cubes.tutorial.sql as tutorial
&lt;/pre&gt;

&lt;p&gt;Cubes package contains tutorial helper methods. It is advised not to use them in production, they are provided just to simplify learner&amp;#8217;s life.&lt;/p&gt;

&lt;p&gt;Prepare the data using the tutorial helper methods:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;

engine = sqlalchemy.create_engine('sqlite:///:memory:')
tutorial.create_table_from_csv(engine, 
                      "IBRD_Balance_Sheet__FY2010.csv", 
                      table_name="irbd_balance", 
                      fields=[
                            ("category", "string"), 
                            ("line_item", "string"),
                            ("year", "integer"), 
                            ("amount", "integer")],
                      create_id=True    
                        
                        )
&lt;/pre&gt;

&lt;p&gt;Now, create a model:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
model = cubes.Model()
&lt;/pre&gt;

&lt;p&gt;Add dimensions to the model. Reason for having dimensions in a model is, that they might be shared by multiple cubes.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
model.add_dimension(cubes.Dimension("category"))
model.add_dimension(cubes.Dimension("line_item"))
model.add_dimension(cubes.Dimension("year"))
&lt;/pre&gt;

&lt;p&gt;Define a cube and specify already defined dimensions:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
cube = cubes.Cube(name="irbd_balance", 
                  model=model,
                  dimensions=["category", "line_item", "year"],
                  measures=["amount"]
                  )
&lt;/pre&gt;

&lt;p&gt;Create a browser and get a cell representing the whole cube (all data):&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
browser = cubes.backends.sql.SQLBrowser(cube, engine.connect(), view_name = "irbd_balance")

cell = browser.full_cube()
&lt;/pre&gt;

&lt;p&gt;Compute the aggregate. Measure fields of aggregation result have aggregation suffix, currenlty only &lt;code&gt;_sum&lt;/code&gt;. Also a total record count within the cell is included as &lt;code&gt;record_count&lt;/code&gt;.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
result = browser.aggregate(cell)

print "Record count: %d" % result.summary["record_count"]
print "Total amount: %d" % result.summary["amount_sum"]
&lt;/pre&gt;

&lt;p&gt;Now try some drill-down by category:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
print "Drill Down by Category"
result = browser.aggregate(cell, drilldown=["category"])

print "%-20s%10s%10s" % ("Category", "Count", "Total")
for record in result.drilldown:
    print "%-20s%10d%10d" % (record["category"], record["record_count"], record["amount_sum"])
&lt;/pre&gt;

&lt;p&gt;Drill-dow by year:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
print "Drill Down by Year:"
result = browser.aggregate(cell, drilldown=["year"])
print "%-20s%10s%10s" % ("Year", "Count", "Total")
for record in result.drilldown:
    print "%-20s%10d%10d" % (record["year"], record["record_count"], record["amount_sum"])
&lt;/pre&gt;

&lt;p&gt;All tutorials with example data and models will be stored together with &lt;a href="https://github.com/Stiivi/cubes"&gt;cubes sources&lt;/a&gt; under the &lt;code&gt;tutorial/&lt;/code&gt; directory.&lt;/p&gt;

&lt;p&gt;Next: Model files and hierarchies.&lt;/p&gt;

&lt;p&gt;If you have any questions, comments or suggestions, do not hesitate to ask.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/12966527920</link><guid>http://blog.databrewery.org/post/12966527920</guid><pubDate>Fri, 18 Nov 2011 14:28:09 +0100</pubDate><category>cubes</category><category>tutorial</category></item><item><title>Book: Star Schema – The Complete Reference</title><description>&lt;a href="http://www.amazon.com/Schema-Complete-Reference-Christopher-Adamson/dp/0071744320"&gt;Book: Star Schema – The Complete Reference&lt;/a&gt;: &lt;p&gt;Well written book - very understandable even for a beginner, despite being focused on more advanced specialists. Explains multi-dimensional database design: star schemas, snowflakes, fact tables, dimensions, aggregated data browsing and more.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/10999619267</link><guid>http://blog.databrewery.org/post/10999619267</guid><pubDate>Tue, 04 Oct 2011 02:06:00 +0200</pubDate><category>cubes</category><category>reading</category><category>olap</category></item><item><title>Cubes 0.7 released</title><description>&lt;p&gt;I am happy to announce another release of Cubes - Python OLAP framework for multidimensional data aggregation and browsing.&lt;/p&gt;

&lt;p&gt;This release, besides some new features, renames Cuboid to more appropriate Cell. This introduces backward python API incompatibility.&lt;/p&gt;

&lt;p&gt;Main &lt;strong&gt;source repository&lt;/strong&gt; has changed to Github &lt;a href="https://github.com/Stiivi/cubes"&gt;https://github.com/Stiivi/cubes&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Changes&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;Class &amp;#8216;Cuboid&amp;#8217; was renamed to more correct &amp;#8216;Cell&amp;#8217;. &amp;#8216;Cuboid&amp;#8217; is a part of cube with subset of dimensions.&lt;/li&gt;
&lt;li&gt;all APIs with &amp;#8216;cuboid&amp;#8217; in their name/arguments were renamed to use &amp;#8216;cell&amp;#8217; instead&lt;/li&gt;
&lt;li&gt;Changed initialization of model classes: Model, Cube, Dimension, Hierarchy, Level to be more &amp;#8220;pythony&amp;#8221;: instead of using initialization dictionary, each attribute is listed as parameter, rest is handled from variable list of key word arguments&lt;/li&gt;
&lt;li&gt;Improved handling of flat and detail-less dimensions (dimensions represented just by one attribute which is also a key)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Model Initialization Defaults:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;If no levels are specified during initialization, then dimension name is considered flat, with single attribute.&lt;/li&gt;
&lt;li&gt;If no hierarchy is specified and levels are specified, then default hierarchy will be created from order of levels&lt;/li&gt;
&lt;li&gt;If no levels are specified, then one level is created, with name &lt;code&gt;default&lt;/code&gt; and dimension will be considered flat&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;em&gt;Note&lt;/em&gt;: This initialization defaults might be moved into a separate utility function/class that will populate incomplete model (see &lt;a href="https://github.com/Stiivi/cubes/issues/8"&gt;Issue #8&lt;/a&gt; )&lt;/p&gt;

&lt;h2&gt;New features&lt;/h2&gt;

&lt;p&gt;Slicer server:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;changed to handle multiple cubes within model: you have to specify a cube for /aggregate, /facts,&amp;#8230; in form: /cube/&lt;cube_name&gt;/&lt;browser_action/&gt;&lt;/browser_action&gt;&lt;/cube_name&gt;&lt;/li&gt;
&lt;li&gt;reflect change in configuration: removed &lt;code&gt;view&lt;/code&gt;, added &lt;code&gt;view_prefix&lt;/code&gt; and &lt;code&gt;view_suffix&lt;/code&gt;, the cube view name will be constructed by concatenating &lt;code&gt;view prefix&lt;/code&gt; + &lt;code&gt;cube name&lt;/code&gt; + &lt;code&gt;view suffix&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;in aggregate drill-down: explicit dimension can be specified with drilldown=dimension:level, such as:
date:month&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;This change is considered final and therefore we can mark it is as API version 1.&lt;/p&gt;

&lt;p&gt;Links:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;strong&gt;Issues:&lt;/strong&gt; &lt;a href="https://github.com/Stiivi/cubes/issues"&gt;https://github.com/Stiivi/cubes/issues&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Documentation:&lt;/strong&gt; &lt;a href="http://packages.python.org/cubes/"&gt;http://packages.python.org/cubes/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;If you have any questions, comments, requests, do not hesitate to ask.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/10801328991</link><guid>http://blog.databrewery.org/post/10801328991</guid><pubDate>Thu, 29 Sep 2011 10:42:00 +0200</pubDate><category>announcement</category><category>release</category><category>cubes</category><category>olap</category></item><item><title>Brewery 0.7 Released</title><description>&lt;p&gt;New small release is out with quite nice addition of documentation. It does not bring too many new features, but contains a refactoring towards better package structure, that breaks some compatibility.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Documentation updates&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="http://packages.python.org/brewery/install.html"&gt;installation instructions&lt;/a&gt; with list of optional dependencies&lt;/li&gt;
&lt;li&gt;information about &lt;a href="http://packages.python.org/brewery/metadata.html"&gt;fields and metadata&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;included documentation about &lt;a href="http://packages.python.org/brewery/stores.html"&gt;data store classes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;included text from previous blog post about &lt;a href="http://packages.python.org/brewery/streams.html#forking-forks-with-higher-order-messaging"&gt;Higher Order Messaging&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Framework Changes&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;added soft (optional) &lt;a href="http://packages.python.org/brewery/install.html#requirements"&gt;dependencies&lt;/a&gt; on backend libraries. Exception with useful information will be raised when functionality that depends on missing package is used. Example: &amp;#8220;Exception: Optional package &amp;#8216;sqlalchemy&amp;#8217; is not installed. Please install the package from &lt;a href="http://www.sqlalchemy.org/"&gt;http://www.sqlalchemy.org/&lt;/a&gt; to be able to use: SQL streams. Recommended version is &amp;gt; 0.7&amp;#8221;&lt;/li&gt;
&lt;li&gt;field related classes and functions were moved from &amp;#8216;ds&amp;#8217; module to &lt;a href="http://packages.python.org/brewery/metadata.html"&gt;&amp;#8216;metadata&amp;#8217;&lt;/a&gt; and included in brewery top-level: Field, FieldList, expand_record, collapse_record&lt;/li&gt;
&lt;li&gt;added &lt;a href="http://packages.python.org/brewery/probes.html"&gt;probes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Depreciated functions&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;brewery.ds.field_name() - use str(field) instead&lt;/li&gt;
&lt;li&gt;brewery.ds.fieldlist() - use &lt;a href="http://packages.python.org/brewery/metadata.html#brewery.metadata.FieldList"&gt;brewery.metadata.FieldList()&lt;/a&gt; instead&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Streams&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;new node: &lt;a href="http://packages.python.org/brewery/node_reference.html#derive-node"&gt;DeriveNode&lt;/a&gt; - derive new field with callables or string formula (python expression)&lt;/li&gt;
&lt;li&gt;new &lt;a href="http://packages.python.org/brewery/node_reference.html#select"&gt;SelectNode&lt;/a&gt; implementation: accepts callables or string with python code&lt;/li&gt;
&lt;li&gt;former SelectNode renamed to &lt;a href="http://packages.python.org/brewery/node_reference.html#function-select"&gt;FunctionSelectNode&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Enjoy!&lt;/p&gt;

&lt;h2&gt;Links&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="https://bitbucket.org/Stiivi/brewery/overview"&gt;BitBucket repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/stiivi/brewery"&gt;github repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://groups.google.com/group/databrewery"&gt;Mailing list&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description><link>http://blog.databrewery.org/post/6899049272</link><guid>http://blog.databrewery.org/post/6899049272</guid><pubDate>Sat, 25 Jun 2011 12:32:58 +0200</pubDate><category>announcement</category><category>release</category><category>brewery</category></item><item><title>Data Cleansing introduction (for BigClean Prague 2011) </title><description>&lt;a href="http://www.slideshare.net/Stiivi/data-cleansing-introduction-for-bigclean-prague-2011"&gt;Data Cleansing introduction (for BigClean Prague 2011) &lt;/a&gt;: &lt;p&gt;Presentation from &lt;a href="http://bigclean.org/praha/"&gt;BigClean Prague&lt;/a&gt; about data cleansing - with Brewery examples.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/6326940846</link><guid>http://blog.databrewery.org/post/6326940846</guid><pubDate>Wed, 08 Jun 2011 21:00:00 +0200</pubDate><category>brewery</category><category>slides</category></item><item><title>Cubes 0.6 released</title><description>&lt;p&gt;New version of Cubes - &lt;em&gt;Python OLAP framework and server&lt;/em&gt; - was released.&lt;/p&gt;

&lt;p&gt;Cubes is a framework for:&lt;/p&gt;

&lt;ul&gt;&lt;li&gt;Online Analytical Processing - OLAP, mostly relational DB based - ROLAP&lt;/li&gt;
&lt;li&gt;multidimensional analysis&lt;/li&gt;
&lt;li&gt;&lt;p&gt;star and snowflake schema denormalisation&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Source: &lt;a href="https://bitbucket.org/Stiivi/cubes"&gt;https://bitbucket.org/Stiivi/cubes&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;Documentation: &lt;a href="http://packages.python.org/cubes"&gt;http://packages.python.org/cubes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Python Package page: &lt;a href="http://pypi.python.org/pypi/cubes"&gt;http://pypi.python.org/pypi/cubes&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Notable changes:&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;added &amp;#8216;details&amp;#8217; to &lt;a href="http://packages.python.org/cubes/model.html#cubes"&gt;cube metadata&lt;/a&gt; - attributes that might contain fact details which are not relevant to aggregation, but might be interesting when displaying facts (such as contract name or notes)&lt;/li&gt;
&lt;li&gt;added ordering of facts in aggregation browser&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;SQL&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;&lt;a href="http://packages.python.org/cubes/api/backends.html#cubes.backends.SQLDenormalizer"&gt;SQL denormalizer&lt;/a&gt; can now, by request, automatically add indexes to level key columns&lt;/li&gt;
&lt;li&gt;one detail table can be used more than once in SQL denomralizer (such as an organisation for both -  supplier and requestor), added key &lt;code&gt;alias&lt;/code&gt; to &lt;code&gt;joins&lt;/code&gt; in model description, see &lt;a href="http://packages.python.org/cubes/model.html#joins"&gt;joins documentation&lt;/a&gt; for more information.&lt;/li&gt;
&lt;/ul&gt;&lt;h2&gt;Slicer server&lt;/h2&gt;

&lt;ul&gt;&lt;li&gt;added &lt;code&gt;log&lt;/code&gt; a and &lt;code&gt;log_level&lt;/code&gt; &lt;a href="http://packages.python.org/cubes/server.html#configuration"&gt;configuration options&lt;/a&gt; (under &lt;code&gt;[server]&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;added &lt;code&gt;format=&lt;/code&gt; parameter to &lt;code&gt;/facts&lt;/code&gt;, accepts &lt;code&gt;json&lt;/code&gt; and &lt;code&gt;csv&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;added &lt;code&gt;fields=&lt;/code&gt; parameter to &lt;code&gt;/facts&lt;/code&gt; - comma separated list of returned fields in CSV (see &lt;a href="http://packages.python.org/cubes/server.html#api"&gt;API&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;limit number of facts returned in JSON (configurable by &lt;code&gt;json_record_limit&lt;/code&gt; in &lt;code&gt;[server]&lt;/code&gt; section), CSV can return whole dataset and will do it iteratively (we do not want to consume all of our memory, do we?)&lt;/li&gt;
&lt;/ul&gt;&lt;p&gt;Also many bugs were fixed, including localization in fact(s) retrieval and pagination. Sharing of single SQLAlchemy engine and model within server thread was added for performance reasons.&lt;/p&gt;

&lt;p&gt;Enjoy.&lt;/p&gt;</description><link>http://blog.databrewery.org/post/4932134336</link><guid>http://blog.databrewery.org/post/4932134336</guid><pubDate>Mon, 25 Apr 2011 20:25:00 +0200</pubDate><category>announcement</category><category>cubes</category><category>olap</category></item></channel></rss>

