by Stefan Urbanek
-
The new minor release of Cubes – light-weight Python
OLAP framework –
brings range cuts,
denormalization
with the slicer tool and cells in /report query, together with fixes and
important changes.
See the second part of this post for the full list.
Range Cuts
Range cuts were implemented in the SQL Star Browser. They are used as follows:
Python:
cut = RangeCut("date", [2010], [2012,5,10])
cut_hi = RangeCut("date", None, [2012,5,10])
cut_low = RangeCut("date", [2010], None)
To specify a range in slicer server where keys are sortable:
http://localhost:5000/aggregate?cut=date:2004-2005
http://localhost:5000/aggregate?cut=date:2004,2-2005,5,1
Open ranges:
http://localhost:5000/aggregate?cut=date:2010-
http://localhost:5000/aggregate?cut=date:2004,1,1-
http://localhost:5000/aggregate?cut=date:-2005,5,10
http://localhost:5000/aggregate?cut=date:-2012,5
Denormalization with slicer Tool
Now it is possible to denormalize tour data with the slicer tool. You do not
have to denormalize using python script. Data are denormalized in a way how
denormalized browser would expect them to be. You can tune the process using
command line switches, if you do not like the defaults.
Denormalize all cubes in the model:
$ slicer denormalize slicer.ini
Denormalize only one cube::
$ slicer denormalize -c contracts slicer.ini
Create materialized denormalized view with indexes::
$ slicer denormalize --materialize --index slicer.ini
Example slicer.ini:
[workspace]
denormalized_view_prefix = mft_
denormalized_view_schema = denorm_views
# This switch is used by the browser:
use_denormalization = yes
For more information see Cubes slicer tool documentation
Cells in Report
Use cell to specify all cuts (type can be range, point or set):
{
"cell": [
{
"dimension": "date",
"type": "range",
"from": [2010,9],
"to": [2011,9]
}
],
"queries": {
"report": {
"query": "aggregate",
"drilldown": {"date":"year"}
}
}
}
For more information see the slicer server
documentation.
New Features
- cut_from_string(): added parsing of range and set cuts from string;
introduced requirement for key format: Keys should now have format
“alphanumeric character or underscore” if they are going to be converted to
strings (for example when using slicer HTTP server)
- cut_from_dict(): create a cut (of appropriate class) from a dictionary
description
- Dimension.attribute(name): get attribute instance from name
- added exceptions: CubesError, ModelInconsistencyError, NoSuchDimensionError,
NoSuchAttributeError, ArgumentError, MappingError, WorkspaceError and
BrowserError
StarBrowser:
- implemented RangeCut conditions
Slicer Server:
/report JSON now accepts cell with full cell description as dictionary,
overrides URL parameters
Slicer tool:
denormalize option for (bulk) denormalization of cubes (see the the slicer
documentation for more information)
Changes
- important: all
/report JSON requests should now have queries wrapped in the key
queries. This was originally intended way of use, but was not correctly
implemented. A descriptive error message is returned from the server if the
key queries is not present. Despite being rather a bug-fix, it is listed
here as it requires your attention for possible change of your code.
- warn when no backend is specified during slicer context creation
Fixes
- Better handling of missing optional packages, also fixes #57 (now works
without slqalchemy and without werkzeug as expected)
- see change above about
/report and queries
- push more errors as JSON responses to the requestor, instead of just failing
with an exception
Links
Sources can be found on github.
Read the documentation.
Join the Google Group for discussion, problem solving and announcements.
Submit issues and suggestions on github
IRC channel #databrewery on irc.freenode.net
If you have any questions, comments, requests, do not hesitate to ask.
Tags:
announcement
release
cubes
olap
by Stefan Urbanek
-
The new version of Cubes – light-weight Python OLAP framework – brings new StarBrowser, which we discussed in previous blog posts:
The new SQL backend is written from scratch, it is much cleaner, transparent, configurable and open for future extensions. Also allows direct browsing of star/snowflake schema without denormalization, therefore you can use Cubes on top of a read-only database. See DenormalizedMapper and SnowflakeMapper for more information.

Just to name a few new features: multiple aggregated computations (min, max,…), cell details, optional/configurable denormalization.
Important Changes
Summary of most important changes that might affect your code:
Slicer: Change all your slicer.ini configuration files to have [workspace]
section instead of old [db] or [backend]. Depreciation warning is issued, will
work if not changed.
Model: Change dimensions in model to be an array instead of a
dictionary. Same with cubes. Old style: "dimensions" = { "date" = ... }
new style: "dimensions" = [ { "name": "date", ... } ]. Will work if not
changed, just be prepared.
Python: Use Dimension.hierarchy() instead of Dimension.default_hierarchy.
New Features
- slicer_context() - new method that holds all relevant information from
configuration. can be reused when creating tools that work in connected
database environment
- added Hierarchy.all_attributes() and .key_attributes()
- Cell.rollup_dim() - rolls up single dimension to a specified level. this might
later replace the Cell.rollup() method
- Cell.drilldown() - drills down the cell
- create_workspace(backend,model, **options) - new top-level method for creating a workspace by specifying backend name. Easier to create browsers (from
possible browser pool) programmatically. The backend name might be full
module name path or relative to the cubes.backends, for example
sql.star for new or sql.browser for old SQL browser.
get_backend() - get backend by name
AggregationBrowser.cell_details(): New method returning values of attributes
representing the cell. Preliminary implementation, return value might
change.
AggregationBrowser.cut_details(): New method returning values of attributes
representing a single cut. Preliminary implementation, return value might
change.
Dimension.validate() now checks whether there are duplicate attributes
- Cube.validate() now checks whether there are duplicate measures or details
SQL backend:
- new StarBrowser implemented:
- StarBrowser supports snowflakes or denormalization (optional)
- for snowflake browsing no write permission is required (does not have to
be denormalized)
- new DenormalizedMapper for mapping logical model to denormalized view
- new SnowflakeMapper for mapping logical model to a snowflake schema
- ddl_for_model() - get schema DDL as string for model
- join finder and attribute mapper are now just Mapper - class responsible for
finding appropriate joins and doing logical-to-physical mappings
- coalesce_attribute() - new method for coalescing multiple ways of describing
a physical attribute (just attribute or table+schema+attribute)
- dimension argument was removed from all methods working with attributes
(the dimension is now required attribute property)
- added create_denormalized_view() with options: materialize, create_index,
keys_only
Slicer tool/server:
- slicer ddl - generate schema DDL from model
- slicer test - test configuration and model against database and report list
of issues, if any
- Backend options are now in [workspace], removed configurability of custom
backend section. Warning are issued when old section names [db] and
[backend] are used
- server responds to /details which is a result of
AggregationBrowser.cell_details()
Examples:
- added simple Flask based web example - dimension aggregation browser
Changes
- in Model: dimension and cube dictionary specification during model
initialization is depreciated, list should be used (with explicitly
mentioned attribute “name”) — important
- important: Now all attribute references in the model (dimension
attributes, measures, …) are required to be instances of Attribute() and
the attribute knows it’s dimension
- removed
hierarchy argument from Dimension.all_attributes() and .key_attributes()
- renamed builder to denormalizer
- Dimension.default_hierarchy is now depreciated in favor of
Dimension.hierarchy() which now accepts no arguments or argument None -
returning default hierarchy in those two cases
- metadata are now reused for each browser within one workspace - speed
improvement.
Fixes
- Slicer version should be same version as Cubes: Original intention was to
have separate server, therefore it had its own versioning. Now there is no
reason for separate version, moreover it can introduce confusion.
- Proper use of database schema in the Mapper
Links
Sources can be found on github.
Read the documentation.
Join the Google Group for discussion, problem solving and announcements.
Submit issues and suggestions on github
IRC channel #databrewery on irc.freenode.net
If you have any questions, comments, requests, do not hesitate to ask.
Tags:
cubes
announcement
release
by Stefan Urbanek
-
I’m glad to announce new release of Brewery – stream based data auditing and analysis framework for Python.
There are quite a few updates, to mention the notable ones:
- new
brewery runner with commands run and graph
- new nodes: pretty printer node (for your terminal pleasure), generator
function node
- many CSV updates and fixes
Added several simple how-to
examples, such as:
aggregation of remote CSV, basic audit of a CSV, how to use a generator
function. Feedback and questions are welcome. I’ll help you.
Note that there are couple changes that break compatibility, however they can
be updated very easily. I apologize for the inconvenience, but until 1.0 the
changes might happen more frequently. On the other hand, I will try to make
them as painless as possible.
Full listing of news, changes and fixes is below.
Version 0.8
News
- Changed license to MIT
- Created new brewery runner commands: ‘run’ and ‘graph’:
- ‘brewery run stream.json’ will execute the stream
- ‘brewery graph stream.json’ will generate graphviz data
- Nodes: Added pretty printer node - textual output as a formatted table
- Nodes: Added source node for a generator function
- Nodes: added analytical type to derive field node
- Preliminary implementation of data probes (just concept, API not decided yet
for 100%)
- CSV: added empty_as_null option to read empty strings as Null values
Nodes can be configured with node.configure(dictionary, protected). If
‘protected’ is True, then protected attributes (specified in node info) can
not be set with this method.
added node identifier to the node reference doc
added create_logger
added experimental retype feature (works for CSV only at the moment)
- Mongo Backend - better handling of record iteration
Changes
- CSV: resource is now explicitly named argument in CSV*Node
- CSV: convert fields according to field storage type (instead of all-strings)
- Removed fields getter/setter (now implementation is totally up to stream
subclass)
- AggregateNode: rename
aggregates to measures, added measures as
public node attribute
- moved errors to brewery.common
- removed
field_name(), now str(field) should be used
- use named blogger ‘brewery’ instead of the global one
- better debug-log labels for nodes (node type identifier + python object ID)
WARNING: Compatibility break:
- depreciate
__node_info__ and use plain node_info instead
Stream.update() now takes nodes and connections as two separate arguments
Fixes
- added SQLSourceNode, added option to keep ifelds instead of dropping them in
FieldMap and FieldMapNode (patch by laurentvasseur @ bitbucket)
- better traceback handling on node failure (now actually the traceback is
displayed)
- return list of field names as string representation of FieldList
- CSV: fixed output of zero numeric value in CSV (was empty string)
Links
If you have any questions, comments, requests, do not hesitate to ask.
Tags:
brewery
release
announcement
by Stefan Urbanek
-
Another minor release of Cubes - Light Weight Python OLAP framework is out. Main change is that backend is no longer hard-wired in the Slicer server and can be selected through configuration file.
There were lots of documentation changes, for example the reference was separated from the rest of docs. Hello World! example was added.
The news, changes and fixes are:
New Features
- Started writing StarBrowser - another SQL aggregation browser with different
approach (see code/docs)
Slicer Server:
- added configuration option
modules under [server] to load additional
modules
- added ability to specify backend module
- backend configuration is in [backend] by default, for SQL it stays in [db]
- added server config option for default
prettyprint value (useful for
demontration purposes)
Documentation:
- Changed license to MIT + small addition. Please refer to the LICENSE file.
- Updated documentation - added missing parts, made reference more readable,
moved class and function reference docs from descriptive part to reference
(API) part.
- added backend documentation
- Added “Hello World!” example
Changed Features
- removed default SQL backend from the server
- moved worskpace creation into the backend module
Fixes
- Fixed create_view to handle not materialized properly (thanks to deytao)
- Slicer tool header now contains #!/usr/bin/env python
Links
If you have any questions, comments, requests, do not hesitate to ask.
Tags:
cubes
announcement
olap
by Stefan Urbanek
-
I am glad to announce new minor release of Cubes - Light Weight Python OLAP framework for multidimensional data aggregation and browsing. The news, changes and fixes are:
New Features
- New method: Dimension.attribute_reference: returns full reference to an attribute
- str(cut) will now return constructed string representation of a cut as it can be used by Slicer
Slicer server:
- added /locales to slicer
- added locales key in /model request
- added Access-Control-Allow-Origin for JS/jQuery
Changes
- Allow dimensions in cube to be a list, noy only a dictionary (internally it is ordered dictionary)
- Allow cubes in model to be a list, noy only a dictionary (internally it is ordered dictionary)
Slicer server:
- slicer does not require default cube to be specified: if no cube is in the request then try default from
config or get first from model
Fixes
- Slicer not serves right localization regardless of what localization was used first after server was
launched (changed model localization copy to be deepcopy (as it should be))
- Fixes some remnants that used old Cell.foo based browsing to Browser.foo(cell, …) only browsing
- fixed model localization issues; once localized, original locale was not available
- Do not try to add locale if not specified. Fixes #11: https://github.com/Stiivi/cubes/issues/11
Tutorials
Added tutorials in tutorials/ with models in tutorials/models/ and data in tutorials/data/:
- Tutorial 1:
- how to build a model programatically
- how to create a model with flat dimensions
- how to aggregate whole cube
- how to drill-down and aggregate through a dimension
- Tutorial 2:
- how to create and use a model file
- mappings
- Tutorial 3:
- how hierarhies work
- drill-down through a hierarchy
- Tutorial 4 (not blogged about it yet):
- how to launch slicer server
Links
If you have any questions, comments, requests, do not hesitate to ask.
Tags:
cubes
olap
announcement
release
by Stefan Urbanek
-
I am happy to announce another release of Cubes - Python OLAP framework for multidimensional data aggregation and browsing.
This release, besides some new features, renames Cuboid to more appropriate Cell. This introduces backward python API incompatibility.
Main source repository has changed to Github https://github.com/Stiivi/cubes
Changes
- Class ‘Cuboid’ was renamed to more correct ‘Cell’. ‘Cuboid’ is a part of cube with subset of dimensions.
- all APIs with ‘cuboid’ in their name/arguments were renamed to use ‘cell’ instead
- Changed initialization of model classes: Model, Cube, Dimension, Hierarchy, Level to be more “pythony”: instead of using initialization dictionary, each attribute is listed as parameter, rest is handled from variable list of key word arguments
- Improved handling of flat and detail-less dimensions (dimensions represented just by one attribute which is also a key)
Model Initialization Defaults:
- If no levels are specified during initialization, then dimension name is considered flat, with single attribute.
- If no hierarchy is specified and levels are specified, then default hierarchy will be created from order of levels
- If no levels are specified, then one level is created, with name
default and dimension will be considered flat
Note: This initialization defaults might be moved into a separate utility function/class that will populate incomplete model (see Issue #8 )
New features
Slicer server:
- changed to handle multiple cubes within model: you have to specify a cube for /aggregate, /facts,… in form: /cube//
- reflect change in configuration: removed
view, added view_prefix and view_suffix, the cube view name will be constructed by concatenating view prefix + cube name + view suffix
- in aggregate drill-down: explicit dimension can be specified with drilldown=dimension:level, such as:
date:month
This change is considered final and therefore we can mark it is as API version 1.
Links:
If you have any questions, comments, requests, do not hesitate to ask.
Tags:
announcement
release
cubes
olap
by Stefan Urbanek
-
New small release is out with quite nice addition of documentation. It does not bring too many new features, but contains a refactoring towards better package structure, that breaks some compatibility.
Documentation updates
Framework Changes
- added soft (optional) dependencies on backend libraries. Exception with useful information will be raised when functionality that depends on missing package is used. Example: “Exception: Optional package ‘sqlalchemy’ is not installed. Please install the package from http://www.sqlalchemy.org/ to be able to use: SQL streams. Recommended version is > 0.7”
- field related classes and functions were moved from ‘ds’ module to ‘metadata’ and included in brewery top-level: Field, FieldList, expand_record, collapse_record
- added probes
Depreciated functions
Streams
- new node: DeriveNode - derive new field with callables or string formula (python expression)
- new SelectNode implementation: accepts callables or string with python code
- former SelectNode renamed to FunctionSelectNode
Enjoy!
Links
Tags:
announcement
release
brewery
by Stefan Urbanek
-
New version of Cubes - Python OLAP framework and server - was released.
Cubes is a framework for:
Notable changes:
- added ‘details’ to cube metadata - attributes that might contain fact details which are not relevant to aggregation, but might be interesting when displaying facts (such as contract name or notes)
- added ordering of facts in aggregation browser
SQL
- SQL denormalizer can now, by request, automatically add indexes to level key columns
- one detail table can be used more than once in SQL denomralizer (such as an organisation for both - supplier and requestor), added key
alias to joins in model description, see joins documentation for more information.
Slicer server
- added
log a and log_level configuration options (under [server])
- added
format= parameter to /facts, accepts json and csv
- added
fields= parameter to /facts - comma separated list of returned fields in CSV (see API)
- limit number of facts returned in JSON (configurable by
json_record_limit in [server] section), CSV can return whole dataset and will do it iteratively (we do not want to consume all of our memory, do we?)
Also many bugs were fixed, including localization in fact(s) retrieval and pagination. Sharing of single SQLAlchemy engine and model within server thread was added for performance reasons.
Enjoy.
Tags:
announcement
cubes
olap
by Stefan Urbanek
-
Freshly brewed clean data with analytical taste – that is what Data Brewery is for. The Python framework will allow you to:
- stream structured data from various sources (CSV, XLS, SQL database, Google spreadsheet) to various structured targets
- create analytical streams using flow-based programming: connect processing nodes together and let the structured data flow through them
- measure data properties, such as data quality or numerical statistics
- do advanced data mining in the future such as clustering or classification
You can use Brewery for analytical automation or just for ad-hoc analytical processing.
Project page is at databrewery.org. Source repository can be found at:
Documentation with examples and node reference can be found here.
Happy brewing!

Tags:
announcement
brewery