2012-04-04 by Stefan Urbanek

Brewery 0.8 Released

I'm glad to announce new release of Brewery – stream based data auditing and analysis framework for Python.

There are quite a few updates, to mention the notable ones:

  • new brewery runner with commands run and graph
  • new nodes: pretty printer node (for your terminal pleasure), generator function node
  • many CSV updates and fixes

Added several simple how-to examples, such as: aggregation of remote CSV, basic audit of a CSV, how to use a generator function. Feedback and questions are welcome. I'll help you.

Note that there are couple changes that break compatibility, however they can be updated very easily. I apologize for the inconvenience, but until 1.0 the changes might happen more frequently. On the other hand, I will try to make them as painless as possible.

Full listing of news, changes and fixes is below.

Version 0.8

News

  • Changed license to MIT
  • Created new brewery runner commands: 'run' and 'graph':
    • 'brewery run stream.json' will execute the stream
    • 'brewery graph stream.json' will generate graphviz data
  • Nodes: Added pretty printer node - textual output as a formatted table
  • Nodes: Added source node for a generator function
  • Nodes: added analytical type to derive field node
  • Preliminary implementation of data probes (just concept, API not decided yet for 100%)
  • CSV: added empty_as_null option to read empty strings as Null values
  • Nodes can be configured with node.configure(dictionary, protected). If 'protected' is True, then protected attributes (specified in node info) can not be set with this method.

  • added node identifier to the node reference doc

  • added create_logger

  • added experimental retype feature (works for CSV only at the moment)

  • Mongo Backend - better handling of record iteration

Changes

  • CSV: resource is now explicitly named argument in CSV*Node
  • CSV: convert fields according to field storage type (instead of all-strings)
  • Removed fields getter/setter (now implementation is totally up to stream subclass)
  • AggregateNode: rename aggregates to measures, added measures as public node attribute
  • moved errors to brewery.common
  • removed field_name(), now str(field) should be used
  • use named blogger 'brewery' instead of the global one
  • better debug-log labels for nodes (node type identifier + python object ID)

WARNING: Compatibility break:

  • depreciate __node_info__ and use plain node_info instead
  • Stream.update() now takes nodes and connections as two separate arguments

Fixes

  • added SQLSourceNode, added option to keep ifelds instead of dropping them in FieldMap and FieldMapNode (patch by laurentvasseur @ bitbucket)
  • better traceback handling on node failure (now actually the traceback is displayed)
  • return list of field names as string representation of FieldList
  • CSV: fixed output of zero numeric value in CSV (was empty string)

Links

  • github sources: https://github.com/Stiivi/brewery
  • Documentation: http://packages.python.org/brewery/
  • Mailing List: http://groups.google.com/group/databrewery/
  • Submit issues here: https://github.com/Stiivi/brewery/issues
  • IRC channel: #databrewery on irc.freenode.net

If you have any questions, comments, requests, do not hesitate to ask.