1. Introduction

1.1. Overview

At the most fundamental level, an “execution” or a “run” of any data-processing can be thought like that:

      .--------------.     _____________        .-------------.
     ;  DataTree    ;    |             |      ;   DataTree   ;
    ;--------------; ==> |  <cfunc_1>  | ==> ;--------------;
   ; /some/data   ;      |  <cfunc_2>  |    ; /some/data   ;
  ;  /some/other ;       |     ...     |   ;  /some/other ;
 ;   /foo/bar   ;        |_____________|  ;   /foo/bar   ;
'--------------'                         '--------------.
  • The data-tree might come from json, hdf5, excel-workbooks, or plain dictionaries and lists. Its values are strings and numbers, numpy-lists, pandas or xray-datasets, etc.

  • The component-functions must abide to the following simple signature:

    cfunc_do_something(pandelone, datatree)
    

    and must not return any value, just read and write into the data-tree.

  • Here is a simple component-function:

    def cfunc_standardize(pandelone, datatree):
        pin, pon = pandelone.paths(),
        df = datatree.get(pin.A)
        df[pon.A.B_std] = df[pin.A.B] / df[pin.A.B].std()
    
  • Notice the use of the relocatable-paths marked specifically as input or output.

  • TODO: continue rough example in tutorial…

1.2. Quick-start

Note

The program runs on Python-2.7+ and Python-3.3+ (preferred) and requires numpy/scipy, pandas and win32 libraries along with their native backends to be installed. If you do not have such an environment already installed, please read Install section below for suitable distributions such as Anaconda or WinPython.

Assuming that you have a working python-environment, open a command-shell, (in Windows use cmd.exe BUT ensure python.exe is in its PATH), try the following commands:

Tip

The commands beginning with $, below, imply a Unix like operating system with a POSIX shell (Linux, OS X). Although the commands are simple and easy to translate in its Windows cmd.exe counterpart, it would be worthwile to install Cygwin to get the same environment on Windows. If you choose to do that, include also the following packages in the Cygwin’s installation wizard:

* git, git-completion
* make, zip, unzip, bzip2, dos2unix
* openssh, curl, wget

But do not install/rely on cygwin’s outdated python environment.

Install:
$ pip install pandalone                 ## Use `--pre` if version-string has a build-suffix.

Or in case you need the very latest from master branch :

$ pip install git+https://github.com/pandalone/pandalone.git

See: Install

Run:
$ pandalone --version

1.3. Discussion