Command-Line Syntax

xtab.py should be run at the operating-system command line–i.e., at a shell prompt in Linux or in a command window in Windows. Python may or may not need to be explicitly invoked, and the .py extension may or may not need to be included, depending on your operating system, operating system seetings, and how xtab is installed.

For Linux users: The xtab.py file contains a shebang line pointing to /usr/bin/python, so there should be no need to invoke the Python interpreter. Depending on how xtab.py was obtained and installed, it may need to be made executable with the chmod command.

For Windows users: If you are unfamiliar with running Python programs at the command prompt, see https://docs.python.org/2/faq/windows.html.

In the following syntax descriptions, angle brackets identify required replaceable elements, and square brackets identify optional replaceable elements.

xtab.py [options] -i <input_file> -o <output_file> -r <row_headers> -c <column_headers> -v <value_headers>

Required Arguments

-i <filename>
    The name of the input file from which to read data.
    This must be a text file, with data in a normalized format.
    The first line of the file must contain column names.
-o <filename>
    The name of the output file to create. The output file
    will be created as a .csv file.
-r <column_name1> [column_name2 [...]]
    One or more column names to use as row headers. Multiple
    column names should be separated by a space. Unique values
    of these columns will appear at the beginning of every output
    line.
-c <column_name1> [column_name2 [...]]
    One or more column names to use as column headers in the
    output.  Multiple column names should be separated by a space.
    A crosstab column (or columns) will be created for every
    unique combination of values of these fields in the input.
-v <column_name1> [column_name2 [...]]
    One or more column names with values to be used to fill the
    cells of the cross-table.  If *n* columns names are specified,
    then there will be *n* columns in the output table for each
    of the column headers corresponding to values of the -c
    argument. The column names specified with the -v argument
    will be appended to the output column headers created from
    values of the -c argument.  There should be only one value
    of the -v column(s) for each combination of the -r and -c
    columns; if there is more than one, a warning will be printed
    and only the first value will appear in the output. That is,
    values are not combined in any      way when there are multiple
    values for each output cell.

Optional Arguments

-d[1|2|3|4]
    Controls the format of column headers. The four alternatives are:
        -d1 or no option specified
            One row of column headers, with elements joined by
            underscores to facilitate parsing by other programs.
        -d or -d2
            Two rows of column headers. The first row contains
            values of the columns specified by the -c argument,
            and the second row contains the column names specified
            by the -v argument.
        -d3
            One header row for each of the values of the columns
            specified by the -c argument, plus one row with the
            column names specified by the -v argument.
        -d4
            Like -d3, but the values of the columns specified by
            the -c argument are labeled with (preceded by) the
            column names.
-f
    Use a temporary (sqlite) file instead of memory for intermediate
    storage.
-n
    Use the specified default string in the output wherever an empty
    or null value would otherwise appear.
-k
    Keep (i.e., do not delete) the sqlite file.  Only useful with the
    "-f" option.  Unless the "-t" option is also used, the table name
    will be "src".
-t <tablename>
    Name to use for the table in the intermediate sqlite database.
    Only useful with the "-f" and "-k" options.
-p <sep_char>
    A character to be used instead of the underscore when multiple input
    column headers are combined to create a single output column header
    when using the "-d1" (default) option.
-e [filename]
    Log all error messages, to a file if the filename is specified
    or to the console if the filename is not specified.
-q <filename>
    Log the sequence of SQL commands used to extract data from the
    input file to write the output file, including the result of
    each command.
-h
    Print this help and exit.