Program Examples

The following examples illustrate usage of xtab.py as a command-line program.

Example 1: Single Columns for Row and Column Headers

At least one column of the input file must be specified for each of the three elements of the output file: a row heading, a column heading, and a cell value. Therefore every input file must have at least three columns, as illustrated by the following table.

Station Date Value
WQ-01 2006-05-23 4.5
WQ-02 2006-05-23 3.7
WQ-03 2006-05-23 6.8
WQ-01 2006-06-15 9.7
WQ-02 2006-05-15 5.1
WQ-03 2006-06-15 7.2
WQ-01 2006-07-19 10
WQ-02 2006-07-19 6.1
WQ-03 2006-07-19 8.8

The corresponding CSV file would look something like:

Station,Date,Value
WQ-01,2006-05-23,4.5
WQ-02,2006-05-23,3.7
WQ-03,2006-05-23,6.8
WQ-01,2006-06-15,9.7
WQ-02,2006-05-15,5.1
WQ-03,2006-06-15,7.2
WQ-01,2006-07-19,10
WQ-02,2006-07-19,6.1
WQ-03,2006-07-19,8.8

A command-line command to transform this file (input.csv) might look like this:

xtab.py -i input.csv -o output.csv -r Station -c Date -v Value

And the resulting table (output.csv) would look like:

Station 2006-05-23 2006-06-15 2006-07-19
WQ-01 4.5 3.7 6.8
WQ-02 9.7 5.1 7.2
WQ-03 10 6.1 8.8

Example 2: Multiple Row and Column Headers

The input file may contain multiple columns with values that are to be preserved as separate columns in the output (these are listed after the -r command-line flag). The output file may have columns that represent the combination of several columns in the input file (listed after the -c command-line flag), and the output file may have multiple columns in the crosstabbed result (listed after the -v flag). This example illustrates all three of these features.

For an input data file (input.csv) like the following,

location_id sample_date analyte value units meas_basis quals
RM708B1 2005-04-07 Arsenic 3.9 mg/kg dry U
RM708B1 2005-04-07 Cadmium 1.6 mg/kg dry  
RM708B1 2005-04-07 Copper 14.1 mg/kg dry  
RM708B1 2005-04-07 Mercury 0.078 mg/kg dry J
RM708B1 2005-04-07 Arsenic 4.1 mg/kg dry U
RM708B1 2005-04-07 Cadmium 1.3 mg/kg dry  
RM708B1 2005-04-07 Copper 14.6 mg/kg dry  
RM708B1 2005-04-07 Mercury 0.061 mg/kg dry J
RM744X2 2005-04-08 Arsenic 29.1 mg/kg dry  
RM744X2 2005-04-08 Cadmium 2.9 mg/kg dry  
RM744X2 2005-04-08 Copper 2380.0 mg/kg dry  
RM744X2 2005-04-08 Mercury 0.011 mg/kg dry J
RM744X2 2005-04-08 Arsenic 28.3 mg/kg dry  
RM744X2 2005-04-08 Cadmium 2.6 mg/kg dry  
RM744X2 2005-04-08 Copper 2330.0 mg/kg dry  
RM744X2 2005-04-08 Mercury 0.015 mg/kg dry J
RM742B2 2005-04-09 Cadmium 1.2 mg/kg dry  
RM742B2 2005-04-09 Arsenic 24.6 mg/kg dry  
RM742B2 2005-04-09 Copper 2190.0 mg/kg dry  
RM742B2 2005-04-09 Mercury 0.01 mg/kg dry J
RM742B2 2005-04-09 Arsenic 25.2 mg/kg dry  
RM742B2 2005-04-09 Cadmium 1.2 mg/kg dry  
RM742B2 2005-04-09 Copper 2240.0 mg/kg dry  
RM742B2 2005-04-09 Mercury 0.03 mg/kg dry J
RM704X3 2005-04-08 Arsenic 10.0 mg/kg dry  
RM704X3 2005-04-08 Cadmium 4.7 mg/kg dry  
RM704X3 2005-04-08 Copper 42.4 mg/kg dry  
RM704X3 2005-04-08 Mercury 0.62 mg/kg dry  
RM704X3 2005-04-08 Arsenic 6.8 mg/kg dry  
RM704X3 2005-04-08 Cadmium 4.0 mg/kg dry  
RM704X3 2005-04-08 Copper 35.9 mg/kg dry  
RM704X3 2005-04-08 Mercury 0.92 mg/kg dry  
RM726X1 2005-04-08 Arsenic 10.6 mg/kg dry  
RM726X1 2005-04-08 Cadmium 2.0 mg/kg dry  

the command:

xtab.py -d3 -i input.csv -o output.csv -r location_id sample_date
        -c analyte units meas_basis -v value quals

(the command line is wrapped for readability) will produce the following output (output.csv):

location_id
sample_date
Arsenic
mg/kg
dry
value
quals
Cadmium
mg/kg
dry
value
quals
Copper
mg/kg
dry
value
quals
Mercury
mg/kg
dry
value
quals
RM708B1 2005-04-07 3.9 U 1.6   14.1   0.078 J
RM744X2 2005-04-08 29.1   2.9   2380.0   0.011 J
RM742B2 2005-04-09 24.6   1.2   2190.0   0.01 J
RM704X3 2005-04-08 10.0   4.7   42.4   0.62  
RM726X1 2005-04-08 10.6   2.0