OpenDX - Documentation
Full Contents QuickStart Guide User's Guide User's Reference
Previous Page Next Page Table of Contents Partial Table of Contents Index Search

ImportSpreadsheet

Category

Import and Export

Function

Import spreadsheet format data

Syntax

field, labellist = ImportSpreadsheet(filename, delimiter,
                                     columnname, format, categorize,
                                     start, end, delta,
                                     headerlines, labelline);

Inputs
Name Type Default Description
filename string (none) name of file to import
delimiter string " " one-character delimiter (what separates the columns)
columnname string list (all) names of columns to import
format string 1-d import as 1-d or 2-d field
categorize string list "" list of columns to categorize during import
start integer (first record) record (row) to begin importing
end integer (last record) record (row) to end importing
delta integer 1 increment of rows to import
headerlines integer 0 number of lines to skip before start of data/column labels
labelline integer no default line number labels are on

Outputs
Name Type Description
field field a field with each of the columns as a component, with the name of the column as the component name
labellist string list a list of the imported column names

Functional Details

ImportSpreadsheet imports spreadsheet (i.e. tabular) ASCII data. Each column in the file is imported as a separate component in the resulting output field. The name of the component is taken from a name at the top of the column in the file, if present. If no name is found, a default name of "column#" is used instead.

If any column entry is NULL or consists of just white space this entry is treated as invalid data in Data Explorer. If it is a column of type

In addition, an "componentname missingvalues" component is created which references those invalid entries. Also an "invalid positions" component is created which is the union of all the "componentname missingvalues" components. For more information on invalid positions, see "Invalid Positions and Invalid Connections Components" in IBM Visualization Data Explorer User's Guide.

filename

is the file to import.

delimiter

specifies a one-character delimiter which defines the columns. If you do not specify delimiter, white space is assumed to delimit the columns.

Note: The tab delimiter is specified as "\t".

columnname

is a list of the names of the columns you wish to import.

format

must be either "1-d" or "2-d". If you specify "1-d", then positions of the output field will simply be the indices (row numbers) from 0 to number-of-rows. The field will have as many components as there are imported columns, with each component named by the column name. If you specify "2-d", then the output field will be a c x r grid, where r is the number of imported rows, and c is the number of imported columns. It will have a single data component which contains all the values in the imported rows and columns. If you specify "2-d", then the columns imported can not mix string data with numerical data.

categorize

specifies columns to be categorized, using the Categorize module. If "allstring" is specified, all columns with a data type of "string" are categorized. For additional information, see Categorize.

headerlines

specifies the number of lines to skip before the start of the data/column labels, for skipping comments at the top of the file. Note that this would typically be necessary only when the data being imported is all strings, or if you have comments at the top of the file that could be misinterpreted as labels or data.

labelline

specifies the line number labels are on. Note that this would only be necessary when the data being imported is all strings.

start, end,  and  delta

specify the records (rows) you wish to import.

Example Visual Programs

Categorical.net
Duplicates.net
Zipcodes.net

See Also

Categorize, Import


Full Contents QuickStart Guide User's Guide User's Reference

[ OpenDX Home at IBM | OpenDX.org ]