Basic Data Transformations









Rounding, Truncating and Precision Control

       BASE10            Convert number to #.### and a power of 10.
       CEIL              Round up towards +INF.
       FLOOR             Round down towards -INF.
       PRCSN             Set computational precision for matrix operations.
       ROUND             Round to the nearest integer.
       TRUNC             Truncate decimal portion of number.

All calculations in GAUSS are done in double precision, with the exception of some of the intrinsic functions, which may use extended precision (18-19 digits of accuracy). Use PRCSN to change the internal accuracy used in these cases.

ROUND, TRUNC, CEIL and FLOOR convert floating point numbers into integers. The internal representation for the converted integer is still 64-bit (double precision).

Each matrix element in memory requires 8 bytes of workspace. See the function CORELEFT to determine availability of workspace.

Mathematical and Scientific Functions

 Trigonometric Functions              Other Math Functions
 -----------------------              --------------------
 ARCCOS      Inverse cosine.          BESSELJ     Bessel function, 1st kind.
 ARCSIN      Inverse sine.            BESSELY     Bessel function, 2nd kind.
 ARCTAN      Inverse tangent.         EXP         Exponential function.
 ARCTAN2     Angle whose tangent      GAMMA       Gamma function.
             is y/x.                  LN          Natural log.
 COS         Cosine.                  LNFACT      Natural log of factorial
 COSH        Hyperbolic cosine.                   function.
 SIN         Sine.                    LOG         Log base 10.
 SINH        Hyperbolic sine.         PI          Returns the value of PI.
 TAN         Tangent.                 SQRT        Square root.
 TANH        Hyperbolic tangent.

All trigonometric functions take or return values in radian units. All mathematical functions are calculated in double precision, with the exception of the BESSEL functions and GAMMA. These are calculated to roughly single precision.

Differentiation and Integration Routines

       GRADP             Computes first derivative of a function.
       HESSP             Computes second derivative of a function.
       INTQUAD1          Integrates a 1-dimensional function.
       INTQUAD2          Integrates a 2-dimensional function.
       INTQUAD3          Integrates a 3-dimensional function.
       INTSIMP           Integrates by Simpson's method.
       INTGRAT2          Integrates over a region defined by functions of x.
       INTGRAT3          Integrates over a region defined by functions of
                         x and y.

GRADP and HESSP use a finite difference approximation to the first and second derivatives. Use GRADP to calculate a Jacobian.

INTQUAD1, INTQUAD2, and INTQUAD3 use Gaussian quadrature to calculate the integral of the user-defined function over a rectangular region.

To calculate an integral over a region defined by functions of x and y, use INTGRAT2 and INTGRAT3.

To get a greater degree of accuracy than that provided by INTQUAD1, use INTSIMP for one-dimensional integration.

Root Finding, Polynomial Multiplication and Interpolation

       POLYCHAR          Computes characteristic polynomial of a
                         square matrix.
       POLYEVAL          Evaluates polynomial with given coefficients.
       POLYINT           Calculates Nth order polynomial interpolation
                         given known point pairs.
       POLYMULT          Multiplies two polynomials together.
       POLYMAKE          Computes polynomial coefficients from roots.
       POLYMAT           Returns sequence powers of a matrix.
       POLYROOT          Computes roots of polynomial from coefficients.


Frequency Transforms - FFT's

       FFT               Compute 1- or 2-D Fast Fourier Transform (FFT).
       FFTI              Compute inverse FFT.
       FFTN              Compute FFT.
       DFFT              Compute 1-D Discrete Fourier Transform (DFT).
       DFFTI             Compute inverse DFT.
       RFFT              Compute 1- or 2-D real FFT.
       RFFTI             Compute inverse real FFT.
       RFFTIP            Compute inverse real FFT, takes packed format FFT.
       RFFTN             Compute real FFT.
       RFFTNP            Compute real FFT, return packed format FFT.
       RFFTP             Compute real FFT, return packed format FFT.

FFT, RFFT, and RFFTP require the dimensions of the input matrix to be powers of 2. FFTN, FFTI, RFFTN, RFFTNP, RFFTI, and RFFTIP allow them to be products of 2, 3, 5, and 7.

RFFTNP and RFFTP return only the positive frequencies and Nyquist frequencies, as the negative frequencies are often not needed in RFFT applications.

FFT and RFFT are supported for backward compatibility. Use FFTN and RFFTN if you can.

Random Number Generators and Seeds

         Random Number Generators
       RNDN        Creates a matrix of normally distributed random numbers.
       RNDU        Creates a matrix of uniformly distributed random numbers.

       Random Number Generator Control
       RNDCON      Changes constant of random number generator.
       RNDMOD      Changes modulus of random number generator.
       RNDMULT     Changes multiplier of random number generator.
       RNDNS       Creates a matrix of normally distributed random
                   numbers using a specified seed.
       RNDSEED     Changes seed of random number generator.
       RNDUS       Creates a matrix of uniformly distributed random
                   numbers using a specified seed.

The random number generator can be seeded. Set the seed using RNDSEED or generate the random numbers using either RNDUS for uniformly distributed numbers or RNDNS for normally distributed numbers. For example:

                seed = 44435667;
                x = rndus(1,1,seed);    

Complex Number Operations

 COMPLEX           Converts 2 real matrices to 1 complex matrix.
       IMAG              Returns imaginary part of complex matrix.
       ISCPLX            Returns 1 (TRUE) if its argument is complex.
       REAL              Returns real part of complex matrix.

Most other operators and functions operate on complex scalars and matrices in the expected way with no extra programming necessary.

Fuzzy Comparison Operators

       Scalar Comparisons
       FEQ       Fuzzy ==
       FGE       Fuzzy >=
       FGT       Fuzzy >
       FLE       Fuzzy <=
       FLT       Fuzzy <
       FNE       Fuzzy /=

       Element-by-Element Comparisons
       DOTFEQ    Fuzzy .==
       DOTFGE    Fuzzy .>=
       DOTFGT    Fuzzy .>
       DOTFLE    Fuzzy .<=
       DOTFLT    Fuzzy .<
       DOTFNE    Fuzzy ./=

The global variable _FCOMPTOL controls the tolerance used for comparison. By default, this is 1E-15. The default can be changed by editing the file FCOMPARE.DEC.



GAUSS-386i allows expressions that directly reference variables (columns) of a data set. This is done within the context of a data loop.

    dataloop infile outfile;
        drop wagefac wqlec shordelt foobly;
        csed = ln(sqrt(csed));
        select csed > 0.35 and married $== "y";
        make chfac = hcfac + wcfac;
        keep csed chfac stid recsum voom;

Dataloop procedures

    CODE           create variable based on a set of logical expressions
    DELETE         delete rows (observations) based on a logical expression
    DROP           specify variables NOT to be written to data set
    EXTERN         allows access to matrices and strings in memory
    KEEP           specify variables to be written to output data set
    LAG            lag variables a number of periods
    LISTWISE       controls deletion of missing values
    MAKE           create new variable
    OUTTYP         specify output file precision
    RECODE         change variable based on a set of logical expressions
    SELECT         select rows (observations) based on a logical expression
    VECTOR         create new variable from a scalar-returning expression

In any expression inside of a data loop, all symbols not immediately followed by a left parenthesis `(' are assumed to be data set variable (column) names.

All program statements in the main file and not inside of a data loop are passed through without modification. Program statements inside of a data loop that are preceded by a `#' are passed through to the data loop without modification. The adept user who is familiar with the code generated in the temporary file can use this to do out of the ordinary operations inside the data loop.

The translator that processes data loops can be turned on and off with Ctrl-T or from the Alt-C configuration menu. When the translation is on, the main program file will be translated to a temporary file with the data loops expanded to the appropriate GAUSS code to perform the data loop operations.

Lines within a data loop preceded by a `#' are passed through unchanged. This allows the adept user to use any GAUSS commands or functions within the data loop.

When assigning a character variable to a numeric variable or vice versa, the #= and $= operators must be used to force a type change by the translator.

      numvar $= charvar;     /* numvar is now a character variable */
      charvar #= numvar;     /* charvar is now a numeric variable */

The translator writes a temporary file called $XRUN$.TMP which can be loaded into the editor after a run by pressing Ctrl-F1 twice. The first time you press Ctrl-F1, the original source is loaded into the editor. The next time, the original source file will be saved and then swapped out and the translated file will be loaded into the editor. Each subsequent Ctrl-F1 will swap the two files, saving the one currently in memory. Pressing F8 will swap the files without saving any changes to the one in memory.

To keep the translated file permanently, rename it using Alt-O while it is in memory and save it by pressing Alt-W.


Cumulative Distribution Functions

     CDFBETA           Computes integral of beta function.
     CDFBVN            Computes lower tail of bivariate normal cdf.
     CDFCHIC           Computes complement of cdf of chi-square.
     CDFCHINC          Computes integral of noncentral chi-square.
     CDFFC             Computes complement of cdf of F.
     CDFFNC            Computes integral of noncentral F.
     CDFGAM            Computes integral of incomplete gamma function.
     CDFN              Computes integral of normal distribution:  lower tail.
     CDFNC             Computes complement (1-cdf) of normal distribution.
     CDFNI             Computes inverse of cdf of normal distribution.
     CDFTC             Computes complement of cdf of t-distribution.
     CDFTCI            Computes inverse of complement of t-distribution cdf.
     CDFTNC            Computes integral of noncentral t-distribution.
     CDFTVN            Computes lower tail of trivariate normal cdf.
     ERF               Computes Gaussian error function.
     ERFC              Computes complement of Gaussian error function.
     PDFN              Computes standard normal probability density function.

Descriptive Statistics

       CROSSPRD          Computes cross product.
       MOMENT            Computes moment matrix (x'x) with special handling
                         of missing values.
       MOMENTD           Computes moment matrix from data set.
       DSTAT             Computes descriptive statistics of a data matrix.
       CONV              Computes convolution of two vectors.
       CORRM             Computes correlation matrix of a moment matrix.
       CORRVC            Computes correlation matrix from a variance-
                         covariance matrix.
       CORRX             Computes correlation matrix.
       MEANC             Computes mean value of every column of a matrix.
       STDC              Computes standard deviation of every column.
       VCM               Computes a variance-covariance matrix from a moment
       VCX               Computes a variance-covariance matrix from a data

See @REG for linear regression routines.

Linear Regression

       OLS                Computes least squares regression of data set.
       OLSQR              Computes OLS coefficients using QR decomposition.
       OLSQR2             Computes OLS coefficients, residuals and predicted
                          values using QR decomposition.

These functions can handle missing data by performing either a listwise or pairwise deletion. Also, they produce extensive printed output.


Creating Matrices

      DESIGN            Creates a design matrix of 0's and 1's.
       EDITM             Invokes the matrix editor.
       EYE               Creates an identity matrix.
       LET               Creates a matrix from a list of values.
       ONES              Creates a matrix of ones.
       RECSERAR          Computes auto-regressive recursive series.
       RECSERCP          Computes recursive series involving products.
       RECSERRC          Computes recursive series involving division.
       SEQA              Creates a vector as an additive sequence.
       SEQM              Creates a vector as a multiplicative sequence.
       TOEPLITZ          Computes Toeplitz matrix from column vector.
       ZEROS             Creates a matrix of zeros.
       |                 Vertical concatenation operator.
       ~                 Horizontal concatenation operator.

Use ZEROS or ONES to create a constant vector or matrix.

To get help on loading matrices from ASCII files, type @LSE.

To get help on loading matrices from data sets, type @DATA.

Rank and Size of Matrices

COLS       Returns number of columns in a matrix.
ROWS       Returns number of rows in a matrix.

Minimum and Maximum Elements
MAXC       Returns largest element in each column of a matrix.
MAXINDC    Returns row number of largest element in each column of a matrix.
MINC       Returns smallest element in each column of a matrix.
MININDC    Returns row number of smallest element in each column of a matrix.

Ranges of Elements
COUNTS     Returns number of elements of a vector falling in specified ranges.
COUNTWTS   Returns weighted count of elements of a vector falling in
           specified ranges.
INDEXCAT   Returns indices of elements falling within specified ranges.
RANKINDX   Returns rank index of Nx1 vector.

See @SORT for routines which return sorted indices and unique indices.

Submatrix Extraction

      DELIF             Deletes rows from a matrix using a logical
       DIAG              Extracts the diagonal of a matrix.
       DIAGRV            Pushes column vector into diagonal of matrix.
       EXCTSMPL          Creates a random subsample of a data set, with
       LOWMAT            Returns the main diagonal and lower triangle.
       LOWMAT1           Returns a main diagonal of 1's and lower triangle.
       UPMAT             Returns the main diagonal and upper triangle.
       UPMAT1            Returns a main diagonal of 1's and upper triangle.
       SELIF             Selects rows from a matrix using a logical
       SUBMAT            Extracts a submatrix from matrix.
       TRIMR             Trims rows from top or bottom of matrix.
       x[1 2 3:9,.]      Submatrix containing all columns of the 1st, 2nd, and
                         3rd-9th rows of matrix x.

To delete the rows of a matrix which contain missing values, see PACKR.

For more help on matrices, type @CREATE, @SIZE and @MAN.

Basic Row, Column and Set Operations

      Basic Row and Column Operations
       CUMPRODC          Computes cumulative products of each column of a
       CUMSUMC           Computes cumulative sums of each column of a matrix.
       PRODC             Computes the product of each column of a matrix.
       SUMC              Computes the sum of each column of a matrix.

       Set Operations
       UNION             Returns the union of two vectors.
       INTRSECT          Returns the intersection of two vectors.
       SETDIF            Returns elements of one vector that are not in

To get a list of all matrix operators, type @OPERS. To get help on matrix comparison and logical operators, type one of the operators. (GT or AND, for example).

Matrix Element Manipulation

       REV               Reverses the order of rows of a matrix.
       ROTATER           Rotates the rows of a matrix, wrapping elements.
       SHIFTR            Shifts rows of a matrix, filling in holes with
                         a specified value.
       RESHAPE           Reshapes a matrix to new dimension.
       VEC               Stacks columns of a matrix to form a single column.
       VECR              Stacks rows of a matrix to form a single column.
        X.'              Bookkeeping transpose of matrix X.

The functions RESHAPE, VEC, VECR and the dot transpose operator ( .' ) change the shape of matrices, while REV, ROTATER and SHIFTR move elements in the matrix, but retain the structure of the matrix.

The standard transpose operator ( ' ) returns the complex conjugate transpose of complex matrices. The bookkeeping transpose ( .' ) simply transposes the matrix without changing the sign of the imaginary part.

RESHAPE is useful when loading data from an ASCII file. See @LSE.

For more information on matrices, see @CREATE, @SIZE, and @OPERS for a list of operators.


Working with Data Sets

       Data Set Dimension
       COLSF             Returns number of columns in an open data set.
       ROWSF             Returns number of rows in an open data set.
       ISCPLXF           Returns 1 (TRUE) if data set is complex.
       TYPEF             Returns the element size (2, 4, or 8 bytes).

       Creating, Opening, Loading and Closing Data Sets
       CLOSE             Closes an open data set (.DAT file).
       CLOSEALL          Closes all open data sets.
       CREATE            Creates and opens a data set.
       EOF               Tests for end of file.
       LOADD             Loads a small data set.
       OPEN              Opens an existing data set.
       READR             Reads rows from open data set.
       SAVED             Creates small data sets.
       SEEKR             Moves pointer to specified row in open data set.
       WRITER            Writes matrix to an open data set.

See System and Graphics Manual for help on data conversion utility ATOG386.EXE.

Working with Variables in Data Sets

       GETNAME     Returns names of all variables in a data set.
       INDCV       Returns indices of selected variables in a data set.
       INDICES     Returns indices and names of selected variables
                   in a data set.
       INDICES2    Similar to INDICES, but returns separate lists
                   for dependent and independent variables.
       MERGEVAR    Concatenates column vectors to create larger matrix.
       MAKEVARS    Decomposes matrix to create column vectors.
       SETVARS     Creates global variables using names from data set.

These functions are written to simplify the task of working with the variables in data sets.

Example: Creating vectors in memory from data set variables: open f1 = mydata; x = readr(f1,rowsf(f1)); makevars(x,0,getname("mydata")); f1 = close(f1);

Coding Data Matrices and Vectors

       CODE              Code the data in a vector by applying a logical set
                         of rules to assign each data value to a category.
       DUMMY             Creates a dummy matrix, expanding values in a vector
                         to rows with ones in columns corresponding to true
                         categories and zeros elsewhere.
       DUMMYBR           Similar to DUMMY.
       DUMMYDN           Similar to DUMMY.
       RECODE            Similar to CODE, but leaves the original data in
                         place if no condition is met.
       SUBSTUTE          Similar to RECODE, but operates on matrices.
       SUBSCAT           Simpler version of RECODE, but uses ascending bins
                         instead of logical conditions.

CODE, RECODE, and SUBSCAT allow the user to code data variables and operate on vectors in memory. SUBSTUTE operates on matrices, and DUMMY and DUMMYBR and DUMMYDN create matrices.

See @MISS for help on recoding missing values.

x = MISS(0,0); x = ERROR(0); LET x = { . };

all create a single missing value. Equality and inequality comparisons may be done on matrices containing missing values by using the $== and $/= operators.

Sorting Routines

       SORTC         Quick-Sorts rows of matrix based on numeric column.
       SORTCC        Quick-Sorts rows of matrix based on character column.
       SORTD         Sorts rows of data set on basis of a column.
       SORTHC        Heap-Sorts rows of matrix based on numeric column.
       SORTHCC       Heap-Sorts rows of matrix based on character column.
       SORTIND       Returns a sorted index of numeric vector.
       SORTINDC      Returns a sorted index of character vector.
       SORTMC        Sorts rows of matrix on the basis of multiple columns.
       UNIQINDX      Returns a sorted unique index of vector.
       UNIQUE        Removes duplicate elements of a vector.
       INTRLEAV      Produces one large sorted data set from two smaller sets
                     having the same variables and sorted on a key column.
       MERGEBY       Produces one large sorted data set from two smaller
                     sorted sets having a single key column in common.

Sorting routines operate on matrices by sorting the rows on the basis of a key column. Both character and numeric data can be sorted using these functions.

MERGEBY, INTRLEAV, and SORTD operate on data sets. See @SIZE for functions which rank elements of matrices.


Eigenvalue and Eigenvector Routines

                         Computes:                       Matrix type:
       EIG               Eigenvalues                     Complex general
                                                         Real general

       EIGV              Eigenvalues and eigenvectors    Complex general
                                                         Real general

       EIGH              Eigenvalues                     Complex hermitian
                                                         Real symmetric

       EIGHV             Eigenvalues and eigenvectors    Complex hermitian
                                                         Real symmetric

To get help on polynomial functions, type @POLY.

See @DEC for information on matrix decompositions.

See @INV for help on solving linear systems.

Matrix Decompositions

       BALANCE        Balances a matrix.
       CHOL           Computes Cholesky decomposition, X = Y'Y.
       CHOLDN         Performs Cholesky downdate on an upper triangular matrix.
       CHOLUP         Performs Cholesky update on an upper triangular matrix.
       CROUT          Computes Crout decomposition, X = LU.
       CROUTP         Computes Crout decomposition with row pivoting.
       HESS           Computes upper Hessenberg form (real matrices only).
       LU             Computes LU decomposition, X = LU.
       NULL           Computes orthonormal basis for right null space.
       NULL1          Computes orthonormal basis for right null space.
       ORTH           Computes orthonormal basis for column space.
       QQR            QR decomposition: returns Q1 and R.
       QQRE           QR decomp: returns Q1, R and a permutation vector E.
       QQREP          QR decomp. with pivot control:  returns Q1, R and E.
       QR             QR decomposition: returns R.
       QRE            QR decomp: returns R and a permutation vector E.
       QREP           QR decomp. with pivot control:  returns R and E.
       QTYR           QR decomp: returns Q'Y and R.
       QTYRE          QR decomp: returns Q'Y, R and a permutation vector E.
       QTYREP         QR decomp. with pivot control: returns Q'Y, R and E.
       QYR            QR decomp: returns Q*Y and R.
       QYRE           QR decomp: returns Q*Y, R and a permutation vector E.
       QYREP          QR decomp. with pivot control: returns Q*Y, R and E.
       RREF           Computes reduced row echelon form of a matrix.
       SCHUR          Computes Schur decomposition of a matrix (real only).
       SVD            Computes the singular values of a matrix.
       SVD1           Computes singular value decomposition, X = USV'.
       SVD2           Computes SVD1 with compact U.

       Scalar Descriptions
       COND           Computes condition number of a matrix.
       DET            Computes determinant of square matrix.
       DETL           Computes determinant of decomposed matrix.
       RANK           Computes rank of a matrix.

For help on eigenvalues and eigenvectors, type @EIG. For help on solving linear systems, type @INV.

Matrix Inversions and Linear System Solutions

       INV               Inverts a matrix.
       INVPD             Inverts a positive definite matrix.
       PINV              Generalized pseudo-inverse: Moore-Penrose.
       INVSWP            Generalized sweep inverse.
       b/A               Solves a linear system Ax = b.
       CHOLSOL           Solves a system of equations given the Cholesky
                         factorization of a matrix.
       SOLPD             Solves a system of positive definite linear equations.

INV uses Crout decomposition and INVPD uses Cholesky decomposition.

See @DEC for help on matrix decompositions.

See @EIG for help on eigenvalues and eigenvectors.


Program Execution Control - RUN, STOP, etc.

       END               Terminates a program and closes files.
       PAUSE             Pauses for the specified time.
       RUN               Runs both source code and compiled form programs.
       STOP              Terminates a program and leaves files open.
       SYSTEM            Quits GAUSS and returns to DOS.

These functions start, stop or pause the execution of a program. Neither END nor STOP is required in a program; if neither if found, an implicit STOP is executed upon program termination.

If you have subroutine definitions at the end of a program file, you should place an END or STOP statement before the first subroutine label.

Unconditional branching is done with GOTO.
/* coin toss... */                     /* file check... */
toss:                                  open f1 = mydat for read;
    coin = rndu(1,1);                  if f1 == -1;
    if coin > .49 and coin < .51;         goto errout( "File not found", -1 );
        goto edge;                     endif;
    elseif coin >= .51;                   .
        heads = heads + 1;                .
    endif;                             errout:
    t = t + 1;                            pop rv;
    goto toss;                            pop msg;
edge:                                     errorlog msg;
    print "It's on edge!";                _errval = rv;
    print "H " heads " T " t-heads;       end;

The target of a GOTO is called a label. Labels must begin with '_' or an alphabetic character and are always followed by a colon.

GOTO, like GOSUB, can pass arguments via the stack. If arguments are passed, they are retrieved (POPed) in the reverse order they are passed.

Looping Control

Looping is controlled with the DO statement.

                    do while st > tol;      /* loop if true */

                    do until st <= tol;      /* loop if false */

       BREAK;                    Jump to the statement following ENDO.
       CONTINUE;                 Jump to the top of a DO loop.

See the Foreign Language Interface section of the GAUSS System and Graphics Manual for more details.


DOS Shell Commands

Workspace Management

Error Handling and Debugging

String Handling Routines

String Arrays