What is a Statistical Analysis System?

The term “statistical analysis system” is used to refer to software that allows the user to perform statistical analysis on data sets. Another commonly-used term for this type of software is statistical programming language. When capitalized, Statistical Analysis System (SAS) is also the proper name of one of the most well-known software packages of this type.

A statistical analysis system provides the automation and processing power needed to facilitate the manipulation and analysis of data sets. These packages facilitate computation of both descriptive and inductive statistics. Commonly-used descriptive statistical computations include calculation of central tendency, frequency distribution, and association. Inductive statistical analysis that can be performed with a statistical analysis system includes statistical hypothesis testing, such as the t-test, the z-test, and the chi-square test. Many statistical analysis systems also support other tests, such as analysis of variance (ANOVA) and its relatives, and various types of regression testing.

Statistical analysis systems are used in a wide variety of settings. Natural and social scientists in academic and commercial research settings are the most frequent users of these types of software packages. Businesses may also use a statistical analysis system for operations research, project management, and other business intelligence applications.

With some software packages, the command-line interface (CLI) is more often used, while others primarily feature a graphical user interface (GUI), often with drop-down menus. Most software packages provide both CLI and GUI capabilities, although the user may not be able to access all features from both interfaces. While a GUI is more familiar for non-technical users, using a CLI to create programs enables easier replication of analyses.

Many statistical software packages make use of fourth-generation programming languages (4GL). Due to their higher level of abstraction and more natural syntax, data manipulation and analysis in 4GL is quicker and easier than in lower-level programming languages. Before the development of 4GL, computer-assisted statistical analysis was cumbersome and required greater programming expertise.

A large number of statistical analysis software applications with varied interfaces, capabilities, and extensions are available. Proprietary software applications remain popular, but many open-source software applications are also widely used. Virtually all statistical software packages will run on Windows operating systems, and most also have Macintosh and Linux versions. A few applications are compatible with Unix operating systems as well.