package ‘mmst’
TRANSCRIPT
Package ‘MMST’February 15, 2013
Type Package
Title DATASETS FROM MMST
Version 0.6-1.1
Date 2010-07-04
Author Keith Halbert <[email protected]>
Maintainer Keith Halbert <[email protected]>
Description The datasets from Modern Multivariate StatisticalTechniques by Alan Julian Izenman are contained in thispackage. The documentation descriptions show the page numbersof references to the data set within the text. See the textfor detailed descriptions of the datasets. Also included inthis package is a function for exporting these datasets en masse.
License GPL (>= 2)
LazyLoad yes
Repository CRAN
Date/Publication 2011-02-11 16:58:30
Depends R (>= 2.10)
NeedsCompilation no
R topics documented:MMST-package . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3AirlineDistances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3alontop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4baseball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5bodyfat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6boston . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1
2 R topics documented:
BritishTowns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8bupa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8cleveland . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9color.stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10COMBO17 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10covertype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11detergent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11diabetes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12ecoli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12foetal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13food . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14geyser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15gilgaied.soil . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15glass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Hidalgo1872 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17ionosphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17iris . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18letter.recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18leukemia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19lloyd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20MEG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21MMST.out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21morse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22ncifinal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23norwaypaper1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23pendigits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24pet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25pima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25primate.scapulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26psych24r . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27root.stocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28satimage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29shoplifting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29shuttle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30soldat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31sonar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31spambase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32steganography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34swiss.roll . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34SwissBankNotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35tobacco . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35tumors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36turtles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37ushighways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38wdbc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39wine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40x498.matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
AirlineDistances 3
yeast . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Index 43
MMST-package DATASETS FROM MMST
Description
Data sets for Modern Multivariate Statistical Techniques, by A. Izenman (2008).
Details
Package: MMSTType: PackageVersion: 0.6-1Date: 2010-07-04License: GPL (>= 2)LazyLoad: yes
Author(s)
Keith Halbert <[email protected]>
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
See Also
MMST.out
AirlineDistances MMST AIRLINE DISTANCES DATA
Description
airline distances, 464, 467, 481, 482, 484
Usage
data(AirlineDistances)
4 alontop
Format
A distance matrix of the distance in kilometers between the following 18 cities: Beijing, CapeTown, Hong Kong, Honolulu, London, Melbourne, Mexico, Montreal, Moscow, New Delhi, NewYork, Paris, Rio de Janeiro, Rome, San Francisco, Singapore, Stockholm, and Tokyo.
Source
National Geographic Society (1995), National Geographic Atlas of the World, Rev 6th Edition,National Geograpic
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
alontop MMST ALONTOP DATA
Description
colon cancer, 19, 20, 443, 444, 446
Usage
data(alontop)
Format
A data frame with 62 observations (tissue samples) on 92 numeric variables (a subset of a larger setof more than 6500 genes).
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., and Levine, A. (1999). Broadpatterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probedby oligonucleotide arrays, Proceedings of the National Academy of Sciences, 96, 6745-6750.
baseball 5
baseball MMST BASEBALL DATA
Description
Major League Baseball salaries, 307, 308, 309, 368
Usage
data(baseball)
Format
A data frame with 337 observations on the following 18 variables.
salary a numeric vector
BA a numeric vector
OBP a numeric vector
Runs a numeric vector
Hits a numeric vector
X2B a numeric vector
X3B a numeric vector
HR a numeric vector
RBI a numeric vector
BB a numeric vector
SO a numeric vector
SB a numeric vector
E a numeric vector
FAE a numeric vector
FA a numeric vector
AE a numeric vector
A a numeric vector
Name a character vector
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Watnik, M.R. (1998). Pay for play: are baseball salaries based on performance? Journal of StatisticsEducation, 6.
6 bodyfat
bodyfat MMST BODYFAT DATA
Description
bodyfat, 116, 125, 128, 146, 148, 150, 151, 154
Usage
data(bodyfat)
Format
A data frame with 252 observations on the following 15 variables.
density a numeric vector
bodyfat a numeric vector
age a numeric vector
weight a numeric vector
height a numeric vector
neck a numeric vector
chest a numeric vector
abdomen a numeric vector
hip a numeric vector
thigh a numeric vector
knee a numeric vector
ankle a numeric vector
biceps a numeric vector
forearm a numeric vector
wrist a numeric vector
Source
http://lib.stat.cmu.edu/datasets/
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
boston 7
boston MMST BOSTON HOUSING DATA
Description
Boston housing, 158
Usage
data(boston)
Format
A data frame with 506 observations on the following 14 variables.
crim a numeric vector
zn a numeric vector
indus a numeric vector
chas a numeric vector
nox a numeric vector
rm a numeric vector
age a numeric vector
dis a numeric vector
rad a numeric vector
tax a numeric vector
ptratio a numeric vector
black a numeric vector
lstat a numeric vector
medv a numeric vector
Source
http://lib.stat.cmu.edu/datasets/
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
8 bupa
BritishTowns MMST BRITISHTOWNS DATA
Description
British towns, 504
Usage
data(BritishTowns)
Format
A distance matrix of the distances in kilometers between the following 48 British towns: Ab-erdeen, Aberystwyth, Barnstaple,Birmingham, Brighton, Bristol, Cambridge, Cardiff, Carlisle, Car-marthen, Colchester, Dorchester, Dov, Edinburgh, Exeter, Fort.William, Glasgow, Gloucester, Guild-ford, He, Holyhead, Hull, Inverness, Kendal, Leeds, Lincoln, Liverpool, Maidstone, Manchester,Middlesborough, Newcastle, Northampton, Norwich, Nottingham, Oxford, Penzance, Perth, Ply-mouth, Preston, Salisbury, Sheffield, Shre, Southampton, Stoke, Stranraer, Taunton, York, and Lon-don.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
bupa MMST BUPA LIVER DISORDERS DATA
Description
BUPA liver disorders, 258, 260, 348, 387, 508
Usage
data(bupa)
Format
A data frame with 345 observations on the following 7 variables.
mcv a numeric vectoralkphos a numeric vectorsgpt a numeric vectorsgot a numeric vectorgammagt a numeric vectordrinks a numeric vectorgroup a factor with levels 1 2
cleveland 9
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
cleveland MMST CLEVELAND DATA
Description
Cleveland heart-disease, 284, 286, 287, 289, 291, 314, 368
Usage
data(cleveland)
Format
A data frame with 296 observations on the following 15 variables.
age a numeric vectorgender a factor with levels fem malecp a factor with levels abnang angina asympt notang
trestbps a numeric vectorchol a numeric vectorfbs a factor with levels fal truerestecg a factor with levels abn hyp norm
thalach a numeric vectorexang a factor with levels fal trueoldpeak a numeric vectorslope a factor with levels down flat up
ca a numeric vectorthal a factor with levels fix norm rev
diag1 a factor with levels buff sickdiag2 a factor with levels H S1 S2 S3 S4
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
10 COMBO17
color.stimuli MMST COLOR.STIMULI DATA
Description
perceptions of color, 468, 469, 503
Usage
data(color.stimuli)
Format
A distance matrix of colors represented by the following wavelengths (micrometers): 434, 445, 465,472, 490, 504, 537, 555, 584, 600, 610, 628, 651, and 674.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Ekman, G. (1954). Dimensions of color vision, Journal of Psychology, 38, 467-474.
COMBO17 MMST COMBO17 DATA
Description
COMBO-17 galaxy photometric catalogue, 216, 219, 235
Usage
data(COMBO17)
Format
A data frame with 3462 observations on 65 numeric variables.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Wolf. C., Meisenheimer, M., Kleinheinrich, M., Borch, A., Dye, S., Gray, M., Wisotski, L., Bell,E.F., Rix, H.-W., Cimatti, A., Hasinger, G., and Szokoly, G. (2004). A catalogue of the Chan-dra Deep Field South with multi-colour classification and photometric redshifts from COMBO-17,Astronomy & Astrophysics, arXiv:astro-ph/0403666v1.
covertype 11
covertype MMST COVERTYPE DATA
Description
covertype, 279
Usage
data(covertype)
Format
A data frame with 581012 observations on 55 variables.
Source
http://kdd.ics.uci.edu/databases/covertype/
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
detergent MMST DETERGENT DATA
Description
laundry detergent, 156
Usage
data(detergent)
Format
A data frame with 12 observations on 1173 variables. There are five Y variables, representing fourcompounds in an aquaeous solution (the fifth Y variable is the amount of water in the solution).The X input variables consist of mid-infrared spectrum values recorded as the absorbances at 1168equally spaced frequencies in the detergent.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
12 ecoli
diabetes MMST DIABETES DATA
Description
diabetes, 272, 278, 348, 391
Usage
data(diabetes)
Format
A data frame with 145 observations on the following 6 variables.
glucose.area a numeric vector
insulin.area a numeric vector
SSPG a numeric vector
relative.weight a numeric vector
fasting.plasma.glucose a numeric vector
class a numeric vector
Source
http://lib.stat.cmu.edu/datasets/Andrews/
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Andrews, D.F. and Herzberg, A.M. (1985). Data, New York: Springer.
ecoli MMST ECOLI DATA
Description
e-coli, 273, 279, 348, 391
Usage
data(ecoli)
foetal 13
Format
A data frame with 336 observations on the following 9 variables.
label a character vectormvg a numeric vectorgvh a numeric vectorlip a numeric vectorchg a numeric vectoraac a numeric vectoralm1 a numeric vectoralm2 a numeric vectorsite a factor with levels cp im imL imS imU om omL pp
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
foetal MMST FOETAL DATA
Description
cutaneous potential recordings of a pregnant woman, 554, 556, 592
Usage
data(foetal)
Format
A data frame with 2500 observations of ECG points. The first variable is a simple timestep, thenext five channels are measured near the fetus (abdominal signals) and the last three channels wereplaced on the mother’s thorax (chest).
timestep a numeric vectorab1 a numeric vectorab2 a numeric vectorab3 a numeric vectorab4 a numeric vectorab5 a numeric vectorth1 a numeric vectorth2 a numeric vectorth3 a numeric vector
14 food
Source
http://homes.esat.kuleuven.be/~smc/daisy/daisydata.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
de Lathauwer, L., de Moor, B., Vandewalle, J. (2000). Fetal electrocardiogram extraction by blindsource subspace separation, IEEE Transactions on Biomedical Engineering, 47, 567-573. Proceed-ings of the IEEE SP/Athos Workshop on Higher-Order Statistics, Girona, Spain, pp. 134-138.
food MMST FOOD DATA
Description
nutritional value of food, 196, 198, 206, 208, 462, 612, 613, 631
Usage
data(food)
Format
A data frame with 961 observations on the following 7 variables.
Fat.grams a numeric vector
Food.energy.calories a numeric vector
Carbohydrates.grams a numeric vector
Protein.grams a numeric vector
Cholesterol.mg a numeric vector
weight.grams a numeric vector
Saturated.fat.grams a numeric vector
Source
http://www.ntwrks.com/~mikev/chart1.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
geyser 15
geyser MMST GEYSER DATA
Description
Old Faithful Geyser, 99, 100, 409, 410
Usage
data(geyser)
Format
A data frame with 107 observations on the following 2 variables.
X1 a numeric vector
X2 a numeric vector
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Weisberg, S. (1985). Applied Linear Regression, Second Edition, New York: Wiley.
gilgaied.soil MMST GILGAIED SOIL DATA
Description
gilgaied soil, 271, 278, 367
Usage
data(gilgaied.soil)
Format
A data frame with 48 observations on the following 11 variables.
pH a numeric vector
N a numeric vector
BD a numeric vector
P a numeric vector
Ca a numeric vector
Mg a numeric vector
16 glass
K a numeric vectorNa a numeric vectorcond a numeric vectorBlock.no. a factor with levels 1 2 3 4
Group.no. a factor with levels 1 2 3 4 5 6 7 8 9 10 11 12
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Horton, I.F., Russell, J.S., and Moore, A.W. (1968). Multivariatecovariance and canonical analysis:a method for selecting the most effective discriminators in a multivariate situation, Biometrics, 24,845-858.
glass MMST GLASS DATA
Description
forensic glass, 273, 348, 391, 508, 550
Usage
data(glass)
Format
A data frame with 214 observations on the following 10 variables.
RI a numeric vectorNa a numeric vectorMg a numeric vectorAl a numeric vectorSi a numeric vectorK a numeric vectorCa a numeric vectorBa a numeric vectorFe a numeric vectortype a factor with levels 1 2 3 5 6 7
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Hidalgo1872 17
Hidalgo1872 MMST HIDALGO 1872 STAMP DATA
Description
Hidalgo postage stamps, 93, 96, 98
Usage
data(Hidalgo1872)
Format
A data frame with 485 observations on the following 3 variables.
thickness a numeric vectorthicknessA a numeric vectorthicknessB a numeric vector
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Izenman, A.J. and Sommer, C.J. (1988). Philatelic mixtures and multimodal densities, Journal ofthe American Statistical Association, 83, 941-953.
ionosphere MMST IONOSPHERE DATA
Description
ionosphere, 258, 260, 348, 387
Usage
data(ionosphere)
Format
A data frame with 351 observations on 34 continuous variables and 1 factor, classifying the observa-tion as Good (show some type of structure in the ionosphere) or Bad (pass through the ionosphere).
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
18 letter.recognition
iris MMST IRIS DATA
Description
iris, 235, 274, 278, 348, 391
Usage
data(iris)
Format
A data frame with 150 observations on the following 5 variables. This is R.A. Fisher’s classic dataset.
sepal.length a numeric vector
sepal.width a numeric vector
petal.length a numeric vector
petal.width a numeric vector
type a factor with levels Iris-setosa Iris-versicolor Iris-virginica
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
letter.recognition MMST LETTER.RECOGNITION DATA
Description
letter recognition, 274, 348, 391
Usage
data(letter.recognition)
leukemia 19
Format
A data frame with 20000 observations on the following 17 variables. V1 through V16 are primitive,scaled to fit into a range of integer values of 0-15.
letter a factor with levels A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
V1 a numeric vector
V2 a numeric vector
V3 a numeric vector
V4 a numeric vector
V5 a numeric vector
V6 a numeric vector
V7 a numeric vector
V8 a numeric vector
V9 a numeric vector
V10 a numeric vector
V11 a numeric vector
V12 a numeric vector
V13 a numeric vector
V14 a numeric vector
V15 a numeric vector
V16 a numeric vector
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
leukemia MMST LEUKEMIA DATA
Description
leukemia (ALL/AML), 451, 453, 461
Usage
data(leukemia)
20 lloyd
Format
A data frame with 72 observations on 7140 variables.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, M., Gaasenbeek, J.P„ Mesirov, J.P., Coller, H., Loh,M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., and Lander, E.S. (1999). Molecular clas-sification of cancer: class discovery and class prediction by gene expression monitoring, Science,286, 531-537.
lloyd MMST LLOYD’S BANK DATA
Description
employee careers at Lloyds Bank, 477, 478, 489
Usage
data(samp05)data(samp25)data(samp05d)data(samp25d)
Format
There are two data frames and two distance matrices.
Details
There are four data sets utilized in this analysis, the data function must be repeatedly used to loadeach of the four.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Stovel, K., Savage, M., and Bearman, P. (1996). Ascription into Achievement: Models of careersystems at Lloyds Bank, 1890-1970. American Journal of Sociology, 102, 358-399.
MEG 21
MEG MMST MEG DATA
Description
identifying artifacts in MEG recordings, 569
Usage
data(MEG)
Format
A data frame with 122 observations on 17730 unnamed variables.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Vigario, R„ Jousmaki, V., Hamalainen, M., Hari, R., and Oja, E. (1998). Independent compo-nent analysis for identification of artifacts in magnetoencephalographic recordings, In: Advances inNeural Information Processing Systems, 10, pp. 229-235, Cambridge, MA: MIT Press.
MMST.out MMST DATA SET OUTPUT
Description
Function to output data sets for Modern Multivariate Statistical Techniques, by A. Izenman (2008),to a single destination
Usage
MMST.out(dest.folder, datasets = ’all’)
Arguments
dest.folder String containing path to destination folder for files
datasets Vector of strings, each component being the name of a desired dataset (defaultis to output all the data sets contained in the package)
Details
The datasets will be tab delimited with file extension .txt. This task could be done manually usingwrite.table, and this is what the user should do if they are particular about the format of theexported dataset. The reason this function exists is for one to be able to easily export every datasetin the book at a single stroke.
22 morse
Value
NULL
Note
The datasets of class dist are exported as symmetric matrices
Author(s)
Keith Halbert <[email protected]>
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
See Also
write.table
Examples
## Not run:MMST.out(’C:/output’) ## exports all the book’s datasetsMMST.out(’C:/output’, ’bodyfat’) ## exports single datasetMMST.out(’C:/output’, c(’bodyfat’, ’tobacco’)) ## exports two datasets
## End(Not run)
morse MMST MORSE DATA
Description
confusion of Morse-code signals, 469, 470, 503, 504
Usage
data(morse)
Format
A data frame with 36 numeric observations on variables representing each of the 36 alphanumericcharacters.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Rothkopf, E.Z. (1957). A measure of stimulus similarity and errors in some paired-associate learn-ing, Journal of Experimental Psychology, 53, 94-101.
ncifinal 23
ncifinal MMST NCIFINAL DATA
Description
National Cancer Institute, 461
Usage
data(ncifinal)
Format
A data frame with 5244 observations on 62 variables.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
http://www.cancer.gov/
norwaypaper1 MMST NORWAYPAPER1 DATA
Description
Norwegian paper quality, 166, 167, 190, 193, 194
Usage
data(norwaypaper1)
Format
A data frame with 29 observations on the following 22 variables.
Y1 a numeric vector
Y2 a numeric vector
Y3 a numeric vector
Y4 a numeric vector
Y5 a numeric vector
Y6 a numeric vector
Y7 a numeric vector
Y8 a numeric vector
Y9 a numeric vector
24 pendigits
Y10 a numeric vectorY11 a numeric vectorY12 a numeric vectorY13 a numeric vectorX1 a numeric vectorX2 a numeric vectorX3 a numeric vectorX4 a numeric vectorX5 a numeric vectorX6 a numeric vectorX7 a numeric vectorX8 a numeric vectorX9 a numeric vector
Source
http://lib.stat.cmu.edu/datasets/
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Aldrin, M. (1996). Moderate projection pursuit regression for multivariate response data, Compu-tational Statistics & Data Analysis, 21, 501-531.
pendigits MMST PENDIGITS DATA
Description
pen-based handwritten digit recognition, 211, 234, 274, 348, 391, 631
Usage
data(pendigits)
Format
A data frame with 10992 observations on 36 unnamed variables.
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
pet 25
pet MMST PET DATA
Description
PET yarns, 130, 133, 134, 136, 137, 142, 144, 156
Usage
data(pet)
Format
A data frame with 28 observations on 270 variables.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Swierenga, H., de Weijer, A.P., van Wijk, R.J., and Buydens, L.M.C. (1999). Stategy for construct-ing robust multivariate calibration models, Chemometrics and Intelligent Laboratory Systems, 49,1-17.
pima MMST PIMA DATA
Description
Pima Indians diabetes, 292, 294, 296, 298, 299, 301, 302, 314, 368, 549
Usage
data(pima)
Format
A data frame with 532 observations on the following 9 variables.
npregnant a numeric vectorglucose a numeric vectordiastolic.bp a numeric vectorskinfold.thickness a numeric vectorbmi a numeric vectorpedigree a numeric vectorage a numeric vectorclassdigit a factor with levels 0 1
class a factor with levels diabetic normal
26 primate.scapulae
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
primate.scapulae MMST PRIMATE.SCAPULAE DATA
Description
primate scapulae, 274, 279, 280, 348, 391, 420, 421, 461
Usage
data(primate.scapulae)
Format
A data frame with 105 observations on the following 11 variables.
genus a numeric vector
AD.BD a numeric vector
AD.CD a numeric vector
EA.CD a numeric vector
Dx.CD a numeric vector
SH.ACR a numeric vector
EAD a numeric vector
beta a numeric vector
gamma a numeric vector
class a factor with levels Gorilla Homo Hylobates Pan Pongo
classdigit a factor with levels 1 2 3 4 5
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Ashton, K.H., Oxnard, C.E., and Spence, T.F. (1965). Scapular shape and primate classification,Proceeding of the Zoological Society, London, 145, 125-142.
psych24r 27
psych24r MMST PSYCH24R DATA
Description
psychological tests, 587, 588, 595
Usage
data(psych24r)
Format
A data frame with 301 observations on the following 31 variables.
Case a numeric vector
Sex a factor with levels F M
Age a numeric vector
Grp a numeric vector
V1 a numeric vector
V2 a numeric vector
V3 a numeric vector
V4 a numeric vector
V5 a numeric vector
V6 a numeric vector
V7 a numeric vector
V8 a numeric vector
V9 a numeric vector
V10 a numeric vector
V11 a numeric vector
V12 a numeric vector
V13 a numeric vector
V14 a numeric vector
V15 a numeric vector
V16 a numeric vector
V17 a numeric vector
V18 a numeric vector
V19 a numeric vector
V20 a numeric vector
V21 a numeric vector
28 root.stocks
V22 a numeric vector
V23 a numeric vector
V24 a numeric vector
V25 a numeric vector
V26 a numeric vector
group a factor with levels GRANT PASTEUR
Source
http://www.psych.yorku.ca/friendly/lab/files/psy6140/data/psych24r.sas
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
root.stocks MMST ROOT STOCKS DATA
Description
root-stocks of apple trees, 193
Usage
data(root.stocks)
Format
A data frame with 104 observations on the following 5 variables.
type a factor with levels I II III IV IX V VI VII X XII XIII XV XVI
Y1 a numeric vector
Y2 a numeric vector
Y3 a numeric vector
Y4 a numeric vector
Source
http://lib.stat.cmu.edu/datasets/Andrews/
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Andrews, D.F. and Herzberg, A.M. (1985). Data, New York: Springer.
satimage 29
satimage MMST SATIMAGE DATA
Description
Landsat satellite image, 428, 431, 436, 438, 461
Usage
data(satimage)
Format
A data frame with 4435 observations on 37 variables.
Source
http://www.liacc.up.pt/
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
shoplifting MMST SHOPLIFTING DATA
Description
shoplifting in the Netherlands, 634, 635, 646, 647
Usage
data(shoplifting)
Format
A data frame with 18 observations on the following 13 variables.
clothing a numeric vector
accessories a numeric vector
tobacco a numeric vector
writing a numeric vector
books a numeric vector
records a numeric vector
goods a numeric vector
30 shuttle
sweets a numeric vector
toys a numeric vector
jewelry a numeric vector
perfume a numeric vector
tools a numeric vector
other a numeric vector
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
van der Heijden, P.G.M., de Falguerolles, A., and de Leeuw, J. (1989). A combined approach tocontingency table analysis using correspondence analysis and log-linear analysis, Applied Statistics,38, 249-292.
shuttle MMST SHUTTLE DATA
Description
shuttle, 274, 348, 391
Usage
data(shuttle)
Format
A data frame with 43500 observations on 10 unnamed numeric variables.
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
soldat 31
soldat MMST SOLDAT DATA
Description
aqueous solubility in drug discovery, 514, 515
Usage
data(soldat)
Format
A data frame with 5631 observations on 72 input variables and 1 output variable.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Culp, M., Johnson, K., and Michailidis, G. (2006). ada: an R package for stochastic boosting,Journal of Statistical Software, 17, 2.
sonar MMST SONAR DATA
Description
sonar, 259, 260, 348, 387
Usage
data(sonar)
Format
A data frame with 208 observations on 61 unnamed variables.
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
32 spambase
spambase MMST SPAMBASE DATA
Description
spambase, 259, 260, 278, 348, 385, 387, 512, 549
Usage
data(spambase)
Format
A data frame with 4601 observations on the following 59 variables.
make a numeric vector
address a numeric vector
all a numeric vector
xd a numeric vector
our a numeric vector
over a numeric vector
remove a numeric vector
internet a numeric vector
order a numeric vector
mail a numeric vector
receive a numeric vector
will a numeric vector
people a numeric vector
report a numeric vector
addresses a numeric vector
free a numeric vector
business a numeric vector
email a numeric vector
you a numeric vector
credit a numeric vector
your a numeric vector
font a numeric vector
x000 a numeric vector
money a numeric vector
hp a numeric vector
spambase 33
hpl a numeric vectorgeorge a numeric vectorx650 a numeric vectorlab a numeric vectorlabs a numeric vectortelnet a numeric vectorx857 a numeric vectordata a numeric vectorx415 a numeric vectorx85 a numeric vectortechnology a numeric vectorx1999 a numeric vectorparts a numeric vectorpm a numeric vectordirect a numeric vectorcs a numeric vectormeeting a numeric vectororiginal a numeric vectorproject a numeric vectorre a numeric vectoredu a numeric vectortable a numeric vectorconference a numeric vectorx. a numeric vectorx.. a numeric vectorx...1 a numeric vectorx..1 a numeric vectorx..2 a numeric vectorx..3 a numeric vectorcrla a numeric vectorcrll a numeric vectorcrrt a numeric vectorclassdigit a factor with levels 0 1
class a factor with levels email spam
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
34 swiss.roll
steganography MMST STEGANOGRAPHY DATA
Description
steganography, 344, 345
Usage
data(steganography)
Format
A data frame with 1000 observations on 73 variables.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Kahn, D. (1996). The history of steganography, Proceedings of Information Hiding, First Interna-tional Workshop, Cambridge, U.K.
swiss.roll MMST SWISS ROLL DATA
Description
Swiss roll, 598, 617, 619, 620, 622, 623
Usage
data(swiss.roll)
Format
A data frame with 20000 observations on the following 5 variables.
X1 a numeric vector
X2 a numeric vector
X3 a numeric vector
Y1 a numeric vector
Y2 a numeric vector
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
SwissBankNotes 35
SwissBankNotes MMST SWISSBANKNOTES DATA
Description
Swiss bank notes, 235
Usage
data(SwissBankNotes)
Format
A data frame with 200 observations on the following 6 variables.
length a numeric vector
height.left a numeric vector
height.right a numeric vector
inner.lower a numeric vector
inner.upper a numeric vector
diagonal a numeric vector
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
tobacco MMST TOBACCO DATA
Description
chemical composition of tobacco, 183, 187
Usage
data(tobacco)
36 tumors
Format
A data frame with 25 observations on the following 9 variables.
Y1.BurnRate a numeric vector
Y2.PercentSugar a numeric vector
Y3.PercentNicotine a numeric vector
X1.PercentNitrogen a numeric vector
X2.PercentChlorine a numeric vector
X3.PercentPotassium a numeric vector
X4.PercentPhosphorus a numeric vector
X5.PercentCalcium a numeric vector
X6.PercentMagnesium a numeric vector
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Anderson, R.L. and Bancroft, T.A. (1952). Statistical Theory in Research, New York: McGraw-Hill.
tumors MMST TUMORS DATA
Description
four childhood tumors, 541, 545, 550
Usage
data(tumors)
Format
A data frame with 2308 observations on 90 variables.
Source
http://research.nhgri.nih.gov/microarray/Supplement/
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Khan, J., Wei, J.S., Ringner, M., Saal, L.H., Ladanyi, M., Westermann, F., Berthold, F., Schwab, M.,Antonescu, C.R., Peterson, C., and Meltzer, P.S. (2001). Classification and diagnostic prediction ofcancers using gene expression profiling and artificial neural networks, Nature Medicine, 7, 673-679.
turtles 37
turtles MMST TURTLES DATA
Description
turtle carapaces, 234
Usage
data(turtles)
Format
A data frame with 48 observations on the following 4 variables.
sex a factor with levels f m
length a numeric vector
width a numeric vector
height a numeric vector
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
ushighways MMST USHIGHWAYS DATA
Description
U.S. highways, 106
Usage
data(ushighways)
Format
A data frame with 221 observations on the following 4 variables.
Interstate a numeric vector
State a factor with levels AL AR CA CO CT DE FL GA IA ID IL IN KS KY LA MA MD ME MI MN MO MS NCNE NH NJ NY OH OK OR PA RI SC SD TN TX UT VA VT WA WI WV WY
Approx.Miles a numeric vector
Location a character vector
38 vehicle
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Rand McNally (1992), Rand McNally 1993 Business Traveler’s Road Atlas and Guide to MajorCities, Rand McNally
vehicle MMST VEHICLE DATA
Description
vehicle, 274, 302, 304, 348, 391
Usage
data(vehicle)
Format
A data frame with 564 observations on the following 20 variables.
Comp a numeric vector
Circ a numeric vector
Dcirc a numeric vector
RR a numeric vector
PrAxisAR a numeric vector
MaxLAR a numeric vector
ScatterR a numeric vector
Elong a numeric vector
PrAxisRect a numeric vector
MaxLRect a numeric vector
SvarMajAxis a numeric vector
SvarMinAxis a numeric vector
SradGyration a numeric vector
SkewMajAxis a numeric vector
SkewMinAxis a numeric vector
KurtMinAxis a numeric vector
KurtMajAxis a numeric vector
Hratio a numeric vector
classdigit a factor with levels 1 2 3 4
class a factor with levels bus opel saab van
wdbc 39
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
wdbc MMST WDBC DATA
Description
Wisconsin diagnostic breast cancer, 239, 241, 246, 247, 249, 251, 255, 256, 260, 279, 348, 387,462, 508, 550
Usage
data(wdbc)
Format
A data frame with 569 observations on the following 32 variables.
id a numeric vector
class a factor with levels B M
radius.mv a numeric vector
texture.mv a numeric vector
peri.mv a numeric vector
area.mv a numeric vector
smooth.mv a numeric vector
comp.mv a numeric vector
scav.mv a numeric vector
ncav.mv a numeric vector
symt.mv a numeric vector
fracd.mv a numeric vector
radius.sd a numeric vector
texture.sd a numeric vector
peri.sd a numeric vector
area.sd a numeric vector
smooth.sd a numeric vector
comp.sd a numeric vector
scav.sd a numeric vector
40 wine
ncav.sd a numeric vector
symt.sd a numeric vector
fracd.sd a numeric vector
radius.ev a numeric vector
texture.ev a numeric vector
peri.ev a numeric vector
area.ev a numeric vector
smooth.ev a numeric vector
comp.ev a numeric vector
scav.ev a numeric vector
ncav.ev a numeric vector
symt.ev a numeric vector
fracd.ev a numeric vector
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Street, W.N., Wolberg, W.H., and Mangasarian, O.L. (1993). Nuclear feature extraction for breasttumor diagnosis, IS&T/SPIE International Symposium on Electronic Imaging: Science and Tech-nology (San Jose, CA), 1905, 861-870.
wine MMST WINE DATA
Description
wine, 275, 278, 348, 391
Usage
data(wine)
Format
A data frame with 178 observations on the following 15 variables.
Alcohol a numeric vector
MalicAcid a numeric vector
Ash a numeric vector
AlcAsh a numeric vector
Mg a numeric vector
Phenols a numeric vector
x498.matrix 41
Flav a numeric vector
NonFlavPhenols a numeric vector
Proa a numeric vector
Color a numeric vector
Hue a numeric vector
OD a numeric vector
Proline a numeric vector
classdigit a factor with levels 1 2 3
class a factor with levels Barbera Barolo Grignolino
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
x498.matrix MMST 498 MATRIX DATA
Description
mapping the protein universe, 484, 486
Usage
data(x498.matrix)
Format
A distance matrix mapping 498 unnamed variables representing proteins.
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Hou, J., Sims, G.E., Zhang, C., and Kim, S.-H. (2003). A global representation of the protein foldspace, Proceedings of the National Academy of Sciences, 100, 2386-2390.
42 yeast
yeast MMST YEAST DATA
Description
yeast, 275, 279, 348, 391, 508
Usage
data(yeast)
Format
A data frame with 1484 observations on the following 10 variables.
yeast a character vector
mcg a numeric vector
gvh a numeric vector
alm a numeric vector
mit a numeric vector
erl a numeric vector
pox a numeric vector
vac a numeric vector
nuc a numeric vector
site a factor with levels CYT ERL EXC ME1 ME2 ME3 MIT NUC POX VAC
Source
http://archive.ics.uci.edu/ml/datasets.html
References
A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
Index
∗Topic IOMMST-package, 3MMST.out, 21
∗Topic datasetsAirlineDistances, 3alontop, 4baseball, 5bodyfat, 6boston, 7BritishTowns, 8bupa, 8cleveland, 9color.stimuli, 10COMBO17, 10covertype, 11detergent, 11diabetes, 12ecoli, 12foetal, 13food, 14geyser, 15gilgaied.soil, 15glass, 16Hidalgo1872, 17ionosphere, 17iris, 18letter.recognition, 18leukemia, 19lloyd, 20MEG, 21MMST-package, 3morse, 22ncifinal, 23norwaypaper1, 23pendigits, 24pet, 25pima, 25primate.scapulae, 26psych24r, 27
root.stocks, 28satimage, 29shoplifting, 29shuttle, 30soldat, 31sonar, 31spambase, 32steganography, 34swiss.roll, 34SwissBankNotes, 35tobacco, 35tumors, 36turtles, 37ushighways, 37vehicle, 38wdbc, 39wine, 40x498.matrix, 41yeast, 42
AirlineDistances, 3alontop, 4
baseball, 5bodyfat, 6boston, 7BritishTowns, 8bupa, 8
cleveland, 9color.stimuli, 10COMBO17, 10covertype, 11
detergent, 11diabetes, 12
ecoli, 12
foetal, 13food, 14
43
44 INDEX
geyser, 15gilgaied.soil, 15glass, 16
Hidalgo1872, 17
ionosphere, 17iris, 18
letter.recognition, 18leukemia, 19lloyd, 20
MEG, 21MMST (MMST-package), 3MMST-package, 3MMST.out, 3, 21morse, 22
ncifinal, 23norwaypaper1, 23
pendigits, 24pet, 25pima, 25primate.scapulae, 26psych24r, 27
root.stocks, 28
samp05 (lloyd), 20samp05d (lloyd), 20samp25 (lloyd), 20samp25d (lloyd), 20satimage, 29shoplifting, 29shuttle, 30soldat, 31sonar, 31spambase, 32steganography, 34swiss.roll, 34SwissBankNotes, 35
tobacco, 35tumors, 36turtles, 37
ushighways, 37
vehicle, 38
wdbc, 39wine, 40write.table, 22
x498.matrix, 41
yeast, 42