Type: | Package |
Title: | Datasets for the Book Graphical Data Analysis with R |
Version: | 0.93 |
Date: | 2015-05-02 |
Author: | Antony Unwin |
Maintainer: | Antony Unwin<unwin@math.uni-augsburg.de> |
Description: | Datasets used in the book 'Graphical Data Analysis with R' (Antony Unwin, CRC Press 2015). |
Depends: | R (≥ 2.10) |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Suggests: | ggplot2 |
LazyData: | yes |
NeedsCompilation: | no |
Packaged: | 2015-05-02 09:42:33 UTC; antonyunwin |
Repository: | CRAN |
Date/Publication: | 2015-05-02 14:11:23 |
Top performances in the Decathlon from 1985 to 2006.
Description
The point scoring system for the Decathlon last changed in 1985. Best annual performances of 6800 points and over for a twenty-one year period after the new rules were introduced were downloaded from the excellent Estonian website Decathlon2000. Handtimed performances were not included. Names with accents have been simplified.
Usage
data(Decathlon)
Format
A data frame with 7968 observations on the following 24 variables.
Totalpoints
the total points achieved over all 10 events
DecathleteName
Decathlete's name
Nationality
Decathlete's nationality
m100
Time for the 100 metres (secs)
Longjump
Distance jumped (metres)
Shotput
Distance putting the shot (metres)
Highjump
Height jumped (metres)
m400
Time for the 400 metres (secs)
m110hurdles
Time for the 110 metres hurdles (secs)
Discus
Distance throwing the discus (metres)
Polevault
Height achieved (metres)
Javelin
Distance throwing the javelin (metres)
m1500
Time for the 1500 metres (secs)
yearEvent
Year of performance
P100m
Points for performance in 100 metres
Plj
Points for performance in long jump
Psp
Points for performance in putting the shot
Phj
Points for performance in high jump
P400m
Points for performance in 400 metres
P110h
Points for performance in 110 metres hurdles
Ppv
Points for performance in pole vault
Pdt
Points for performance in discus
Pjt
Points for performance in javelin
P1500
Points for performance in 1500 metres
Source
Examples
data(Decathlon, package="GDAdata")
summary(Decathlon[, grep("P.*", names(Decathlon))])
library(ggplot2)
ggplot(Decathlon, aes(Plj)) + geom_histogram()
ggplot(Decathlon, aes(P100m, Plj)) + geom_point()
Figures for the trade between England and the East Indies in the 18th century.
Description
The data have been estimated from the graphic in the first edition of Playfair's Commercial and Political Atlas by the website 'Me, myself, and BI'.
Usage
data(EastIndiesTrade)
Format
A data frame with 81 observations on the following 3 variables.
Year
the data go from 1700 to 1780
Exports
Exports from England to the East Indies (millions of pounds)
Imports
Imports to England from the East Indies (millions of pounds)
Source
http://blog.bissantz.com/vis-a-vis
Examples
data(EastIndiesTrade, package="GDAdata")
library(ggplot2)
ggplot(EastIndiesTrade, aes(x=Year, y=Exports-Imports)) + geom_line()
Star data useful for drawing a Hertzsprung-Russell diagram.
Description
Hertzsprung-Russell diagrams plot star luminosity (brightness) against temperature (colour). The first one was drawn just over 100 years ago. The dataset is the Yale Trigonometric Parallax Dataset and this version can be found on the webpage of the Astronomy Department of Case Western Reserve University.
Usage
data(HRstars)
Format
A data frame with 6220 observations on the following 5 variables.
ID
star ID number
V
apparent V magnitude
BV
observed B-V color
Para
observed parallax (in arcsec)
Uncert
uncertainty in parallax (in milliarcsec)
Source
http://burro.astr.cwru.edu/Academics/Astr221/HW/HW5/HW5.html
Examples
data(HRstars, package="GDAdata")
with(HRstars, hist(BV))
with(HRstars, hist(V))
Data from the longjump final in the 1968 Mexico Olympics.
Description
The best longjumps by the 16 finalists in the 1968 Mexico Olympics. Each athlete jumped up to six times, though the winner of the Gold Medal, Bob Beamon, only jumped twice.
Usage
data(MexLJ)
Format
A data frame with 16 observations on the following variable.
Jump
Distance jumped measured in metres
Source
http://en.wikipedia.org/wiki/Athletics_at_the_1968_Summer_Olympics_-_Men's_long_jump
Examples
data(MexLJ, package="GDAdata")
with(MexLJ, summary(Jump))
with(MexLJ, hist(Jump,breaks=seq(7.25,9,0.25)))
World Speed Skiing Competition, Verbier 21st April, 2011
Description
There were separate Speed Skiing competitions for men (79 participants) and women (12 participants).
Usage
data(SpeedSki)
Format
A data frame with 91 observations on the following 8 variables.
Rank
Finishing position by sex
Bib
Start number
FIS.Code
Skier's international skiing ID number
Name
Skier's name
Year
Skier's year of birth
Nation
Skier's nationality
Speed
Speed achieved in km/hr
Sex
Female or Male
Event
Speed Downhill, Speed Downhill Junior or Speed One
no.of.runs
No of runs
Source
http://www.fis-ski.com/de/606/612.html?sector=SS&raceid=262 (men)
http://www.fis-ski.com/de/606/612.html?sector=SS&raceid=263 (women)
Examples
data(SpeedSki, package="GDAdata")
with(SpeedSki, summary(Speed))
library(ggplot2)
ggplot(SpeedSki, aes(Speed)) + geom_histogram(binwidth=5)
Nutritional value of food.
Description
Nutritional value of different foods based on standard serving sizes.
Usage
data(foodnames)
Format
A data frame with 961 observations on the following 9 variables.
Name
name of food (not unique)
Measure
serving description
Fat.grams
grams of fat in a standard serving
Food.energy.calories
calories per serving
Carbohydrates.grams
grams of carbohydrates per serving
Protein.grams
grams of protein per serving
Cholesterol.mg
cholesterol in mg per serving
weight.grams
weight in grams of a standard serving
Saturated.fat.grams
grams of saturated fat per serving
Source
The data are used in A. Izenman (2008), Modern Multivariate Statistical Techniques, Springer
and are available on the accompanying website
http://astro.temple.edu/~alan/MMST/
Examples
data(foodnames, package="GDAdata")
summary(foodnames)
library(ggplot2)
ggplot(foodnames, aes(Fat.grams, Saturated.fat.grams)) + geom_point()
The Guardian University League Table 2013
Description
The Guardian newspaper in the UK publishes a ranking of British universities each year and it reported these data in May, 2012 as a guide for 2013.
Usage
data(uniranks)
Format
A data frame with 120 observations on the following 13 variables.
Rank
Rank of the University
Institution
University name
UniGroup
Universities can be a member of one of five groups,
1994 Group
,Guild HE
,Million+
,Russell
,University Alliance
, or noneHesaCode
University's Higher Education Statistics Agency code
AvTeachScore
Average Teaching Score
NSSTeaching
University's National Student Survey teaching score
NSSOverall
University's NSS overall score
SpendPerStudent
University expenditure per student (depends on subject)
StudentStaffRatio
Student to Staff ratio
CareerProspects
Proportion of graduates in appropriate level employment or full-time study within six months of graduation
ValueAddScore
”Based upon a sophisticated indexing methodology that tracks students from enrolment to graduation, qualifications upon entry are compared with the award that a student receives at the end of their studies.” (Guardian)
EntryTariff
Value dependent on the average points needed to get on the university's courses
NSSFeedback
University's NSS feedback score
Source
http://www.theguardian.com/news/datablog/2012/may/22/university-guide-2013-guardian-data
Examples
data(uniranks, package="GDAdata")
with(uniranks, table(UniGroup))
library(ggplot2)
ggplot(uniranks, aes(x=NSSTeaching, y=NSSFeedback)) + geom_point()
ggplot(uniranks, aes(x=UniGroup, y=SpendPerStudent)) + geom_boxplot()