-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathyahtzee_Nour.Rmd
325 lines (218 loc) · 9.59 KB
/
yahtzee_Nour.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
---
title: "Loops and basic functions"
author: ""
date: "September 27, 2017"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
In this project we will consider loops and basic functions. To begin we will work on using loops to simulate a famous dice game: Yahtzee.
# Part 1: Yahtzee
In the game of Yahtzee, five dice are tossed, and various combinations of numbers, similar to poker hands, are assigned point values. In the game, dice can be selected and re-tossed, but we will focus on calculating the probabilities for the first toss only. We will also deal only with the "lower half" of the score card in the game. For the interested student, continuing this project to account for the complete rules of play would be an entertaining challenge.
Anyone not familiar with Yahtzee should try a web search for the rules of the game. Some sites have applets that let you play online.
All you really need to know for this lesson, though, is which combinations are counted. We will call these "hands," as the combinations in poker are called. The hands in Yahtzee are:
- Three of a Kind (three of one number and two others that are different)
- Full House (three of one number and two of another number)
- Four of a Kind (four of one number and one other)
- Yahtzee (five of the same number)
- Small Straight (four consecutive numbers)
- Large Straight (five consecutive numbers)
- Chance (anything that does not fit the above patterns)
We begin by simulating dice rolls. For example we can simulate one roll of a die to be:
```{r}
sample(1:6, size=1, replace=T)
```
### Problem 1: Dice Rolls
Using a loop create a matrix of 10 rolls all consisting of 5 dice in each roll.
```{r}
x <- matrix(, nrow=10, ncol=5)
for (i in 1:10)
{
x[i,] <- sample(1:6, size=5, replace=T)
}
x
```
### Prolem 2: Sorting Dice Rolls
Your output will look similar to what is show below.
```
x
[,1] [,2] [,3] [,4] [,5]
[1,] 1 5 2 4 1
[2,] 2 3 3 4 6
[3,] 4 5 5 2 5
[4,] 4 1 4 1 5
[5,] 4 4 3 5 3
[6,] 3 4 3 5 3
[7,] 3 4 2 6 2
[8,] 4 3 5 2 2
[9,] 4 4 4 6 2
[10,] 6 6 1 4 6
```
We will need to sort this. The following function is called a *Bubble Sort*. Comment on every part of the code to explain what it is doing:
```
bubble_sort = function(vec) {
no_passes = 0 # set counter
while(1) {
no_swaps = 0 # set counter for the number of columes to break the loop when it ends counting the colums.
for (j in 1 : (length(vec) - 1 - no_passes)) { # locate the first integer in the range of vector length
if (vec[j] > vec[j + 1]) { # compare each two consecutive numbers in the row.
s = vec[j] # store the fisrt number if it passed the condition of being bigger than the next number.
vec[j] = vec[j+1] # swap the location between the two intgers.
vec[j+1] = s # store the next integer in the memory.
no_swaps = no_swaps + 1 # counter of swaps .
}
}
no_passes = no_passes + 1 # counter of passes
if(no_swaps == 0) break # the break works when there is zero swaps happening.
}
vec
}
bubble_sort(x[1,])
```
```{r}
bubblesort <- function(x) {
# x is initially the input vector and will be
# modified to form the output
if (length(x) < 2) return (x)
# last is the last element to compare with
for(last in length(x):2) {
for(first in 1:(last - 1)) {
if(x[first] > x[first + 1]) { # swap the pair
save <- x[first]
x[first] <- x[first + 1]
x[first + 1] <- save
}
}
}
return (x)
}
```
### Problem 3: Applying the Sort with dice rolls
Apply Bubble sort to each row of your matrix to sort the data. (You may wish to do this within your loop so the rows are sorted as they get added.)
```{r}
for (i in 1:10) {
x[i,] = bubblesort(x[i,])
}
x
````
### Problem 4: Looking for Yahtzee Combinations
Sorting the dice means that all dice that are equal will be next to each other. Thus, to check for a Yahtzee, all we need is to find out if x1=x5. If x1 and x5 are the same, it is not possible (in sorted order) for the numbers in between to be different.
Some examples of Four of a Kind (after sorting) are:
```
1 2 2 2 2
2 2 2 2 3
```
As you can see, either x1=x4 or x2=x5. If it is not a Yahtzee, then these two conditions will identify Four of a Kind.
When it comes to Three of a Kind, we run into a little complication. If we follow the strategy used for Four of a Kind, we would check if x1=x3, x2=x4, or x3=x5. Consider the following examples:
```
1 1 1 2 3
1 2 2 2 5
2 4 5 5 5
```
These would all be correctly identified. But what about:
```
1 1 1 2 2
3 3 5 5 5
```
As you can see, these would all fulfill the first and third conditions proposed above, but they should be classified as Full House. Therefore we also need to check for a Full House in these cases. The following identification routine checks for these types of Hands. At each stage, we have to be very careful that all possibilities are accounted for.
```{r}
x <- bubblesort(sample(1:6, size=5, replace=T)) #1. yahtzee > 4ofKind > large straight > full house > small straight > 3ofKind > chance
x <- c(2,3,4,5,5)
if (x[1]==x[5]){
hand="Yahtzee"
} else if (x[1]==x[4] | x[2]==x[5]) {
hand="4ofKind"
}else if (x[5]== x[4]+1 & x[4]==x[3]+1 & x[3]==x[2]+1 & x[2]==x[1+1]) {
hand = "Large Straight" # if(x[1]==1 & sum(x)==15 | x[1]==2 & sum(x)==17)
} else if (x[1]==x[3] & x[4]==x[5] | x[3]==x[5] & x[1] == x[2]){
hand= "Full House"
}else if (x[5]== x[4]+1 & x[4]==x[3]+1 & x[3]==x[2]+1 | x[4]==x[3]+1 & x[3]==x[2]+1 & x[2]==x[1+1]) {
hand = "small Straight"
} else if (x[1]==x[3]| x[2]==x[4] | x[3]==x[5]){
hand="3ofKind"
} else
hand="Chance"
hand
```
The above code lays this out for Yahtzee and 4 of a kind. Find statements that work for the rest. Make sets of x and test them to make sure they work.
### Problem 5: Simulating 100,000 dice rolls
```{r}
for (i in 1:100000){
x <- bubblesort(sample(1:6, size=5, replace=T))
if (x[1]==x[5]){
hand[i]="Yahtzee"
} else if (x[1]==x[4] | x[2]==x[5]) {
hand[i]="4ofKind"
}else if (x[5]== x[4]+1 & x[4]==x[3]+1 & x[3]==x[2]+1 & x[2]==x[1+1]) {
hand[i] = "Large Straight" # if(x[1]==1 & sum(x)==15 | x[1]==2 & sum(x)==17)
} else if (x[1]==x[3] & x[4]==x[5] | x[3]==x[5] & x[1] == x[2]){
hand[i]= "Full House"
}else if (x[5]== x[4]+1 & x[4]==x[3]+1 & x[3]==x[2]+1 | x[4]==x[3]+1 & x[3]==x[2]+1 & x[2]==x[1]+1) {
hand[i] = "small Straight"
} else if (x[1]==x[3]| x[2]==x[4] | x[3]==x[5]){
hand[i]="3ofKind"
} else
hand[i]="Chance" }
table(hand) / 100000
```
Put this all together and simulate this game 100,000 times and check your probabilities. You should get the following:
--------------------------------
Hand Probability
----------------- --------------
Yahtzee .0008
Four of a Kind .0193
Three of a Kind .1543
Full House .0386
Large Straight .0309
Small Straight .1235
Chance .6326
-------------------------------------------------------
## Part II: PHP 2560 Only
Standardizing a variable means subtracting the mean, and then dividing by the standard deviation. Let’s use a loop to standardize the numeric columns in the [Western Collaborative Group Study](https://clinicaltrials.gov/ct2/show/NCT00005174). This study began in 1960 with 3154 men ages 39-59, who were employed in one of 11 California based companies. They were followed until 1969 during this time, 257 of these men developed coronary heart disease (CHD). You can read this data in with the code below. You can access this dataset with the following code:
```
suppressMessages(library(foreign))
wcgs <- read.dta("https://drive.google.com/uc?export=download&id=0B8CsRLdwqzbzYWxfN3ExQllBQkU")
```
The data has the following variables:
WCGS has the following variables:
-----------------------------------------------------------
Name Description
------- -------------------------------------------
id Subject identification number
age Age in years
height Height in inches
weight Weight in lbs.
sbp Systolic blood pressure in mm
dbp Diastolic blood pressure in mm Hg
chol Fasting serum cholesterol in mm
behpat Behavior
1 A1
2 A2
3 B3
4 B4
ncigs Cigarettes per day
dibpat Behavior
1 type A
2 type B
chd69 Coronary heart disease
1 Yes
0 no
typechd Type of CHD
1 myocardial infarction or death
2 silent myocardial infarction
3 angina perctoris
time169 Time of CHD event or end of follow-up
arcus Arcus senilis
0 absent
1 present
bmi Body Mass Index
-----------------------------------------------------------
### Question 6: Standardize Function
A. Create a function called standardize.me() that takes a numeric vector as an argument, and returns the standardized version of the vector.
B. Assign all the numeric columns of the original WCGS dataset to a new dataset called WCGS.new.
C. Using a loop and your new function, standardize all the variables WCGS.new dataset.
D. What should the mean and standard deviation of all your new standardized variables be? Test your prediction by running a loop
### Question 7: Looping to Calculate
A. Using a loop, calculate the mean weight of the subjects separated by the type of CHD they have.
B. Now do the same thing, but now don’t use a loop