# Rise of Kingdoms: Saving gold keys

Here it is. The answer to the age-old question. The question that shows up at least once a day on Reddit and Discord: "should I save my gold keys and open them all at once?". The assumption being that opening gold keys together in a batch gives more/better rewards than opening them one at a time. In this post we will test whether that is true. As always, you can find the code and dataset for this analysis in my GitHub repo here. Samples and expected distributions We'll begin with a quick primer on expected/population distributions and how they relate…

# Rise of Kingdoms: Show Your Love Event

In a previous post I discussed the point requirements of the last holiday event (Valentine's Day - Pledge of Thorns). Here, I will do the same analysis with the new holiday event (Show Your Love) to confirm those findings. I'll be drawing comparisons between the two events throughout. The supporting code and data for this analysis can be found here. Data collection for event I began by using 800 items (Carnations) and recorded how many points each item provided. Below (left) you can see the frequency distribution of the points provided for each item, as well as the distribution from…

# Rise of Kingdoms: Holiday Events

Rise of Kingdoms holiday events come around every few months. The last one involved collecting ornaments for the Christmas tree, and the current Valentine's Day event requires collection of roses. Using a rose gives you points, and you need a certain number of points to complete the event. We know how many points are needed in total to complete the event (2888 in the case of Valentine's Day), but the problem is that each rose gives you a varying number of points due to critical hits. For example, using one rose could give you 1 point or it could give…

# Overwatch Data Visualization

In a previous post I talked about an Overwatch dataset I've been collecting from my ranked games. Before running any statistical analysis on the Overwatch data, it is usually a good idea to explore and visualize the dataset. This helps us get a general sense of data patterns, which can help generate hypotheses that can then be tested with more formal statistical models. My own personal philosophy about data visualization/exploration is to approach it with targeted questions of interest. It is all too easy to fall down the rabbit hole of plotting absolutely everything, without a sense of what the…

# Overwatch Ranked Data

I started playing Overwatch towards the end of Season 2 and I thought it would be interesting to start collecting data from my own ranked games. The goal was to maintain a dataset of Overwatch ranked data that I could analyze to better understand how skill rating (SR) changes as a function of, for example, win/loss streaks and medals. I have made the dataset public in case anyone wants to run their own statistics or visualizations. However, I only have a few seasons worth of data as I took a break from the game after Season 4, and picked it up again…

# Colour image classification (CIFAR-10) using a CNN

As I mentioned in a previous post, a convolutional neural network (CNN) can be used to classify colour images in much the same way as grey scale classification. The way to achieve this is by utilizing the depth dimension of our input tensors and kernels. In this example I'll be using the CIFAR-10 dataset, which consists of 32x32 colour images belonging to 10 different classes. You can see a few examples of each class in the following image from the CIFAR-10 website: Although previously I've talked about the Lasagne and nolearn packages (here and here), extending those to colour images…

# Visualizing Convolutional Neural Networks using nolearn

We previously talked about Convolutional Neural Networks (CNN) and how use them to recognize handwritten digits using Lasagne. While we can manually extract kernel parameters to visualize weights and activation maps (as discussed in the previous post), the nolearn package offers an easy way to visualize different elements of CNNs. nolearn is a wrapper around Lasagne (which itself is a wrapper around Theano), and offers some nice visualization options such as plotting occlusion maps to help diagnose the performance of a CNN model. Additionally, nolearn offers a very high level API that makes model training even simpler than with Lasagne. In…

# Handwritten digit recognition with a CNN using Lasagne

Following my overview of Convolutional Neural Networks (CNN) in a previous post, now lets build a CNN model to 1) classify images of handwritten digits, and 2) see what is learned by this type of model. Handwritten digit recognition is the 'Hello World' example of the CNN world. I'll be using the MNIST database of handwritten digits, which you can find here. The MNIST database contains grey scale images of size 28x28 (pixels), each containing a handwritten number from 0-9 (inclusive). The goal: given a single image, how do we build a model that can accurately recognize the number that…

# Overview of Convolutional Neural Networks (CNN)

Regular feed-forward artificial neural networks (ANN), like the type featured below, allow us to learn higher order non-linear features, which typically results in improved prediction accuracy over smaller models like logistic regression. However, artificial neural networks have a number of problems that make them less ideal for certain types of problems. For example, imagine a case where we wanted to classify images of handwritten digits. An image is just a 2D array of pixel intensity values, so a small 28x28 pixel image has a total of 784 pixels. If we wanted to classify this using an ANN, we would flatten…

# XOR Logic Gate – Neural Networks (3/3)

(Part 3 of a series on logic gates) We have previously discussed OR logic gates and the importance of bias units in AND gates. Here, we will introduce the XOR gate and show why logistic regression can't model the non-linearity required for this particular problem. As always, the full code for these examples can be found in my GitHub repository here. XOR gates output True if either of the inputs are True, but not both. It acts like a more specific version of the OR gate: Input 1 Input 2 Output 0 0 0 0 1 1 1 0 1…