# Free Football Predictions Generated by Neural Networks

## Next 10 football predictions

Now our primary task is to integrate our software with new provider. Mobile version is almost done and we expect to lunch it this Sunday. As requested on prosoccer. The page is ugly and lack mobile version, but our team work hard to rewrite all football scripts and prediction model and design is less important for now. Please note that there is no tips and odds for now.

Just percents of each outcome. Beta version will be online within 2 weeks. The development of www. Math is the basis for almost all sports including professional football. We decide to develop unique software that can predict outcome of match using several well-known models for predictions. The software use huge soccer database over , football results for prediction modeling.

All this calculations are used on Poisson distribution as a football betting system. A new window will open, where we need to set the learning parameters, learning rate and momentum. Next thing we should do is determine the values of learning parameters, learning rate and momentum. Learning rate is one of the parameters which governs how fast a neural network learns and how effective the training is. Let us assume that the weight of some synapse in the partially trained network is 0.

When the network is introduced with a new training sample, the training algorithm demands the synapse to change its weight to 0. If we update the weight straightaway, the neural network will definitely learn the new sample, but it tends to forget all the samples it had learnt previously.

This is because the current weight 0. So we do not directly change the weight to 0. So, the weight of the synapse gets changed to 0.

Proceeding this way, all the training samples are trained in some random order. Learning rate is a value ranging from zero to unity. Choosing a value very close to zero, requires a large number of training cycles.

This makes the training process extremely slow. On the other hand, if the learning rate is very large, the weights diverge and the objective error function heavily oscillates and the network reaches a state where no useful training takes place. The momentum parameter is used to prevent the system from converging to a local minimum or saddle point. A high momentum parameter can also help to increase the speed of convergence of the system.

However, setting the momentum parameter too high can create a risk of overshooting the minimum, which can cause the system to become unstable. A momentum coefficient that is too low cannot reliably avoid local minima, and can also slow down the training of the system.

There are two stopping criteria. One is maximum error and second one is maximum number of learning iterations, which are intuitively clear. We can see in pictures below that training was unsuccesfull. After iterations Neural Network failed to learn problem with error less than 0, We can test this network but error will be greater than expected.

After the network is trained, we click 'Test', in order to see the total error, and all the individual errors. Individual error are also pretty big. Lets look at last result.

Values of output are 0. With this information we can conclude that this Neural Network is not good enough. So let we try something else. In network window click Randomize button and then click Train button. That means that we will set value of 0. Increasing the value of learning rate we conclude that the objective error function oscillates more and the network reaches a state where no useful training takes place. In the table below for the next three sessions we will present the results of other trainings for the first architecture.

For other trainings is not given graphic. Training results for the first architecture. Based on data from Table 1 can be seen that regardless of the parameters of training error do not falls below a specified level, even if we train the network through a different number of iterations.

This may be due to the small number of hidden neurons. In the following solution we will increase the number of hidden neurons. Next Neural Network will have same number of input and output neurons but different number of neurons in hidden layer. We will use 4 hidden layer neurons. Network in named PremierLeague2. First training course, of second architecture, we will start with extremely low values of learning rate and momentum.

First click on button 'Train'. In 'Set Learning parameters' dialog, field 'set Stopping criteria enter 0. In order to graphically display, the training of this network, was clearer. In field 'set Learning parameters', enter 0. After entering this values click on button 'Train'. During the testing we unsuccessfully trained the neural network named PremierLeague2. The summary of the results are shown in the Table 2.

From the graphics above can be seen from iteration to iteration there are no large shifts in the prediction. More accurate in predicting, fluctuations are very small and the values are around 0. Reason for such a small fluctuation is that the learning rate is very close to zero.

Also because of such a small coefficient, of the learning rate, neural network has no the ability to learn quickly. On the other hand small value of momentum slows down the training of the system. Like in last attempt we will try extremely high values of learning rate and momentum. Compared to previous training, we will just replace the values of learning rate and momentum. For learning rate we will enter 0.

Other options will be the same as in the previous training. In picture below we see distinction between small values and large values of learning parameters. We set the momentum parameter too high and we have created a risk of overshooting the minimum, which caused the system to become unstable.

On the other hand, the learning rate is very large, the weights diverge and the objective error function heavily oscillates and the network reaches a state where no useful training takes place. In previous two attempts we used extreme values of learning parameters, so this time we will use recommended values.

Following useful conclusion can be drawn from this training. We can see that the architecture of four hidden neurons is not appropriate for this training set, because for continuing the training of the neural network we do not get the desired approximation of max error. Error is still much higher than desired level. The oscillations are less than second training which was expected because the parameters of training is less than in the previous case , but on the other side neural network has no the ability to learn quickly and the training of the system is slow just like in first training.

In the table below for the previous three sessions we will present the results of all trainings for the second architecture. Training results for the second architecture. After several tries with different architecture and parameters we got results that are given in table 3. There is interesting pattern in data. If we look number of hidden neurons and total net eror we can see that higher number of neurons leads us to lesser total net error.

This neural network will contain 16 neurons in hidden layer, as we see in picture below, and same options as previous networks. First we will try with recommended values for learning rate and momentum.

During the testing we successfully trained the neural network named PremierLeague6. The summary of the results are shown at the final table at the end of this article. The total net error slowly descends but with high oscilation and finally stops when reaches a level lower than a given 0.

Total Mean Square Error measures the average of the squares of the "errors". The error is the amount by which the value implied by the estimator differs from the quantity to be estimated.

An mean square error of zero, meaning that the estimator predicts observations of the parameter with perfect accuracy, is the ideal, but is practically never possible. The unbiased model with the smallest mean square error is generally interpreted as best explaining the variability in the observations. The test showed that total mean square is 0.

The goal of experimental design is to construct experiments in such a way that when the observations are analyzed, the mean square error is close to zero relative to the magnitude of at least one of the estimated treatment effects. Now we need to examine all the individual errors for every single instance and check if there are any extreme values.

When you have a large data set, individual testing requires a lot of time. Instead of testing observations we will random choose 5 observations which will be subjected to individual testing.

Three following table will show the value of input, output and errors in 5 randomly selected observations. These values are taken from the window Test Results. In introduction we mentioned that result can belong to one of three groups. So if home team won output would be 1, 0, 0, if away team wins it would be 0, 0, 1 and they played draw output would be 0, 1, 0. After completion of testing would be ideal if the value of output after the test were the same as the output values before testing.

As with other statistical methods, and classification using neural networks include errors that arise during the approximation. Individual error between the original and the assessed values are shown in Table 4. For observation 3 and 63 we can say that there is reasonable mistake in classification.

Therefor, we will continue training neural network by increasing learning rate to 0. At the beginning we said that the goal is try to quickly find the smallest network that converges and then refine the answer by working back from there. Since we find the smallest neural network do the following:. After iterations total net error is 0.

But what is the most interesting are the values of errors of observations. They are given in table 4. Because this network learned data perfectly individual error will be equals to zero as we see in table 4.

If you do not get the desired results, continue to gradually increase the training parameters. The neural network will definitely learn the new sample, and it would not forget all the samples it had learnt previously. When the training is complete, you will want to check the network performance.

A learning neural network is expected to extract rules from a finite set of examples. It is often the case that the neural network memorizes the training data well, but fails to generate correct output for some of the new test data.

Therefore, it is desirable to come up with some form of regularization. One form of regularization is to split the training set into a new training set and a validation set. After each step through the new training set, the neural network is evaluated on the validation set.

The network with the best performance on the validation set is then used for actual testing. Then you have to compute the validation error rate periodically during training and stop training when the validation error rate starts to go up. However, validation error is not a good estimate of the generalization error, if your initial set consists of a relatively small number of instances. Our initial set, we named it PremierLeague , consists of only instances.

This is the insufficient number of instances to perform validation. In this case instead validation we will use a generalization as a form of regularization. One way to get appropriate estimate of the generalization error is to run the neural network on the test set of data that is not used at all during the training process.

The generalization error is usually defined as the expected value of the square of the difference between the learned function and the exact target. In the following examples we will check the generalization error, such as from the example to the example we will increase the number of instances in the training set, which we use for training, and we will decrease the number of instances in the sets that we used for testing.

First group will be called PremierLeague70, and second PremierLeague Unlike previous training, now there is no need to create new neural network. Advanced Training Techniques consist in the fact that we examine the performance of existing architectures, using a new training and test set of data. Satisfactory results we found using architecture PremierLeague6.

By the end of this article we will use not only this architecture, but also the parameters of the training that we used in this architecture previously which brought us desired results. But before you open an existing architecture, create new training sets. First training set name it PremierLeague70 and second one name it PremierLeague Now open neural network PremierLeague6 , select training set PremierLeague70 and in new network window press button 'Train'.

The parameters that we now need to set will be the same as the ones in previous training attempt: We will not limit the maximum number of iterations, and we will check 'Display error graph', as we want the see how the error changes throughout the iteration sequence. Then press 'Train' button again and see what will happen. Although, problem contained fewer instances it took iterations to train this network.

Because it managed to converge to total net error of 0. Test the network After successful training the neural network, we can test the same to discover wheter the results will be as good as the previous testing.