Based on the count of datasets added to cart, get up to a combined 45% discount! Check out discounts on historical and in-season datasets.
Football / Soccer Data in Excel Spreadsheets
Which leagues are covered in the football/soccer datasets at BigDataBall?
Our dataset currently focuses on the “Big 5” European leagues: the English Premier League, German Bundesliga, Italian Serie A, French Ligue 1, and Spanish La Liga. It includes detailed team-level match statistics exclusively for league matches in these top-tier leagues.
2. What type of data is included in the dataset?
Our datasets include all main statistics for each football match in the covered leagues, starting from the 2019-2020 season. This encompasses offense and defense stats; goals scored, total number of shots, expected goals (xG), tackles, interceptions, cards and many more. Additionally, it features betting-related information such as moneyline odds, over/unders, and Asian handicap, alongside Elo ratings for each team. Finally, it has match-specific information such as team lineups, referee and venue.
3. How are the betting odds in the dataset calculated?
The odds indicated in our dataset represent the closing odds and are calculated as the average of all odds from top bookmakers.
4. What is the Elo rating mentioned in the dataset?
The Elo rating in our dataset represents the team’s Elo rating at the start of each match. It’s a widely recognized metric used to measure the strength of a team based on their past game results, valuable for understanding team performance trends and extensively used in sports analytics and betting.
5. What do “Expected Goals (xG)” and “Expected Assists (x) mean in the dataset?
Expected Goals (xG) is a statistical measure used to assess the quality of scoring opportunities. It assigns a probability to each goal-scoring chance, indicating how likely it is that the chance would be scored. A high xG value suggests a high likelihood of scoring. This metric helps understand how many goals a team or player should have scored on average, given the quality and quantity of the shots taken.
Expected Assists (xA) measures the likelihood that a given pass will become an assist. It considers factors such as the type of pass, the location from where it was made, and the subsequent actions of the receiver. xA provides insight into a player’s playmaking abilities, indicating their effectiveness in creating goal-scoring opportunities for teammates.
6. What can you do with the BigDataBall’s Football Datasets?
Predict match outcomes or scores:
* Train machine learning models like random forests, neural networks using features like shots, possession, xG, past results to predict match outcome (win/lose/draw)
* Tune models using training and test sets to identify best parameters and features
* Evaluate model accuracy on unseen data to test predictive ability
* Extend models to predict exact scorelines using similar features
Analyze team strengths/weaknesses:
* Aggregate stats like shots, tackles, possession over a season by team
* Compare averages vs opponents to identify strengths/weaknesses
* See if teams excel at shooting, keeping possession, set pieces etc
* Identify areas for improvement based on weaker metrics
Evaluate player performance:
* Track stats like goals, assists, pass %, tackles for each player
* Rank players within position or across league based on contribution
* Compare players to teammates in similar positions to quantify impact
* Build player ratings/indexes based on key stats that define good performance
Simulate seasons:
* Use Elo or other team ratings to represent team strengths
* Simulate matches between teams based on ratings and historical match data
* Update ratings after each simulated match
* Run 100s of simulations and track final table, points, and milestones
Study betting odds:
* Gather odds data, match stats, and actual results
* Analyze market movements compared to team metrics
* Build models to predict odds or find mispriced bets
* Backtest models historcially to evaluate profitability
Gain tactical insights:
* Look at lineup data like formations, positions played over time
* Identify patterns in setups, roles, and relationships
* See how tactics have evolved in terms of style, shape, personnel
* Relate to match stats to quantify tactical impact
Analyze referee performance:
* Track referee assignments and match stats like cards, penalties
* Compare rates of disciplinary actions to averages
* Assess if certain refs have biases or affect outcomes
* Relate to team playstyles to see if mismatch affects decisions
DISCOUNTS
Discounts for Historical Datasets
(1) BULK PURCHASE DISCOUNT
Get a bigger rate as you add more historical seasons to your cart!
• 2 to 5 seasons → 5% OFF
• 6 to 10 seasons → 10% OFF
• 11 seasons and more → 15% OFF
(2) MEMBER DISCOUNT
Additional 30% OFF on historical datasets: Log in, or signup for a free account to use your member discount.