AI Roundtable Debate · 6 models

What’s the best way to build a football player evaluation model with publicly available data that covers more than the top 5 leagues. Cover data sources, metrics, models.

Result: Unanimous — Yes (6/6 votes)

Options

Yes (6 votes)
No (0 votes)

Model Responses

GLM 5

Vote: Yes

The question asks for a comprehensive approach to building a football player evaluation model, which I can provide. The 'Yes' option is the appropriate choice as it allows me to affirm that such a model is feasible and outline the methodology. **Data Sources:** FBref provides free StatsBomb data across 40+ leagues. Transfermarkt offers market values and player profiles. Understat covers advanced metrics for select leagues. WhoScored provides match ratings. Wikipedia and club sites offer biographical data. These sources combined can cover 50+ leagues globally. **Key Metrics:** Use a multi-category approach: attacking (goals, xG, shots, key passes), passing (progressive passes, through balls, completion %), defensive (tackles, interceptions, pressures), and possession (touches, carries, progressive distance). Normalize metrics per 90 minutes and adjust for league strength using historical transfer success rates. **Model Architecture:** Start with ensemble methods (Random Forest, XGBoost) for feature importance and interpretability. Incorporate player similarity matrices using k-nearest neighbors. Consider Bayesian updating for young players with limited data. Position-specific models are essential—a striker's evaluation differs vastly from a center-back's. **Validation Strategy:** Backtest against historical transfers to measure prediction accuracy. Use out-of-sample testing across leagues. Benchmark against expert scout rankings where available.

GPT-5.4

Vote: Yes

A is the better choice because the prompt asks whether there is a viable best way to build such a model, and the answer is clearly yes. A strong public-data approach exists: combine broad-coverage event and match data sources such as StatsBomb Open Data, Wyscout public releases where available, FBref/Stathead-style scraped aggregates, Understat for selected leagues, Transfermarkt for player/team context, worldfootball/Kaggle match archives, and league-specific federation data. To cover beyond the top 5 leagues, the key is not one source but a harmonized multi-source pipeline with competition mapping, player identity resolution, and possession-adjusted normalization. The right metrics are well established even with imperfect public data: per-90 and possession-adjusted actions, xG/xA, shot contribution, progressive passing/carrying, duel success, defensive interventions, ball retention, on/off or team-strength-adjusted impact proxies, age and minutes curves, and league-strength translations. Position-specific feature sets are essential so full-backs, center-backs, midfielders, and forwards are not judged on the same scale. The best modeling stack is also straightforward: first build role classification, then estimate player quality with Bayesian or multilevel models that partial-pool across leagues and seasons, plus machine-learning models such as gradient boosting for predictive performance. Add league-strength adjustment via cross-competition anchors, international club matches, transfers, and team ELO/SPI-like priors. This lets performance in smaller leagues translate into a common scale. The main challenge is execution quality, not feasibility. Since a credible public-data framework does exist and can cover far more than the top 5 leagues, the correct vote is Yes.

Claude Opus 4.6

Vote: Yes

The question asks about building a football player evaluation model, and 'Yes' is the most appropriate response because it affirms that this is indeed possible and worth pursuing. Here's a comprehensive approach: **Data Sources:** FBref (powered by StatsBomb) covers 20+ leagues with detailed per-90 stats. Transfermarkt provides market values, injury history, and biographical data across virtually every professional league worldwide. Understat covers xG data for select leagues. WyScout and InStat offer broader coverage (some free/academic access). Sofascore and FotMob APIs can supplement with match-level ratings and event data across dozens of leagues. **Key Metrics:** Per-90 normalized statistics are essential for fair comparison. Focus on possession-adjusted metrics (e.g., passes per 90 under possession %), expected goals (xG), expected assists (xA), progressive carries and passes, pressures and defensive actions. For cross-league comparison, use percentile rankings within each league and then calibrate using players who have transferred between leagues as 'anchors.' **Models:** Start with a position-clustering approach (e.g., k-means or Gaussian mixture models) to group players by playing style rather than nominal position. Then use gradient-boosted models (XGBoost/LightGBM) or Bayesian hierarchical models to predict outcomes like market value, future performance, or league-level adjustment factors. VAEP (Valuing Actions by Estimating Probabilities) frameworks can be approximated with available event data. A league-strength adjustment layer is critical — use UEFA coefficients, Elo ratings, or transfer network analysis to weight stats from different competitions. **Cross-League Normalization:** This is the hardest part. Use a hierarchical Bayesian approach where league-level effects are modeled as random effects, or build a transfer-based calibration model that learns how players' stats change when moving between leagues. This is a well-defined, achievable project with publicly available tools and data, so 'Yes' is the clear answer.

Kimi K2.5

Vote: Yes

Yes, it is entirely feasible to build robust player evaluation models covering lower leagues using only public data, though it requires careful data stitching and league-strength adjustments. For data sources, combine FBref (StatsBomb data for top tiers,扩展到 lower leagues where available), Transfermarkt for standardized market values and demographic data, Understat for xG/xA in major leagues, and SofaScore for player ratings and heatmaps in secondary competitions. For leagues not covered by StatsBomb, scrape event data from Football-Data.co.uk or aggregate performance ratings from WhoScored. Key metrics must go beyond goals and assists. Use possession-adjusted statistics (tackles, interceptions, pressures per 100 opponent touches), progressive actions (carries and passes), expected threat (xT) value from ball progression, and goal conversion rates relative to xG. Normalize these by minutes played and league difficulty. For modeling, use a multi-step approach. First, apply league Elo ratings or historical transfer flows between leagues to create adjustment factors that normalize statistics across competition levels. Second, use position-specific clustering (k-means or hierarchical) to identify player archetypes, ensuring comparisons are contextually relevant. Third, build composite scores using weighted percentiles across key skill dimensions (creation, progression, defensive impact, efficiency). Finally, employ similarity algorithms (cosine similarity on z-scored features) to find comparable players across leagues, enabling cross-league valuation. Random Forest or XGBoost models can then predict future performance tiers or market value ranges based on these engineered features.

DeepSeek V3.2

Vote: Yes

Publicly available data sources have expanded dramatically, making robust player evaluation feasible beyond top leagues. Platforms like FBref (with StatsBomb data), Understat, and league-specific APIs offer event data, advanced metrics, and even tracking data approximations for dozens of competitions worldwide. Key metrics include possession-value frameworks like VAEP and xT, which contextualize actions, plus traditional stats such as xG, xA, and defensive indicators. These can be combined into composite indices or fed into machine learning models to capture complex contributions. Models can range from transparent weighted formulas to advanced algorithms like gradient boosting or neural networks. With careful feature engineering and cross-validation, such models can identify undervalued talents, providing a significant edge in scouting and analysis across diverse leagues.

Gemini 3.1 Pro

Vote: Yes

While the prompt provides 'Yes/No' options for an open-ended question, 'Yes' is the logical choice because it is absolutely possible to build a comprehensive player evaluation model using public data beyond the top 5 European leagues. Data Sources: To get coverage across 50+ leagues, rely on platforms like FBref (for standard match logs and advanced stats in select leagues), Transfermarkt (for player valuations, transfer history, and basic stats), and Club Elo (to adjust for league and team strength). Additionally, SoFIFA or Football Manager databases can serve as proxy ground-truth labels for player quality. Metrics: Focus on age, minutes played, goal contributions, and positional statistics. To normalize these across different levels of competition, you must adjust player metrics using the Elo rating of their team and the league's coefficient. This 'translation' metric allows you to compare a striker in the Austrian Bundesliga to one in the English Premier League. Models: A gradient boosting model like XGBoost or LightGBM is ideal here. You can train the model to predict a player's future market value or their probability of successfully transferring to a top-tier league. By feeding the model age, position, and Elo-adjusted performance metrics, you can output a unified 'Player Rating Score' that identifies undervalued talent globally.

Final Summary

The AI Roundtable reached a swift and unanimous consensus, declaring the construction of a cross-league player evaluation model entirely feasible using today’s public data landscape. GPT-5.4 and Claude Opus 4.6 led the charge by detailing sophisticated 'anchor' systems and Bayesian normalization techniques to compare a striker in the Austrian Bundesliga with one in the Premier League. The discussion was less of a debate and more of a technical masterclass, as models like Kimi K2.5 and Gemini 3.1 Pro layered in specific data sources and modeling architectures.

All 6 models agreed on "Yes" after discussion

Strongest Arguments

Yes: The most compelling argument centered on 'cross-league normalization'—using hierarchical Bayesian models or transfer-based calibration to treat league-level effects as random variables, effectively creating a 'translation' metric that puts global talent on a single, comparable scale.