PHBench
PHBench: The Open Benchmark for Predicting Series A Funding from Product Hunt Launch Signals
PHBench is a comprehensive open benchmark designed to predict Series A funding outcomes using signals from Product Hunt launches. Featuring a dataset of over 67,000 launches across seven years, it provides a leaderboard for machine learning models and LLMs to identify high-potential startups.
2026-05-17
--K
PHBench Product Information
PHBench: An Open Benchmark for Predicting Series A Success via Product Hunt Signals
In the high-stakes world of venture capital, identifying the next breakout success is often described as finding a needle in a haystack. PHBench is a specialized open benchmark designed to solve this problem by predicting Series A funding outcomes using 24-hour signals from Product Hunt launches. By leveraging seven years of historical data, PHBench allows researchers and data scientists to train, rank, and evaluate models on their ability to identify future startup winners.
Identifying a future Series A startup at the moment of its public launch is notoriously difficult. According to data provided by PHBench, only 0.78% of all Product Hunt launches successfully raise a Series A within 18 months of their launch date. PHBench provides the necessary framework to move beyond guesswork, offering a 4.7× lift over random selection through its most advanced predictive models.
What’s PHBench?
PHBench is an open-source benchmark and dataset specifically engineered for the prediction of Series A funding based on signals captured during a startup's first 24 hours on the Product Hunt platform. It serves as a rigorous testing ground for machine learning models, including Gradient Boosting Machines (GBMs) and Large Language Models (LLMs), to determine which features of a launch truly correlate with long-term financial success.
The core of PHBench is a dataset consisting of 67,292 launches spanning from 2019 to 2025. Within this massive pool, the benchmark tracks 528 verified "winners"—startups that successfully secured a Series A term sheet within an 18-month window following their launch. By providing a hash-pinned test set and manually audited labels, PHBench ensures that model evaluations are reproducible, citable, and grounded in real-world outcomes.
Features of the PHBench Framework
PHBench is more than just a dataset; it is a comprehensive evaluation ecosystem for startup success prediction. Below are the key features that define the benchmark:
1. Extensive Historical Dataset
PHBench covers a seven-year period (2019–2025), capturing the evolution of the startup ecosystem. This long-term view allows models to account for shifts in market trends, such as the rise of AI-focused products.
2. Rigorous Evaluation Metrics
The PHBench Leaderboard ranks models based on several critical performance metrics, including:
- F0.5 Score: A metric that places more emphasis on precision than recall.
- Average Precision (AP): Measuring the quality of the ranked predictions.
- AUC (Area Under the Curve): Evaluating the model's ability to distinguish between Series A winners and non-winners.
- Lift: Measuring how much better the model performs compared to a random baseline.
3. Feature Engineering and Signal Identification
PHBench identifies which Product Hunt signals are predictive and which are merely "noise." The team has engineered 61 distinct features, identifying 12 key signals that provide significant predictive power for Series A outcomes.
4. Transparent Methodology
Every label in the PHBench dataset is manually audited to ensure accuracy. The methodology is fully documented, and the benchmark is an extension of the VCBench project, maintaining high standards for startup data integrity.
Predictive Signals vs. Noise in PHBench
One of the most valuable aspects of PHBench is its analysis of which launch day signals actually matter for future Series A funding. Through XGBoost gain importance analysis, the benchmark reveals a clear distinction between "Signal" and "Noise."
High-Importance Signals
- Daily Rank on Launch: This is the most critical signal, providing a 3.50× lift. PHBench findings show that products finishing in the top 3 on their launch day are significantly more likely to raise a Series A. The "rank_bucket" captures this non-linearly, where moving from #4 to #1 provides more lift than moving from #10 to #4.
- Maker Follower Count (Log): The established audience and reputation of the product's makers.
- B2B Topic Cluster: Startups categorized within B2B sectors show a higher correlation with Series A success.
- AI Topic × Year Interaction: Reflects the increasing importance of artificial intelligence in recent funding cycles.
- Votes-per-Comment Ratio: A measure of engagement quality over quantity.
Launch Noise (Non-Predictive Factors)
Interestingly, some of the most commonly watched metrics on Product Hunt have no statistically significant effect on predicting a Series A. These include:
- Raw Upvote Count (Log): High upvotes do not always equate to venture-scale business potential.
- Launch Day of the Week: Whether a product launches on a Tuesday or a Sunday does not impact its long-term funding prospects.
- Tagline Word Count: The length of the marketing hook has no predictive value for Series A success.
Use Case for PHBench
PHBench serves multiple stakeholders within the technology and finance ecosystems:
For Venture Capitalists (VCs)
VCs can use the weekly predictions generated by PHBench models to source high-potential leads. By identifying launches with the highest confidence scores for future Series A funding, investment teams can focus their outreach on startups that have already demonstrated significant "alpha" during their public debut.
For Machine Learning Researchers
Researchers can use the PHBench leaderboard to test the efficacy of different architectures. The benchmark includes baselines for Logistic Regression (LR), XGBoost (XGB), LightGBM (LGBM), and even Zero-shot LLMs like Google's Gemini. It provides a real-world classification challenge with highly imbalanced classes (0.78% positive rate).
For Startup Founders
Founders can analyze the signals prioritized by PHBench to understand what market-leading indicators look like. While upvotes are a common goal, the benchmark highlights that daily rank and maker reputation are much stronger indicators of institutional interest.
FAQ
Q: How many launches are included in the PHBench dataset? A: The dataset includes 67,292 launches tracked over a seven-year period from 2019 to 2025.
Q: What is the primary goal of the PHBench models? A: The models aim to predict whether a startup will successfully raise a Series A funding round within 18 months of its Product Hunt launch.
Q: Which model currently leads the PHBench leaderboard? A: The Top-3 Ensemble model, created by the authors, holds the top spot with an F0.5 score of 0.284 and an AUC of 0.840.
Q: Does a high upvote count guarantee a Series A? A: No. PHBench data indicates that raw upvote count is considered "noise" and does not have a significant effect on the base rate of Series A funding.
Q: Is the test set public?
A: The phbench_public_test.csv is used for final submissions and is held out to ensure the integrity of the leaderboard rankings.
Q: How can I cite this research? A: You can cite the work as Ihlamur et al. (2026), "PHBench: A Benchmark for Predicting Startup Series A Funding from Product Hunt Launch Signals," available on arXiv (2605.02974).








