Briefings Dashboard Deal Flow Model Tracker AI Tools Pricing Policy Monitor Talent Tracker Research Robotics Data API

SEED

│Markets: OPEN│Refresh: │Models tracked: …│Active deals: …│Regulatory actions: …│Sources: …

Daily Briefing · Sunday, May 10, 2026

Capital and benchmark updates dominate a mixed cycle

14 items · 4 desks · 7 min read

Research5

DeEscalWild: A Real-World Benchmark for Automated De-Escalation Training with SLMs

Effective de-escalation is critical for law enforcement safety and community trust, yet traditional training methods lack scalability and realism. While Large Language Models (LLMs) enable dynamic, open-ended simulations, their substantial computational footprint renders them impractical for deployment on the lightweight, portable hardware required for immersive field training. Small Language Mode

RESEARCH

Frequency-Enhanced Diffusion Models: Curriculum-Guided Semantic Alignment for Zero-Shot Skeleton Action Recognition

Human action recognition is pivotal in computer vision, with applications ranging from surveillance to human-robot interaction. Despite the effectiveness of supervised skeleton-based methods, their reliance on exhaustive annotation limits generalization to novel actions. Zero-Shot Skeleton Action Recognition (ZSAR) emerges as a promising paradigm, yet it faces challenges due to the spectral bias o

RESEARCH

AI CFD Scientist: Toward Open-Ended Computational Fluid Dynamics Discovery with Physics-Aware AI Agents

Recent LLM-based agents have closed substantial portions of the scientific discovery loop in software-only machine-learning research, in chemistry, and in biology. Extending the same loop to high-fidelity physical simulators is harder, because solver completion does not imply physical validity and many failure modes appear only in field-level imagery rather than in solver logs. We present AI CFD S

RESEARCH

AceGRPO: Adaptive Curriculum Enhanced Group Relative Policy Optimization for Autonomous Machine Learning Engineering

Autonomous Machine Learning Engineering (MLE) requires agents to perform sustained, iterative optimization over long horizons. While recent LLM-based agents show promise, current prompt-based agents for MLE suffer from behavioral stagnation due to frozen parameters. Although Reinforcement Learning (RL) offers a remedy, applying it to MLE is hindered by prohibitive execution latency and inefficient

RESEARCH

Prediction-Based Markov Violation Scores for Detecting Non-Markovian Observations in Reinforcement Learning

Reinforcement learning algorithms assume that observations satisfy the Markov property, yet real-world sensors frequently violate this assumption through correlated noise, latency, or partial observability. Standard performance metrics conflate Markov breakdowns with other sources of suboptimality, leaving practitioners without tools to detect such violations. This paper introduces a prediction-ba

RESEARCH

Funding3

View in Deal Flow →

the United Arab Emirates

Capital is concentrating around a small set of AI-adjacent names rather than spreading across many rounds. The pattern points to large, strategically structured checks at the top of the market, with little evidence of broad early-stage dispersion.

FUNDING

OpenAI and SoftBank

FUNDING

OpenAI nonprofit

FUNDING

Talent Moves3

View in Talent Tracker →

professor

Named moves are sparse, but the desk still matters when a person or role shift signals institutional reorganization. The current bundle is too thin to show a broader hiring or departure pattern.

TALENT MOVES

first witness

Named moves are sparse, but the desk still matters when a person or role shift signals institutional reorganization. The current bundle is too thin to show a broader hiring or departure pattern.

TALENT MOVES

first witness

Named moves are sparse, but the desk still matters when a person or role shift signals institutional reorganization. The current bundle is too thin to show a broader hiring or departure pattern.

TALENT MOVES

Benchmarks3

View in Model Tracker →

Grok-1 — aa_intelligence_index

Fresh benchmark entries span intelligence, multimodal coding, and verified software repair, keeping evaluation pressure on both general and task-specific systems. The mix matters because it tracks where score gains are still being recorded, and where the ceiling is moving.

BENCHMARKS

GUIRepair + o3 (2025-04-16) — swe_bench_multimodal

BENCHMARKS

OpenHands + 4x Scaled (2024-02-03) — swe_bench_verified

BENCHMARKS