NVDA$1,847+3.2%MSFT$512+1.1%GOOGL$199-0.4%META$728+2.7%AMD$184-1.2%TSM$212+0.6%PLTR$98+4.1%AI IDX4,821+1.9%NVDA$1,847+3.2%MSFT$512+1.1%GOOGL$199-0.4%META$728+2.7%AMD$184-1.2%TSM$212+0.6%PLTR$98+4.1%AI IDX4,821+1.9%
PKT
SEED
Markets: OPENRefresh: Models tracked: Active deals: Regulatory actions: Sources:
Research
Live Terminal·Updated 3mo ago
10
Papers
5.9K
Total Citations
489.5
Median Citations
0
With Code
1.3K
Most Cited
Trending PapersBY PUBLISHED AT
#1
cs.CVcs.CLcs.AI
1.3K cites▲ 28/7d/7d
Gemini 2.0: A Family of Highly Capable Multimodal Models
Gemini Team, Oriol Vinyals, Jeff Dean·4mo ago
We present Gemini 2.0, a family of multimodal models achieving state-of-the-art performance across text, image, video, and audio understanding with native tool use and long-context… Read more →
#2
cs.ROcs.CV
892 cites▲ 31/7d/7d
World Models for Autonomous Driving: A Comprehensive Survey
Dragomir Anguelov, Yuning Chai·5mo ago
This survey covers 340+ papers on world models applied to autonomous driving, categorizing approaches by architecture, training paradigm, and evaluation methodology, identifying ke… Read more →
#3
cs.LGcs.AI
847 cites▲ 42/7d/7d
Mamba-2: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu, Tri Dao·3mo ago·ICML 2026
We present Mamba-2, extending selective state space models with improved hardware-aware algorithms, achieving transformer-quality results at linear time complexity across language,… Read more →
#4
cs.CLcs.AI
623 cites▲ 89/7d/7d
DeepSeek-V3: Scaling Open-Source Language Models to Frontier Performance
Daya Guo, Dejian Yang, Haowei Zhang·3mo ago
DeepSeek-V3 demonstrates that open-weight models can match proprietary frontier systems through aggressive mixture-of-experts scaling, novel training stability techniques, and high… Read more →
#5
cs.AIcs.LG
534 cites▲ 22/7d/7d
The Alignment Tax: Measuring Performance Degradation from Safety Training
Liane Lovitt, Yuntao Bai, Sam McCandlish·3mo ago
We systematically quantify the capability loss incurred by RLHF and constitutional training across 12 frontier models, finding that modern techniques reduce the tax to under 2% on… Read more →
#6
cs.LGcs.DC
445 cites▲ 19/7d/7d
Efficient Inference on Consumer Hardware: Quantization Beyond 4-bit
Song Han, Ji Lin, William Dally·4mo ago·MLSys 2026
We demonstrate 2-bit quantization of 70B+ parameter models with less than 1% quality loss through a novel mixed-precision scheme, enabling frontier-class inference on consumer GPUs… Read more →
#7
cs.AIcs.CL
412 cites▲ 112/7d/7d
Constitutional AI 2.0: Scalable Alignment via Principled Self-Supervision
Amanda Askell, Yuntao Bai, Deep Ganguli·3mo ago
We introduce CAI 2.0, eliminating the need for human feedback in alignment training while achieving stronger safety guarantees through recursive constitutional evaluation and autom… Read more →
#8
cs.CL
356 cites▲ 15/7d/7d
Scaling Laws for Neural Machine Translation Revisited
Angela Fan, Mike Lewis·4mo ago·ACL 2026
We revisit scaling laws for machine translation and find that previously established power-law relationships break down above 100B parameters, requiring new architectural innovatio… Read more →
#9
cs.AI
298 cites▲ 34/7d/7d
Reward Hacking in RLHF: A Comprehensive Taxonomy and Mitigation Framework
Stuart Russell, Deep Ganguli, Cassidy Laidlaw·3mo ago
We present the first comprehensive taxonomy of reward hacking failure modes in RLHF systems, documenting 47 distinct attack vectors and proposing a unified mitigation framework val… Read more →
#10
cs.LG
189 cites▲ 95/7d/7d
Mixture-of-Experts at Scale: Lessons from Training a Trillion-Parameter Model
Guillaume Lample, Timothée Lacroix·3mo ago
We report on engineering and scientific lessons from training a 1T-parameter sparse MoE model, including novel load balancing, expert specialization dynamics, and inference serving… Read more →