ITI and ARI lead a coalition urging Congress to enact AI Safety Institute legislation this year. This is a call for legislative action rather than an enacted law.
Ambiguous signal — this appears to be a media article about the U.K.’s AI safety approach, but no specific binding regulation, consultation, or official guidance issuance is stated in the provided record.
Broadband Breakfast reports that the AI Safety Institute has been renamed the Center for AI Standards and Innovation. The record does not specify any binding regulatory change or effective date.
Ambiguous signal — a news report describes the Trump administration rebranding an AI safety institute as part of an oversight move, but the record does not specify whether any binding regulation, consultation, or formal guidance was issued.
News report on Anthropic dropping Claude Gov for national security customers and a Trump administration rebranding of an AI Safety Institute. The record does not specify any binding regulation, consultation, or formal enforcement action.
Steering is a widely used technique for controlling large language models, yet its effects are often unstable and hard to predict. Existing theoretical accounts are largely based on the Linear Representation Hypothesis (LRH). While LRH assumes that concepts can be orthogonalized for lossless control, this idealized mapping fails in real representations and cannot account for the observed unpredict
Foundational Large Language Models (LLMs) demonstrate proficiency on a wide range of general tasks, and achieve remarkable results on various specialized tasks via domain-expert LLMs. With the ever-growing list of available LLMs, inference routers are being proposed to select the most appropriate LLM for each prompt. However, existing routing methods either optimize cost across weak-to-strong gene
Biomedical abstracts play a critical role in downstream NLP applications, such as information retrieval, biocuration, and biomedical knowledge discovery. However, a non-trivial number of biomedical articles do not have abstracts, diminishing the utility of these articles for downstream tasks. We propose DPR-BAG (Divide, Prompt, and Refine for Biomedical Abstract Generation), a training-free, zero-
Large language models (LLMs) are increasingly deployed through hosted APIs, making model extraction a practical threat to model ownership and service security. However, individual extraction queries often resemble benign requests, and existing evaluations often focus on single-query anomaly scoring or pure benign-versus-attacker user settings. We formulate model extraction monitoring as benign-cal
Policy-gradient methods usually optimize expected return, but many real world applications care about distributional properties of returns: tail risk, outlier robustness, or best-of-K discovery. We introduce OrderGrad, a family of likelihood-ratio and reparameterization gradient estimators for order-statistic objectives. OrderGrad optimizes finite-sample L-statistics, i.e., weighted averages of so
Late-stage capital is concentrating around infrastructure and frontier AI, with public-market scale at one end and large private checks at the other. The mix suggests investors are still rewarding compute-heavy platforms over narrower application bets.
Late-stage capital is concentrating around infrastructure and frontier AI, with public-market scale at one end and large private checks at the other. The mix suggests investors are still rewarding compute-heavy platforms over narrower application bets.
Late-stage capital is concentrating around infrastructure and frontier AI, with public-market scale at one end and large private checks at the other. The mix suggests investors are still rewarding compute-heavy platforms over narrower application bets.
The talent desk is light on true movement and heavier on role labels than reshaping hires or exits. That makes named-person changes the main signal to watch for whether teams are being reorganized or simply re-described.
The talent desk is light on true movement and heavier on role labels than reshaping hires or exits. That makes named-person changes the main signal to watch for whether teams are being reorganized or simply re-described.
The talent desk is light on true movement and heavier on role labels than reshaping hires or exits. That makes named-person changes the main signal to watch for whether teams are being reorganized or simply re-described.
Benchmark activity is narrow but useful: one model family is posting fresh scores across core reasoning and knowledge tests. The value today is less in a leaderboard shake-up than in the baseline it sets for comparison.
Benchmark activity is narrow but useful: one model family is posting fresh scores across core reasoning and knowledge tests. The value today is less in a leaderboard shake-up than in the baseline it sets for comparison.
Benchmark activity is narrow but useful: one model family is posting fresh scores across core reasoning and knowledge tests. The value today is less in a leaderboard shake-up than in the baseline it sets for comparison.