NIST page describing AI research on identifying and managing harmful bias in AI systems. The record appears informational/non-binding rather than a finalized rule or enforcement action.
NIST announced that CAISI signed a CRADA with OpenMined to enable secure AI evaluations. The record appears to be an announcement rather than a binding regulation or formal enforcement action.
NIST news item about Caisi signing an MOU with GSA to boost AI evaluation science for federal procurement. The record appears to be an announcement rather than a binding regulation or enforcement action.
NIST published a new report addressing challenges in monitoring deployed AI systems. The record appears to be advisory/non-binding research rather than a binding regulation.
NIST blog post discussing how AI standards can support trustworthiness in AI-enabled healthcare contexts. Non-binding informational guidance; no clear regulatory or enforcement action stated.
Hierarchical reinforcement learning (RL) has the potential to enable effective decision-making over long timescales. Existing approaches, while promising, have yet to realize the benefits of large-scale training. In this work, we identify and solve several key challenges in scaling online hierarchical RL to high-throughput environments. We propose Scalable Option Learning (SOL), a highly scalable
In federated learning (FL), local personalization of models has received significant attention, yet personalized fine-tuning of foundation models remains underexplored. In particular, there is a lack of understanding in the literature on how to personalize foundation models in settings where there exist heterogeneity not only in data, but also in tasks and modalities across the clients. To address
Human action recognition is pivotal in computer vision, with applications ranging from surveillance to human-robot interaction. Despite the effectiveness of supervised skeleton-based methods, their reliance on exhaustive annotation limits generalization to novel actions. Zero-Shot Skeleton Action Recognition (ZSAR) emerges as a promising paradigm, yet it faces challenges due to the spectral bias o
Autonomous Machine Learning Engineering (MLE) requires agents to perform sustained, iterative optimization over long horizons. While recent LLM-based agents show promise, current prompt-based agents for MLE suffer from behavioral stagnation due to frozen parameters. Although Reinforcement Learning (RL) offers a remedy, applying it to MLE is hindered by prohibitive execution latency and inefficient
Recent LLM-based agents have closed substantial portions of the scientific discovery loop in software-only machine-learning research, in chemistry, and in biology. Extending the same loop to high-fidelity physical simulators is harder, because solver completion does not imply physical validity and many failure modes appear only in field-level imagery rather than in solver logs. We present AI CFD S
Capital is concentrating around a single platform company, with repeated entries pointing to an outsized financing event. That kind of clustering matters because it can reset expectations for late-stage AI funding and adjacent deal flow.
Capital is concentrating around a single platform company, with repeated entries pointing to an outsized financing event. That kind of clustering matters because it can reset expectations for late-stage AI funding and adjacent deal flow.
Capital is concentrating around a single platform company, with repeated entries pointing to an outsized financing event. That kind of clustering matters because it can reset expectations for late-stage AI funding and adjacent deal flow.
The talent desk is thin, but even sparse movement at a major platform deserves attention when it touches executive leadership. Named leadership changes are the kind that can alter product cadence, hiring, and internal authority.
The talent desk is thin, but even sparse movement at a major platform deserves attention when it touches executive leadership. Named leadership changes are the kind that can alter product cadence, hiring, and internal authority.
The talent desk is thin, but even sparse movement at a major platform deserves attention when it touches executive leadership. Named leadership changes are the kind that can alter product cadence, hiring, and internal authority.
Benchmark activity is narrow but directional: one system is posting scores across multiple GAIA levels, giving a clearer read on agentic task performance. The significance lies in surface coverage, not just a single score.
Benchmark activity is narrow but directional: one system is posting scores across multiple GAIA levels, giving a clearer read on agentic task performance. The significance lies in surface coverage, not just a single score.
Benchmark activity is narrow but directional: one system is posting scores across multiple GAIA levels, giving a clearer read on agentic task performance. The significance lies in surface coverage, not just a single score.