NVDA$1,847+3.2%MSFT$512+1.1%GOOGL$199-0.4%META$728+2.7%AMD$184-1.2%TSM$212+0.6%PLTR$98+4.1%AI IDX4,821+1.9%NVDA$1,847+3.2%MSFT$512+1.1%GOOGL$199-0.4%META$728+2.7%AMD$184-1.2%TSM$212+0.6%PLTR$98+4.1%AI IDX4,821+1.9%
PKT
SEED
Markets: OPENRefresh: Models tracked: Active deals: Regulatory actions: Sources:
← Back to latest

Benchmark scores shift as research broadens retrieval

14 items · 4 desks · 7 min read

APPSI-139: A Parallel Corpus of English Application Privacy Policy Summarization and Interpretation

Privacy policies are essential for users to understand how service providers handle their personal data. However, these documents are often long and complex, as well as filled with technobabble and legalese, causing users to unknowingly accept terms that may even contradict the law. While summarizing and interpreting these privacy policies is crucial, there is a lack of high-quality English parall

RESEARCH

FinCARDS: Card-Based Analyst Reranking for Financial Document Question Answering

Financial question answering (QA) over long corporate filings requires evidence to satisfy strict constraints on entities, financial metrics, fiscal periods, and numeric values. However, existing LLM-based rerankers primarily optimize semantic relevance, leading to unstable rankings and opaque decisions on long documents. We propose FinCards, a structured reranking framework that reframes financia

RESEARCH

Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction

Agentic web search increasingly faces two distinct demands: deep reasoning over a single target, and structured aggregation across many entities and heterogeneous sources. Current systems struggle on both fronts. Breadth-oriented tasks demand schema-aligned outputs with wide coverage and cross-entity consistency, while depth-oriented tasks require coherent reasoning over long, branching search tra

RESEARCH

CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model

Electroencephalography (EEG) provides real-time insights into brain activity and supports diverse applications in neuroscience. While EEG foundation models (EFMs) have emerged to address the scalability issues of task-specific models, current approaches still yield clinically uninterpretable and weakly discriminative representations, inefficiently capturing global dependencies and neglecting impor

RESEARCH

Context Matters: Peer-Aware Student Behavioral Engagement Measurement via VLM Action Parsing and LLM Sequence Classification

Understanding student behavior in the classroom is essential to improve both pedagogical quality and student engagement. Existing methods for predicting student engagement typically require substantial annotated data to model the diversity of student behaviors, yet privacy concerns often restrict researchers to their own proprietary datasets. Moreover, the classroom context, represented in peers'

RESEARCH