Upcube Research — Overview & Focus Areas
Upcube Research

Pioneering research toward useful, safe, long-context systems

We explore models and methods that help people solve real problems: assistants that keep 256K context, cite sources, call tools, speak naturally, and generate on-brand visuals—while staying reviewable and safe.

256K context Grounded search & citations Tool calling & agents Voice & images Safety & evaluations

Focus areas

Language & Interaction (Upcube-Instruct)

  • Long-context dialogue, structured outputs (JSON/Markdown/LaTeX).
  • Grounded answers with quote-level citations and uncertainty notes.
  • Multi-turn state tracking across help, research, and build flows.

Reasoning & Planning (Upcube-Base + Agents)

  • Tool choice & multi-step execution for real tasks.
  • Program synthesis, code repair, and data pipelines.
  • Evaluation suites for plan quality, robustness, and cost.

Vision & Image Generation

  • Photorealistic & stylized outputs with aspect/style controls.
  • Brand anchors & captioning for accessibility.
  • Auditability: prompt/seed/config provenance.

Voice & Audio

  • Low-latency, full-duplex conversations with barge-in.
  • Expressive synthesis and controllable prosody.
  • ASR robustness, diarization, and privacy modes.

Scaling & Efficiency

  • Mixture-of-Experts with ~32B activated (≈1T total) for efficiency.
  • Token-efficient pretraining; stability via optimizer & clipping.
  • Throughput/latency work on vLLM, TensorRT-LLM, and SGLang.

Safety, Alignment & Governance

  • Policy-tuned prompts, refusal calibration, and evals.
  • Red-teaming for sensitive domains; incident playbooks.
  • Role-based controls, audit logs, and data governance.

Recent research highlights

Upcube context upgrade
New weights unlock 256K context across chat, search, voice, and images—longer threads, larger inputs, clearer answers.
Update
Agentic evaluation toolkit
Benchmarks for tool selection, plan repair, and recovery from API errors—reporting success@k and cost curves.
Toolkit
Grounded search improvements
Better quote attribution, duplicate collapse, and coverage metrics (precision/recall vs. curated corpora).
Methods

For full lists of papers, notes, datasets, and leaderboards, see the index above.

How we do research

Training & Post-training

  • Mixture-of-Experts with routing tuned for stability and cost.
  • Post-training with preference data and rubric-based feedback.
  • On-policy rollouts for verifiable rewards (math, coding) and rubric-judged tasks.

Evaluations

  • Knowledge, reasoning, math/STEM, coding, and tool-use suites.
  • Grounding: citation precision/recall, quote accuracy, coverage.
  • Safety: refusal calibration, bias audits, privacy leak tests.

Reproducibility & Reporting

  • Seeds, configs, and environment manifests captured per run.
  • Action logs: tools called, parameters, and evidence paths.
  • Exportable artifacts: HTML/PDF reports and .yaml pipelines.

Deployment & Systems

  • Inference engines: vLLM, SGLang, KTransformers, TensorRT-LLM.
  • Latency-first voice paths and streaming JSON outputs.
  • Multi-channel delivery with RBAC and audit trails.

Research perspective

“Safely aligning capable systems is a scientific challenge and an engineering discipline. Our aim is practical reliability: assistants that cite sources, ask before acting, and make it easy to see what happened and why.”
— Upcube Research Team

Streams & artifacts

Language & Interaction
  • Dialogue state tracking & memory compression for 256K.
  • Structured outputs with schema adherence and fallbacks.
  • Cross-doc grounding and contradiction detection.
Reasoning & Planning
  • Plan synthesis, tool graphs, and execution recovery.
  • Program-aided reasoning for math and code tasks.
  • Cost-aware routing and partial-credit metrics.
Vision & Image
  • Style anchoring and text-to-figure generation.
  • Safety filters and brand-guard constraints.
  • Captioning and alt-text quality metrics.
Voice & Audio
  • Streaming ASR with diarization and punctuation repair.
  • Expressive TTS with controllable prosody and timing.
  • Barge-in reliability and latency budgets.
Safety & Alignment
  • Policy tuning, red-teaming, and scenario playbooks.
  • Refusal calibration and uncertainty surfacing.
  • Data minimization, retention windows, and audits.
Systems & Efficiency
  • Throughput, cache strategies, and memory planning.
  • Routing stability and optimizer research.
  • Traceability across distributed inference.

Get involved

Collaborations

  • Joint evaluations and red-team exercises.
  • Domain-specific datasets and ground truth building.
  • Tool/plugin ecosystems for agentic tasks.
Propose a collaboration

Open materials

  • Research notes, benchmarks, and example pipelines.
  • Reproduction guides and config packs.
  • Safety checklists and incident templates.
Browse the index

Contact Upcube Research





Where to find us

Upcube Inc.
New York, NY 10005 · USA
upcubeco@gmail.com

Quick links