Skip to content
Projects

Internal research projects.

A small number of focused projects pushing specific frontiers — formal verification for multi-agent safety, quantum error correction at higher temperatures, and autonomous materials discovery. The discipline is to run few projects, run them deeply, and abandon them on stated criteria rather than on attention drift.

The discipline of focus

Why three projects, not thirty.

The default failure mode for a research-driven company is to spread effort across many promising directions until none of them clear the threshold from promising to load-bearing. The history of industrial research is full of cautions on this point. Bell Labs in its productive decades held a small number of focused programs at any given time — solid-state physics, the transistor effort, the Unix group, the information-theory line under Shannon — and a much larger penumbra of free exploration that was deliberately buffered from delivery pressure. PARC, in its peak decade, ran on the same shape: a small number of named programs (Smalltalk, the Alto, the laser printer) with sustained engineering investment, surrounded by exploratory work that did not carry quarterly deliverables. Both labs are studied today not because they ran many threads but because they ran few threads well.

The pattern is older than industrial research and broader than computing. Brooks's essay No Silver Bullet (Brooks, 1986) makes the case in software-engineering terms: the productivity gains of the prior decade had come from attacking accidental complexity, and the next decade's gains would have to come from attacking essential complexity, which does not yield to scale. Apik's reading of that argument: the open problems on which we work are essential-complexity problems. Spreading effort thin against them produces shallow work that looks like progress on a quarterly cadence and does not aggregate. The opposite discipline — three projects, sustained over multi-year horizons, with explicit success and abandonment criteria — is the only structure under which essential-complexity work tends to land.

This is the position from which the project portfolio below was selected. It is also the position from which the portfolio is bounded: we deliberately decline to add a fourth or fifth project until one of the existing three ships, sunsets, or spins out. The constraint produces a slower-looking roadmap and a higher-converting one.

What counts as a project

The bar.

An Apik project is not a research strand and not a product. A research strand is an open question on a pillar page — for example, "scalable interpretability" on the AI Safety pillar — that an individual or a small team carries on an ongoing basis. A product is a customer-facing surface that ships under our Responsible Development Policy and lives or dies by deployment metrics. A project sits between the two: it is an engineering investment with a multi-year horizon and a deliverable that is neither a paper nor a customer-facing release, but rather an artifact — a verified policy core, a higher-temperature qubit substrate, a closed-loop materials platform — on which subsequent products and research depend.

Three conditions have to hold for a research investment to graduate from strand to project. First, the work has to cross the threshold from research into engineering — meaning the central uncertainty is no longer "does this work in principle" but "does this work at the engineering tolerance the application requires." Second, the work has to be load-bearing for at least one of the three planetary-scale outcomes Apik is built around: AI safety, abundance economics, or the civilizational stack. A project that improves a benchmark by three points but does not materially change the disposition of any of those three outcomes is a paper, not a project. Third, the work has to have an abandonment criterion — a stated condition under which we would call it and shut it down — that is specific enough to be argued with at year three.

Why these three

Each project, named.

Aegis — verified envelopes around agentic action.

The thesis: behavioral evaluation of large models tells us what a system did on a sample of inputs, not what it can do across an input space. Formal verification of a frontier model, end-to-end, remains out of reach; verifying a small, mechanically-checkable policy layer around the model's externally-visible action surface is not. Project Aegis takes that claim seriously. The technical reference point is Klein et al. (2009)'s formal verification of the seL4 microkernel: a small privileged interface, mechanically checked, on which a much larger and unverified system runs. Aegis applies that pattern to agent-mediated side effects — tool calls, file writes, network egress, shell access — with a goal of shipping a verified policy core that mediates them.

Q-Core — error correction at higher temperatures.

The thesis: large-scale fault-tolerant quantum computing requires logical qubits with error rates well below the physical-qubit error rate, and the state of the art is now below the threshold (Acharya et al., 2024), at the cost of thousands of physical qubits per logical qubit at cryogenic operating temperatures. The cost structure of fault-tolerant machines is therefore dominated by cryogenic plant and routing density. A substrate change that raises the operating temperature even modestly — from millikelvin to a few kelvin — would change the cost structure substantially. Project Q-Core is our investment in that substrate question. Success is not a working fault-tolerant computer; it is a substrate demonstration that meaningfully changes the engineering envelope.

Synthesis — closed-loop autonomous materials discovery.

The thesis: the rate-limiting step in materials science is not the design of candidate compounds but the loop time between design, synthesis, characterization, and re-design. Closed-loop autonomous platforms — robots that perform synthesis and characterization without a human in the loop — have demonstrated round-the-clock cadence on specific synthetic chemistries (Burger et al., 2020, the Liverpool mobile chemist; Coley and colleagues' autonomous synthesis platforms). Project Synthesis applies the same loop pattern to the materials substrates the civilization stack depends on — better photovoltaics, longer-cycling batteries, lower-emission cement, structural materials for off-world deployment. Success is a platform that closes the discovery loop on a chosen substrate at a sustained cadence, with the generated data publishable at the same level as the artifact.

Open by default

What publishes, what stays internal.

The default disposition of project output at Apik is publication. Reference architectures, evaluation methodologies, negative results, and engineering writeups publish on the news surface and, where appropriate, into peer-reviewed venues. The reasoning is direct: a research-driven company that does not publish is functionally a closed lab, and closed labs concentrate the information asymmetry the transparency framework is designed to counteract.

The exception clause is narrow. We hold work internal when capability uplift from publication exceeds the disclosure benefit — concretely, when releasing a specific evaluation harness or a specific verification technique would materially help a less-safety-disciplined actor. The clause is invoked rarely, decided by the safety council on the documented procedure in the Responsible Development Policy, and the existence of the held-back artifact is itself disclosed even when the artifact is not. The discipline owes more to Anthropic's RSP-style framing of disclosure than to the older "publish everything or publish nothing" binary; the question is always which fork increases total safety, and the answer is usually publication.

Velocity

Success, sunset, and spin-out.

Each project carries three named milestones at planning time. Reference architecture publishable at 18 months: a written description of the system, its assumptions, its evaluations, and its known limitations, at the level of detail external researchers can engage with. Demonstrated component at 36 months: a working artifact — a verified policy core, an operating qubit substrate, a closed-loop discovery run — that an external party could in principle reproduce. Spin-out, integration, or sunset by year five: the project either becomes a product line, becomes infrastructure other projects depend on, finds a better home, or is shut down with a written postmortem.

The fifth-year clause is the one that matters most. Most industrial research programs fail not by failing visibly but by drifting past their stated horizons. The discipline of writing the abandonment criterion at the start, and reading it against current state on a fixed cadence, is the antidote. Sunsetting a project well — with a postmortem published, with the team rotated to other projects, with the IP and lessons absorbed — is treated as a successful outcome at Apik, distinguishable from a successful demonstration only by which side of the abandonment criterion the result fell on.

Related across the site