The SGLang and Miles teams announced day-zero open-source support for DeepSeek V4, covering both inference and reinforcement-learning training of V4's hybrid sparse-attention architecture, manifold-constrained hyper-connections, and FP4 expert weights. The serving stack pulls together ShadowRadix, a prefix cache that natively handles V4's hybrid attention layers; HiSparse, a hierarchical-memory sparse-attention engine that extends the KV cache to CPU; multi-token-prediction speculative decoding with in-graph metadata; Flash Compressor for IO-aware exact compression; Lightning TopK; and a hierarchical multi-stream overlap scheduler. On a thirty-thousand-token decode benchmark (a passage from Dream of the Red Chamber), SGLang reports a meaningful throughput edge over the closest open-source competitor at best-effort speculative configurations. The training side — Miles — implements V4 in Megatron-LM and supports the full DP / TP / SP / EP / PP / CP parallelism matrix, with custom kernels for the hybrid attention and the manifold-constrained hyper-connections, plus an end-to-end mixed-precision RL loop. Verified-RL pipelines on V4 land on launch day rather than waiting weeks for community ports, which is the most important practical point: the open-weights release stays usable. The post also previews a near-term roadmap focused on FP4 inference kernels for non-Hopper hardware (specifically Huawei Ascend and Blackwell variants), which is the missing piece for the Ascend-runnable claims that drove last week's coverage. Read alongside the prior runs on V4 itself, this is the day the model became fully deployable in production on the open stack — and it sets the floor for what 'open-weights serving' looks like for any subsequent frontier-scale Mixture-of-Experts release. The throughput lead matters less than the breadth: ShadowRadix and HiSparse both target features (prefix caching with hybrid attention, large-prompt KV that exceeds GPU memory) that previously required either Anthropic-style proprietary infra or substantial engineering on the part of the user. By bundling them with the Miles RL stack on day zero, the open stack closes most of the practical gap between open-weights V4 and a hosted V4 endpoint, removing one of the structural advantages closed labs still held over the open-weights competitor.