Saturday, April 25, 2026

If this page is useful, please consider your support

Friday, April 24, 2026

MrDocs in the WildThe questions changed. For a long time, people asked about MrDocs in the abstract: what formats will it support, how will it handle templates, when will it be ready. Then, gradually, the questions became specific. Jean-Louis Leroy, the author of Boost.OpenMethod, became one of our most active sources of feedback. His library exercises corners of C++ that most projects never touch, which means MrDocs gets tested in ways we would not have anticipated. He wanted to know why his template specializations were not sorted correctly. He wanted macro support because Boost libraries rely heavily on macros. He hit a crash when his doc comments contained HTML tables. These are not theoretical questions about a tool that might exist someday. These are questions from someone who already generated documentation with MrDocs and needs it to work better. In our previous post, we described MrDocs transitioning from prototype to product. This post is about what happened when MrDocs went into the wild. Real Projects, Real Problems The Demo Page Breadcrumbs Without a Navigation File Coordinating Two Independent Extensions Edge Cases in the Wild Rendering and Output Under the Hood The MrDocs Website Exploring the Unknowns Reflection: Replacing Boilerplate with Introspection First Steps Toward Extensions Why We Discarded MrDocs-as-Compiler Contributor Experience Automating PR Reviews CI Infrastructure Test Infrastructure Acknowledgments and Reflections Real Projects, Real Problems %%{init: {"theme": "base", "themeVariables": {"primaryColor": "#e4eee8", "primaryBorderColor": "#affbd6", "primaryTextColor": "#000000", "lineColor": "#baf9d9", "secondaryColor": "#f0eae4", "tertiaryColor": "#ebeaf4", "fontSize": "14px"}}}%% mindmap root((Feedback)) First impressions Unstyled demos Custom stylesheets Navigation Orphaned pages Breadcrumbs AST edge cases Parameter packs Friend targets Detail namespaces Rendering Description ordering Code blocks Anchor links Runtime JS engine switch Compiler fallback The Demo Page Right after the previous post, where we announced the MVP and encouraged people to try MrDocs, we noticed the demos page was not doing us any favors. Someone shared MrDocs on a developer community and the website started getting traffic. The landing page looked polished, but visitors clicked through to the demos and saw raw, unstyled HTML: no fonts, no spacing, no colors. The HTML generator produced correct semantic markup, and that is technically the point: users are supposed to customize the output with their own stylesheets. But on the demos page, there was no stylesheet at all, and the result looked broken rather than customizable. The custom stylesheet system added five configuration options (stylesheets, linkcss, copycss, no-default-styles, stylesdir) so projects can match their own branding. A bundled default CSS now ships with MrDocs, and it was refined to remove gradients in favor of solid, readable backgrounds. Stylesheet commits 5fe30c1 feat: custom stylesheets 33d985c chore: version is 0.8.0 Breadcrumbs Without a Navigation File MrDocs generates thousands of reference pages, one per C++ symbol. We maintain an Antora extension, the antora-cpp-reference-extension, that integrates these pages into Antora-based documentation sites. But the generated pages end up orphaned from the navigation tree. Users found the navigation confusing: clicking on “boost” in the breadcrumb did not go where expected, and reference pages had no trail showing where they belonged in the hierarchy. The obvious fix would be to list every page in Antora’s nav.adoc, but maintaining a navigation file with thousands of entries that changes every time a symbol is added or removed is not practical. Worse, Antora renders the navigation file in the sidebar, so listing every reference page would flood the UI with thousands of entries. We discussed the problem extensively with the Antora maintainer on the Antora community chat. His position was clear: Antora was designed so that pages must be in the navigation file. Programmatic editing of navigation is not supported. That was not acceptable for us. We needed breadcrumbs that work for thousands of generated pages without polluting the sidebar or requiring a hand-maintained navigation file. The Antora author’s position was reasonable from his perspective (Antora is a general-purpose documentation tool, not a reference generator), but our use case was fundamentally different from what Antora was designed for. The antora-cpp-reference-extension now builds breadcrumbs independently from the navigation file. MrDocs generates reference pages in a directory structure that mirrors the C++ namespace hierarchy (boost/urls/segments_view.adoc lives inside boost/urls/). The extension uses this structure to reconstruct the breadcrumb trail: each directory maps to a namespace, and the page title (which is the symbol name) becomes the last breadcrumb entry. The result reads naturally: Reference > boost > urls > segments_view. Zero changes to the nav file. The sidebar stays clean. Breadcrumbs appear automatically and update when symbols are added or removed. Breadcrumb and reference extension commits antora-cpp-reference-extension ae95eb2 feat: synthesize reference breadcrumbs without nav files 10a4019 feat: add auto base URL detection 6a6c08b docs: auto-base-url option 4f7c79f refactor: enhance release asset validation Coordinating Two Independent Extensions The antora-cpp-reference-extension generates reference pages and breadcrumbs. The antora-cpp-tagfiles-extension resolves cross-library symbol links (so a reference to boost::system::error_code in Boost.URL’s docs links to the correct page in Boost.System’s docs). These are two independent Antora extensions running as separate jobs. The problem was that the reference extension generates tagfiles as a side effect of producing reference pages, and the tagfiles extension needs the most recent version of those tagfiles to resolve links correctly. MrDocs changes the tagfiles every time the corpus changes. Manually keeping them in sync was not sustainable: committing tagfiles to the repository meant they were always stale by the time the next build ran. We made the extensions coordinate directly. The reference extension now hands its tagfile to the tagfiles extension at build time, so the links always reflect the current state of the documentation. The reference extension also gained auto base URL detection, removing the need for manual path configuration when switching between development and production builds. Extension coordination commits antora-cpp-reference-extension 6e8ffcb feat: antora-cpp-tagfiles-extension coordination 8f12576 chore: version is 0.1.0 antora-cpp-tagfiles-extension 98eba40 feat: antora-cpp-reference-extension coordination 5a1723c feat: add global log level control for missing symbols 453f01b chore: version is 0.1.0 Edge Cases in the Wild As more libraries adopted MrDocs, edge cases in C++ symbol extraction surfaced. Boost.Beast exposed a duplicate ellipsis in parameter pack rendering (#1108, #1129): Before: T& emplace(Args...&&... args) After: T& emplace(Args&&... args) Boost.OpenMethod revealed that friend targets were not resolving correctly. Boost.Buffers uncovered a problem with detail namespaces: when a class inherits from a base in a hidden namespace, the inherited members appeared in the documentation but their doc comments were lost (#1107). We fixed this so derived classes inherit documentation from hidden bases. Unnamed structs also sparked an extended design discussion. When C++ code declares constexpr struct {} f{};, MrDocs needs a stable, unique name for hyperlinks. The team established a collaborative design process using shared documents, with Peter Dimov contributing an insight about C compatibility (typedef struct {} T; makes the struct named in C++). AST and metadata commits c85be75 fix: remove duplicate ellipsis in parameter pack expansion c3dbded fix(ast): prevent TU parent from including unmatched globals 76b7b43 fix(ast): canonicalize friend targets 05f5852 fix(metadata): copy impl-defined base docs 35cf1f6 fix: UsingSymbol is SymbolParent c406d57 fix: preserve extraction mode when copying members from derived classes 4e7ef04 fix: prevent infinite recursion when extracting non-regular base class 0a69301 fix: extract and fix some special member function helpers Rendering and Output Users noticed that the manual description of a symbol was buried below long member tables (#1105). On a class with many members, you had to scroll past the entire member listing before finding the author’s explanation of what the class does. We moved the description to appear immediately after the synopsis, matching what cppreference does. Other rendering issues included HTML code blocks not wrapped in tags, anchor links appearing when the wrapper element was missing, and the Handlebars template engine accumulating special name re-mappings that conflated different symbols. Rendering and output commits d90eae6 fix: hide anchor links when wrapper is not included 92491de fix: manual description comes before member lists 58bf524 fix: remove all special name re-mappings for Handlebars 2a75692 fix: HTML code blocks not wrapped in pre tags 1ebff32 fix: bottomUpTraverse() skips ListBlock items 7b118b1 fix: missing @return command in doc comment Under the Hood We fixed a compiler fallback issue where MrDocs failed when the compilation database referenced a compiler that was not available on the current machine, and corrected sanitizer flag propagation so that UBSan and TSan do not unnecessarily propagate to dependency builds. Build and toolchain commits 235f5c8 fix: fall back to system compilers when database compiler is unavailable f320581 fix: don’t pass sanitizer to dependency builds for UBSan/TSan The MrDocs Website While we were fixing the generated output, Robert Beeston and Julio Estrada were redesigning the MrDocs website. Robert led the design direction, working with a team to develop a visual identity that balances a distinctive retro aesthetic with modern readability, including a dark theme. Julio handled the implementation: mobile-responsive layout, UI styling improvements, cleaner backgrounds and styles, Open Graph and Twitter meta tags for social sharing, and a close button for the docs navigation on smaller screens. For a documentation tool, the website is the first thing potential users see. Having a polished, memorable landing page matters more than it might for other kinds of projects. Exploring the Unknowns The team made a deliberate choice: instead of following a traditional feature roadmap, we would focus on areas of uncertainty (#1113). These were open questions that blocked multiple design decisions at once: MrDocs-as-compiler (#1073): should MrDocs emit “object” files for later “linking,” like a compiler? Scripting extensions (#1128, #881): how should users extend and transform documentation output? Plugins (#58, #1044): how should third-party code register new generators? JSON-only MrDocs: should we add a JSON output format alongside (or replacing) the existing XML structured output? Reflection (#1114): how do we reduce the maintenance burden of the growing metadata model? Cross-linking (#1072): how do we reference symbols in other libraries? The motivation was practical. Each Boost library that adopted MrDocs had its own needs that could not be met by the core tool alone. Boost.URL has implementation_defined namespaces with internal code that should be hidden or transformed in the documentation. Boost.Capy has detail types that should be presented as user-facing types. Coroutines are represented as types in the AST but should be documented as functions. We want MrDocs to be smart enough, with project-specific extensions, that library authors do not have to do workarounds in the source code just to get the documentation right. Rather than hard-coding solutions for each library, the unknowns framework asked: what general mechanisms would let every library solve its own documentation problems? %%{init: {"theme": "base", "themeVariables": {"primaryColor": "#f7f9ff", "primaryBorderColor": "#9aa7e8", "primaryTextColor": "#1f2a44", "lineColor": "#b4bef2", "secondaryColor": "#fbf8ff", "tertiaryColor": "#ffffff", "fontSize": "14px"}}}%% mindmap root((Unknowns)) Scripting extensions JS helpers Lua Plugins Generator API DLL loading Reflection Boost.Describe MrDocs.Describe Cross-linking Tagfiles Antora coordination JSON-only MrDocs MrDocs-as-compiler Reflection: Replacing Boilerplate with Introspection MrDocs models many kinds of C++ symbols: functions, classes, namespaces, enums, typedefs, concepts, and more. Each symbol type has metadata, and every piece of code that touches that metadata had to enumerate all fields by hand. Adding a single field to a symbol type meant updating it in: Schema files that describe the metadata format Generators (HTML, AsciiDoc, XML) that produce the output Templates that render individual pages Operators like comparison functions, merge functions (e.g., merging symbols from different translation units when only one is documented), and equality checks Documentation describing the metadata The code itself that populates and transforms the metadata That is roughly ten to fifteen places per field, and missing one caused CI failures that blocked everyone. This was one of the unknowns we identified: how to reduce the maintenance burden as the data model grows. Worse, downstream users who had their own templates and extensions also had to learn about the new fields and update everything accordingly. Gennaro Prota, with his strong background in generic programming and metaprogramming, took ownership of the reflection problem. The work progressed through several stages: Integrate Boost.Describe into the metadata system, replacing hand-written serialization functions Add $meta.type and $meta.bases to all DOM objects so templates can introspect the corpus Replace the XML generator with a reflection-based one (no more hand-maintained XML output) Build a custom reflection system (MrDocs.Describe) tailored to our needs Replace per-type operators with a single generic template The result eliminated the second step entirely: adding a new field to a symbol type no longer requires touching ten other files. The description drives everything, and the serialization, comparison, and merge logic derive from it automatically. Boost.Describe and Boost.Mp11 are private dependencies that do not appear in public headers. Along the way, Gennaro also added function object support, fixed Markdown inline formatting and missing dependent array bounds. Reflection and metadata commits Reflection (Gennaro Prota) d490880 refactor(metadata): integrate Boost.Describe c4dd89a feat: add $meta.type and $meta.bases to all DOM objects d4a64ef fix: replace the XML generator with a reflection-based one 6ce961f refactor: add custom reflection facilities (MrDocs.Describe) eb68494 refactor: migrate all reflection consumers to MrDocs.Describe 8f5391b refactor: replace per-type merge() one-liners with a single generic template e749144 feat: make the reflection consumers public 1ed76ad refactor: replace most per-type tag_invoke overloads with a single generic template 0246935 refactor: replace per-type operator==() and operator () with a single generic template Features and fixes (Gennaro Prota) 93a5032 feat: add function object support f35ebcd fix: rendering of Markdown inline formatting and bullet lists 4ae305b fix: missing dependent array bounds in the output 72fba40 test: add golden tests for a partial class template specialization The reflection work is the foundation for everything that comes next: the extension system, the upcoming Lua scripting, and the metadata transformation pipeline. First Steps Toward Extensions MrDocs supports two extension points: JavaScript for Handlebars template helpers, and Lua for more powerful scripting. The JavaScript engine had been Duktape, but Duktape is no longer actively maintained and only supports ES5.1. We needed a replacement. We evaluated several alternatives (#881): Engine JS Support Windows/MSVC Size License QuickJS ES2023 No (clang-cl only) ~370 KB MIT PrimJS ES2019 No (POSIX only) ~370 KB MIT JerryScript ES5.1 + ES2022 subset Yes ~200 KB Apache 2.0 Escargot ES2025 subset Yes ~400-500 KB LGPL 2.1 MuJS ES5.1 Yes ~200-300 KB ISC Moddable XS ES2025 (~99%) Yes (via SDK) ~100-300 KB Apache/GPL/LGPL mJS Restricted ES6 Yes ~50-60 KB GPL 2.0 / Commercial Elk Minimal ES6 Yes ~20-30 KB GPL 2.0 / Commercial We first experimented with QuickJS, which had the best ES support. But it requires C11 features like and __int128 that plain MSVC does not support. On Windows, users would need Clang with the Visual Studio runtime. PrimJS was POSIX-only. We settled on JerryScript: it supports Windows and MSVC natively, has a small footprint (~200 KB), and covers enough of ES2022 for template helpers. Unlike most alternatives in the table, JerryScript was designed from the ground up to be embedded in other applications, which makes it more like Lua and less like engines that target browsers or standalone runtimes. The JavaScript helpers extension was a single commit but a large one: 85 files changed, 4,287 insertions. The work included: Replacing Duktape with JerryScript across the entire codebase, including build scripts, CMake recipes, and third-party patches Rewriting the C++ JavaScript bindings (JavaScript.hpp and JavaScript.cpp) with shared context lifetime, safer value accessors, and clearer error messages A layered addon system where projects provide JavaScript helpers in a directory structure (generator/common/helpers/ for shared helpers, generator/html/helpers/ for format-specific ones). Multiple addon directories can be layered, so a project’s helpers override or extend the defaults. Golden tests for extension output (js-helper/, js-helper-layering/) to verify that helpers produce the expected documentation 1,335 lines of new JavaScript binding tests covering the engine lifecycle, value conversion, error handling, and helper registration Combined with the public API for registering custom generators, MrDocs now supports customization beyond templates. A library like Boost.Capy could write an extension that transforms its coroutine types into function documentation, without any changes to MrDocs itself. %%{init: {"theme": "base", "themeVariables": {"primaryColor": "#f7f9ff", "primaryBorderColor": "#9aa7e8", "primaryTextColor": "#1f2a44", "lineColor": "#b4bef2", "secondaryColor": "#fbf8ff", "tertiaryColor": "#ffffff", "fontSize": "14px"}}}%% flowchart LR A[Clang AST] --> B[Extraction] B --> C[Corpus] C --> D[Transformation Extensions] D --> E[Handlebars Generators] E --> F[Documentation Templates] F --> H[HTML / AsciiDoc] F --> G[Template Extensions] G --> F D -.-> I[XML] The vision for extensions has two layers: Transformation extensions operate on the corpus between extraction and generation. A library could transform its internal types into the documentation structure it wants. This layer is not yet implemented. Template extensions (JavaScript helpers) operate inside the Handlebars templates that produce HTML and AsciiDoc output. This is the layer we shipped. Lua scripts for more powerful scripting in both layers Extension and generator commits 0f3ecb4 feat: javascript helpers extension (85 files, 4,287 insertions) 930a5ea fix: jerry_port_context_free wrong signature causes silent corruption 8da0930 feat(lib): public API for generator registration 788c1ba feat(generators): tables for symbols have headers Why We Discarded MrDocs-as-Compiler One unknown we explored and deliberately discarded was MrDocs-as-compiler (#1073). The idea, proposed by Peter Dimov, was to treat MrDocs like a compiler: emit “object” files per translation unit, then “link” them to produce the final reference. CMake would invoke MrDocs as if it were Clang, with identical command-line options. We spent time studying tools that work this way: clang-tidy, clang-doc, include-what-you-use. What we found is that tricking CMake into thinking MrDocs is a real compiler is not trivial. Every tool that tries this approach ends up needing either a coordinator binary (reimplementing what MrDocs already has) or CMake helper scripts. Both add workflows rather than simplifying them. The experience from the Boost ecosystem reinforced this: no Boost project uses any of these compiler-like tools for static analysis, and the reason is complexity. People who find the compilation database workflow too involved are going to be even less inclined to adopt a tool that requires them to pretend to be a compiler. We decided to keep MrDocs as a single-step tool that reads a compilation database and produces output, rather than splitting it into a multi-binary pipeline that would need its own coordination layer. Contributor Experience As more people contributed to MrDocs, the gap between “clone the repo” and “submit a useful PR” needed closing. The biggest change was the bootstrap script, which reduced the entire build setup to a single python bootstrap.py command (covered in a separate post). Beyond the bootstrap, we split the contributor guide into focused sections, added reference documentation for MrDocs comment syntax (so contributors know what @copydoc, @see, and other commands do), and created a run_all_tests script that runs the full test suite locally without needing to understand the CMake test configuration. Onboarding commits b103cba docs(reference): mrdocs comments 9b7ec24 feat(util): run_all_tests script 5902699 docs: update packages 302f0a6 docs: split contribute.adoc guide Automating PR Reviews MrDocs PRs tend to be large and hard to review. A single PR might touch the AST visitor, the Handlebars templates, the Antora extension, the CI configuration, and hundreds of golden test files (when an intentional change to the output format updates the expected output for every test case). We found ourselves making the same review comments over and over. We set up Danger.js to catch these patterns before human reviewers see the PR. The most important check is detecting when source code changes do not include corresponding tests: if someone changes extraction logic but does not update the golden tests, or changes a template without updating the expected output, Danger flags it. Beyond that: Categorizes all file changes into scopes (source, tests, golden-tests, docs, CI, build, tooling) and generates a summary table showing churn per scope Validates commit messages against Conventional Commits format Warns when a single commit exceeds 2,000 lines of source churn (encouraging smaller, reviewable slices) Flags mismatched commit types (e.g., a feat: commit that only touches test files suggests test: instead) Rejects PR descriptions under 40 characters Ignores the test check for refactor-only PRs where the tests are expected to remain unchanged Even when there are no warnings, the scope summary table gives reviewers an immediate sense of what a large PR touches. On a PR that changes 500 lines of source and 3,000 lines of golden tests, the table makes it clear that the bulk of the diff is expected test output, not new logic. Danger.js commits 6f5f6e9 ci: setup danger.js 5429b2e ci(danger): align report table and add top-files summary 240921d ci(danger): split PR target ci workflows 08c46b6 ci(danger): correct file delta calculation in reports 2cfd081 ci(danger): adjust large commit threshold 71845c8 ci(danger): map root files into explicit scopes 17b0a57 ci(danger): ignore test check for refactor-only PRs 6481fd3 ci(danger): simplify CI naming fd7d248 ci(danger): omit empty sections from report 7502961 ci(danger): categorize util/bootstrap as build scope 57e191e ci(danger): better markdown format CI Infrastructure We integrated Codecov for tracking test coverage across PRs and switched from GCC to Clang for coverage (more accurate AST-based measurement). CI speed was a recurring concern: we skipped remote documentation generation on PRs, sped up release demos, and skipped long tests that were not catching new bugs. LLVM cache keys were unified to avoid redundant builds, and CTest timeouts were increased for sanitizer jobs that run significantly slower. Matheus Izvekov contributed the Clang coverage switch, fixed an infinite recursion in extraction, and moved the project to use system libs by default. CI infrastructure commits ed6b3bc ci: add codecov configuration 5426a0a ci: use clang for coverage d629173 fix(ci): unify redundant LLVM cache keys 36a3b51 ci: update actions to v1.9.1 7b2103a ci: increase CTest timeout for MSan jobs 086becc ci: increase the ctest timeout to 9000 adb6821 ci(cpp-matrix): remove the optimized-debug factor 9507a38 ci: simplify CI workflow and upgrade cpp-actions to @develop 9a5bd3c ci: skip remote documentation generation on PRs 637011f ci: detect and report demo generation failures 084322d ci: speed up release demos on PRs 471951d ci: skip long tests to speed up CI a5f160b ci: increase test coverage for the new XML generator b1fc43c ci: exclude Reflection.hpp from coverage a1f9a82 ci: accept any g++-14 version c136a46 ci(website): preserve roadmap directory during deployment 4763d86 revert(ci): remove premature roadmap report step 3462996 ci: revert coverage changes 8b2c3e9 ci: align llvm-sanitizer-config with archive basename fdff573 ci: gitignore CI node_modules 757d446 fix(ci): update the fmt branch reference from master to main a3366b0 fix(ci): name rolling release packages after the branch Test Infrastructure MrDocs uses golden tests: the expected output for every test case is stored as a file, and the test runner compares the actual output against it. The most important change was adding multipage golden tests. Previously, all golden tests were single-page, but many bugs only manifested in multi-page output (cross-references between pages, navigation links, index generation). We were missing these entirely because we had no way to test them. We also added output normalization (so platform differences do not cause false failures) and regression categories so tests can be grouped and run selectively. A run_ci_with_act.py script lets contributors run the full CI pipeline locally using act. Test infrastructure commits bf78b1b test: support multipage golden tests d7ad1ce test: output normalization ccd7f71 test: check int tests results in ctest 681b0cd chore: assign categories to regression tests 9146125 test: cover additional paths in DocCommentFinalizer.cpp 8326417 test: run_ci_with_act.py script 5527e9c test: testClang_stdCxx default is C++26 0dfdb02 test: –bad is disabled by default Acknowledgments and Reflections Going into the wild changed MrDocs. The edge cases, the customization requests, and the integration feedback shaped the direction more than any internal roadmap could. Gennaro Prota drove the reflection integration that reduces maintenance burden across the entire codebase. Matheus Izvekov hardened CI with coverage, sanitizers, and warnings-as-errors, and migrated dependency management to the bootstrap script. Julio Estrada and Robert Beeston delivered the polished public face of MrDocs. Agustín Bergé contributed AST and metadata fixes including base member shadowing and alias SFINAE detection. Jean-Louis Leroy provided detailed feedback from Boost.OpenMethod that drove multiple improvements. The most requested feature we have not solved yet is macro support (#1127). Macros are expanded before parsing and do not appear in the AST. Supporting them would require preprocessor-level integration with Clang. The work ahead also includes Lua scripting, metadata transforms, and deeper reflection, all direct responses to what users told us they need. The biggest lesson from this period is that the problems worth solving are the ones users bring. We spent time on an unknowns framework to decide what to explore, but the most impactful work came from people who showed up with a broken demo page, a missing breadcrumb, or a duplicate ellipsis in their generated docs. The complete set of changes is available in the MrDocs repository.📝The C++ Alliance

Thursday, April 23, 2026

The CLion 2026.2 Roadmap: Simplified Debugger Configuration and the Ability to Use Multiple Zephyr West ProfilesWe’ve begun work on our next major release, version 2026.2, which we plan to introduce in a few months. After reviewing your feedback and our strategic goals, we’ve decided to focus on improving build tools, including Bazel, as well as project formats, the embedded experience, and the debugger. Here are our more specific priorities: Read […]📝CLion : A Cross-Platform IDE for C and C++ | The JetBrains Blog

Wednesday, April 22, 2026

C++ Code Intelligence for GitHub Copilot CLI (Preview)We recently brought C++ code understanding tools to GitHub Copilot in Visual Studio and VS Code. These tools provide precise, semantic understanding of your C++ code to GitHub Copilot using the same IntelliSense engine that powers code navigation in the IDE. Until now, these capabilities have been tied to GitHub Copilot in Visual Studio and […] The post C++ Code Intelligence for GitHub Copilot CLI (Preview) appeared first on C++ Team Blog .📝C++ Team Blog

Tuesday, April 21, 2026

CapyPDF is approaching feature sufficiencyIn the past I have written many blog posts on implementing various PDF features in CapyPDF . Typically they explain the feature being implemented, how confusing the documentation is, what perverse undocumented quirks one has to work around to get things working and so on. To save the effort of me writing and you reading yet another post of the same type, let me just say that you can now use CapyPDF to generate PDF forms that have widgets like text fields and radio buttons. What makes this post special is that forms and widget annotations were pretty much the last major missing PDF feature Does that mean that it supports everything? No. Of course not. There is a whole bunch of subtlety to consider. Let's start with the fact that the PDF spec is massive , close to 1000 pages. Among its pages are features that are either not used or have been replaced by other features and deprecated. The implementation principle of CapyPDF thus far has been "implement everything that needs special tracking, but only to the minimal level needed". This seems complicated but is in fact quite simple. As an example the PDF spec defines over 20 different kinds of annotations. Specifying them requires tracking each one and writing out appropriate entries in the document metadata structures. However once you have implemented that for one annotation type, the same code will work for all annotation types. Thus CapyPDF has only implemented a few of the most common annotations and the rest can be added later when someone actually needs them. Many objects have lots of configuration options which are defined by adding keys and values to existing dictionaries. Again, only the most common ones are implemented, the rest are mostly a matter of adding functions to set those keys. There is no cross-referencing code that needs to be updated or so on. If nobody ever needs to specify the color with which a trim box should be drawn in a prepress preview application, there's no point in spending effort to make it happen. The API should be mostly done, especially for drawing operations. The API for widgets probably needs to change. Especially since form submission actions are not done. I don't know if anything actually uses those, though. That work can be done based on user feedback.📝Nibble Stew
Meeting C++ 2025I have been meaning to write up this conference for a very long time. The call for papers for this year (2026) runs until 4th June. Here's the link: https://meetingcpp.com/mcpp/submittalk/ . So, last year's conference was great, as ever. Anthony Williams gave the opening keynote, called "Software and Safety". I had to look that up, because my notes say "Think Harder/Think outside the box". His talk encouraged us to do just that. He discussed UB, pointing out that it often comes from unexpected events and talked through how to avoid the unexpected. He pointed out validation is a design problem, and used the analogy of Swiss cheese: you can have several holes, but with a thick enough slice of cheese, you won't be able to see right through it. So, having several layers of testing helps. For example, in addition to unit tests, using sanitizers and fuzz testing is useful. To continue this theme, I went to see Anders Schau Knatten talking about "Real-time Safety - Guaranteed by the Compiler!" next. He went through Clang's non-blocking and non-allocating attributes, which I've not used before. He talked about real-time sanitizers (RTSan) too. He always finds neat, short examples that stick in my head for a while afterwards. I haven't got around to trying these out yet - I really should. After lunch I went to "Using std::generator<> in Practice" by Nicolai Josuttis. He talked about using them to write state machines and showed how much less boilerplate code you need to swap from states. As ever, he gave simple examples and pointed out various important things to be aware of, like getting in trouble when using references. Later I went to see Anders again, talking about "The Two Memory Models". He explained atomics and memory ordering. My notes make no sense - I'll have to rewatch the talk. I do remember coming away and feeling like this had made more sense than usual though. Memory ordering is a deep topic. I gave the "Center keynote" the next day. I used the title "Stoopid questions", and dug into why people are often shy about speaking up when they don't understand something. There is, after all, no such thing as a stupid question. I suspect many people end up mentoring or training others and haven't been taught how to teach. I am a trained secondary school maths teacher, so have had some input on this. Teaching is hard work, and can be emotionally draining, But learning is draining too. If you can find engaging or fun examples that helps. I got loads of questions afterwards, so I clearly got people thinking. I went to "The Missing Step: Making Data Oriented Design One Million Times Faster" by Andrew Drakeford next. I've seen a variant of this before, but he had added more details and things to think about. The punchline is optimising code might require a bit of thinking and some maths :-) I don't have notes after that. My head was still spinning after giving my talk But I did take photos. I went to "(Don't) use coroutines for this" by Ivan Cukic on the last day. He talked about "prog" C++ a while ago, so talked about "punk" C++ this time. For example, trying to get a stack based, rather than heap based, coroutine., and various other ways to subvert the original intentions of this relatively new language feature. Lots of fun, and lots of C++ covered, beyond just coroutines. If you can make people laugh by being slightly subversive, you are winning. Steve Love gave a talk called "CMake for the impatient" later. I can use CMake but don't have it 100% clear in my head. Many people give in depth talks, so it was nice to see a more beginner-level one for a change. We've all got gaps in our knowledge, after all. If you don't know CMake, this talk is a good place to start. I did go to several other talks, but I'll leave you to watch the videos. One last talk I'll mention is a lightning talk, by Rahel Natalie Engel. It was called "Let them eat cake", wherein she introduced a language based on C++ designed for teaching, called "Cat Pie". The talk is here: https://www.youtube.com/watch?v=gQ6grpbhW8k and you can try it out online here: https://catpie.compscicomp.de/ . The idea is to get a cat to move around a maze and eat some pie. The "Help" button takes you to a cheat sheet, and you can read more details here: https://dl.gi.de/server/api/core/bitstreams/d10b2056-9d7b-4552-bac7-a510dd2522e3/content . You can find the conference videos on YouTube . If you can't think of a talk to propose, you could volunteer instead. Or get someone to buy a ticket for you for this year. If you need help persuading work to let you go, reach out to someone to get help explaining why going to a conference like this is such a great learning opportunity. I really should get on with my proposal for this year now.📝BuontempoConsulting
Boost.URL: Audited, Constexpr, and PolishedWe had been putting off the Boost.URL security review for a while. There was always something more urgent. When the review finally happened, it confirmed what we hoped: the core parsing logic held up well. Around the same time, a constexpr feature request that we had been dismissing suddenly became a cross-library collaboration when other Boost maintainers started applying changes to their own libraries. And while working on Boost.Beast2 integration, we noticed friction in common URL operations that led us to clear a backlog of usability improvements. Security Review Round 1: 1,207 Findings (February 2, 2026) Round 2: 27 Findings (February 17, 2026) Round 3: 15 Findings (April 2, 2026) Compile-Time URL Parsing The Conversation That Changed Everything Error Handling at Compile Time The -Wmaybe-uninitialized Problem The Shared Library Problem The Result Usability Improvements Convenience Functions C++20 Integration Performance Acknowledgments and Reflections Security Review The C++ Alliance arranges professional security audits for the libraries we maintain. The results for Boost.Beast (2020) and Boost.JSON (2021) are publicly available. For Boost.URL, we always had the plan but kept delaying because there was so much other work to do first. That delay turned out to be a good thing: we found and fixed issues ourselves first, so the reviewers could focus on the subtle problems. Laurel Lye Systems Engineering conducted three rounds of assessment. Each finding was manually reviewed against the source code and categorized as a confirmed bug (fixed), a false positive, or a deliberate design choice. For every confirmed bug, we also proposed new test cases to prevent regressions. Round 1: 1,207 Findings (February 2, 2026) The first assessment was the broadest. Of 1,207 findings, 15 were confirmed bugs resulting in fix commits. The vast majority were false positives or by-design patterns: Verdict CRITICAL HIGH MEDIUM LOW INFO Total FIXED 1 9 0 2 3 15 FALSE POSITIVE 3 47 46 186 110 392 BY DESIGN 0 129 445 170 56 800 Total 4 185 491 358 169 1,207 The single CRITICAL fix was a loop condition in url_base that dereferenced *it before checking it != end. Three other CRITICAL findings were false positives: the audit flagged raw-pointer writes in the format engine, but these use a two-phase measure/format design that guarantees the buffer is pre-sized correctly. Most false positives fell into recognizable themes: BOOST_ASSERT as sole bounds check (29 HIGH findings): internal _unsafe functions rely on preconditions validated by the public API. The _unsafe suffix signals the contract. This is the standard Boost/STL pattern (std::vector::operator[] vs at()). Non-owning view lifetime safety (27 HIGH findings): string_view and url_view types do not own their data. The audit flagged potential use-after-free, but lifetime management is the caller’s responsibility by design. Atomic reference counting (multiple findings across all rounds): the audit tool did not recognize the #ifdef BOOST_URL_DISABLE_THREADS guard that switches between std::atomic and plain std::size_t. Round 1 fix commits bcdc891 CRITICAL: url_base loop condition order ec15fce HIGH: encode() UB pointer arithmetic for small buffers 81fcb95 HIGH: LLONG_MIN negation UB in format 42c8fe7 HIGH: ci_less::operator() return type 76279f5 HIGH: incorrect noexcept in segments_base::front() and back() d4ae92d HIGH: recycled_ptr::get() nullptr when empty 8d98fe6 LOW: decode() noexcept on throwing template The proportion of false positives to confirmed bugs was large enough that we discussed a second round with Laurel Lye, where we shared the false positive categories we had identified so they could be more targeted. Round 2: 27 Findings (February 17, 2026) The second assessment was more targeted. The reviewers had learned from our Round 1 triage and produced fewer false positives: Verdict HIGH MEDIUM LOW INFO Total FIXED 7 3 1 1 12 FALSE POSITIVE 2 2 0 0 4 BY DESIGN 0 0 1 1 2 ALREADY FIXED 0 5 4 0 9 Total 9 10 6 2 27 9 of the 27 findings had already been fixed in Round 1 commits. The new confirmed bugs included a heap overflow in format center-alignment padding (lpad = w / 2 used total width instead of padding amount), an infinite loop in decode_view::ends_with with empty strings, and an OOB read in ci_is_less on mismatched-length strings. Both rounds are tracked in PR #982. Round 2 fix commits d06df88 HIGH: format center-alignment padding 4fe2438 HIGH: decode_view::ends_with with empty string f5727ed HIGH: stale pattern n.path after colon-encoding d045d71 HIGH: ci_is_less OOB read 88efbae HIGH: recycled_ptr copy self-assignment fe4bdf6 MEDIUM: url move self-assignment ab5d812 MEDIUM: encode_one signed char right-shift b662a8f MEDIUM: encode() noexcept on throwing template 5bc52ed LOW: port_rule has_number for port zero at end of input 9c9850f INFO: ci_equal arguments by const reference 4f466ce test: public interface boundary and fuzz tests Round 3: 15 Findings (April 2, 2026) The third round was the shortest and the most precise. Of 15 findings, 4 were confirmed bugs and 11 were false positives. No CRITICAL findings. The false positives were the same recurring themes (atomic refcounting, pre-validated format strings, preconditions guaranteed by callers). Verdict HIGH MEDIUM LOW Total FIXED 0 1 3 4 FALSE POSITIVE 4 6 1 11 Total 4 7 4 15 The confirmed bugs were more subtle: a decoded-length calculation error in segments_iter_impl::decrement() that only manifested during backward iteration over percent-encoded paths, two noexcept specifications on functions that allocate std::string (which can throw bad_alloc), and a memcpy with null source when size is zero (undefined behavior per the C standard, even though it copies nothing). This round is tracked in PR #988. Round 3 fix commits 3ca2d71 MEDIUM: segments_iter_impl decoded-length in decrement() b1f6f8e LOW: param noexcept on throwing constructor d42c748 LOW: string_view_base noexcept on throwing operator std::string() f963383 LOW: url_view memcpy with null source when size is zero The progression from 1,207 findings to 27 to 15 shows the reviewers learning the peculiarities of our codebase. The ratio of false positives dropped with each round, and the confirmed bugs got more subtle. %%{init: {"theme": "base", "themeVariables": {"primaryColor": "#e4eee8", "primaryBorderColor": "#affbd6", "primaryTextColor": "#000000", "lineColor": "#baf9d9", "secondaryColor": "#f0eae4", "tertiaryColor": "#ebeaf4", "fontSize": "14px"}}}%% mindmap root((Confirmed Bugs)) UB in edge cases encode_one right-shift LLONG_MIN negation pointer arithmetic Self-assignment url move recycled_ptr copy OOB reads ci_is_less decode_view ends_with Incorrect noexcept encode / decode segments_base front/back param constructor string_view_base operator Iterator bugs segments decoded-length Null pointer recycled_ptr get url_view memcpy Compile-Time URL Parsing constexpr URL parsing has been one of the most recurring requests since the library’s inception. Every few months someone would ask about it, and every few months we would decide the refactoring cost was too high. The parsing engine is heavily buffer-oriented, and moving enough code into headers for constexpr evaluation required careful refactoring without breaking the shared library build. When we finally prototyped it, the diff touched thousands of lines, but most of those were code being moved from source files to headers rather than new logic. The actual new code was limited to alternative code paths to bypass non-literal types and refactoring url_view_base to eliminate a self-referencing pointer that prevented constexpr evaluation. Still, given the size of the change, we initially marked it as unactionable and moved on to the security review. Beyond the refactoring cost, we had blockers beyond our control. Our parsing code depended on boost::optional (not a literal type, no constexpr constructors), boost::variant2 (not literal when containing optional), and boost::system::result (could not be constructed with a custom error_code in constexpr because error_category::failed() is virtual). Without changes to those libraries, constexpr URL parsing was not possible regardless of how much we refactored our own code. The Conversation That Changed Everything Then Peter Dimov, the maintainer of Boost.System and Boost.Variant2, joined the conversation. We had assumed that system::result could not be constexpr in C++14 because it wraps error_code, which uses virtual functions. Peter pointed out that system::result is already a literal type in C++14 when T is literal and the error code is not custom. Boost.URL uses a custom error code category, and constructing a result from a custom error_code requires calling error_category::failed(), which is virtual and therefore not constexpr before C++20. Peter offered to fix this in Boost.System (#141, af53f17) for C++20 so that custom error codes would also work at compile time. Allowing constexpr virtual functions in C++20 Peter Dimov is also one of the authors of P1064: “Allowing Virtual Function Calls in Constant Expressions”, the C++ committee proposal that made constexpr virtual functions possible in C++20. The paper uses error_code and error_category as the motivating example. That shifted the problem. Instead of building our own constexpr_result type to bypass the entire error handling system, we could use system::result directly in C++20. The scope of the refactoring shrank, and we focused on C++20 as the initial target. The remaining blocker was that system::result requires T to be a literal type, and we use boost::optional heavily in our parsing code. boost::optional was not a literal type. Andrzej Krzemieński, the Boost.Optional maintainer, started working on it. The conversation went back and forth on the C++14 constraints: std::addressof is not constexpr until C++17, mandatory copy elision is only available in C++17, and there were questions about what subset of constructors could realistically become constexpr in C++14. After several iterations (including a feature/constexpr branch), the constexpr implementation landed on develop. With optional becoming literal, boost::variant2 containing optional could also become literal. All three blockers were now resolved. Peter had fixed Boost.System, Andrzej had fixed Boost.Optional, and we contributed fixes to Boost.Variant2. There was no going back: we could no longer dismiss the constexpr feature after three library maintainers had already done their part. %%{init: {"theme": "base", "themeVariables": {"primaryColor": "#f7f9ff", "primaryBorderColor": "#9aa7e8", "primaryTextColor": "#1f2a44", "lineColor": "#b4bef2", "secondaryColor": "#fbf8ff", "tertiaryColor": "#ffffff", "fontSize": "14px"}}}%% flowchart TD A[Boost.URL constexpr parsing] --> B[Boost.Optional] A --> C[Boost.Variant2] A --> D[Boost.System] B --> E[boost::optional constexpr] C --> F[boost::variant2::variant constexpr] D --> G[boost::system::result constexpr] D --> H[boost::system::error_code constexpr] Cross-library commits for constexpr support Boost.URL (PR #976, PR #981) 0a2c39f feat: constexpr URL parsing for C++20 b9db439 build: remove -Wno-maybe-uninitialized from GCC flags (see below) 59b4540 fix: suppress GCC false-positive -Wmaybe-uninitialized in tuple_rule (see below) Boost.Optional (issue #143, PR #145) 3df2337 make optional constexpr in C++14 046357c add more robust constexpr support 88e2378 add -Wmaybe-uninitialized pragma (see below) Boost.Variant2 (PR #57) b6ce8ac add missing -Wmaybe-uninitialized pragma (see below) Boost.System (issue #141) af53f17 add constexpr to virtual functions on C++20 or later Error Handling at Compile Time Boost.URL attaches source location information to error codes for better diagnostics at runtime. In a constexpr context, BOOST_CURRENT_LOCATION is not available, so the BOOST_URL_CONSTEXPR_RETURN_EC macro branches on __builtin_is_constant_evaluated(): at compile time it returns the error enum directly, at runtime it attaches the source location. #if defined(BOOST_URL_HAS_CXX20_CONSTEXPR) # define BOOST_URL_CONSTEXPR_RETURN_EC(ev) \ do { \ if (__builtin_is_constant_evaluated()) { \ return (ev); \ } \ return [](auto e) { \ BOOST_URL_RETURN_EC(e); \ }(ev); \ } while(0) #endif The -Wmaybe-uninitialized Problem GCC’s -Wmaybe-uninitialized flagged code inside boost::optional and boost::variant2 union storage constructors. The root cause was neither library. The inlining chain: Boost.URL’s parsing code constructs a variant2::variant that contains an optional alternative. At -O3, GCC inlines the entire chain: Parse function Variant construction variant2 storage optional storage Union constructor After inlining, GCC sees a union with a dummy_ member and a value_ member, and it cannot prove which member is active. It conflates the “uninitialized dummy” path with the “initialized value” path. The in_place_index_t dispatch guarantees which member is initialized, but GCC’s data flow analysis loses track across the nested layers. -fsanitize=address makes it worse by changing inlining thresholds. The compiler blames the wrong library. The root cause is in variant2’s union storage, but when variant2 contains an optional, GCC reports the warning in optional’s code. The pragma has to go where GCC reports it, not where the issue originates. We contributed pragmas to both Boost.Optional and Boost.Variant2, and replaced Boost.URL’s blanket -Wno-maybe-uninitialized flag with targeted pragmas. This particular false positive requires GCC 14+, -O3, ASan, on x86_64 Linux, with a variant2::variant containing a boost::optional, constructed through a system::result dereference. Change any one of those conditions and the warning disappears. This leaves an open question for the Boost ecosystem: when a false positive surfaces because library A’s optimizer behavior interacts with library B’s union storage and gets reported in library C’s code, who is responsible for the pragma? For now, we placed pragmas where GCC reports the issue, but the underlying problem recurs every time a new combination of types triggers the same inlining pattern. The Shared Library Problem Making URL parsing constexpr means the parsing functions must be available in headers. But Boost.URL is a compiled library, and on MSVC, __declspec(dllexport) on a class exports all members, including inline and constexpr ones. This causes LNK2005 (duplicate symbol) errors for any class that mixes compiled and header-only members. Each class must follow exactly one of two policies: (a) Fully compiled: class BOOST_URL_DECL C. All members in .cpp files. No inline or constexpr members. (b) Fully header-only: class BOOST_SYMBOL_VISIBLE C. All inline/constexpr/template. No .cpp file. We documented the full rationale in config.hpp. We suspect other C++ libraries have not encountered this because they either do not test shared library builds as extensively as we do, or they are header-only. The Result Boost.URL can now parse URLs at compile time under C++20 (PR #976). All parse functions (parse_uri, parse_uri_reference, parse_relative_ref, parse_absolute_uri, and parse_origin_form) are fully constexpr. A malformed URL literal becomes a compile error rather than a runtime failure: // Parsed and validated at compile time. // A malformed literal would fail to compile. constexpr url_view api_base = parse_uri("https://api.example.com/v2").value(); Pre-parsed constexpr URL views also serve as zero-cost constants: because all parsing happens during compilation, components like scheme, host, and port are available at runtime with no parsing overhead. This is useful for applications that compare against well-known endpoints, pre-populate configuration defaults, or build routing tables without paying for string parsing at startup. The constexpr feature taught us that dismissing a request because the cost seems too high for one library misses the bigger picture. Once Peter Dimov and the other maintainers got involved, the cost was shared and the scope shrank. In the Boost ecosystem, a feature that seems expensive in isolation can become practical when the dependencies cooperate. Usability Improvements While integrating Boost.URL into Boost.Beast2, the Beast2 authors noticed friction in common operations that worked correctly but required more code than they should. At the same time, several community issues had been open for a while. We used this as an opportunity to address both. Convenience Functions The most requested feature was get_or for query containers: look up a query parameter by key and return a default value if it is not present. Before: auto it = url.params().find("page"); auto page = it != url.params().end() ? (*it).value : "1"; After: auto page = url.params().get_or("page", "1"); We also added standalone decode functions for working with individual URL components without constructing a full URL object: auto plain = decode("My%20Stuff"); assert(plain && *plain == "My Stuff"); auto n = decoded_size("Program%20Files"); assert(n && *n == 13); C++20 Integration enable_borrowed_range is now specialized for 10 Boost.URL view types (segments_view, params_view, decode_view, and others). Unlike a std::vector, which owns its data, Boost.URL views point into the URL’s buffer without owning it. When a temporary view is destroyed, its iterators still point to valid memory. enable_borrowed_range tells the compiler this is safe, so algorithms like std::ranges::find can return iterators from temporary views without the compiler rejecting the code: segments_view::iterator it; { segments_view ps("/path/to/file.txt"); it = ps.begin(); } // iterator is still valid (points to external buffer) assert(*it == "path"); The grammar system gained user-provided RangeRule support. Custom grammar rules for parsing URL components satisfy a concept requiring first() and next() methods returning system::result : struct my_range_rule { using value_type = core::string_view; system::result first(char const*& it, char const* end) const noexcept; system::result next(char const*& it, char const* end) const noexcept; }; The motivation was performance and API clarity (#943). Previously, grammar::range always type-erased the rule through a recycled_ptr with string storage. Stateless rules were paying for storage they did not need. With user-provided RangeRule, range detects empty rules and avoids the type-erasure overhead entirely. Performance Component offsets in url_impl changed from size_t to uint32_t, reducing the size of every URL object on 64-bit platforms. The maximum URL size is capped at UINT32_MAX - 1 (enforced by a static_assert). Constructing a segments_view or segments_encoded_view from a URL is now a constant-time operation: offsets are computed directly from iterator indices without scanning the path. Other improvements Fixes a87998a params_iter_impl::decrement() computed incorrect decoded key/value sizes when a query parameter’s value contains literal = characters (PR #978, #972) 60c281a decode_view::remove_prefix/remove_suffix asserted n <= size() instead of preventing undefined behavior (PR #978, #973) 01e0571 decode_view was forward-declared but not complete when pct_string_view::operator*() was declared (PR #963) cbaf493 parse_query guard for empty string_view inputs from null data (PR #949) 161cf73 example router is now move-only (PR #959) 13f0110 natvis: add visualizers for segments (PR #962) Refactors e809ee4 token_rule_t now uses the empty base optimization via empty_value and provides conditional default construction (PR #964) Documentation 32c3ddc new design rationale page 000476c restore library-detail.adoc with shorter description Legacy QuickBook documentation removed in favor of Antora-based docs 8c7c4c7 plus scheme convention documented 6d396a4 format examples show full URL e4e6644 SVG diagrams with medium brightness backgrounds c93553c simplify SVG documentation images e618e69 avoid shadow warnings while improving param_view docs 4f63aea antora-downloads-extension integration 7f08ce2 update antora extensions 67bcd2d build script sets root dirs 888cd8c MrDocs-generated tagfiles for cross-referencing with other Boost libraries Tests e946887 URL with ? in query string (PR #978, #926) 3228399 URL natvis instantiations Most of these improvements came from real usage. The Beast2 integration exposed friction that we would not have found from inside the library, and the community issues represented patterns that multiple users had independently hit. The best usability feedback comes from people who are actually building something with the library. Acknowledgments and Reflections The constexpr work benefited from the contributions of Peter Dimov (Boost.System, Boost.Variant2) and Andrzej Krzemieński (Boost.Optional), who applied fixes to their libraries so that Boost.URL could proceed. The Beast2 usability feedback came from the Beast2 authors as they integrated Boost.URL into the new design. The work on Boost.URL has shifted. The problems we are solving now (edge cases found by professional auditors, compiler limitations for constexpr, usability friction from real integrations) are different from the problems we used to solve. They are smaller and more specific, but they matter more because real people hit them. The complete set of changes is available in the Boost.URL repository.📝The C++ Alliance

Monday, April 20, 2026

My BeCPP talk video posted: “C++ — Growing in a world of competition, safety, and AI”BeCPP just posted this video of my talk at their March 30 Symposium. This is the first time I’ve given this material on camera — it’s extension of themes in my New Year’s Eve blog post, with major updates because some big industry changes happened in the first quarter of 2026. This talk is different … Continue reading My BeCPP talk video posted: “C++ — Growing in a world of competition, safety, and AI” →📝Sutter’s Mill

Sunday, April 19, 2026