Devirtualize storage hot path: mark MemoryStorage/OfflineStorageHandler final + TU-local statics#1488
Devirtualize storage hot path: mark MemoryStorage/OfflineStorageHandler final + TU-local statics#1488bmehta001 wants to merge 6 commits into
Conversation
Mark the leaf concrete implementations of IHttpClient and IOfflineStorage final
so the compiler can devirtualize (and often inline) calls made through them:
HttpClient_{Curl,WinInet,WinRt,Apple,Android,CAPI} and OfflineStorage_Room /
MemoryStorage / OfflineStorageHandler.
Each was verified to have no subclass anywhere in the tree (lib + tests +
wrappers). Deliberately NOT marked:
- OfflineStorage_SQLite -- tests/unittests/OfflineStorageTests_SQLite.cpp
subclasses it (OfflineStorage_SQLiteNoAutoCommit).
- TelemetrySystemBase -- base of TelemetrySystem / AITelemetrySystem.
Validated: NDK aarch64 -fsyntax-only on OfflineStorage_Room, OfflineStorageHandler
and MemoryStorage.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR optimizes hot-path virtual dispatch by marking leaf concrete implementations of the IHttpClient and IOfflineStorage interfaces as final, enabling better compiler devirtualization/inlining in the SDK’s HTTP upload and offline-storage code paths.
Changes:
- Mark concrete HTTP client implementations (
HttpClient_*) asfinal. - Mark concrete offline storage implementations/handler (
OfflineStorage_*,MemoryStorage,OfflineStorageHandler) asfinal.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| lib/offline/OfflineStorageHandler.hpp | Marks OfflineStorageHandler as final to enable devirtualization through known concrete type usage. |
| lib/offline/OfflineStorage_Room.hpp | Marks OfflineStorage_Room as final to allow devirtualization in Android Room-backed storage path. |
| lib/offline/MemoryStorage.hpp | Marks MemoryStorage as final to enable devirtualization for in-memory storage operations. |
| lib/http/HttpClient_WinRt.hpp | Marks HttpClient_WinRt as final to improve call-site optimization on WinRT HTTP path. |
| lib/http/HttpClient_WinInet.hpp | Marks HttpClient_WinInet as final to improve call-site optimization on WinInet HTTP path. |
| lib/http/HttpClient_Curl.hpp | Marks HttpClient_Curl as final to improve call-site optimization on Curl HTTP path. |
| lib/http/HttpClient_CAPI.hpp | Marks HttpClient_CAPI as final to improve call-site optimization for C API-backed HTTP client. |
| lib/http/HttpClient_Apple.hpp | Marks HttpClient_Apple as final to improve call-site optimization on Apple HTTP path. |
| lib/http/HttpClient_Android.hpp | Marks HttpClient_Android as final to improve call-site optimization on Android HTTP path. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Companion to the `final` devirtualization in this PR: give internal linkage to translation-unit-local symbols so the compiler can inline / drop them and keep them out of the (static-archive and shared-object) symbol table. This helps the static-lib consumption path that -fvisibility=hidden does not fully cover, since hidden visibility only trims the dynamic export table while these symbols keep external linkage across TUs. lib/api/capi.cpp: mark the 10 file-local C-API dispatch helpers static (remove_client, mat_open_core, mat_open, mat_open_with_params, mat_log, mat_close, mat_pause, mat_resume, mat_upload, mat_flushAndTeardown). Verified each is called only from the single exported entry point evt_api_call_default (and each other) within capi.cpp, and appears in no header and no other translation unit. capi_get_client stays external (it is MAT::capi_get_client, declared in a header and used by the CAPI HTTP client). Consistent with the file's existing static mtx/clients. lib/shared/dllmain.cpp: mark thread_count static -- a file-scope mutable global mutated only inside DllMain in this TU. get_platform_uuid (sysinfo_sources.cpp) was deliberately NOT touched: it has no caller in-tree, so marking it static would trip -Wunused-function under -Werror. Verified: NDK clang aarch64 -fsyntax-only on capi.cpp is clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
I believe most of our storage/upload calls go through My only concern is the tradeoff here: if this does not actually devirtualize the main hot paths, we may be restricting downstream/internal users from subclassing these concrete types without much benefit. |
Addresses Copilot review comment. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Per review feedback (lalitb), measured the devirtualization effect of the
'final' markers (vendored Linux shared libmat.so, -O2 -ffunction-sections
-fvisibility=hidden -Wl,--gc-sections, no LTO), base vs this PR:
- Final binary size: NO change (.text byte-size identical). Devirtualization
swaps an indirect call for a same-width direct call -- perf, not size.
- 'final' devirtualizes only where the concrete type is known: object-level
indirect-call counts dropped in OfflineStorageHandler.o (70->64) and
MemoryStorage.o (8->3) -- a class calling its own virtual methods and
OfflineStorageHandler's concrete MemoryStorage member. Call sites routed
through IHttpClient/IOfflineStorage base references were unchanged
(LogManagerImpl/TelemetrySystem/OfflineStorage_SQLite/HttpClient_Curl TUs
byte-identical).
So 'final' on the HttpClient_* clients and OfflineStorage_Room bought no
measured devirtualization while still restricting subclassing. Drop it from
those; keep it only on MemoryStorage and OfflineStorageHandler (internal
lib/offline impl types, not extension points), where it does help. The
internal-linkage (static) cleanups in capi.cpp and dllmain.cpp are orthogonal
and retained.
Files: lib/http/HttpClient_{Android,Apple,CAPI,Curl,WinInet,WinRt}.hpp,
lib/offline/OfflineStorage_Room.hpp
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@lalitb, You are right. Devirtualization is only happening on concrete-typed call sites. Object-level indirect-call counts: My main reason for my recent series of PRs is performance (slight impact) and decreasing binary size (no impact), so none of this PR is actually necessary. I will narrow this PR to keep |
Newer Clang (macOS-latest, LLVM 19+) enables -Wunnecessary-virtual-specifier, which errors under -Werror when a 'virtual' method that does not override a base method lives inside a 'final' class (it can never be overridden). Marking OfflineStorageHandler and MemoryStorage final left three such methods (DeleteRecordsByKeys, isKilled, GetReservedCount) still declared virtual, breaking the macOS debug build. Remove the now-redundant virtual specifier; these become non-virtual, consistent with the devirtualization goal. No subclasses or overrides of these methods exist. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Devirtualization: mark the two storage hot-path classes
finalMarks the concrete leaf storage impls
MemoryStorageandOfflineStorageHandlerfinalso the compiler can devirtualize (and often inline) virtual calls where theconcrete type is known.
Measured effect (this informed the scope)
Built vendored Linux shared
libmat.sowith-O2 -ffunction-sections -fdata-sections -fvisibility=hidden -Wl,--gc-sections(no LTO), base vs this PR:Final binary size: no change. Stripped
libmat.soand.textare identical insize (186,514 B
.texton both). Devirtualization swaps an indirectcall *(vtable)for a same-width direct
call rel32, so it is a perf change, not a size one.finalonly devirtualizes concrete-typed call sites (object-level indirect-callcounts):
OfflineStorageHandler.oMemoryStorage.oOfflineStorage_SQLite.o,LogManagerImpl.o,TelemetrySystem.o,HttpClient_Curl.oThe eliminated calls are a
finalclass calling its own virtual methods (e.g.OfflineStorageHandler→DeleteRecords/ReleaseRecords/DeleteRecordsByKeys/Flush)and
OfflineStorageHandler's concreteMemoryStoragemember. Calls routed throughIHttpClient/IOfflineStoragebase references are not devirtualized.Scope
Per review feedback,
finalis kept only onMemoryStorageandOfflineStorageHandler— internal
lib/offline/impl types (not public extension points) — where it measurablydevirtualizes. It was dropped from
HttpClient_Curl/WinInet/WinRt/Apple/Android/CAPIand
OfflineStorage_Room, where it bought no measured devirtualization while stillrestricting subclassing.
Deliberately not marked:
OfflineStorage_SQLite(subclassed byOfflineStorage_SQLiteNoAutoCommitin tests),TelemetrySystemBase(base ofTelemetrySystem/AITelemetrySystem).Internal linkage (
static) — retainedOrthogonal hygiene: give internal linkage to translation-unit-local symbols so the
compiler can inline/drop them and keep them out of the symbol table. This helps the
static-library consumption path (e.g. vcpkg);
-fvisibility=hiddenonly trims ashared object's dynamic export table.
lib/api/capi.cpp— the 10 file-local C-API dispatch helpers (remove_client,mat_open_core,mat_open,mat_open_with_params,mat_log,mat_close,mat_pause,mat_resume,mat_upload,mat_flushAndTeardown), each called only withincapi.cppand in no header/other TU.
MAT::capi_get_clientstays external.lib/shared/dllmain.cpp—thread_count(file-scope global mutated only inDllMain).Deliberately not touched:
get_platform_uuid(sysinfo_sources.cpp) has no in-treecaller, so
staticwould trip-Wunused-functionunder-Werror.Verified: NDK clang
aarch64 -fsyntax-onlyclean; whole-tree cross-reference confirms noneof the internalized symbols are referenced from any header or other TU.
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com