-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Description
Original proposal by @jkotas (click to view)
The generation of Json serializers via reflection at runtime has non-trivial startup costs. This has been identified as a bottleneck during prototyping of fast small cloud-first micro-services:
Repro: https://gist.github.com/jkotas/b0671e154791e287c38a627ca81d7197
The Json serializer generated using reflection at runtime has startup cost ~30ms. The manually written Json serializer has startup cost ~1ms.
Edited by @kevinwkt and @layomia :
Background
There are comprehensive documents detailing the needs and benefits of generating JSON serializers at compile time. Some of these benefits are improved startup time, reduction in private memory usage, faster throughput for serialization and deserialization, and being ILLinker-friendly due to avoiding reflection at run-time. There is also an opportunity to reduce the size of the trimmed System.Text.Json.dll after source generation and linker trimming due to code-paths that use reflection being potentially removed, and also unused built-in JsonConverter<T>
s such as Uri
, Ulong64
etc.
After discussing some approaches and pros/cons of some of them we decided to implement this feature using Roslyn source generators. Implementation details and code/usage examples can be seen in the design document. This document will outline the roadmap for the initial experiment and highlight actionable items.
This project requires numerous API changes and the design is being iterated on which is why we will be using the dotnet/runtimelab repository instead of dotnet/runtime. The main goal of this project is to get something up and running while changing implementation and iterating on public API without committing to dotnet/runtime master. We hope to share the project and get feedback for potential release on .NET 6.0. The project will be consumable through a prerelease package until then. Progress can be tracked through the JSON Code Gen project board in dotnet/runtimelab.
Approach
There are 3 main points in this project: type discovery, source code generation, generated source code integration (with user applications).
Type discovery
Type discovery can be thought of in two ways, an implicit model (where the user does not have to specify which types to generate code for) and an explicit model (user specifies through code or configuration which types to generate code for).
Various implicit approaches have been discussed such as source generating for all partial classes or scanning for calls into the serializer using Roslyn tree syntax. These models can be revisited in the future as the value/feasibility of the approach becomes clearer based on user feedback. It is important to note that some downsides to such a model include missing types to generate source for or generating source for types when not needed due to a bug or edge cases we didn’t consider.
The proposed approach for type discovery requires an explicit indication of serializable types by the user. This model supports indicating both owned and non-owned types. A new JsonSerializableAttribute
will be used to detect these types. There are two patterns for JsonSerializiableAttribute
. The first consists of applying the attribute on a type that the user owns, and the second consists of the user passing into the constructor of the attribute a non-owned serializable type.
We believe that an explicit model using attributes would be a simple first-approach to the problem. Within the Roslyn source generator, we parse the syntax tree to find usages of the JsonSerializableAttribute
. The output of this phase would be a list of input types for the generator in order to code-gen recursively for each type in all the object graphs.
Source code generation
The design for the generated source focuses mainly on performance gains and extensibility to existing JsonSerializer
functionality. Performance is improved in two ways. The first is during the first-time/warm-up performance for both CPU and memory by avoiding costly reflection to build up a Type metadata cache during runtime and moving it to compile time. These type metadata are then represented as JsonTypeInfo
classes that can be used for (de)serialization at runtime. The second is throughput improvement by avoiding the initial metadata-dictionary lookup on calls to the serializer by generating an instance of the type’s JsonTypeInfo
(metadata). These instances will be passed to new (de)serialize overloads.
We will use the types discovered in the type discovery phase and recurse through the type graph in order to source generate the functions mentioned above within each JsonTypeInfo
and register them inside the user-facing wrapper JsonSerializerContext
.
Generated source code integration
There are discussions regarding integration of generated metadata source code with user apps. The proposed approach consists of the generator creating a context class (JsonSerializerContext
) which takes an options instance and contains references to the generated JsonTypeInfos
for each type seen above. This relies on the creation of new overloads to the current serializer mentioned before that can be retrieved from the context. An example of the overload and usage can be seen here, while examples and details of the end to end approach can be seen in the design document.
Action items
- JSON serialization source generator Compile-time source generation for System.Text.Json #45448
Progress of this effort can be observed through the JSON Code Gen project board in dotnet/runtimelab.
The source generator (System.Text.Json.SourceGeneration.dll) and updated System.Text.Json.dll can be consumed via an experimental NuGet package. Issues can be logged at https://github.com/dotnet/runtimelab/issues/new?labels=area-JsonCodeGen with the area-JsonCodeGen
label.
cc @jkotas @davidfowl @stephentoub @mjsabby @terrajobst @pranavkm @ericstj @layomia @steveharter @chsienki