Serialization Proposal 1 by iriswebb · Pull Request #52 · embedded-graphics/bdf

iriswebb · 2026-06-05T02:41:35Z

These additions provide a way to represent and render a parsed BDF font as a slice of u8s, and tooling for that conversion in the font converter and previewer.

The format for serialization attempts to be as close to the BdfFont struct as possible while saving space, using 17 bytes per character:

* Header (12 Bytes):
 -    Ascent  (pixels, u16 BE)
 -    Descent (pixels, u16 BE)
 -    Replacement Character (index into character table, u32 BE)
 -    Character Table Length (entries, u32 BE)

* Glyph Table (17 Bytes Per Entry):
 -    corresponding codepoint (u32 BE)
 -    top_left.x (i16 BE)
 -    top_left.y (i16 BE)
 -    size.width (u16 BE)
 -    size.height (u16 BE)
 -    device_width (pixels, u8)
 -    data index  (bytes from start of data, u32 BE)

rfuest · 2026-06-05T19:43:14Z

Thanks for the PR, I agree that the storage of the glyphs should be optimized to save space. But I would prefer not to implement manual serialization and deserialization if possible.

My idea was to store the glyphs as a struct of arrays instead of an array of BdfGlyph structs, which should prevent any padding/alignment of the glyph struct to waste space. Something like this (untested quick mockup):

struct BdfFont<'a, Coord, Size, Index> {
    top_left: &'a [Coord],
    size: &'a [Size],
    device_width: &'a [Size],
    data_index: &'a [Index],
}

const TOP_LEFT: [i8; 256 * 2] = ...;
const SIZE: [u8; 256 * 2] = ...;
const WIDTH: [i8; 256] = ...;
const DATA_INDEX: [u16; 256] = ...;

pub const MY_FONT: BdfFont<'static, i8, u8, u16> = {
    .top_left = &TOP_LEFT,
    .size = &SIZE,
    .width = &WIDTH,
    .data_index = &DATA_INDEX,
};

The advantage is that no manual serialization is necessary and that each font can use the minimum data size required, because of the type parameters in ´BdfFont. The eg-font-converter` would determine which is the smallest type that can be used to represent all values of one kind.

But my proposal only works for fonts embedded in the source code, a serialization format would still be useful for other applications which load fonts at runtime from some external storage. If you want to continue with this PR I would suggest to look into using a serialization format like https://github.com/jamesmunns/postcard.

iriswebb · 2026-06-05T21:35:08Z

Representing fonts in source code would make it harder to use those fonts in other languages. For example, generating custom fonts for u8g2_fonts requires compiling C source code since that's the only format u8g2's tooling generates.

Using postcard, would it be possible to index serialized data without de-serializing it? For a large font, creating a whole copy of it could use up the memory of a machine.

iriswebb · 2026-06-08T03:40:07Z

Although the serialization in this PR tries to replicate the BdfFont struct as close as possible, For fonts with large numbers of glyphs (like unifont, which has ~57000 in the BDF release), there could be a more optimized way of storing characters.

As of now, finding a glyph is O(n), and there are 17 bytes of metadata per glyph. This makes rendering slow for CJK characters and emojis, and means that the metadata of a font can get larger than the bitmap data.

Sorting the list of glyphs would allow for a binary search, but storing ranges of contiguous sections of glyphs would make the search functionally O(1), since fonts usually group glyphs together in large blocks

This would also remove the need to denote the corresponding character of each glyph, making the character struct 13 bytes per glyph, potentially saving up to 200kb for a font like unifont

I propose:

* ascent
* descent
* replacement
* section_count
* character_count

SECTION TABLE (sorted):
* starting char
* ending char
* starting index into glyph table

GLYPH TABLE (sorted):
* Bounding Box (8 bytes)
* Index from start of data block (4 bytes)
* Kerning (1 byte)

iriswebb · 2026-06-10T23:00:43Z

This recent commit refactored the SerializedBdfFont rendering into a new trait and struct ProportionalFont and PropotionalTextStyle

…ortional one

rfuest · 2026-06-12T18:04:57Z

+    primitives::Rectangle,
+};
+
+/// * Header (12 Bytes):


In case this format is also used to load a font at runtime it would be a good idea to add some validation that it is a valid font file. A magic value, a version, and the file size should be enough.

rfuest · 2026-06-12T18:26:43Z

Using postcard, would it be possible to index serialized data without de-serializing it? For a large font, creating a whole copy of it could use up the memory of a machine.

You are correct that this would be an issue for a variable integer size format like postcard. It would need to be a hybrid approach, where at least the glyph index table would use fixed integer sizes for more efficient lookup. I'm OK with leaving the manual serialization/deserialization.

As of now, finding a glyph is O(n), and there are 17 bytes of metadata per glyph. This makes rendering slow for CJK characters and emojis, and means that the metadata of a font can get larger than the bitmap data.

Sorting the list of glyphs would allow for a binary search, but storing ranges of contiguous sections of glyphs would make the search functionally O(1), since fonts usually group glyphs together in large blocks

Using binary search for glpyh lookup was always the idea for the non serialized version. I would keep the non grouped version for this PR to keep things simpler.

Co-authored-by: Ralf Fuest <mail@rfuest.de>

iriswebb · 2026-06-12T18:41:13Z

Suggestions applied, I will push another commit to resolve the errors shortly

iriswebb · 2026-06-13T02:11:55Z

It is not possible to generate a reference to the bitmap data for each glyph using the function EgBdfOutput::new without changing the return value or the data structure back to an index

rfuest · 2026-06-13T13:58:29Z

It is not possible to generate a reference to the bitmap data for each glyph using the function EgBdfOutput::new without changing the return value or the data structure back to an index

My comment mentioned this, but I didn't explain this very well. The glyphs field inside the BdfFont cannot use the new version of BdfGlyph directly any longer. It will need to continue to use the old version of BdfGlyph with the start_index attribute, called something like GlyphData. The lookup function will convert between GlyphData and the new version of BdfGlyph (with the slice reference to the bitmap data).

iriswebb · 2026-06-13T16:06:15Z

BdfFont is becoming similar to the bdf::Font the parser generates

start_index indexes the bit the glyph data starts at, not the byte, which makes harder to generate a slice from

iriswebb · 2026-06-13T16:29:42Z

Padding each glyphs data to a byte boundary seems to work

rfuest · 2026-06-16T17:44:42Z

start_index indexes the bit the glyph data starts at, not the byte, which makes harder to generate a slice from
...
Padding each glyphs data to a byte boundary seems to work

Sorry I forgot that the pixel data of the glyphs was tightly packed without any padding. Starting each glyph at a byte boundary seems like a good solution.

rfuest · 2026-06-16T17:52:59Z

+    /// The raw u8 data of the serialized font
+    pub data: &'a [u8],
+}
+impl<'a> SerializedBdfFont<'a> {


The deserialization code contains too many magical values and it shouldn't panic in case of malformed or truncated input data.

As a first step I would add some constants and getters, like:

const HEADER_SIZE: usize = 12; const GLYPH_SIZE: usize = 17; impl<'a> SerializedBdfFont<'a> { ... fn bitmap_data(&self) -> &[u8] { &self.data[HEADER_SIZE + self.character_count() * GLYPH_SIZE..] } fn glyph_data(&self, index: usize) -> Option<&[u8; GLYPH_SIZE]> { let offset = HEADER_SIZE + index * GLYPH_SIZE; self.data .get(offset..offset + GLYPH_SIZE) .map(|slice| slice.try_into().unwrap()) } fn character_table(&self, index: usize) -> Option<DisplayBdfGlyph<'_>> { let glyph = self.glyph_data(index)?; let corresponding_character = char::from_u32(u32::from_be_bytes(glyph[0..4].try_into().unwrap()))?; let top_left_x = i16::from_be_bytes(glyph[4..6].try_into().unwrap()); ... } ... }

iriswebb · 2026-06-16T18:42:13Z

Verification can be done using the New Type pattern, which would make checking data easier

    /// Constructs a serialized font without first verifing the data
    pub const fn from_unverified_data(data: &'a [u8]) -> Self {
        Self { data }
    }

    /// Verifies data in a way that prevents panics and returns a serialized font if the data is valid
    ///
    /// TODO: Make this const
    pub fn verify_data(data: &'a [u8]) -> Option<Self> {

iriswebb · 2026-06-16T19:14:57Z

Commit 5c1c7d6

rfuest · 2026-06-18T15:43:46Z

+    /// Constructs a serialized font without first verifing the data
+    pub const fn from_unverified_data(data: &'a [u8]) -> Self {
+        Self { data }
+    }


I would prefer not to have a constructor that doesn't check the data. It should be possible to make the other constructor const and provide two variants: one which returns a Result and one which panics to make it easier to define font constants.

The verification process is noticeably slower than constructing it from unverified data. Even if the verification function was const, it would slow down loading fonts from non-const values (like from external storage).

The consequence of loading corrupted font data would just be a panic, there isn't any memory errors that could propagate to other parts of the program silently.

rfuest · 2026-06-18T15:45:02Z

+    /// Verifies data in a way that prevents panics and returns a serialized font if the data is valid
+    ///
+    /// TODO: Make this const
+    pub fn verify_data(data: &'a [u8]) -> Result<Self, &'static str> {


This is a constructor and should be named accordingly:

Suggested change

pub fn verify_data(data: &'a [u8]) -> Result<Self, &'static str> {

pub fn new(data: &'a [u8]) -> Result<Self, &'static str> {

rfuest · 2026-06-18T15:48:42Z

+        for i in 0..font.character_count() {
+            let offset = Self::HEADER_SIZE + (i * Self::CHARACTER_TABLE_ENTRY_SIZE) as usize;
+            // Corresponding Character Invalid
+            char::from_u32(font.get_be_u32(offset + Self::CODEPOINT_OFFSET))
+                .ok_or("Invalid character")?;
+
+            // Data index not within bitmap data
+            let idx = font.get_be_u32(offset + Self::IDX_OFFSET) as usize;
+
+            if idx > data.len() {
+                return Err("Invalid bitmap index");
+            }
+        }


It should be possible to make this part work in a const fn. I would add a helper function to get the glyph data, which could also be reused in character_table. Something like:

const fn glyph_data(&self, index: usize) -> &[u8; CHARACTER_TABLE_ENTRY_SIZE] { let (_header, data) = self .data .split_at(Self::HEADER_SIZE + index * CHARACTER_TABLE_ENTRY_SIZE); data.first_chunk().unwrap() }

The get_be_... functions can be moved outside the struct impl to take data as a parameter, which makes them more flexible.

rfuest · 2026-06-18T16:05:21Z

+            descent: self.get_be_u16(Self::DESCENT_OFFSET) as u32,
+            line_height: self.get_be_u16(Self::ASCENT_OFFSET) as u32
+                + self.get_be_u16(Self::DESCENT_OFFSET) as u32,
+        }


See note about as vs from above.

Co-authored-by: Ralf Fuest <mail@rfuest.de>

iriswebb · 2026-06-18T18:01:37Z

Commits d90124c, 386937e, and 33dc8a8

Basic Serializer

35871b0

Refactor Proportional Trait

ab5f0eb

remove original BDF renderer and replace it with the generalized prop…

941ceed

…ortional one

rfuest requested changes Jun 12, 2026

View reviewed changes

rfuest reviewed Jun 12, 2026

View reviewed changes

Comment thread eg-font-converter/src/serializer.rs

iriswebb and others added 2 commits June 12, 2026 13:37

Update eg-bdf/src/proportional.rs

e17f5da

Co-authored-by: Ralf Fuest <mail@rfuest.de>

Update eg-font-converter/src/serializer.rs

dc96dcc

Co-authored-by: Ralf Fuest <mail@rfuest.de>

Fix errors in eg-bdf

bd23ac5

fix converter and viewer

7eb585f

rfuest requested changes Jun 16, 2026

View reviewed changes

Better error handling and verificaion, remove magic numbers

5c1c7d6

rfuest requested changes Jun 18, 2026

View reviewed changes

iriswebb and others added 3 commits June 18, 2026 11:11

Update eg-bdf/src/serialized.rs

d90124c

Co-authored-by: Ralf Fuest <mail@rfuest.de>

Update eg-bdf/src/serialized.rs

386937e

Co-authored-by: Ralf Fuest <mail@rfuest.de>

Make more things const, factor out indexing functions

33dc8a8

	pub fn verify_data(data: &'a [u8]) -> Result<Self, &'static str> {
	pub fn new(data: &'a [u8]) -> Result<Self, &'static str> {

Conversation

iriswebb commented Jun 5, 2026

Uh oh!

rfuest commented Jun 5, 2026

Uh oh!

iriswebb commented Jun 5, 2026

Uh oh!

iriswebb commented Jun 8, 2026

Uh oh!

iriswebb commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

rfuest Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

rfuest commented Jun 12, 2026

Uh oh!

Uh oh!

iriswebb commented Jun 12, 2026

Uh oh!

iriswebb commented Jun 13, 2026

Uh oh!

rfuest commented Jun 13, 2026

Uh oh!

iriswebb commented Jun 13, 2026

Uh oh!

iriswebb commented Jun 13, 2026

Uh oh!

rfuest commented Jun 16, 2026

Uh oh!

rfuest Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

iriswebb commented Jun 16, 2026

Uh oh!

iriswebb commented Jun 16, 2026

Uh oh!

rfuest Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

iriswebb Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

rfuest Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rfuest Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rfuest Jun 18, 2026

Choose a reason for hiding this comment

Uh oh!

iriswebb commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

iriswebb commented Jun 10, 2026 •

edited

Loading