In case of python-only solutions a clustering by an attribute like the category provides an efficient way to storage and retrieve features. With masking rows can be selected.
Memorry mapped numpy arrays are pretty fast if using a fast SSD. Otherwise, it's limited by i/o speed. Multhreading also works but is very limited by i/o speed limits.
keywords: sqlite numpy database postgresql