The problem is that, while on gaming computers like the Commodore 64 and consoles like the Super NES the hardware's video modes supported scrollable background layers and the concept of sprites, and games also used them in almost all cases to hava a major speedup and more available CPU power for the actual game logic, there wasn't such a standard on the IBM compatible PCs. Early PC video adapters (CGA, EGA) just supported special text modes for displaying text on screen and general image modes where an array of color values was submitted to the adapter to display an image on the screen.
Wikipedia says that the proceeding VGA standard supported smooth hardware scrolling, but I have no clues in which ways this might open a possiblity to extract a scrolling layer out of the data that is submitted for display in each drawing cycle. Also maybe that wouldn't mean that multiple layers could be distinguished because each game engine was programmed differently. But I don't know anything about that for sure because I never programmed a VGA game. Might be a question that is best lead to a developer of the DOSBox, for example.