Remembering More Memory: XMS And A Real Hack

Last time we talked about how the original PC has a limit of 640 kB for your programs and 1 MB in total. But of course those restrictions chafed. People demanded more memory, and there were workarounds to provide it.

However, the workarounds were made to primarily work with the old 8088 CPU. Expanded memory (EMS) swapped pages of memory into page frames that lived above the 640 kB line (but below 1 MB). The system would work with newer CPUs, but those newer CPUs could already address more memory. That led to new standards, workarounds, and even a classic hack.

XMS

If you had an 80286 or above, you might be better off using extended memory (XMS). This took advantage of the fact that the CPU could address more memory. You didn’t need a special board to load 4MB of RAM into an 80286-based PC. You just couldn’t get to with MSDOS. In particular, the memory above 1 MB was — in theory — inaccessible to real-mode programs like MSDOS.

Well, that’s not strictly true in two cases. One, you’ll see in a minute. The other case is because of the overlapping memory segments on an 8088, or in real mode on later processors. Address FFFF:000F was the top of the 1 MB range.

PCs with more than 20 bits of address space ran into problems since some programs “knew” that memory access above that would wrap around. That is FFFF:0010, on an 8088, is the same as 0000:0000. They would block A20, the 21st address bit, by default. However, you could turn that block off in software, although exactly how that worked varied by the type of motherboard — yet another complication.

XMS allowed MSDOS programs to allocate and free blocks of memory that were above the 1 MB line and map them into that special area above FFFF:0010, the so-called high memory area (HMA). Continue reading “Remembering More Memory: XMS And A Real Hack”

Remembering Memory: EMS, And TSRs

You often hear that Bill Gates once proclaimed, “640 kB is enough for anyone,” but, apparently, that’s a myth — he never said it. On the other hand, early PCs did have that limit, and, at first, that limit was mostly theoretical.

After all, earlier computers often topped out at 64 kB or less, or — if you had some fancy bank switching — maybe 128 kB. It was hard to justify the cost, though. Before long, though, 640 kB became a limit, and the industry found workarounds. Mercifully, the need for these eventually evaporated, but for a number of years, they were a part of configuring and using a PC.

Why 640 kB?

The original IBM PC sported an Intel 8088 processor. This was essentially an 8086 16-bit processor with an 8-bit external data bus. This allowed for cheaper computers, but both chips had a strange memory addressing scheme and could access up to 1 MB of memory.

In fact, the 8088 instructions could only address 64 kB, very much like the old 8080 and Z80 computers. What made things different is that they included a number of 16-bit segment registers. This was almost like bank switching. The 1 MB space could be used 64 kB at a time on 16-byte boundaries.

So a full address was a 16-bit segment and a 16-bit offset. Segment 0x600D, offset 0xF00D would be written as 600D:F00D. Because each segment started 16-bytes after the previous one, 0000:0020, 0001:0010, and 0002:0000 were all the same memory location. Confused? Yeah, you aren’t the only one.

Continue reading “Remembering Memory: EMS, And TSRs”

BASIC Classroom Management

While we don’t see it used very often these days, BASIC was fairly revolutionary in bringing computers to the masses. It was one of the first high-level languages to catch on and make computers useful for those who didn’t want to (or have time) to program them in something more complex. But that doesn’t mean it wasn’t capable of getting real work done — this classroom management software built in the language illustrates its capabilities.

Written by [Mike Knox], father of [Ethan Knox] aka [norton120], for his classroom in 1987, the programs were meant to automate away many of the drudgeries of classroom work. It includes tools for generating random seating arrangements, tracking attendance, and other direct management tasks as well as tools for the teacher more directly like curving test grades, tracking grades, and other tedious tasks that normally would have been done by hand at that time. With how prevalent BASIC was at the time, this would have been a powerful tool for any educator with a standard desktop computer and a floppy disk drive.

Since most people likely don’t have an 80s-era x86 machine on hand capable of running this code, [Ethan] has also included a docker container to virtualize the environment for anyone who wants to try out his father’s old code. We’ve often revisited some of our own BASIC programming from back in the day, as our own [Tom Nardi] explored a few years ago.

640k Was Never Enough For Anyone: How DOS Broke Free

On modern desktop and laptop computers, there is rarely a need to think about memory. We all have many gigabytes of the stuff, and it’s just there. Our operating system does the heavy lifting of working out what goes where and what needs to be paged to disk, and we just get on with reading Hackaday, that noblest of computing pursuits. This was not always the case though, and for early PCs in particular the limitations of the 8086 processor gave the need for some significant gymnastics in search of an extra few kilobytes. [Julio Merion] has an interesting run-down of the DOS memory map, and how memory expansion happened on computers physically unable to see much of it.

The 8086 has a 20-bit address bus, giving it access to a maximum of 1 megabyte. When IBM made the PC they needed space for the BIOS, the display, and the various accessory ROMs intended to come with expansion cards. Thus they allocated a maximum 640k of the map for RAM, and many early machines shipped with much less than that. The quote from Bill Gates about 640k being enough for anyone is probably apocryphal, but it was pretty clear as the 1980s wore on that more would be needed. The post goes into how memory expansion worked, with a 64k page mapped to switchable RAM on a card, and touches on how DOS managed extended memory above 1 Mb on the later processors that supported it. We dimly remember there also being a device driver that would map the unused graphics memory as EMS when the graphics card was running in text mode, but such horrors are best left behind.

Of course, some of the tricks to boost RAM were nothing but snake oil.

8086 header: Thomas Nguyen, CC BY-SA 4.0

Finding Undocumented 8086 Instructions Via Microcode

Video gamers know about cheat codes, but assembly language programmers are often in search of undocumented instructions. One way to find them is to map out all of a CPU’s opcodes and where there are holes, try those values, and see what happens. Not good enough for [Ken Shirriff]. He prefers examining the CPU’s microcode and deducing what each part of it does.

Microcode is a feature of many modern CPUs. The CPU runs several “microcode” instructions to process a single opcode. For the Intel 8086, there are 512 micro instructions, each with 21 bits. Each instruction has two parts: a part that moves a source to a destination and another that performs some other operation, such as an ALU operation. [Ken] explains it all in the post, including several hidden registers you can’t see, but the microcode can.

Searching for holes in the opcode table.

Some of the undocumented instructions are probably not useful. They are either impractical or duplicate a function you can already do another way. Not all of the instructions are there for technical reasons. For example, opcode D6, commonly known as SALC for “Set AL to Carry”, seems to exist only as a trap for anyone making a carbon copy of Intel’s microcode. When other companies like NEC made 8086 clones, having an undocumented instruction would strongly suggest they just copied Intel’s intellectual property (in NECs case, they didn’t).

Other cases happen where an instruction just doesn’t make sense. For example, you can pop all segment registers, and though it is not documented, you can deduce that POP CS should be opcode 0F. The problem is there is no sane reason to pop CS off the stack. The instruction works; it just isn’t useful. The opcodes from 60-6F are conditional jumps that are no different from the instructions at 70-7F because of decoding. There is no reason to document both identical instruction ranges.

The plot thickens when you go to two-byte instructions. You’ll find plenty of instructions of dubious value. You don’t hear much about undocumented instructions anymore. Why? Because modern CPUs have enough circuitry to dedicate some to detecting illegal instructions and halting the CPU. But the 8086 was squeezed too tight to allow for such a luxury. Good thing for people like us who enjoy solving puzzles.

You can still get a modern CPU to tell you more about instructions even if it won’t run them. Even the 80286 had some secret opcodes.

It Isn’t WebAssembly, But It Is Assembly In Your Browser

You might think assembly language on a PC is passe. After all, we have a host of efficient high-level languages and plenty of resources. But there are times you want to use assembly for some reason. Even if you don’t, the art of writing assembly language is very satisfying for some people — like an intricate logic puzzle. Getting your assembly language fix on a microcontroller is usually pretty simple, but on a PC there are a lot of hoops to jump. So why not use your browser? That’s the point of this snazzy 8086 assembler and emulator that runs in your browser. Actually, it is not native to the browser, but thanks to WebAssembly, it works fine there, too.

No need to set up strange operating system environments or link to an executable file format. Just write some code, watch it run, and examine all the resulting registers. You can do things using BIOS interrupts, though, so if you want to write to the screen or whatnot, you can do that, too.

The emulation isn’t very fast, but if you are single-stepping or watching, that’s not a bad thing. It does mean you may want to adjust your timing loops, though. We didn’t test our theory, but we expect this is only real mode 8086 emulation because we don’t see any protected mode registers. That’s not a problem, though. For a learning tool, you’d probably want to stick with real mode, anyway. The GitHub page has many examples, ranging from a sort to factorials. Just the kind of programs you want for learning about the language.

Why not learn on any of a number of other simulated processors? The 8086 architecture is still dominant, and even though x86_64 isn’t exactly the same, there is a lot of commonalities. Besides, you have to pretend to be an 8086, at least through part of the boot sequence.

If you’d rather compile “real” programs, it isn’t that hard. There are some excellent tutorials available, too.

String Operations The Hard(ware) Way

One of the interesting features of the 8086 back in 1978 was the provision for “string” instructions. These took the form of prefixes that would repeat the next instruction a certain number of times. The next instruction was meant to be one of a few string instructions that operated on memory regions and updated pointers to the memory region with each repeated operation. [Ken Shirriff] examines the 8086 die up close and personal to explain how the 8086 microcode pulled this off and it is a great read, as usual.

In general, the string instructions wanted memory pointers in the SI and DI registers and a count in CX. The flags also have a direction bit that determines if the SI and DI registers will increase or decrease on each execution. The repeat prefix could also have conditions on it. In other words, a REP prefix will execute the following string instruction until CX is zero. The REPZ and REPNZ prefixes would do the same but also stop early if the zero flag was set (REPZ) or not set (REPNZ) after each operation. The instructions can work on 8-bit data or 16-bit data and oddly, as [Ken] points out — the microcode is the same either way.

[Ken] does a great job of explaining it all, so we won’t try to repeat it here. But it is more complicated than you’d initially expect. Partially this is because the instruction can be interrupted after any operation. Also, changing the SI and DI registers not only have to account for increment or decrement, but also needs to understand the byte or word size in play. Worse still, an unaligned word had to be broken up into two different accesses. A lot of logic to put in a relatively small amount of silicon.

Even if you never design a microcoded CPU, the discussion is fascinating, and the microphotography is fun to look at, too. We always enjoy [Ken’s] posts on little CPUs and big computers.