Posts from: October '11
A wise guy once said, „Assembly language programming is an extravagant waste of human talent and should be avoided whenever possible“.
While the previous week I was playing around with SimNow, ucbench and assembly language for x86, this weekend (partly as a tribute to the recently deceased master, Dennis Ritchie (R.I.P., dmr!)), I went for a higher-level language on a lower-level CPU: C for PIC.
That Peter Norton quote on the top is soooooo true¹, applied to my last PIC project; it is partly finished, but there's still a lot of not-very-hard-but-obtrusively-massive code left to be written. Thus, the mentioned waste comes easily to mind, and the temptation to avoid it is very hard to resist. So, to start clean, I went from scratch, on a new (but easier) project, new breadbord and all. I'm using Hi-tech's C compiler, and this is the first test program that I wrote:
Click (h.264 format - opens fine VLC).
Whether the ways of C could be successfully applied to the "big" project is left to be checked - as even the tiny program on the video utilizes some 10% of the code memory, and also around 10% of the data, too. Yet, there's hope - I wasn't sure I'd have the mental persistence to write that amount of assembly :)
There was a fun moment with the electronics I bought: I needed a good 3.3V low-dropout regulator, and, after some research, I settled on the MCP1703. I bought it along with some other parts, but haven't checked the purchases until I got home. I didn't see the regulator in the bag at first, and thought they must have forgotten to put it. Then... I saw it... It's just unreal: I was expecting something like this, while it turned out like that instead (see the arrow)... You should have seen what's it like to solder pins to that thing! Having completed this feat, I bet I could now solder wires to a grain of sand, too... :)
¹ even though that smart ass didn't write even a single line of anything, except assembly, for the first few versions of Norton Commander :)
I decided to write a tutorial about AMD SimNow™, which I bumped into recently. While the tool is nice, the introductory articles about it are somewhat lacking, so you just RTFM, RTFM and RTFM until your head explodes. I found some of the things quite nonintuitive, hence this tutorial.
So, SimNow is, in a nutshell, a complete simulator of a PC (even an imaginary one). If you were told in you CS classes, that CPUs are just software (which is correct), and they are being tested (by software) hundreds of thousands of hours, before they are implemented on actual silicon (also correct), then this is one of the programs that does this sort of simulations.
SimNow works on a very low level - analyzing, for example, how is the CPU—memory transfer going (the exact speed and timings are emulated as well), or how is the HyperTransport performing, etc.
The price to be paid for that kind of thoroughness is speed: e.g., a 3200 MHz real CPU emulates an (imaginary) 2.0 GHz part, with an emulation speed of around 50-100 MIPS. So the simulation moves slower than real time (contrary to other technologies like DosBox, VMWare, etc.). This means that the emulated CPU(s) is assumed to be fixed-speed, and its full-speed simulation is impossible, thus the time "stretches" from the Host's viewpoint: a "sleep 1s" would take 10 seconds (wall-time) if the current emulation speed is 200 MIPS. So, if you're watching a movie inside the emulator, it will "play" without a hitch, but it would display quite a bit slower. Other emulators/virtualization solutions usually take the other way - there, the guest movie player will notice the CPU is slow, and activate framedrop, etc.
That said, you'd guess that using SimNow requires a hell lot of waiting. Thus, you're better off using a lightweight and optimized guest OS. Creating a virtual HDD image is not very easy to do, so the simpler way is to mount .ISO CD images, and run some of the lighweight LiveCD linux distros - I used Puppy Linux 5.28. This OS fits the task quite well: it doesn't complain about that CPU it has unheard of, and boots in a reasonable time (still, 10 minuites, but that's considered quick), and after booting all works fast enough to be usable. You'd probably be annoyed by the lack of mouse cursor sync between guest and host - some of the issues can be mitigated by twiddling with the Puppy's mouse sensitivity settings.
SimNow allows you to "assemble" any machine. Just drop-in the hardware components from the catalogue (video-cards, southbridges, CPUs,...). However, tying them all together can be a tedious task. An easier way is to just load and run one of the prefabricated "configurations", which come with the installation. For example, the Bulldozer is implemented in only one config, named vp_bd_phase1.bsd.
There's still one hitch you need to resolve before emulating your LiveCD: the BIOS's boot order has the HDD before any CDs, and you'd get an "Invalid (or missing) operating system" error if you launch it right away; you need to go inside the BIOS (which is... well..., real) and fix that. Restart and voilà!
The next part is not well documented and took me quite a while to figure it out. It's about how to connect our emulated machine to the world. SimNow doesn't support Copy/Paste between guest and host, it has no "shared folders" thing, but it has a virtual networking. The latter is implemented in a way that may seem striking and overcomplicated at first, but there's a reason behind it. From the guest's side, it is simple: it just sees a E1000 network adapter. Host, on the other hand, expects to have a connection to a "mediator" server process (which has to be run manually). The mediator is the actual "exit point" of the emulated traffic to the world. I.e., if the guest decides to ping google.com, the packet is first passed to the emulated E1000 interface, then SimNow tunnels the packet to the mediator (which can be situated on a different machine if you will), and then the packet appears on the server side, potentially after a NAT step, or directly, as if it originates from a bridged adapter. The bridged scenario is the easiest, so I'll describe that. Before running SimNow, you need to launch the mediator program; the arguments are "-p <port>" (e.g., "./mediator -p 8888"). The mediator uses libpcap, and thus requires superuser privileges. After that, launch your simulation, and use the command prompt of SimNow (this is usually the console where you launched ./simnow from) to instruct it to connect to your mediator. Use "setMediatorHost localhost:8892" (the number needs to be <port> + 4). After that, enter "linkConnect down" and "linkConnect auto" - this will tell the E1000 that "the cable is connected" (in our case the "cable" goes to our real LAN). This establishes the communication so you can start working with yout VM. Yet, I had some issues, for an unknown reason: I was unable to ping the guest from the host, even though other machines in the same LAN had no problems! Thus, to copy a file to the guest, I first uploaded the file to some other machine in my LAN, and then downloaded it from the guest with wget. Luckily, Puppy has both wget and unzip :)
After all that info, you might be wondering - why would you need to do all that? Well, in my case, that was the only way to test my XOP instruction implementation of ucbench - without actually buying a Bulldozer, which was (at the time) kinda impossible, given it was before its launch date! The vp_bd_phase1.bsd config supports both XOP and FMA4, at their actual speeds. As an example of the usefulness of the complete simulation (which SimNow dows) - when I benchmarked my new code on the Puppy under SimNow, it showed a performance increase of around 50%. At first, I thought this was quite too much, but then the real Bulldozer showed the same figure, so the simulation was pretty accurate!
One more thing: the simulator works on 64-bit AMD processors only (it's explicitly stated in the reqs). Interestingly enough, the situation is more or less the same in the Intel camp, only it's unofficial: Intel also has a CPU emulator for yet-to-be-released processor cores, which I tried (before the Sandy Bridge launch, around March). The requirements just list "an Intel-compatible prpocessor". Well, if you are brave enough to run it on an AMD CPU, it appears to start and for some time it works without complaining, and then it crashes "mysteriously", with a Segfault. The same program, on a 14-times-weaker Intel CPU (Atom, vs. Thuban) works without a hitch :D. Cross-company backstabbing ftw :]