Playstation 2 Programming
Posted: March 16, 2006
11 years ago I playing around with PS2DEV writing Playstation 2 programs in C. Recently I added to naken_asm the full Playstation 2 Emotion Engine MIPS R5900 and vector unit instruction sets. I've been writing some sample programs in assembly and dropping them in the git repo in the /samples/playstation2/ directory. I figured I'd document what I'm doing incase I need to look at it later or maybe it could help others learn Playstation 2 programming. I've also added some macros and definitions to naken_asm that might make life easier. I left all the old PS2DEV programs at the bottom of this page.
Below is rotation.asm from the samples directory running. The rotations and 3D projections are done with vector unit 1. Further down on this page is an explanation on how I got this working.
Most of the information I used for learning to write these sample programs came from http://lukasz.dk/. Lukasz has links to serval old Playstation 2 demos including Duke's 3stars.asm demo. I used code from the 3stars.asm demo to figure out how to get the Playstation 2 video initialization working.
There are also several PDFs that are also important which can be found at http://hwdocs.webs.com/ps2. These PDFs document the R5900 instruction set, the vector units instruction set, DMA, and Graphic Synthesizer (GS) modules. Sony released these PDF's with the Playstation 2 Linux kit (which I got with my PS2).
The main CPU of the Emotion Engine is a R5900 MIPS instruction set with 128 bit wide registers. As pointers they are typically used as 32 bit, for math (shifts, adds, logic, etc) they can be used as 64 bit, and for load / store they can be used to move 128 bit data around or as in the R5900 vector unit as 128 bit vector registers. Note the R5900 has its own vector instructions that are completely different than the VU0 and VU1 vector unit instruction sets. In some ways the R5900 vector instructions are more generic and more useful.
There is also an R3000 CPU running 10 times slower that can be used to run Playstation 1 software, but it's also the way the Playstation 2 does sound, DVD-ROM access, file I/O, gamepads, etc. The sound chip was upgraded with more memory and now has two cores instead of just one. The EE CPU communicates with this CPU (the IOP) through SIF (subsystem interface) DMA calls but is also (I believe mostly undocumented) memory mapped to the EE.
It probably would make this page too long to post code, so probably the best thing to do is link to rotation.asm in the naken_asm samples directory. The first part of this program up to the "while_1:" main loop is basically just setting up the video hardware. The screen is set up for a 640x448 interlaced display (the code looks like it's 224 lines, but since it's interlaced, the drawing area is 448). After the video hardware is set up, there are some drawing registers to set up. Unfortunately in the rotation.asm code it's mixed in with the screen clearing code, but in Java Grinder's output, the init code was separated out:
.align 128 _screen_init: dc64 GIF_TAG(13, 1, 0, 0, FLG_PACKED, 1), REG_A_D dc64 SETREG_FRAME(0, 10, FMT_PSMCT24, 0), REG_FRAME_1 dc64 SETREG_FRAME(210, 10, FMT_PSMCT24, 0), REG_FRAME_2 dc64 SETREG_ZBUF(420, 0, 0), REG_ZBUF_1 dc64 SETREG_ZBUF(420, 0, 0), REG_ZBUF_2 dc64 SETREG_XYOFFSET(1000 << 4, 1000 << 4), REG_XYOFFSET_1 dc64 SETREG_XYOFFSET(1000 << 4, 1000 << 4), REG_XYOFFSET_2 dc64 SETREG_SCISSOR(0,639,0,447), REG_SCISSOR_1 dc64 SETREG_SCISSOR(0,639,0,447), REG_SCISSOR_2 dc64 1, REG_PRMODECONT dc64 1, REG_COLCLAMP dc64 0, REG_DTHE dc64 SETREG_ALPHA(ALPHA_SRC, ALPHA_FB, ALPHA_SRC, ALPHA_FB, 0x80), REG_ALPHA_1 dc64 SETREG_ALPHA(ALPHA_SRC, ALPHA_FB, ALPHA_SRC, ALPHA_FB, 0x80), REG_ALPHA_2 _screen_init_end:
Data sent from the main CPU to the Graphics Synthesizer is sent using these GIF packets. It's basically a 16 byte header along with x number of 16 byte [ 8 byte value, 8 byte register ] combinations. Well, at least in packed mode (as shown above) it's 8 bytes per value, 8 bytes to name a register. In reglist mode it can be done with just 8 bytes per value. An example of a reglist transfer can be found in reglist.asm. Starting at line 258, there are two GIF packets here, one to set the primitive register to a triangle and then another GIF packet that has 8 byte register values (the registers are defined in the packet header). The registers are described in the GS_Users_Manual.pdf file.
GIF packets are typically sent to the GS through a DMA transfer, which sample source can be found in the rotation.asm file after the comment "Set up draw environment". Basically it just sets a size (how many 16 byte units need to be transfered), the address in main memory where the GIF packet is, and then setting D2_CHCR to 0x101 to tell it to kick off the transfer. One thing to be careful of here is the DMA will grab from main memory, which means if the data cache of the MIPS chip has data which hasn't been written back to main memory for this memory region, then the DMA will not send the correct data (old data) to the GS. Flushing the cache can either be done with a system call or (probably a lot more efficiently) with CPU instructions:
;; Flush cache with syscall lui $a0, 0 li $v1, FlushCache syscall
;; Flush 128 bytes of the ;; cache with CPU li $v1, vif_packet_1_start sync.l cache dhwoin, 0($v1) cache dhwoin, 64($v1) sync.l
The GIF packet above is going to set up both drawing contexts. Basically a drawing context allows for 2 buffers so while the video chip is drawing one context to the TV screen, the user can be drawing new triangles on the second context. For Java Grinder this is how double buffering is being done. In this setup, being 640x448 interlaced, one context is drawing the even video beam lines and the second context is drawing the odd ones.
The FRAME_1/2 registers in the GIF packet set up the video frames. The first argument is the base memory location (all base memory locations are multiplied by 2048), the 10 means the frame buffer width is 640 pixels (640 / 64 = 10), and the FMT_PSMCT32 means this will be a 32 bit color display (red, green, blue, and alpha are all 8 bit).
The ZBUF_1/2 registers set up the Z buffer which is described in more detail below.
The XYOFFSET_1/2 registers map the primitive drawing coordinates to the screen coordinates. When a triangle is drawn the (X,Y) coordinates can be between (0,0) to (4096,4096). By setting XYOFFSET to 1000 it means that if pixels inside of the triangle being drawn are not between (1000,1000) to (1640,1448) they will be dropped. A pixel at (1000,1000) is mapped to (0,0) on the actual TV screen.
GIF Packet For Drawing A Triangle
Here is a sample GIF packet that would draw a triangle:
draw_triangle: dc64 GIF_TAG(7, 1, 0, 0, FLG_PACKED, 1), REG_A_D dc64 SETREG_PRIM(PRIM_TRIANGLE_STRIP, 1, 0, 0, 0, 0, 0, 0, 0), REG_PRIM dc64 SETREG_RGBAQ(0,255,0,0x80,0x3f80_0000), REG_RGBAQ dc64 SETREG_XYZ2(1800 << 4, 2000 << 4, 128), REG_XYZ2 dc64 SETREG_RGBAQ(255,0,0,0x80,0x3f80_0000), REG_RGBAQ dc64 SETREG_XYZ2(1800 << 4, 2110 << 4, 128), REG_XYZ2 dc64 SETREG_RGBAQ(0,0,255,0x80,0x3f80_0000), REG_RGBAQ dc64 SETREG_XYZ2(1900 << 4, 2110 << 4, 128), REG_XYZ2
Note since this is packed, the length of the packet is 7 (a single 128 bit [value, reg] to set the PRIM register to a triangle with gouraud shading turned on, and two 128 bit [value, reg] entries for each vertex). Since each vertex has its own color and gouraud shading is turned on, the Playstation 2 hardware will draw this triangle with each color at each vertex while filling in the rest of the triangle with colors in between. This can be seen in triangle in the video above.
Another thing to note is the coordinates of the vertex. They are shifted left by 4 bits because Playstation 2 expects vertex coordinates to be a fixed point number. So the upper 28 bits will represent the whole part and the lower 4 bits are (value / 16) for a decimal point.
There are several primitives including points, lines, line strip, triangles, triangle fan, triangle strip, and sprite.
Another interesting thing to mention is the vertexes are, in an odd way, really only (x, y) for the position on the screen. The Z value is really the Z buffer value. Playstation 2 hardware doesn't do transformations of (x, y, z) 3D point to a (x, y) 2D image, that has to be done by programmer. All the code I used to do the transformations are done in vector unit 1 (VU1). Read further below for more information on the Z buffer.
There is a register in the GS called TEST_1 (and TEST_2 for the second context). This register can decide on a pixel by pixel basis if the current pixel needs to be drawn or should be discarded. If this register is not set up properly, most likely you will not see anything on the screen.
It seems most of the values in this register are for Alpha, although ended up not needing or using them. The important thing here is the Z buffer. The Z buffer is.. in an odd way a scratch pad area to assist in 3D drawing. As explained above, the Playstation 2 GS really doesn't know anything about 3D, it draws triangles based on the (x, y) coordinates for each vertex. Each vertex also gets a Z value though.
When a pixel is drawn, not only can the GS draw to the frame buffer the pixel it needs to draw, but it can also put the value of Z into the Z buffer. If the Z buffer already has a video there that is greater than the current pixel's Z value, using a setting in TEST_1/2 the pixel can be dropped.
This feature can be used to give depth to triangles.. hiding one behind another. Or even hiding parts of one behind another if the Z value of one vertex is lower than the currently drawn triangle, but the Z value of another vertex is higher.
Textures are basically just pictures that can be drawn on top of a set of triangles. Textures can be placed in memory as 16 bit images (Alpha = 1 bit, red = 5 bits, green = 5 bits, blue = 5 bits), 24 bit images, or 32 bit images. 32 bit images are identical to 24 bit except they have an alpha value (transparency).
Textures can have transparencies based on either their alpha value or there is a setting that can make any pixel that is black completely see through. More on that in the Alpha Channels section.
Textures are transferred from main memory to the Graphics Synthesizer in a GIF packet of type IMAGE using the BITBLT register in the GS. An example of this is in the texture.asm sample program.
Note that the texture here is sent in 2 GIF packets are that strung together using the EOP (end of packet) flag. The first GIF tag should have EOP set to 0 to set the GS know that there is another GIF packet that needs to be processed with this one. The GIF packet with the actual image has EOP set to 1.
The primitive that uses the texture needs, for every vertex, a description of where the image is drawn on primitive. This ends up being a floating point value from 0.0 to 1.0 and is shown in the TRIANGLE_FAN primitive in texture.asm. This primitive is actually 2 triangles put together to make a square and each vertext has an ST register value that tells the hardware how to stretch the image around the triangles. In this case since it's a square (the simplest case) I was able to use the values (1.0, 0.0), (0.0, 0.0), (0.0, 1.0), and (1.0, 1.0).
Alpha channels are a way to make pixels transparent or translucent / semitransparent. The Playstation 2 actually has a lot of options for doing alpha channels. The ALPHA register is what controls how alpha blending is done. In Java Grinder this defaults to ((RGB_SRC - RGB_FB) * ALPHA_SRC) >> 7 + RGB_FB. This means if the primitive's alpha channel and texture controls are set to alpha mode, the GS will take the RGB value of the texture and subtract it from what's currently drawn on the screen, multiplied by the texture's alpha value, divided by 128 and added back into the pixel currently in the frame buffer. Everything is pixel for pixel and R, G, B. Without textures only the alpha bit in the PRIM register should be needed to be set to turn on alpha channels. Each pixel can of course have it's own alpha value.
For textures there are a few registers that can affect the texture's alpha values. First is TEXA. When AEM is set, any pixel that is black can be considered transparent or there (I believe this depends on the TA0 and TA1 registers). Actually, I believe TA0 and TA1 allow specific alpha values whether it's black or not. When AEM is 0, in 16 mode the TA0 and TA1 set the alpha value for the pixel based on if the A bit of the pixel is set or not. In 24 bit mode TA0 will will set the alpha value for all the pixels in the texture.
To get textures to have alpha channels working, I had to set TCC in the TEX0 register for 16 bit textures. I actually haven't tried 24 bit alpha channels at all yet.
I believe the TEXFLUSH register needs to be used to synchronize drawing of a texture, but I haven't used it yet.
The vector units in the Playstation 2 are quite... unique. There are two of them. One (called VU0) is connected to the MIPS core and one is connected to the Graphics Synthesizer (VU1). Both vector units are VLIW (very long instruction word) which means (in this case) each instruction is 64 bits wide where the 64 bits are composted of two 32 bit instructions that get executed in parallel. In the source code, the left instruction is called the upper instruction and the instruction to the right of it is the lower. The upper instructions do most of the FPU vector math (32 bit * 4 registers) and the lower do more integer work (16 bit registers).
A neat feature of the Playstation 2 vector unit that I haven't seen with other vector instruction sets is the math instructions allow the user to decide which parts of the register get updated. For example if the following upper / lower instructions were coded:
sub.xyzw vf03, vf01, vf02 nop sub.xyz vf04, vf01, vf02 nop
If the starting values of the of the 128 bit vector registers were:
vf01 = [ 3.0, 4.0, 5.0, 6.0] vf02 = [ 1.0, 1.0, 1.0, 1.0] vf04 = [ 7.0, 8.0, 9.0, 3.0]
After the 2 instructions execute (note that the lower instruction is just a nop, since there's nothing interesting that can be done while these instructions run), the ending values of the registers are:
vf03 = [ 2.0, 3.0, 4.0, 5.0] vf04 = [ 2.0, 3.0, 4.0, 3.0]
Some things to be careful of:
VU0 instructions can be called directly from the MIPS core as long as the VU0 is sitting idle. VU0 has 4k of code memory and 4k of data memory. To run a program the MIPS core needs to load the VU0 code memory with a subroutine (can be done with DMA or directly through memory mapping) and call that subroutine. The MIPS core can also put data in the VU0 data memory and read it back (again both with DMA or memory mapped).
A fairly simple VU0 program can be found in the Java Grinder repostiory: test_vu0.asm. This program expects the MIPS core to write to it's 4k data memory segment the number of pixels to modify (divided by 4 since VU0 will modify 4 pixels (aka 16 bytes) at a time) and the color to those pixels to. When the MIPS core kicks the VU0 to start, this program will load into the vi01 register the count, into vf01 it will load in all 4 components (x, y, z, and w) the 32 bit pixel color. The vi03 register is used to point at the next 4 pixels to write to. The program ends with a nop[E] instruction telling the vector unit to halt.
VU1 works the same way, with some extra instructions. VU1 has, for example, an xgkick instruction that can send a GIF packet directly to the GS. Java Grinder uses VU1 to take in a GIF packet from the MIPS core, do rotations on vertexes, 3D projections, and translations. It then kicks the new GIF packet directly to the GS.
The drawing contexts allow a kind of.. page flipping to be done. This means that while the GS is reading from memory and drawing context 1 to the screen, the GS can be drawing primitives / textures to the memory of context 2. Since the display is interlaced, in Java Grinder context 1 is the even lines being drawn on the screen and context 2 is the odd. Without this feature a primitive could be drawn on the screen at a location where the video beam is already done drawing, so maybe half of the primitive shows up, or maybe none of it shows up on the screen at all.
The SIF is barely mentioned at all in the Sony PDF's. I think their thinking was that the Playstation 2 devkit would have a highlevel C API around playing sound, I/O, gamepad controllers, etc. This ended up being really frustrating to figure out. I figured out how to get some code uploaded and executed on the IOP by looking through the source code of 3 different Playstation 2 emulators. I left sample code on what I did in the iop_example.asm sample program. It ended up that basically the code had to be sent to the IOP through DMA channel 6 (SIF1) with a EE DMA tag and an IOP DMA tag (the 16 extra bytes the come before the R3000 code).
One interesting thing I did find out by looking through the emulators was that the IOP's RAM is mapped to the EE's RAM at location 0xbc010000. This means uploading a sound file to the IOP can be done without the SIF. Also it appears the the SPU2 and SPU2 DMA channel in the IOP is also memory mapped to the EE also.
One more thing I'm noticing while trying to run iop_example.asm and Java Grinder code on a real Playstation 2, it seems that the SIF1 DMA registers are causing a memory error... even just setting them all to 0 to "reset" them. Also, the 0xbc010000 cause a memory error in user mode. Seems that memory region is only available in kernel mode. I did end up getting the whole thing working in Playstation2.cxx in the function add_spu_functions() in the assembly function _upload_sound_data without using the IOP or SIF (or DMA). The way it works is, the SPU2 FIFO is loaded with up to thirty-two 16 bit pieces of data, then the register at 0x019a (the SPU2 control register) is set for manual transfer, which causes the FIFO to empty out. The register at 0x0344 is a status register where the bit at bitmask 0x0400 lets the program know the FIFO is busy.
The reason I was trying to get the SIF to work was to get a sound file to play. There is an SPU2 PDF that explains how all the registers work, but oddly doesn't tell you the addresses. I got a list of the register's addresses from both the PS2SDK source code (iop/include/spu2regs.h) and the PCSX2 source code (plugins/spu2-x/src/regs.h). The trick here was to look at the block diagrams.. and read the text. The part I kind of missed was the fact there are two SPU2 cores and SPU2 core 0 feeds its output into SPU2 core 1. In order to get sound to play from core 0, SPU2 core 2's mixer and volume registers must be set.
PS2DEV Homebrew (March 2006)
I've recently started learning how to do programming on Sony's Playstation 2 console. This has been some of the most fun programming I've done in a while. This is my first time even dealing with 3D graphics on this level and such. To get these programs running I got a modified Playstation 2 (with a DMS4) that would boot CD-RW's with my software. I also have a USBExtreme kit which boots homebrew. These demos also run on the PCSX2 emulator.
The SDK and Documentation
I decided to start with simple C until I understand the hardware better and then move to assembly language. I downloaded a script from Dan Peori's http://www.oopo.net/consoledev/ site that downloaded all the components and patches I needed and installed the development kit. This guy has tons of great examples and documentation on programming the Playstation 2. Another great site for Playstation 2 programming is http://www.ps2dev.org/.
This is my first simple test program which draws 3 moving triangles on the screen and my name spinning in 3D in the center of the screen. This is for the most part a modified version of Dan Peori's introduction to 3D on the PS2
http://www.youtube.com/watch?v=DhdJ39Y01nU. Kind of a crappy video :(. I should have moved the camera closer... my name and weird handdrawn smily face in there are hardly visible. The Toxic logo came from the DMS4 and wasn't a part of my programming.
Download Source and ELF: test2-2006-03-16.tar.gz
ps2mandel mandelbrot generator
This program will draw a mandelbrot on the screen using code from my old Mandel Server program. I wrote this because a few years ago I tried using my mandelserver program on Playstation2 Linux, but for some reason it was extremely slow. I think the floating point may have been done in software or something. This time it's quite fast actually, altho not quite fast enough to be real time. After booting this program, the the joystick pad and left joystick can be used to rotate the image. The right joystick will move inside the mandelbrot and the X button will zoom in.
http://www.youtube.com/watch?v=QbjIsUzE9cA. The Toxic logo came from the DMS4.
Download Source and ELF: ps2mandel-2006-03-16.tar.gz
Copyright 1997-2018 - Michael Kohn