<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DevBlog &#187; SAM Coupe</title>
	<atom:link href="http://simonowen.com/blog/category/sam-coupe/feed/" rel="self" type="application/rss+xml" />
	<link>http://simonowen.com/blog</link>
	<description>Stuff and nonsense</description>
	<lastBuildDate>Thu, 17 Jun 2010 22:39:54 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Space Invaders emulator</title>
		<link>http://simonowen.com/blog/2009/12/10/space-invaders-emulator/</link>
		<comments>http://simonowen.com/blog/2009/12/10/space-invaders-emulator/#comments</comments>
		<pubDate>Thu, 10 Dec 2009 20:56:18 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[Emulator]]></category>
		<category><![CDATA[Release]]></category>
		<category><![CDATA[SAM Coupe]]></category>

		<guid isPermaLink="false">http://simonowen.com/blog/?p=122</guid>
		<description><![CDATA[I thought it was about time I added the Space Invaders &#8220;emulator&#8221; (binary port?) to my website, as I&#8217;d not touched it in over 3 years. Most of the work to get it running was done, with just sound and display rotation left to add. While mulling over the tricky display code I moved on [...]]]></description>
			<content:encoded><![CDATA[<p>I thought it was about time I added the Space Invaders &#8220;emulator&#8221; (binary port?) to my website, as I&#8217;d not touched it in over 3 years.  Most of the work to get it running was done, with just sound and display rotation left to add.  While mulling over the tricky display code I moved on to other projects and it was pretty much forgotten about.</p>
<p>It&#8217;s still unfinished but I&#8217;ve cleaned up the code, prepared a bootable disk, and refreshed myself on the technical details.  It was an interesting contrast to the Pac-Man project I&#8217;d worked on previously.  As before, the challenge was to modify as little of the original ROM as possible, with a virgin copy of the ROM patched at runtime.</p>
<p><strong>CPU</strong></p>
<p>The <em>Space Invaders</em> arcade machine uses an <a href="http://en.wikipedia.org/wiki/Intel_8080">Intel 8080</a> CPU running at just under 2MHz.  The Z80 was released 2 years after the 8080 and was designed to be object-code compatible, so the Invaders code runs on SAM (almost) unmodified.  The Z80 also added many new features, including: IX/IY index registers, alternate registers sets, multiple interrupt modes, CB/ED extended instruction sets, and the relative jump instructions <strong>JR [cc]</strong> and <strong>DJNZ</strong>.</p>
<p>The 8080 has a single interrupt mode equivalent to the Z80&#8242;s IM0, where an instruction is supplied on the bus at interrupt time.  The Invaders hardware supplies both <strong>RST 08</strong> and <strong>RST 10</strong> instructions at a frequency of 60Hz, which drive the overall game logic, including the attract screen.  SAM lacks the extra hardware, but they can both be simulated using IM2 and a line interrupt, without modifying the ROM.</p>
<p>I/O ports 1 to 6 are used for coin and button inputs, as well a hardware bit-shifter circuit.  The shifter takes a 16-bit value (written to port 4 in low/high order), and a left-shift count (written to port 2).  Reading from port 3 returns just the high byte of the result &#8212; more on this later.</p>
<p>As we&#8217;re running the ROM code natively, trapping the I/O requires patching the instructions that make the requests.  The only I/O instructions supported by the 8080 are <strong>IN A,(n)</strong> and <strong>OUT (n),A</strong>, which include the port number as an immediate operand.  This allows us to use a simple loop to find and patch instructions that access ports 1 to 6 (later checked manually to ensure no false-positive matches).  Each occurrence is replaced by a <strong>RST 08</strong> instruction, with the original operand modified to include a flag indicating whether the original instruction was IN or OUT.  We could have used separate RST calls for each, but that requires duplicating the RST handler and modifying more of the original ROM.</p>
<p>Since we&#8217;re simulating the interrupt calls, we have control over how the original <strong>RST 08</strong> and <strong>RST 10</strong> handlers are invoked.  The ROM code for both start with 4 register push instructions, which can be moved to our own interrupt handler, freeing the space for our I/O hook.</p>
<p><strong>DISPLAY</strong></p>
<p>Space Invaders uses a monochrome bitmapped display with a linear layout, similar to SAM&#8217;s mode 2.  The display resolution is 224&#215;256, but like most portrait arcade games the display hardware works in landscape mode.  Fitting the 256&#215;224 (rotated) area on SAM&#8217;s 256&#215;192 screen means we lose 4 character columns from the width of the play area.</p>
<p>As with SAM&#8217;s mode 2 (and the Spectrum), drawing to a non-character aligned position requires bit shifting of data.  Invaders uses this for more control over the vertical position of the invaders, as well as the smooth scrolling of player and invader bullets.  The hardware shifting circuit makes easy work of this, which is a good thing considering the slow CPU speed!  That said, the invader pack does only move one invader at a time, keeping the per-frame drawing to a minimum.</p>
<p>The Invaders display is stored at &#038;2400-3fff, which isn&#8217;t compatible with the 16K boundary requirement for SAM&#8217;s mode 2.  That means redirecting ALL display writes to a suitable upper memory location; something difficult to do from a centralised point in the code.  About the only option is to identify ROM routines accessing the display and provide alternative implementations.</p>
<p>Copying the first 6K of Invaders display to a SAM mode 2 screen in upper memory confirmed the game was running, but revealed another issue &#8212; the bit order within display bytes was reversed compared to SAM, requiring each byte be flipped before writing.  The byte rotation could be avoided by rotating the display in the opposite direction, but that would leave scanline rows in reverse order, requiring a much larger display mapping table to correct.</p>
<p>To map the display accesses to a SAM-compatible location we offset the high byte of the address.  Subtracting an additional 2 from this value also pulls the display up (well, left!) by two columns, centralising the game area on the SAM display.  This clips a character from each side of the title area, and half an invader at the left and right edges, but it&#8217;s only a small difference.  The movement range for the player turret is more limited so it&#8217;s unaffected.</p>
<p>The game now looked great, but play-testing revealed some issues.  When the invader pack reaches the edge of the display it&#8217;s supposed to lower and turn back, but that wasn&#8217;t happening.  Also, player bullets were passing through the invaders without hitting them.  It turned out that collision detection was done by checking the display contents, but it was still reading from the original display location.  Hooking an extra couple of routines to look at the new display area soon fixed that.</p>
<p>A final change was to add a splash of colour to match the original machine.  As the video hardware didn&#8217;t support colour, cellophane strips were added to areas of the monitor: green for lives, bases and player turret, red for the flying saucer at the top.  An equivalent effect can be achieved in the SAM version using blocks of mode 2 attributes, which are unaffected by the display data writes.</p>
<p>Rotating the display to the normal SAM orientation remains a challenge.  My original approach was to apply rotation and scaling to each display write, preserving the original layout.  That meant scaling/masking/combining each byte, so the iconic graphics would suffer some scaling distortion.  A better approach might be to relocate some areas of the display, as I did with the score and fruit areas in my Pac-Man emulator.  It still requires rotation, but only within simple 8 pixel blocks.  Writes from some hook reimplementations could also be optimised for full block writes.</p>
<p><strong>SOUND</strong></p>
<p>The sound effects in the original game are generated using analogue circuits rather than a sound chip, which makes them difficult to emulate in a traditional sense.  Most Space Invaders emulators use sound samples taken from the original machine instead.  I haven&#8217;t implemented the sound yet, but will attempt to create approximate effects with the SAM sound chip.</p>
<p>The source code and bootable disk image are <a href="http://simonowen.com/sam/invaders/">now available</a> on my website, but you&#8217;ll need to provide your own Space Invaders ROM image.</p>
]]></content:encoded>
			<wfw:commentRss>http://simonowen.com/blog/2009/12/10/space-invaders-emulator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SAM/IP</title>
		<link>http://simonowen.com/blog/2007/12/11/samip/</link>
		<comments>http://simonowen.com/blog/2007/12/11/samip/#comments</comments>
		<pubDate>Tue, 11 Dec 2007 23:27:35 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[SAM Coupe]]></category>
		<category><![CDATA[Trinity]]></category>

		<guid isPermaLink="false">http://simonowen.com/blog/2007/12/11/samip/</guid>
		<description><![CDATA[The SAM port of uIP seems to be on hold at the moment, so I&#8217;ve been looking at other IP stacks to use until it&#8217;s ready. The most appealing is Mark Rison&#8217;s CPC/IP, not least because it&#8217;s written in Z80 and should work without extensive changes. It also comes with a number of built-in client [...]]]></description>
			<content:encoded><![CDATA[<p>The SAM port of <a href="http://www.sics.se/~adam/uip/index.php/Main_Page">uIP</a> seems to be on hold at the moment, so I&#8217;ve been looking at other IP stacks to use until it&#8217;s ready.  The most appealing is Mark Rison&#8217;s <a href="http://www.cepece.info/cpcip/">CPC/IP</a>, not least because it&#8217;s written in Z80 and should work without extensive changes.  It also comes with a number of built-in client (telnet, finger, host, ping) and server (web, tftp, dns) modules.</p>
<p>So far I&#8217;ve modified the source so it assembles with <a href="http://www.intensity.org.uk/samcoupe/pyz80.html">pyz80</a>.  A global search and replace made quick work of changing the label format from &#8220;.label&#8221; to &#8220;label:&#8221;, but I had to change many data statements manually.  Strings were often combined with other single bytes in defb statements, but the Comet format used by pyz80 doesn&#8217;t allow that, requiring defm be used instead.</p>
<p>The existing code is nice and modular, but there are CPC-specific ROM calls sprinkled throughout them.  All those need to be changed before a test run on SAM, to avoid us unexpectedly jumping into the middle of nowhere!  I changed the stdio.z module to use SAM&#8217;s ROM calls for character output, leaving the cursor control and keyboard input doing nothing for now.  I also replaced the serial module with a dummy ethernet module, with no-op versions of the required interface functions.  They will be fleshed out with calls to the Trinity driver when the rest of the code is ready.</p>
<p>Those changes are enough for a basic run on SAM, without being connected to a real network.  Here&#8217;s what you see when it&#8217;s launched:<br />
<a href='http://simonowen.com/blog/wp-content/uploads/2007/12/cpcip.png' title='CPC/IP on SAM'><img src='http://simonowen.com/blog/wp-content/uploads/2007/12/cpcip_small.png' alt='CPC/IP on SAM' /></a></p>
<p>The program continues polling for serial and keyboard input in a main loop.  The CPC version uses a 300Hz timer to poll for new data from the serial link, which is buffered for later reading from the main loop.  Each received byte is passed into either the SLIP or PPP module (whichever was configured during the build), which builds up complete datagrams.  These are then passed into the &#8216;ip_handle&#8217; function inside ip.z for the processing.</p>
<p>The SAM implementation will read complete datagrams from the Trinity driver, so they can be passed straight into &#8216;ip_handle&#8217;.  This vastly simplifies the setup above, but introduces a new requirement: ARP.  SLIP and PPP push datagrams back into the link and let the remote end deal with routing.  With ethernet we need to determine the hardware addresses for delivery, for both local and routed traffic.</p>
<p>I&#8217;ll need to write a new arp.z module to sit between the Trinity driver and the IP module.  Outgoing traffic for hosts already in the ARP cache can be sent immediately.  Anything for as-yet-unknown targets must be buffered, and a who-has ARP request made for the address owner to reply.  Once a reply is received, an entry for it is added to the local ARP cache and data buffered for that host is sent.  If no reply is received (ideally after multiple attempts), data for the target will be discarded.  We must also reply to incoming ARP requests for our own address so other hosts can to talk to us.</p>
<p>I&#8217;m still torn between using the SAM ROM routines for I/O and something based on the terminal code I wrote for the <a href="http://simonowen.com/blog/2007/03/19/apple-1-emulator/">Apple 1 emulator</a>.  The ROM code would give the same output flexibility as in BASIC, but the general ROM code is a bit on the slow side.  My own code could be tailored for a specific mode, either mode 2 for speed or mode 3 for hi-res.  It might be easier to stick with the ROM code for now, and change it if it&#8217;s too slow.</p>
]]></content:encoded>
			<wfw:commentRss>http://simonowen.com/blog/2007/12/11/samip/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Trinity Ethernet</title>
		<link>http://simonowen.com/blog/2007/11/30/trinity-ethernet/</link>
		<comments>http://simonowen.com/blog/2007/11/30/trinity-ethernet/#comments</comments>
		<pubDate>Fri, 30 Nov 2007 00:09:54 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[SAM Coupe]]></category>
		<category><![CDATA[Trinity]]></category>

		<guid isPermaLink="false">http://simonowen.com/blog/2007/11/30/trinity-ethernet/</guid>
		<description><![CDATA[After a break of a few of months, I&#8217;m almost back on the development wagon. I did the odd project tweak during that time but haven&#8217;t spent any quality time working on new features. Last month I picked up one of the first Quazar Trinty boards. Since then I&#8217;ve been working on the ethernet side, [...]]]></description>
			<content:encoded><![CDATA[<p>After a break of a few of months, I&#8217;m almost back on the development wagon.  I did the odd project tweak during that time but haven&#8217;t spent any quality time working on new features.</p>
<p>Last month I picked up one of the first Quazar Trinty boards.  Since then I&#8217;ve been working on the ethernet side, which is based around a MicroChip ENC28J60 chip.  The Trinity board also includes EEPROM and MMC/SD board features, but I&#8217;m leaving those for another time.  My first task was to write a simple network driver, to allow sending and receiving raw packets from BASIC.</p>
<p>Trinity uses the SAM port range &#038;DC to &#038;DF.  The first of these is the microcontroller, which acts as a central hub for all the board&#8217;s features.  The other ports are used for the EEPROM, Ethernet and MMC/SD card, and each needs to be enabled through the microcontroller before it can be used.  Port &#038;DE is used for ENC ethernet chip, and once enabled we can read and write to the chip directly.  Well, almost directly as the link uses the SPI bus.</p>
<p>If you&#8217;re as clueless about electronics as I am you probably won&#8217;t have come across the SPI (Serial Peripheral Interface) bus.  It&#8217;s a full duplex link where each byte written is paired with a read back from the device.  Since reads can&#8217;t be performed without a write, Trinity stores the value read for later.  Reading from SAM reads only the stored value, without accessing the ENC.</p>
<p>SPI introduces a lag between writing a value and reading any result generated by the write, since the stored value is what was read <em>before</em> the write completed.  An additional dummy (zero) write is needed for the actual result to be available for reading.  The lag also means block reads require a dummy write before reading each byte.  Fortunately, the latest Trinity firmware provides an auto-null-writing feature to simplify and optimise this.</p>
<p>The ENC itself has a banked register setup, arranged as 4 banks of 32 registers.  The final 5 registers in each bank are common across all banks, and are used for status registers and bank selection.  All ENC features are accessed through these registers, including reading and writing from the internal 8K data buffer.  The buffer is used for both transmitting and receiving, with a user-defined portion of it configured as a circular receive buffer.  The remaining space is unmanaged and available for transmission storage.</p>
<p>In its power-on state the ENC will see but not receive anything.  It has no hardware address set, no space allocated for the receive buffer, and the packet filter is set to ignore everything.  The driver initialisation is responsible for setting up all of those, and any other register where the defaults are not suitable.  Before we do that it&#8217;s wise to ask the Trinity microcontroller to reset the ENC chip back to a known state.</p>
<p>I started my experimentation from BASIC as it was quicker to tweak the ENC registers and see results than launching the assembler for each change.  Colin supplied a sample disk with macros to access the board, with most containing a couple of OUTs and maybe an IN.  I added to them for higher level functions, such as setting the MAC address and writing blocks of data to the ENC buffer.  Once I was happy this was working I was ready to port it to Z80.</p>
<p>I chose to use 6.5K of the 8K buffer for receiving, with 1.5K left for sending.  That&#8217;s just enough space to send a single full-size ethernet frame.  The packet filter was set to receive packets addressed to our MAC address, as well as anything broadcast to the whole subnet.  Writing a zero to the packet filter register disables it, so all local network traffic is seen.  Couple that with packet decoding and you have an easy network sniffer.</p>
<p>My driver development wasn&#8217;t all smooth sailing, with a few bumps along the way.  The first was my early attempts to write and read the MAC address values, to ensure my new Z80 code was working.  It turns out the subset of ENC registers starting with &#8216;M&#8217; (which includes the MAC registers) have an extra lag on top of SPI, and require double-reading before they return the correct result.  I was also stung by a documented ENC issue with the transmit logic getting stuck under certain conditions.  A bug in my work-around meant I would still occasionally hang during transmits.</p>
<p>Even with the driver initialised and reception enabled, we&#8217;re still not quite ready to handle a test ping from another machine on the network.  Responding to requests requires CRC calculations in the return packets, which involved more work than I wanted to do for a test setup.  That will be the job of of a full IP stack.  It&#8217;s marginally easier to send a ping request <em>from</em> SAM, since the request can be pre-calculated and it&#8217;s only the remote host that needs to worry about dynamic responses.</p>
<p>Even pinging an IP address from SAM is surprisingly involved:</p>
<ol>
<li>Use local IP and netmask to determine whether target IP is on our subnet (if not, send to gateway machine for further routing)</li>
<li>Check local ARP cache for the target IP (if found, goto 5)</li>
<li>Broadcast who-has ARP request to find the MAC of the IP</li>
<li>Wait for ARP reply, then add MAC to local ARP cache</li>
<li>Construct ECHO REQUEST ICMP packet</li>
<li>Send unicast packet to target MAC</li>
</ol>
<p>Fortunately, we can strip this down for the sake of a simple test.  We&#8217;re using a local target so step 1 is unnecessary.  We can also hard-code the MAC of the target machine, to also skip steps 2 to 4.  An ICMP ECHO request packet can then be constructed with fixed details and pre-calculated CRCs.  I used Ethereal on my PC to sniff a request sent with a zero CRC, which was expected to fail, then completed the correct CRC with what it reported.</p>
<p>To send a reply, the target machine will perform the same steps as above, with an ICMP ECHO REPLY packet.  As SAM is currently unable to reply to ARP requests we must use the &#8220;arp&#8221; command on the target machine to add a static entry linking SAM&#8217;s IP with its MAC address.  In my case that meant running the following command in Windows XP:</p>
<p><code>arp -s 10.0.0.88 02:A4:92:E4:D3:20</code></p>
<p>The test MAC address I used was formed from bits of the string &#8220;TRINITY&#8221;, with a few unused zero bits at the end.  Bits 0 and 1 of the first byte are flags, but the rest can be pretty much anything.  I&#8217;ve set flag bit 1 to mark the address as &#8220;locally administered&#8221;, to avoid the (rather unlikely!) clash with existing network devices.  To avoid clashes with other Trinity boards, Colin will be assigning unique addresses to each one sold.  For convenience, the MAC and other network settings will ultimately stored on the EEPROM.</p>
<p>The rigid setup above was enough to show that I could ping my PC from SAM, and have the echo reply read from the receive buffer.  What we needed now was a proper IP stack to plug my driver into&#8230;</p>
<p>While I was working on the driver, Adrian Brown was busy porting Adam Dunkels&#8217; <a href="http://www.sics.se/~adam/uip/index.php/Main_Page">uIP</a> stack from C to Z80.  He&#8217;s made quick work of it too, with ARP and ICMP already working well enough to ping from PC to SAM without the need for any of my cheating (ping times are typically 7-8ms).  Once TCP is ready we&#8217;ll have enough for some real applications!  Web server anyone?</p>
]]></content:encoded>
			<wfw:commentRss>http://simonowen.com/blog/2007/11/30/trinity-ethernet/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SID Player v1.1</title>
		<link>http://simonowen.com/blog/2007/06/07/sid-player-v11/</link>
		<comments>http://simonowen.com/blog/2007/06/07/sid-player-v11/#comments</comments>
		<pubDate>Thu, 07 Jun 2007 22:29:10 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[SAM Coupe]]></category>
		<category><![CDATA[sidplay]]></category>

		<guid isPermaLink="false">http://simonowen.com/blog/2007/06/07/sid-player-v11/</guid>
		<description><![CDATA[I&#8217;ve updated SAM SID Player to version 1.1, addressing some issues with the original version: Updated 6502 core The recent core enhancements mean it&#8217;s now possible to trap SID writes from all instructions, without the need for hard-coded checks. Control register re-triggering now works correctly in all tunes rather than just the few previously supported [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve updated SAM SID Player to version 1.1, addressing some issues with the original version:</p>
<ol>
<li><strong>Updated 6502 core</strong><br />
The recent core enhancements mean it&#8217;s now possible to trap SID writes from <em>all</em> instructions, without the need for hard-coded checks.  Control register re-triggering now works correctly in all tunes rather than just the few previously supported cases.</p>
<p>Unfortunately, limited program space prevents the full 65C02 core being used, so the extra instructions have been replaced by NOPs of the appropriate size.  This is still better than the previous behaviour of failing if an undocumented instruction was encountered.  The updated core also includes a bug fix to the indirect indexed addressing using X, which wasn&#8217;t performing the indirect lookup correctly.</li>
<li><strong>Additional playback rates</strong><br />
The previous version supported only 50Hz playback using SAM&#8217;s frame interrupt.  This worked well with most tunes (taken from PAL C64 titles), but it made 60Hz NTSC tunes (such as <em>Fairlight</em>) sound sluggish, and anything requiring 100Hz or above sounded awful.</p>
<p>Generating 60Hz on a 50Hz machine is a bit of a challenge, requiring synchronisation to 6 different points across the frame, advancing to the <em>previous</em> point in the <em>next</em> frame to achieve the correct playback rate.  In our case it also needs to work without stealing too much CPU time from the 6502 emulation running in the background.  The 6 sync points divide the 312 raw display lines into 52 lines segments.  The first point is simply the frame interrupt, which is nice and easy.  With a 1-line adjustment, the final 4 points fall on the main screen area, and can be synchronised to using line interrupts at screen lines: 35, 87, 139 and 191.</p>
<p>That just leaves the second point at display line 52, which is 68-52=16 lines above the main screen area.  Busy-looping from the frame interrupt would waste 1/6 of the total frame time, and the point is too early for a line interrupt&#8230; but not for another technique.  MIDI writes are output at a fixed 31.25Kbaud, and generate an interrupt to signal when the transfer has completed, even if there&#8217;s no device present to receive the write.  Using a cascading sequence of MIDI writes starting from the frame interrupt, we can regain control at the required point without having to wait for it.  There is some interrupt processing overhead, but any remaining time is free for the main 6502 emulation.</p>
<p>A number of SID tunes also use the C64 programmable timer to generate custom speeds, which can be used to make the playback speed independent of PAL/NTSC model.  50/60Hz timers are supported the same way as PAL/NTSC tunes, as described above.  100Hz is used by a few tunes, and can be supported by adding a single line interrupt in the middle of the frame (312/2-68 = line 88).</p>
<p>The tune playback speed is detected automatically, using the <em>speed</em> bit array in the SID tune header and the active C64 timer frequency, with 50Hz used for other cases.  In the playback rate is close enough to one of the supported speeds then it will be used instead.  You can also override the playback speed with the following keys: 1=100Hz, 5=50Hz and 6=60Hz.</li>
<li><strong>Large tune support</strong><br />
To simplify relocating the SID tune, the previous version required the tune be loaded at 49152 with a maximum size of 16K.  This could be expanded to 28K by allowing the tune to be loaded directly after the 4K player code at 36864.  That <em>still</em> doesn&#8217;t give enough space to load the 49K <em>Ghouls n Goblins</em> SID, which fills most of the available C64 RAM.</p>
<p>The new version now works with tunes up to the full 64K, including those that span the I/O area from &#038;d000-dfff (which is where the SID player code runs).  On the first playback the tune is relocated to the correct address, with subsequent plays using the existing player to save time.  As with the previous version, a fresh copy of the SID player code is copied for each playback, to minimise the risk of tune players overwriting parts of it.</li>
<li><strong>Keyboard control tweaks</strong><br />
The new version adds a mask for keys to ignore during playback, allowing the caller to limit the key selection causing the player to terminate.  This allows the Next key to be ignored when there is no next tune to play, etc.</li>
</ol>
<p>I&#8217;ve updated the <a href="http://simonowen.com/sam/sidplay/">sidplay page</a> with the new source code, which can be assembled directly to a disk image using <a href="http://www.intensity.org.uk/samcoupe/pyz80.html">pyz80</a>.</p>
<p>You can also download a <a href="http://simonowen.com/sam/sidplay/sidplay.zip">sample disk</a> (175K) containing 37 sample SID tunes.  You&#8217;ll need a Quazar <a href="http://www.samcoupe.com/hardsid.htm">SID interface</a> board for your SAM to hear anything, of course!</p>
]]></content:encoded>
			<wfw:commentRss>http://simonowen.com/blog/2007/06/07/sid-player-v11/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Apple 1 emulator</title>
		<link>http://simonowen.com/blog/2007/03/19/apple-1-emulator/</link>
		<comments>http://simonowen.com/blog/2007/03/19/apple-1-emulator/#comments</comments>
		<pubDate>Mon, 19 Mar 2007 07:46:17 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[Emulator]]></category>
		<category><![CDATA[Release]]></category>
		<category><![CDATA[SAM Coupe]]></category>

		<guid isPermaLink="false">http://blog.simonowen.com/2007/03/19/apple-1-emulator/</guid>
		<description><![CDATA[This will probably be my last emulator for a while so I can return to normal projects. I&#8217;d wanted to emulate the Apple 1 for quite a while, and didn&#8217;t think it should take more than a couple of hours to make a usable emulator. The Apple 1 is a surprisingly simple device, with 1MHz [...]]]></description>
			<content:encoded><![CDATA[<p>This will probably be my last emulator for a while so I can return to normal projects.  I&#8217;d wanted to emulate the <a href="http://en.wikipedia.org/wiki/Apple_1">Apple 1</a> for quite a while, and didn&#8217;t think it should take more than a couple of hours to make a usable emulator.</p>
<p>The Apple 1 is a surprisingly simple device, with 1MHz 6502 CPU, 4K RAM (expandable to 32K) and a tiny 256-byte monitor ROM.  Slots on the main board allowed for add-on ROMs for BASIC, cassette functions, assembler, etc.  The <a href="http://simonowen.com/sam/apple1emu/a1man.pdf">user manual</a> includes comprehensive hardware details plus a <em>fully commented</em> disassembly of the monitor ROM.</p>
<p>There aren&#8217;t many original Apple 1 devices anymore, but there are a few modern replicas available.  Most of them use the 65C02 CPU, so it&#8217;s probably just as well I added support for it recently!  The original ROMs don&#8217;t use the extended instructions but a 3rd party assembler supports for them so it was worth having them covered.</p>
<p>Input and output is via a dumb terminal style interface, supporting only upper-case letters and a slightly cut-down symbol set.  I/O speed is tied to the terminal display, giving a 60 characters/second maximum on the original.  Both input and output have data and control ports, with the latter used to indicate whether the terminal is busy outputting a character or has a key available for input.</p>
<p>For the emulation, the limited terminal output speed means at most 1 character (plus cursor) needs to be drawn each interrupt.  Until that is processed the terminal appears busy and the running program will wait before outputting more.  SAM&#8217;s 50Hz interrupt frequency reduces the output speed slightly, but not by enough to worry about.  Adding a line interrupt in the middle of the frame (312/2-68 = line 88) double this to 100Hz very easily, so Sym-1 and Sym-2 can be used to change the terminal speed.</p>
<p>When a character is written or a key is read, the terminal must update the control ports with the new status.  This must happen as part of the read/write to prevent the running program doing anything further until it has been processed.  The 6502 core was enhanced to trap memory writes for the Orao emulator, so the display could be updated immediately.  The Apple 1 emulation also needs the same enhancement for memory <em>reads</em>, so it can update the input control port.</p>
<p>The output terminal is 40&#215;24 characters, giving a maximum character set size of 6&#215;8 pixels for a 256&#215;192 mode 2 SAM screen mode.  Mode 3 would have allowed up to 12&#215;8 thin pixels, but there wasn&#8217;t really much benefit from the extra resolution, and the 24K display was 4 times slower to scroll.  Perhaps the only drawback in using mode 2 is that Spectrum-style masking and rotating needed to draw each character.</p>
<p>Input is entirely character based, and doesn&#8217;t need support for multiple simultaneous key presses (just Shift for symbols).  For that reason I decided to use SAM&#8217;s ROM keyboard scanner rather than rolling my own version.  All I had to do was page the ROM in and ask for the next key, with the ROM debouncing the input and buffering fast typing.  The returned key symbols are then converted to Apple 1 keys, converting lower-case to upper-case, and mapping a few special keys including Delete, Tab and Escape.</p>
<p>Normal monitor use is 100% speed, as most of the time is spent waiting for key presses.  To test the underlying speed I ran an empty loop in BASIC:  FOR X=1 TO 1000 : NEXT X  which takes 1.2 seconds on the original device and 10 seconds in my emulator.  The 10-15% running speed matches the results for the Orao emulator, and probably applies to anything else that uses my 6502 core.</p>
<p>The completed emulator (plus source code) is <a href="http://simonowen.com/sam/apple1emu/">now available</a> on my website.</p>
]]></content:encoded>
			<wfw:commentRss>http://simonowen.com/blog/2007/03/19/apple-1-emulator/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Orao emulator</title>
		<link>http://simonowen.com/blog/2007/03/05/orao-emulator/</link>
		<comments>http://simonowen.com/blog/2007/03/05/orao-emulator/#comments</comments>
		<pubDate>Mon, 05 Mar 2007 23:46:01 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[Emulator]]></category>
		<category><![CDATA[Release]]></category>
		<category><![CDATA[SAM Coupe]]></category>

		<guid isPermaLink="false">http://blog.simonowen.com/2007/03/05/orao-emulator/</guid>
		<description><![CDATA[This emulator started as a quick test of my 6502 core, to see if it could run the Orao ROMs. I half expected it to fail due to lack of decimal mode or interrupt support, neither of which were implemented in the SID player core. It took just 20 minutes of hacking the SID player [...]]]></description>
			<content:encoded><![CDATA[<p>This emulator started as a quick test of my 6502 core, to see if it could run the <a href="http://en.wikipedia.org/wiki/Orao_(computer)">Orao</a> ROMs.  I half expected it to fail due to lack of decimal mode or interrupt support, neither of which were implemented in the <a href="http://simonowen.com/sam/sidplay/">SID player</a> core.  It took just 20 minutes of hacking the SID player source code to reach the point where I could see the flashing input cursor, and it would have been a crime not to continue&#8230;</p>
<p>Keyboard input was transplanted from the matrix scanning in the <a href="http://simonowen.com/sam/galemu/">Galaksija emulator</a>, though due to the weird memory mapping layout, the Orao table needs 3-byte (address+value) entries for each key.  The bulk of the addresses were taken from the <a href="http://www.foing.hr/~fng_josip/orao.htm">Windows Orao emulator</a> source, though there were a few minor errors that I&#8217;ve corrected (one of them stopping Up working in Boulder Dash).</p>
<p>I considered updating the display during interrupt processing, but the large (256&#215;256 = 8K) display size was too much to do <em>every</em> frame.  Splitting the frame into 8 or 16 strips to have minor impact on the CPU emulation would have made it too obvious and laggy.  It seemed better to update the display live by catching writes to the display memory.  Unlike native running Z80-based emulators, we have full control of the 6502 CPU and can filter the writes as they happen.</p>
<p>One approach was to modify any instruction that could write to the display, but that would require a lot of duplicate code.  Fortunately, each of the writes formed the target address in HL, where it remained until the point it jumped back to main_loop (next instruction fetch).  I simply had to define a new looping point, and use that instead of main_loop for any display write candidates.  Zero-page writes (&#038;0000-&#038;00ff) couldn&#8217;t affect the display (&#038;6000-&#038;7fff), so they were ignored.</p>
<p>Display writes were filtered using a simple:<br />
<code>LD   A,H<br />
CP   &#038;80<br />
JR   NC,screen_or_up</code><br />
Using JR instead of JP meant the fall through case was only 5 tstates instead of 10 tstates.  The total display write checking overhead was 4+7+5 = 16 tstates (plus contention) for normal RAM writes, which didn&#8217;t seem <em>too</em> bad.  Further address filtering could also be done for sound writes at &#038;8800, without further slowing of the normal RAM write path.  No other addresses were of interest to us, so they were ignored.</p>
<p>At first glance the Orao display seems perfectly suited to SAM&#8217;s mode 2 layout, with both using linear addressing, 32 bytes per line, and 8 pixels per byte.  The biggest difference is Orao&#8217;s 256 line vs SAM&#8217;s 192, where clipping or scaling of the display would be needed.  Unfortunately, the bit order within each Orao display byte is also reversed compared to SAM, ruling out a simple memory copy to update the display.</p>
<p>Lookup tables to the rescue!  Using a 256-byte table for the bit-reversing was a no-brainer, but the display mapping was more awkward.  My first thought was to use a line mapping table, mapping from Orao line to SAM line, with &#038;c0 entries for lines that weren&#8217;t visible.  That still required too much arithmetic to look up an address, then add the line offset from the low 5 bits of the original address.  Whatever I used would be done for <em>every byte</em> written to the display, so it had to be fast.</p>
<p>The 12K needed for the mode 2 display meant there wasn&#8217;t room in the normal 64K address space, so I was already paging to access it.  That left over 16K of spare space in the 32K paging window.  Rather than looking up display lines, I had enough space to map every byte on the Orao display to the final SAM address.  This also gave the flexibility needed to pan any 192-line view of the original 256-line display, and even to scale the original display to fit, without any additional overhead.</p>
<p>As with the 6502 instruction handler addresses, the display table was ordered with address low bytes in the lower half and the address high bytes in the upper half.  That allowed a SET/RES instruction to switch halves during the lookup, which is twice the speed of using add/sub on the high byte instead.  Orao display bytes outside the visible area are mapped to SAM line 192, just beyond the visible display.</p>
<p>Everything seemed perfect at this point, until I realised I needed to preserve the 6502 PC value in DE.  The core also crammed 6502 registers into almost every other Z80 register, leaving little room to juggle paging, the original address and a new address lookup.  The only register-based option to preserve DE was to use IX, at a cost of 16 tstates each way.  That was still 4 tstates faster that pushing DE around the block, once stack memory contention was included.</p>
<p>Here&#8217;s the final screen write code:<br />
<code><br />
ld  ixh,d<br />
ld  ixl,e<br />
ld	e,(hl)<br />
ld	d,rev_table/256<br />
ld	a,screen_page+rom0_off<br />
out	(lmpr),a<br />
ld	a,(de)<br />
ld	d,(hl)<br />
res	5,h<br />
ld	e,(hl)<br />
ld	(de),a<br />
ld	a,low_page+rom0_off<br />
out	(lmpr),a<br />
ld  d,ixh<br />
ld  e,ixl</code></p>
<p>The 6502 core got a few additional upgrades along the way, with the first being a boost to 65C02 support.  This added a new addressing mode, and a handful of new instructions (many sorely lacking from the base 6502).  A side-effect of this was that undocumented instructions were guaranteed NOPs (1 to 3 bytes in length), so I didn&#8217;t have to worry about the hybrid undocumented instructions in the original chip.</p>
<p>Decimal mode was finally added too, in just 20 bytes of extra code.  I simply needed to optionally execute a DAA after the adc/sbc calculations to make the necessary adjustment.  The DAA was patched with a NOP when in normal binary calculation mode, for the non-BCD behaviour.</p>
<p>There were no interrupts to handle for the Orao, but I completed the implementations of BRK (call maskable interrupt handler) and RTI (return from interrupt), so they could be used if anything tried.  As they&#8217;re untested, I set the emulator border colour to green to show it has been used.  This actually happens when running Space Invaders, but due to a suspected corrupt image.  The BRK instruction is &#038;00, so it&#8217;s quite likely to get called if execution jumps to a random memory location.</p>
<p>As a result of the 65C02 change the emulator now runs Manic Miner correctly, a game which crashes under the Windows versions due to incorrect undocumented instruction handling.  The decimal mode addition also fixes the score updating in Manic Miner and the timer count-down in Boulder Dash.  Space Invaders will need redumping from the original tape for it to work correctly.</p>
<p>The final emulation speed is typically 10-15% that of the original machine speed, with slower speed during heavy display writes due to the screen write overhead described above.  The mix of 6502 instructions also makes a difference, with heavy indexing requiring more calculations for the CPU emulation.  It still runs surprisingly quickly considering everything it&#8217;s doing, and on a machine produced only a few years later.</p>
<p>The completed emulator is <a href="http://simonowen.com/sam/oraoemu/">now available</a> on my site, with the source code following soon.</p>
]]></content:encoded>
			<wfw:commentRss>http://simonowen.com/blog/2007/03/05/orao-emulator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Galaksija emulator</title>
		<link>http://simonowen.com/blog/2007/02/21/galaksija-emulator/</link>
		<comments>http://simonowen.com/blog/2007/02/21/galaksija-emulator/#comments</comments>
		<pubDate>Wed, 21 Feb 2007 23:02:36 +0000</pubDate>
		<dc:creator>Simon</dc:creator>
				<category><![CDATA[Emulator]]></category>
		<category><![CDATA[Release]]></category>
		<category><![CDATA[SAM Coupe]]></category>

		<guid isPermaLink="false">http://blog.simonowen.com/2007/02/21/galaksija-emulator/</guid>
		<description><![CDATA[I&#8217;ve spent most of the last week porting Tomaz Kac&#8217;s Galaksija emulator from Spectrum to SAM, with version 1.0 now available on my site. Basic SAM support was trivial, requiring just an OUT to LMPR to page in the ROMs, and another OUT to VMPR to set video mode 1. The Spectrum key matrix is [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve spent most of the last week porting Tomaz Kac&#8217;s Galaksija emulator from Spectrum to SAM, with version 1.0 <a href="http://simonowen.com/sam/galemu/">now available</a> on my site.</p>
<p>Basic SAM support was trivial, requiring just an OUT to LMPR to page in the ROMs, and another OUT to VMPR to set video mode 1.  The Spectrum key matrix is compatible so the keyboard worked, and SAM&#8217;s break button could even be used for hard break in Galaksija BASIC too.  Proper SAM support required a number of changes&#8230;</p>
<p>The first was to use mode 2 instead of mode 1, mainly to avoid the <a href="http://simonowen.com/sam/articles/mode1/">mode 1 contention</a> slowdown.  This only gave roughtly a 10% speedup, which was rather disappointing.  A chunk of the extra time was eaten away by needing ADD HL,DE to move between screen lines in the linear layout instead of INC H (for most) on the Spectrum.  In the worst case of drawing all 512 characters it requires an extra 512*11*(12-8) = 45056 tstates!  It <i>is</i> still faster overall, and worth the change from mode 1.</p>
<p>The Galaksija display is a text mode of 32&#215;16 characters, with each character 8&#215;13 pixels in size.  Tomaz had already changed the font to be 8&#215;12 to fit the Spectrum height, which was perfect for SAM use too.  Having 2 characters span 3 vertical blocks made it easier to draw on the Spectrum version, but made no difference to mode 2 on the SAM &#8211; each character could be treated completely separately.</p>
<p>The biggest difference from the Spectrum version was to draw only changed characters between frames.  This gave a huge speed boost in most cases, and even with the extra comparisons it was faster than the old version as long as no more than 80% of the screen was being updated.  Further speedups were made by using a table for character address lookups, rather than tracking the address in a register (which required some arithmetic).  A stack method to access the table allows a simple POP to fetch the next address, or discard it for skipped characters.</p>
<p>The speed gained meant that many games now ran too fast, or they ran with variable speed depending on how much of the display had changed.  That required adding a throttling mechanism to limit the maximum speed, which was implemented by limiting number of unchanged characters that could be skipped during drawing.  This technique worked well for the SAM version, but not so well for the Spectrum.  In some cases <i>adding</i> small delays to the Spectrum drawing code increased the running speed!  This is related to the long interrupt processing and the games own frame synchronisation, but I still don&#8217;t fully understand the details.</p>
<p>In a last minute change I replaced the SAM throttling code to use HPEN to track the number of frames that had elapsed during drawing.  Rather than looping over all 512 characters as 2 blocks of 256, I split it into 4 blocks of 128.  At the end of each block I read HPEN and compare with the previous value, and if it&#8217;s less than the current value I know it&#8217;s spilled over to a new frame.  At the end, if 2+ frames have passed then I&#8217;m running too slow, otherwise there&#8217;s time to spare.  In the latter case I poll HPEN to wait for the raster to reach a specific point, before leaving the interrupt handler and allowing the game to continue.  The actual screen position was determined by trial and error, to ensure that most games ran at the correct speed.</p>
<p>Galaksija software was loaded from tape, and the Spectrum version used the ROM routines at &#038;0556 to load blocks.  The same technique could have been used for the SAM version, but since nobody uses tapes anymore it wouldn&#8217;t have been much use.  Instead I chose to exit to SAM BASIC to handle the loading, which was easier than paging DOS and using hook codes to do it directly.  Adding a simple menu and .GTP tape image parsing was just a few dozen lines of code too.</p>
<p>Full source code and build instructions are included with the disk image download, if you&#8217;re interested in knowing more.</p>
]]></content:encoded>
			<wfw:commentRss>http://simonowen.com/blog/2007/02/21/galaksija-emulator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
