Mercurial > hg > ede
view doc/general/ede.tex @ 29:83e80c2c489c
seperated working emu code from broken emu code.
wrote dbg interface
author | james <jb302@eecs.qmul.ac.uk> |
---|---|
date | Sun, 13 Apr 2014 22:42:57 +0100 |
parents | a542cd390efd |
children |
line wrap: on
line source
%% LyX 2.0.6 created this file. For more info, see http://www.lyx.org/. %% Do not edit unless you really know what you are doing. \documentclass[english]{article} \usepackage{lmodern} \renewcommand{\sfdefault}{lmss} \renewcommand{\ttdefault}{lmtt} \renewcommand{\familydefault}{\sfdefault} \usepackage[T1]{fontenc} \usepackage[utf8]{luainputenc} \usepackage{listings} \usepackage{color} \usepackage{graphicx} \makeatletter %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% LyX specific LaTeX commands. %% A simple dot to overcome graphicx limitations \newcommand{\lyxdot}{.} \makeatother \usepackage{babel} \begin{document} \title{EDE: ELB816 Development Environment} \author{James Bowden (110104485)} \maketitle \begin{abstract} The ELB816 Development Environment consists of an assembler, emulator and debugger for the ELB816 microprocessor system. This report details the design and usage of each of its elements. \end{abstract} \newpage{} \tableofcontents{} \newpage{} \part{Introduction and Specification} \bigskip \section{Motivations} The ELB816 architecture is designed to be a ``simple to understand 8-bit microprocessor system to help learn about microprocessor electronics.'' \bigskip The combination of an ELB816 emulator, debugger and assembler could be used as a set of tools for learning or teaching microprocessor programming without the intricacies of real-world commercial microprocessors getting in the way of a fundamental understanding of the subject. \bigskip A PC based emulator would allow students to quickly develop and debug programs written in a simple assembly language on any modern desktop or laptop and an MCS-51 port running on an 8052 would allow students to test programs in an actual circuit. \section{Project Aims} \begin{itemize} \item Develop an assembler for the ELB816 assembly language. \item Develop an emulated programmable microprocessor system based on the ELB816 architecture. \item Develop a debugger that allows interactive debugging of programs running on the emulator. \end{itemize} \section{Methodology} \subsection{Assembler} \begin{description} \item [{Language:}] Python \item [{Priority:}] First \end{description} The assembler will be developed before anything else so that it can subsequently be used to assemble test programs during development of the emulator. \newpage{} \subsection{Emulator} \begin{description} \item [{Language:}] C \item [{Priority:}] Second \end{description} The emulator will use only standard libraries in order to ensure it is portable between compilers and platforms. Specifically GCC for x86 and Keil C51 for Intel MCS-51. The emulator will first be developed on Linux to facilitated rapid development. It will be ported to MCS-51 once it is complete \subsection{Debugger} \begin{description} \item [{Language:}] C/Python \item [{Priority:}] Second \end{description} The debug interface will be developed along side the emulator. It will consist of a simple text based interface built into the emulator that will read commands using C's \lstinline[basicstyle={\ttfamily}]!stdio.h! library. This means that on Linux the commands will be issued using \lstinline[basicstyle={\ttfamily}]!STDIN! and on the MCS-51 version they will be issued over a serial interface. Python will be used to provide a cleaner interface for common debug procedures such as writing programs to memory and setting break-points. \bigskip The remainder of this report is split into three parts, one for each component of the project, and will attempt to demonstrate the design and usage of each of these components. \newpage{} \part{Assembler} The assembler is written in pure Python 2 using only the standard library. It assembles the assembly the language described in the ELB816 specification with a few minor differences. These differences are: \begin{itemize} \item In-line arithmetic must be wrapped in curved brackets eg. start with '(' and end with ')'. This is a limitation of the design of the program and to change it would require a large amount of code to be re-written. \item The only directives that have been implemented are \lstinline[basicstyle={\ttfamily}]!ORG!, \lstinline[basicstyle={\ttfamily}]!EQU!, \lstinline[basicstyle={\ttfamily}]!DB! and \lstinline[basicstyle={\ttfamily}]!DS!. The other directives listed in the specification have not been implemented, but there omission is only due to time constraints and they could easily be implemented in a later version. \item Macros have not been implemented also due to time constraints. \end{itemize} The assembler consists of two files: \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!language.py! which contains the language definition in an index and some functions to help encode instructions. \item \lstinline[basicstyle={\ttfamily}]!assembler.py! which contains the first and second pass functions and handles opening source files and writing binary files. \end{itemize} The following sections details the design and behavior of the assembler. However it must be noted that these are abstract and high level descriptions that do not fully explain minor routines, but give an overview of the entire process. The full source code is attached in the Appendix and should be referenced for a deeper understanding of the program's operation. The final section is a short programmers manual demonstrating the assembler's features. \newpage{} \section{Data Structures} \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!reserved arguments! \end{itemize} This structure contains a list of string representations of the reserved word arguments for the instruction set. These all equate to registers or register pointers. The full list is as follows: \begin{lstlisting}[basicstyle={\ttfamily},captionpos=b,frame=tb,framexbottommargin=1em,framextopmargin=1em,keywordstyle={\color{blue}},tabsize=4] a, c, bs, ie, flags, r0, r1, r2, r3, dptr, dpl, dph, sp, sph, spl, @a+pc, @a+dptr, @dptr \end{lstlisting} \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!relative instructions! \end{itemize} This structure contains a list of string representations of the mnemonics of instructions that use relative addressing. The full list is as follows: \begin{lstlisting}[basicstyle={\ttfamily},captionpos=b,frame=tb,framexbottommargin=1em,framextopmargin=1em,keywordstyle={\color{blue}},tabsize=4] djnz, cjne, sjmp, jz, jnz, jc, jnc, jpo, jpe, js, jns \end{lstlisting} \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!instruction index! \end{itemize} This structure contains an index of all possible instructions in the instruction set, along with the the corresponding opcode and instruction width. This is implemented using a combination of Python's dictionary, tuple and list objects. Its structure is demonstrated below: \begin{lstlisting}[basicstyle={\ttfamily},captionpos=b,frame=tb,framexbottommargin=1em,framextopmargin=1em,keywordstyle={\color{blue}},tabsize=4] mnemonic: (arg type, arg type, ...): [opcode, width] \end{lstlisting} Each mnemonic has an entry in the parent index which returns another index of possible argument formats for that mnemonic with their corresponding opcode and length. Argument types can be either be one of the reserved arguments or one of the following values: \lstinline[basicstyle={\ttfamily}]!address!, \lstinline[basicstyle={\ttfamily}]!pointer!, \lstinline[basicstyle={\ttfamily}]!data! or \lstinline[basicstyle={\ttfamily}]!label! . Width is represented in number of bytes, ie. \lstinline[basicstyle={\ttfamily}]!width = 3! means 1 byte of opcode and 2 bytes of arguments. \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!label index! \end{itemize} This structure is used to store an index of label definitions. \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!equate index! \end{itemize} This structure is used to store an index of equated strings. \newpage{} \section{Functions} \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!first_pass(source file)! \end{itemize} This function pre-processes a source file and stores it in a format containing the necessary data for the \lstinline[basicstyle={\ttfamily}]!second_pass()! function to assemble it. It processes labels and \lstinline[basicstyle={\ttfamily}]!EQU! directives by storing strings and their corresponding values in indexes and replacing any subsequent appearances of the string with the value. It prepares \lstinline[basicstyle={\ttfamily}]!ORG! and \lstinline[basicstyle={\ttfamily}]!DB! statements for the \lstinline[basicstyle={\ttfamily}]!second_pass()!. It uses the \lstinline[basicstyle={\ttfamily}]!tokenize()! function to determine the argument symbols and operand bit string. Finally it uses the \lstinline[basicstyle={\ttfamily}]!instruction index! to determine the instruction width. \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!second_pass(asm, label index)! \end{itemize} This function takes the pre-processed assembly code and \lstinline[basicstyle={\ttfamily}]!label index! output by \lstinline[basicstyle={\ttfamily}]!first_pass()! as input. First it checks for \lstinline[basicstyle={\ttfamily}]!ORG! and \lstinline[basicstyle={\ttfamily}]!DB! statements and handles them if necessary. Then it replaces any labels that were used before they were defined and therefore not replaced on by \lstinline[basicstyle={\ttfamily}]!first_pass()! . It uses the \lstinline[basicstyle={\ttfamily}]!instruction index ! to determine the opcode and the width of the instruction, then it writes the opcode and operand to the file. If the combined width of the opcode and operand is greater than the instruction width the function raises an error. \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!tokenize(mnemonic, arguments)! \end{itemize} This function processes an instruction in order to produce a hashable symbol that represents the format of its arguments. This symbol is used to look up opcodes in the \lstinline[basicstyle={\ttfamily}]!instruction index!. It also detects string representations of numbers in the arguments and stores a C type struct representation of the operands to be returned along with the symbol. It does this with the help of the \lstinline[basicstyle={\ttfamily}]!stoi()! function and Python's \lstinline[basicstyle={\ttfamily}]!struct! module . \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!stoi(string)! \end{itemize} This function is a general purpose function that is actually used throughout the code, although mainly in the \lstinline[basicstyle={\ttfamily}]!tokenize()! function. It takes a string as an input and tries to convert it to an integer using Pythons integer representation syntax. It can recognize decimal, octal, hexadecimal and binary numbers which are denoted with different prefixes. If it receives a string it can not represent as an integer it returns the string 'NaN', (Not a Number) \bigskip Below is an abstract representation of each functions logical process. The \lstinline[basicstyle={\ttfamily}]!first_pass()! and \lstinline[basicstyle={\ttfamily}]!second_pass()! are represented in pseudo-code, however \lstinline[basicstyle={\ttfamily}]!stoi()! and \lstinline[basicstyle={\ttfamily}]!tokenize()! are more easily understood when represented as flowcharts. \newpage{} \subsection{\lstinline[basicstyle={\ttfamily}]!first_pass!} \begin{lstlisting}[basicstyle={\small\ttfamily},captionpos=b,frame=tb,framexbottommargin=3em,framextopmargin=3em,keywordstyle={\color{blue}},language=Python,showstringspaces=false,tabsize=4] first_pass(source file): address = 0 for statement in source file: remove comments for word in statement: if word is in equate index: replace word with equated value else if word is in label index: replace word with address at label if first word == 'org' address = second word else if last character of first word == ':': remove ':' add word = address to label index next statement else if second word == 'equ' add first word = third word to equate index next statement mnemonic = first word arguments = [second word ... last word] symbol, constant = tokenize(arguments) if mnemonic == 'db': address = address + width of constant next statement width = instruction index[mnemonic][symbol][width] address = address + width append [mnemonic, argument, symbol, constant] to asm return asm, label index \end{lstlisting} \newpage{} \subsection{\lstinline[basicstyle={\ttfamily}]!second_pass! } \begin{lstlisting}[basicstyle={\small\ttfamily},breaklines=true,captionpos=b,frame=tb,framexbottommargin=3em,framextopmargin=3em,keywordstyle={\color{blue}},language=Python,tabsize=4] second_pass(file, asm, label index): address = 0 for line in asm: file offset = address mnemonic, arguments, symbol, constant = line if mnemonic == 'org': address = first argument next line else if mnemonic == 'db': write constant to file address = address + width of constant next line for argument in arguments: if argument is a label: replace argument with address at label symbol, data = tokenize(argument) append data to constant op, width = instruction index[mnemonic][symbol] write op to file if width of constant - width + 1 > 0: raise error else if: write constant to file address = address.+ width return file \end{lstlisting} \newpage{} \subsection{\lstinline[basicstyle={\ttfamily}]!tokenize!} \bigskip \includegraphics[scale=0.57]{/home/jmz/qm/ede/doc/images/assembler/tokenize} \newpage{} \subsection{\lstinline[basicstyle={\ttfamily}]!stoi! } \bigskip \begin{description} \item [{\includegraphics[scale=0.7]{/home/jmz/qm/ede/doc/images/assembler/stoi}}]~ \end{description} \newpage{} \section{Assembly language manual} \newpage{} \part{Emulator} \section{Core microprocessor emulation} The core of the emulator is written in C using only standard libraries. It executes the machine code output by the assembler according to the ELB816 specification. It consists of the following files: \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!iset.c! and \lstinline[basicstyle={\ttfamily}]!iset.h! \end{itemize} These files contain the emulator instruction functions and function look-up table. \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!mem.c! and \lstinline[basicstyle={\ttfamily}]!mem.h! \end{itemize} These files contain the emulators memory structure and memory access functions. \begin{itemize} \item \lstinline[basicstyle={\ttfamily}]!emu.c! \end{itemize} This file contains the program's \lstinline[basicstyle={\ttfamily}]!main()! function. It initializes the emulator and executes the programs fetch/decode/execute cycle. \bigskip Below is a high level description of the content of each of these files which should demonstrate how the emulator works. There is also a large amount of material relevant to the emulator's design in the appendix, which will be referenced when applicable. \subsection{\lstinline[basicstyle={\ttfamily}]!iset.c! and \lstinline[basicstyle={\ttfamily}]!iset.h!} Each mnemonic in the ELB816 instruction set has a function defined in these files. Each function is responsible for execution of all the instructions that use its corresponding mnemonic. The function look-up table is an array of pointers to these functions, where a pointer's position in the list corresponds to the opcode of the instruction to be executed. \bigskip \newpage{} \subsection{\lstinline[basicstyle={\ttfamily}]!mem.c! and \lstinline[basicstyle={\ttfamily}]!mem.h!} The figures bellow illustrate the emulator's memory layout as defined in the \lstinline[basicstyle={\ttfamily}]!mem.h! header file. \bigskip \lstinline[basicstyle={\ttfamily}]!mem.c! contains functions that can be used to access this memory from the rest of the code. \newpage{} \subsection{\lstinline[basicstyle={\ttfamily}]!emu.c!} This file contains the emulator's set-up and control procedures. It includes all of the projects header files and controls the execution of the functions contained in them. \bigskip It first executes a number of initialization procedures and then passes control over to the main fetch/decode/execute cycle. This procedure is shown below as a flowchart. To understand this it you must be familiar with C's function pointer syntax. \bigskip \centerline{\includegraphics[scale=0.7]{/home/jmz/qm/ede/doc/images/emulator/fetch_decode_exe}} \newpage{} \section{Peripherals} \newpage{} \end{document}