目标:开发汇编编译器,将Hack汇编语言编写成的程序翻译成Hach硬件平台能够理解的二进制代码。
分析:本章的本质就是文本处理,将给定的.asm文本根据给定的规则映射为.hack二进制文件。
我们先分析.asm文件,一行可以是一下几种情况:
1)指令:又分为A-指令,C-指令。
2)常数和符号:常数还好解决,用户自定义的符号,还得为其分配内存。
3)注释:以"//"开头的被认为是注释,忽略。
4)空行:忽略。
书中为了降低难度先要求实现一个无符号版的,这也是一种很好的思维方法,将复杂的问题,转换为简单的已知的问题。
书中也给出了模块的API,既然是文本处理,语言自然就是python啦。
实现:
Paser模块:
<span style="font-size:12px;">def hasMoreCommand(line):
if not line:
return 0
return 1
def advance(fl): #return next line
line = fl.readline()
return line
def clear(line):
index = line.find('//')
if index > 0: #remove the gloss
line = line[0:index]
return line.strip('\n ') #remove the blank then return
def commandType(line):
if line.find('(') > -1:
return 'L_COMMAND'
elif line.find('@') > -1:
return 'A_COMMAND'
else:
return 'C_COMMAND'
def symbol(line):
line = line.strip(' ()@\n')
return line
def dest(line):
if line.find(";") > 0: #deal with the jump command
return 'null'
index = line.find('=')
return line[0:index].strip()
def comp(line):
index = line.find(';')
if index > 0:
return line[0:index].strip()
idx = line.find('=')
return line[idx + 1:].strip()</span>该模块,负责解析C-命令,返回comp,dest,jump域的二进制代码。
<span style="font-size:12px;">import Parser
def dest(line):
destDict = {'null':'000', 'M':'001', 'D':'010',
'MD':'011', 'A':'100', 'AM':'101',
'AD':'110', 'AMD':'111'}
return destDict[Parser.dest(line)]
def comp(line):
compDict = {'0':'0101010', '1':'0111111', '-1':'0111010',
'D':'0001100', 'A':'0110000', '!D':'001101',
'!A':'0110001', '-D':'0001111', '-A':'0110011',
'D+1':'0011111', 'A+1':'0110111', 'D-1':'0001110',
'A-1':'0110010', 'D+A':'0000010', 'D-A':'0010011',
'A-D':'0000111', 'D&A':'0000000', 'D|A':'0010101',
'M':'1110000', '!M':'1110001', '-M':'1110011',
'M+1':'1110111', 'M-1':'1110010', 'D+M':'1000010',
'D-M':'1010011', 'M-D':'1000111', 'D&M':'1000000',
'D|M':'1010101'}
return compDict[Parser.comp(line)]
def jump(line):
jumpDict = {'null':'000', 'JGT':'001', 'JEQ':'010',
'JGE':'011', 'JLT':'100', 'JNE':'101',
'JLE':'110', 'JMP':'111'}
return jumpDict[Parser.jump(line)]</span>
该模块实现无符号的汇编编译器,默认输出prog.hack文件。
<span style="font-size:12px;">import sys
import Parser
import Code
fileName = sys.argv[1]
rfile = open(fileName, 'r')
wfile = open('prog.hack', 'w')
line = rfile.readline()
flag = Parser.hasMoreCommand(line)
while flag:
while line.startswith('\n') or line.startswith('//'): #skip the gloss line and null string
line = Parser.advance(rfile)
flag = Parser.hasMoreCommand(line)
line = Parser.clear(line) #remove gloss and '\n'
cType = Parser.commandType(line)
if cType == 'L_COMMAND':
LString = Parser.symbol(line)
wfile.write('0' + LString + '\n')
elif cType == 'A_COMMAND':
AD = Parser.symbol(line)
AB = bin(int(AD))[2:]
AString = AB.zfill(15)
wfile.write('0' + AString + '\n')
elif cType == 'C_COMMAND':
destString = Code.dest(line)
compString = Code.comp(line)
jumpString = Code.jump(line)
wfile.write("111" + compString + destString + jumpString + "\n")
line = Parser.advance(rfile)
flag = Parser.hasMoreCommand(line)</span>
以下实现有符号版的:
遇到的问题:汇编程序允许在符号被定义之前使用该符号标签。
为了解决这个问题,需要读两遍文本,第一遍遍历时每遇到一条伪指令时,就在符号表上加一条该符号到下一条指令地址的映射,第二次遍历时,每次遇到符号化的A-指令时,
就查找符号表,若不存在,则在符号表中增加该符号到可用的RAM地址。改地址从16开始。
其他都和无符号版的差不多,只是增加了SymbolTable模块。
SymbolTable:
<span style="font-size:12px;">def Constructor():
symbolDict = {'SP':0, 'LCL':1, 'ARG':2, 'THIS':3,
'THAT':4, 'R0':0, 'R1':1, 'R2':2,
'R3':3, 'R4':4, 'R5':5, 'R6':6,
'R7':7, 'R8':8, 'R9':9, 'R10':10,
'R11':11, 'R12':12, 'R13':13, 'R14':14,
'R15':15, 'SCREEN':16384, 'KBD':24576}
return symbolDict
def addEntry(symbolDict, symbol, address):
symbolDict[symbol] = address
def contains(symbolDict, symbol):
return symbolDict.has_key(symbol)
def GetAddress(symbolDict, symbol):
return symbolDict[symbol]</span>
assembler:
<span style="font-size:12px;">import sys
import Parser
import Code
import SymbolTable
fileName = sys.argv[1]
rfile = open(fileName, 'r')
wfile = open('prog.hack', 'w')
smtbl = SymbolTable.Constructor()
line = rfile.readline()
flag = Parser.hasMoreCommand(line)
counter = 0
while flag: #first loop complete the symboltable
while line.startswith('\n') or line.startswith('//'):
line = Parser.advance(rfile)
flag = Parser.hasMoreCommand(line)
line = Parser.clear(line)
cType = Parser.commandType(line)
if cType == 'L_COMMAND':
sym = Parser.symbol(line)
SymbolTable.addEntry(smtbl, sym, counter)
else:
counter += 1
line = Parser.advance(rfile) #read next line
flag = Parser.hasMoreCommand(line) #qualify this line
rfile.close()
rfile = open(fileName)
line = rfile.readline()
flag = Parser.hasMoreCommand(line)
j = 16 #remember the number of the memory address
while flag: #second loop
while line.startswith('\n') or line.startswith('//'): #skip the gloss line and null string
line = Parser.advance(rfile)
flag = Parser.hasMoreCommand(line)
line = Parser.clear(line) #remove gloss and '\n'
cType = Parser.commandType(line)
if cType == 'L_COMMAND':
pass
elif cType == 'A_COMMAND':
Acmd = Parser.symbol(line)
if Acmd.isdigit(): #Acmd is digit
bn = bin(int(Acmd))[2:]
elif SymbolTable.contains(smtbl, Acmd): #Acmd is not digit but in Symboltable
dec = SymbolTable.GetAddress(smtbl, Acmd)
bn = bin(dec)[2:]
else: #Acmd is a symbol that not exist in Symboltable
bn = bin(j)[2:]
SymbolTable.addEntry(smtbl, Acmd, j)
j += 1
AString = bn.zfill(15)
wfile.write('0' + AString + '\n')
elif cType == 'C_COMMAND':
destString = Code.dest(line)
compString = Code.comp(line)
jumpString = Code.jump(line)
wfile.write("111" + compString + destString + jumpString + "\n")
line = Parser.advance(rfile)
flag = Parser.hasMoreCommand(line)</span>至此我们的汇编编译器就完成啦!
原文:http://blog.csdn.net/u013573789/article/details/44892389