第六章：汇编编译器

时间：2015-04-06 08:51:02 阅读：337 评论：0 收藏：0 [点我收藏+]

目标：开发汇编编译器，将Hack汇编语言编写成的程序翻译成Hach硬件平台能够理解的二进制代码。

分析：本章的本质就是文本处理，将给定的.asm文本根据给定的规则映射为.hack二进制文件。

我们先分析.asm文件，一行可以是一下几种情况：

1）指令：又分为A-指令，C-指令。

2）常数和符号：常数还好解决，用户自定义的符号，还得为其分配内存。

3）注释：以"//"开头的被认为是注释，忽略。

4）空行：忽略。

书中为了降低难度先要求实现一个无符号版的，这也是一种很好的思维方法，将复杂的问题，转换为简单的已知的问题。

书中也给出了模块的API，既然是文本处理，语言自然就是python啦。

实现：

Paser模块：

<span style="font-size:12px;">def hasMoreCommand(line):	
	if not line:
		return 0
	return 1

def advance(fl):			#return next line 
	line = fl.readline()
	return line

def clear(line):
	index = line.find('//')
	if index > 0:	#remove the gloss
		line = line[0:index]
	return line.strip('\n ') 		#remove the blank then return

def commandType(line):
	if line.find('(') > -1:
		return 'L_COMMAND'
	elif line.find('@') > -1:
		return 'A_COMMAND'
	else:
		return 'C_COMMAND'

def symbol(line):
	line = line.strip(' ()@\n')
	return line

def dest(line):
	if line.find(";") > 0:		#deal with the jump command
		return 'null'
	index = line.find('=')
	return line[0:index].strip()

def comp(line):
	index = line.find(';')
	if index > 0:
		return line[0:index].strip()
	idx = line.find('=')
	return line[idx + 1:].strip()</span>

Code模块：

该模块，负责解析C-命令，返回comp,dest,jump域的二进制代码。

<span style="font-size:12px;">import Parser

def dest(line):
	destDict = {'null':'000', 'M':'001', 'D':'010', 
				'MD':'011', 'A':'100', 'AM':'101',
				'AD':'110', 'AMD':'111'}
	return destDict[Parser.dest(line)]

def comp(line):
	compDict = {'0':'0101010', '1':'0111111', '-1':'0111010', 
				'D':'0001100', 'A':'0110000', '!D':'001101', 
				'!A':'0110001', '-D':'0001111', '-A':'0110011', 
				'D+1':'0011111', 'A+1':'0110111', 'D-1':'0001110',
				'A-1':'0110010', 'D+A':'0000010', 'D-A':'0010011', 
				'A-D':'0000111', 'D&A':'0000000', 'D|A':'0010101', 
				'M':'1110000', '!M':'1110001', '-M':'1110011', 
				'M+1':'1110111', 'M-1':'1110010', 'D+M':'1000010', 
				'D-M':'1010011', 'M-D':'1000111', 'D&M':'1000000', 
				'D|M':'1010101'}
	return compDict[Parser.comp(line)]

def jump(line):
	jumpDict = {'null':'000', 'JGT':'001', 'JEQ':'010', 
				'JGE':'011', 'JLT':'100', 'JNE':'101', 
				'JLE':'110', 'JMP':'111'}
	return jumpDict[Parser.jump(line)]</span>

assembler模块：

该模块实现无符号的汇编编译器，默认输出prog.hack文件。

<span style="font-size:12px;">import sys
import Parser
import Code

fileName = sys.argv[1]
rfile = open(fileName, 'r')
wfile = open('prog.hack', 'w')

line = rfile.readline()
flag = Parser.hasMoreCommand(line)

while flag:
	while line.startswith('\n') or line.startswith('//'):			#skip the gloss line and null string
		line = Parser.advance(rfile)
		flag = Parser.hasMoreCommand(line)
	
	line = Parser.clear(line)												#remove gloss and '\n'
	
	cType = Parser.commandType(line)

	if cType == 'L_COMMAND':
		LString = Parser.symbol(line)
		wfile.write('0' + LString + '\n')
	elif cType == 'A_COMMAND':
		AD = Parser.symbol(line)
		AB = bin(int(AD))[2:]
		AString = AB.zfill(15)
		wfile.write('0' + AString + '\n')
	elif cType == 'C_COMMAND':
		destString = Code.dest(line)
		compString = Code.comp(line)
		jumpString = Code.jump(line)
		wfile.write("111" + compString + destString + jumpString + "\n")
	
	line = Parser.advance(rfile)
	flag = Parser.hasMoreCommand(line)</span>

以下实现有符号版的：

遇到的问题：汇编程序允许在符号被定义之前使用该符号标签。

为了解决这个问题，需要读两遍文本，第一遍遍历时每遇到一条伪指令时，就在符号表上加一条该符号到下一条指令地址的映射，第二次遍历时，每次遇到符号化的A-指令时，

就查找符号表，若不存在，则在符号表中增加该符号到可用的RAM地址。改地址从16开始。

其他都和无符号版的差不多，只是增加了SymbolTable模块。

SymbolTable:

<span style="font-size:12px;">def Constructor():
	symbolDict = {'SP':0, 'LCL':1, 'ARG':2, 'THIS':3, 
					'THAT':4, 'R0':0, 'R1':1, 'R2':2, 
					'R3':3, 'R4':4, 'R5':5, 'R6':6, 
					'R7':7, 'R8':8, 'R9':9, 'R10':10, 
					'R11':11, 'R12':12, 'R13':13, 'R14':14,
					'R15':15, 'SCREEN':16384, 'KBD':24576}
	return symbolDict

def addEntry(symbolDict, symbol, address):
	symbolDict[symbol] = address

def contains(symbolDict, symbol):
	return symbolDict.has_key(symbol)

def GetAddress(symbolDict, symbol):
	return symbolDict[symbol]</span>

assembler:

<span style="font-size:12px;">import sys
import Parser
import Code
import SymbolTable

fileName = sys.argv[1]
rfile = open(fileName, 'r')
wfile = open('prog.hack', 'w')

smtbl = SymbolTable.Constructor()

line = rfile.readline()
flag = Parser.hasMoreCommand(line)
counter = 0

while flag:								#first loop complete the symboltable
	while line.startswith('\n') or line.startswith('//'):
		line = Parser.advance(rfile)
		flag = Parser.hasMoreCommand(line)
		
	line = Parser.clear(line)
	cType = Parser.commandType(line)
	if cType == 'L_COMMAND':
		sym = Parser.symbol(line)
		SymbolTable.addEntry(smtbl, sym, counter)
	else:
		counter += 1
	line = Parser.advance(rfile)			#read next line
	flag = Parser.hasMoreCommand(line)		#qualify this line

rfile.close()

rfile = open(fileName)
line = rfile.readline()
flag = Parser.hasMoreCommand(line)
j = 16								#remember the number of the memory address

while flag:							#second loop
	while line.startswith('\n') or line.startswith('//'):			#skip the gloss line and null string
		line = Parser.advance(rfile)
		flag = Parser.hasMoreCommand(line)
	
	line = Parser.clear(line)												#remove gloss and '\n'
	
	cType = Parser.commandType(line)

	if cType == 'L_COMMAND':
			pass
	elif cType == 'A_COMMAND':
		Acmd = Parser.symbol(line)
		if Acmd.isdigit():								#Acmd is digit
			bn = bin(int(Acmd))[2:]
		elif SymbolTable.contains(smtbl, Acmd):			#Acmd is not digit but in Symboltable
			dec = SymbolTable.GetAddress(smtbl, Acmd)
			bn = bin(dec)[2:]
		else:											#Acmd is a symbol that not exist in Symboltable
			bn = bin(j)[2:]
			SymbolTable.addEntry(smtbl, Acmd, j)
			j += 1
		AString = bn.zfill(15)
		wfile.write('0' + AString + '\n')
	elif cType == 'C_COMMAND':
		destString = Code.dest(line)
		compString = Code.comp(line)
		jumpString = Code.jump(line)
		wfile.write("111" + compString + destString + jumpString + "\n")
	
	line = Parser.advance(rfile)
	flag = Parser.hasMoreCommand(line)</span>

其他两个模块和无符号的一致。

至此我们的汇编编译器就完成啦！

第六章：汇编编译器

原文：http://blog.csdn.net/u013573789/article/details/44892389

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年09月23日 (328)
2021年09月24日 (313)
2021年09月17日 (191)
2021年09月15日 (369)
2021年09月16日 (411)
2021年09月13日 (439)
2021年09月11日 (398)
2021年09月12日 (393)
2021年09月10日 (160)
2021年09月08日 (222)