Computer Science Department CSC 413
Computer Science Department
项目类别:计算机

Hello, dear friend, you can consult us at any time if you have any questions, add  WeChat:  zz-x2580


Computer Science Department

CSC 413
Assignment 2 - Modify the Lexer
Due Date
March 2, before midnight
March 4, before midnight, is the late submission deadline (for 75% credit)
Note that the due date applies to the last commit timestamp into the main branch of your repository.
Overview
The purpose of this assignment is to extend the Lexer component of our x language compiler to be able to handle
additional tokens, to supplement our understanding of Compilers and Lexical Analysis.
You are provided with the Lexer code, which will be automatically cloned into your github repository when you begin
the assignment via this github assignment link.
Submission
Your assignment will be submitted using github. Only the “main” branch of your repository will be graded.
Late submission is determined by the last commit time on the “main” branch. You are required to submit a
documentation PDF named “documentation.pdf” in a “documentation” folder at the root of your project.
Please refer to the documentation requirements posted on iLearn. Organization and appearance of this
document is critical. Please use spelling and grammar checkers - your ability to communicate about
software and technology is almost as important as your ability to write software!
We will test your program using the following commands:
1. git clone your-repository-name
2. cd your-repository-name
3. find . -name "*.class" -type f -delete
4. find . -name "*.jar" -type f -delete
5. javac lexer/setup/TokenSetup.java
6. java lexer.setup.TokenSetup
7. javac lexer/Lexer.java
8. java lever.Lexer filename.x
Requirements
You will be extending the Lexer in order to be able to process additional tokens, as well as to improve the output of
the Lexer.
1. The current implementation of Lexer reads a hardcoded file. Lexer must be updated to allow input via a filename
provided as a command line argument:
java lexer.Lexer sample_files/simple.x
Note that the main method is currently commented out - you should uncomment and update this method. In the
event that no filename is supplied, a usage instruction should be displayed:
> java lexer.Lexer
usage: java lexer.Lexer filename.x
2. Our compiler must be updated to accommodate additional tokens. The tokens file must be updated, and
TokenSetup run in order to re-generate the Tokens and TokenTypes classes.
1. Greater: >
2. GreaterEqual: >=
3. HashDelimeter: #
4. LeftBracket: [
5. RightBracket: ]
6. Utf16String: utf16string (this is the type)
7. Utf16StringLit: any utf16string literal, which is a backslash, followed by a lower case u, followed
by 4 hexadecimal digits (0-9, a-f, A-F), repeated twice.
Valid examples: \uD83D\uDC7D \ud83D\uDc7D
Invalid examples: \uDf12\dF \uR123\uZ123
8. TimestampType: timestamp (this is the type)
9. TimestampLit: any timestamp expressed as yyyy~MM~dd~hh:mm:ss, where y, M, d, h, m, and s
are integers in the range 0-9, and 01 <= MM <= 12, 01 <= dd <= 31, 00 <= hh <= 23, 01 <= mm <= 59, and 00
<= ss <= 59.
Valid example: 2022~02~15~12:15:22
Invalid example: 2022~14~15~39:15:22 123~14~15~39:15:22
10. Reserved words
1. Begin: begin
2. End: end
3. In: in
3. The Token class must be updated to include the line number that a token was found (for subsequent error
reporting, etc.).
4. Lexer output must be updated for readability, and to include the line number from the Token, as well as the type
of the token created. (Note that the initial debug text that shows the file information has been removed!). The
format for each of the token lines is:
1. 11 columns, left aligned, for the token description, then a space
2. left:, then a space
3. 8 columns, left aligned, for the left position, then a space
4. right:, then a space
5. 8 columns, left aligned, for the right position, then a space
6. line:, then a space
7. 8 columns, left aligned, for the line number, then a space
8. The symbol
> java lexer.Lexer sample_files/simple.x
READLINE: program { int i int j
program left: 0 right: 6 line: 1 Program
{ left: 8 right: 8 line: 1 LeftBrace
int left: 10 right: 12 line: 1 Int
i left: 14 right: 14 line: 1 Identifier
int left: 16 right: 18 line: 1 Int
j left: 20 right: 20 line: 1 Identifier
READLINE: i = i + j + 7
/* Remainder of output omitted for brevity, see Appendix A */
5. Lexer output must be updated to include a printout, with line number, of each of the lines read in from the source
file. Line numbers for here should be printed in 3 columns, right aligned. Note that when an error is encountered,
the error should be reported as usual, and the lines of the source file should be output, with line numbers, up to
and including the error line.
1: program { int i int j
2: i = i + j + 7
3: j = write(i)
4: }
Appendix A
The complete output for simple.x (the indentation you see for the file output is to allow for three digit line numbers):
java lexer.Lexer sample_files/simple.x
READLINE: program { int i int j
program left: 0 right: 6 line: 1 Program
{ left: 8 right: 8 line: 1 LeftBrace
int left: 10 right: 12 line: 1 Int
i left: 14 right: 14 line: 1 Identifier
int left: 16 right: 18 line: 1 Int
j left: 20 right: 20 line: 1 Identifier
READLINE: i = i + j + 7
i left: 3 right: 3 line: 2 Identifier
= left: 5 right: 5 line: 2 Assign
i left: 7 right: 7 line: 2 Identifier
+ left: 9 right: 9 line: 2 Plus
j left: 11 right: 11 line: 2 Identifier
+ left: 13 right: 13 line: 2 Plus
7 left: 15 right: 15 line: 2 INTeger
READLINE: j = write(i)
j left: 3 right: 3 line: 3 Identifier
= left: 5 right: 5 line: 3 Assign
write left: 7 right: 11 line: 3 Identifier
( left: 12 right: 12 line: 3 LeftParen
i left: 13 right: 13 line: 3 Identifier
) left: 14 right: 14 line: 3 RightParen
READLINE: }
} left: 0 right: 0 line: 4 RightBrace
1: program { int i int j
2: i = i + j + 7
3: j = write(i)
4: }

留学ICU™️ 留学生辅助指导品牌
在线客服 7*24 全天为您提供咨询服务
咨询电话(全球): +86 17530857517
客服QQ:2405269519
微信咨询:zz-x2580
关于我们
微信订阅号
© 2012-2021 ABC网站 站点地图:Google Sitemap | 服务条款 | 隐私政策
提示:ABC网站所开展服务及提供的文稿基于客户所提供资料,客户可用于研究目的等方面,本机构不鼓励、不提倡任何学术欺诈行为。