COMP6714 Project
Project
项目类别:计算机
COMP6714 Project
It will take you quite some time to complete this project, therefore, we earnestly recommend that you start working as early as
possible. You should read the specs carefully at least 2-3 times before you start coding.

Instructions
1. This note book contains instructions for .
You are required to complete your implementation for part-1 in a file project.py provided along with this notebook. Please
the name of the file.
You are not allowed to print out unnecessary stuff. We will not consider any output printed out on the screen. All results should be
returned in appropriate data structures via corresponding functions.
You can submit your implementation for Project via give .
For each question, we have provided you with detailed instructions along with question headings. In case of problems, you can post
your query @ Piazza.
You are allowed to add other functions and/or import modules (you may have to for this project), but you are not allowed to define
global variables. Only functions are allowed in project.py
You should not import unnecessary and non-standard modules/libraries. Loading such libraries at test time will lead to errors and hence
0 mark for your project. If you are not sure, please ask @ Piazza.
Allowed Libraries:


Part One - Group Varint Encoding

Input Format:
The function encode() should receive One argument:
posting_list which is a list of integers, where each integer represents a document ID (all the document IDs are sorted).
Output Format:
Your output should be a bytearray, which is the group varint encoding for posting_list .
In [1]:
Toy Example for Illustration
Here, we provide a small toy example for this part:
Let posting_list be:
In [2]:
In [3]:
In [6]:
Part Two - Group Varint Decoding
Input Format:
Out[6]: ['00000110',
'00000001',
'00001111',
'11111111',
'00000001',
'11111111',
'11111111',
'00000001']
def encode(posting_list):
pass
posting_list = [1, 16, 527, 131598]
encoded_list = encode(posting_list)
[bin(code)[2:].zfill(8)for code in encoded_list]

The function decode() should receive One argument:
encoded_list is a Bytearray which corresponds to the encoded binary sequence.
Output Format:
Your output should be a list of integers, where each integer represents a document ID that is decoded from the encoded list.
In [55]:
Toy Example for Illustration
Here, we provide a small toy example for this part:
Let encoded_list be:
In [66]:
In [67]:
In [9]:
Part Three - Evaluation
In this part, you need to implement a function that computes the F1 score and MAP with the given informtion.
Input Format:
The function evaluation() should receive two argument:
rel_list is a list of 0s and 1s, where 0 indicates that the corresponding document is irrelevant, and 1 indicates that the corresponding
document is relevant. total_rel_doc is an integer that indicates the total relevant documents to the query.
Output Format:
Your output should be two float numbers, where the first one is the F1 score, and the second one is the MAP.
Out[9]: [1, 16, 527, 131598]
def decode(encoded_list):
pass
encoded_list = bytearray(b'\x06\x01\x0f\xff\x01\xff\xff\x01')
decoded_list = decode(encoded_list)
decoded_list

p , ,
In [71]:
Toy Example for Illustration
Here, we provide a small toy example for this part:
Let rel_list and total_rel_doc be:
In [94]:
In [95]:
In [96]:
In [92]:
Project Submission and Feedback
For project submission, you are required to submit a python file named project.py via give :
You can submit the file by give cs6714 proj1 project.py . The file size is limited to 1MB.
Out[96]: 0.43
Out[92]: 0.4162878787878788
def evaluation(rel_list, total_rel_doc):
pass
rel_list = [1,1,0,0,0,0,0,0,1,0,1,0,0,0,1,0,0,0,0,1]
total_rel_doc = 8
F1_score, MAP = evaluation(rel_list, total_rel_doc)
F1_score
留学ICU™️ 留学生辅助指导品牌
在线客服 7*24 全天为您提供咨询服务
咨询电话(全球): +86 17530857517
客服QQ:2405269519
微信咨询:zz-x2580
关于我们
微信订阅号
© 2012-2021 ABC网站 站点地图:Google Sitemap | 服务条款 | 隐私政策
提示:ABC网站所开展服务及提供的文稿基于客户所提供资料,客户可用于研究目的等方面,本机构不鼓励、不提倡任何学术欺诈行为。