日本欧洲视频一区_国模极品一区二区三区_国产熟女一区二区三区五月婷_亚洲AV成人精品日韩一区18p

CS5012代做、代寫Python設計程序

時間:2024-03-03  來源:  作者: 我要糾錯



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標簽:

掃一掃在手機打開當前頁
  • 上一篇:代做CS252編程、代寫C++設計程序
  • 下一篇:AcF633代做、Python設計編程代寫
  • 無相關信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風景名勝區
    昆明西山國家級風景名勝區
    昆明旅游索道攻略
    昆明旅游索道攻略
  • NBA直播 短信驗證碼平臺 幣安官網下載 歐冠直播 WPS下載

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    日本欧洲视频一区_国模极品一区二区三区_国产熟女一区二区三区五月婷_亚洲AV成人精品日韩一区18p

              欧美日韩国产精品一区| 国产精品欧美一区二区三区奶水| 蜜乳av另类精品一区二区| 久久精品官网| 久久久噜噜噜久久久| 免费观看成人鲁鲁鲁鲁鲁视频| 毛片精品免费在线观看| 欧美成人视屏| 国产精品高清网站| 国产综合激情| 亚洲精选一区二区| 翔田千里一区二区| 久久亚洲精品一区二区| 欧美精品综合| 国产乱码精品1区2区3区| 激情综合视频| 在线视频精品一区| 欧美亚洲一级| 亚洲欧美日产图| 久久综合九色综合欧美就去吻| 米奇777在线欧美播放| 欧美日韩一区二区三区免费| 国产日韩亚洲欧美精品| 亚洲人在线视频| 午夜亚洲性色视频| 欧美紧缚bdsm在线视频| 国产日本欧美一区二区| 亚洲美女精品成人在线视频| 欧美在线一区二区三区| 欧美日韩一区三区| 1000部精品久久久久久久久| 亚洲午夜在线视频| 久久午夜电影| 国产欧美日韩视频在线观看 | 香蕉国产精品偷在线观看不卡| 久久天天躁狠狠躁夜夜av| 欧美性jizz18性欧美| 亚洲国产精品va| 久久国产精彩视频| 国产精品yjizz| 日韩视频亚洲视频| 美女视频黄 久久| 好看不卡的中文字幕| 亚洲专区在线视频| 欧美日韩亚洲一区二区| ●精品国产综合乱码久久久久| 亚洲综合导航| 欧美日韩精品久久久| 亚洲国产成人在线视频| 久久精品人人做人人爽电影蜜月| 国产精品国产a级| 一本色道久久综合亚洲精品按摩| 女主播福利一区| 亚洲成人原创 | 亚洲一区二区3| 欧美精品在线极品| 亚洲国产精品久久91精品| 久久精品91久久久久久再现| 国产精品永久免费在线| 99re8这里有精品热视频免费| 欧美sm极限捆绑bd| 亚洲二区视频在线| 免费在线欧美黄色| 亚洲国产欧洲综合997久久| 久热国产精品视频| 亚洲国产综合91精品麻豆| 久久精品国产91精品亚洲| 韩国女主播一区| 久久久久成人精品| 尤物99国产成人精品视频| 久久综合色综合88| 亚洲黄网站黄| 欧美日韩情趣电影| 亚洲一区观看| 国产亚洲午夜| 男女精品视频| 亚洲性夜色噜噜噜7777| 国产欧美精品一区二区色综合| 91久久国产综合久久蜜月精品| 欧美不卡在线| 一区二区三区日韩欧美精品| 欧美调教vk| 久久久久久久久久久久久9999| 在线播放日韩欧美| 欧美日韩不卡合集视频| 亚洲一区在线看| 黄色亚洲在线| 欧美精品一区二区蜜臀亚洲| 亚洲欧美卡通另类91av| 国产亚洲一区在线| 蜜桃av噜噜一区二区三区| 这里只有视频精品| 黄色日韩精品| 国产精品扒开腿做爽爽爽视频| 欧美一区二区精品| 日韩一区二区福利| 国产日产欧美a一级在线| 久久九九免费视频| 在线国产日韩| 国产精品高潮呻吟久久av无限 | 欧美电影在线播放| 亚洲欧美日本日韩| 亚洲国产裸拍裸体视频在线观看乱了| 欧美日韩视频一区二区| 久久精品女人天堂| 一本一本久久a久久精品综合麻豆 一本一本久久a久久精品牛牛影视 | 亚洲欧美国产日韩天堂区| 一区二区三区中文在线观看 | 久久精品视频在线| 99re热这里只有精品免费视频| 精品99视频| 国产日韩欧美不卡在线| 欧美丝袜一区二区| 男人插女人欧美| 久久久www成人免费毛片麻豆| 亚洲天天影视| 亚洲免费av电影| 国产丝袜一区二区三区| 欧美午夜精品| 欧美日韩午夜激情| 欧美福利在线| 久热精品视频| 欧美一区成人| 午夜精品视频在线观看| 一区二区三区免费网站| 亚洲精品小视频在线观看| 精品成人久久| 国产一区二区三区四区五区美女| 欧美午夜美女看片| 欧美日韩一区二区在线观看视频| 欧美h视频在线| 欧美成人中文字幕在线| 美女脱光内衣内裤视频久久影院 | 一本色道久久综合狠狠躁的推荐| 在线日韩电影| 亚洲第一综合天堂另类专| 黄色一区二区在线| 精品999网站| 在线观看的日韩av| 经典三级久久| 1024亚洲| 亚洲免费大片| 国产精品99久久久久久宅男| 亚洲深夜福利视频| 99热免费精品在线观看| 亚洲精品一区二区三区av| 黄色亚洲免费| 亚洲国产精品va| 亚洲人成在线免费观看| 亚洲国产精品t66y| 亚洲精品美女免费| 亚洲精品国产精品乱码不99按摩| 亚洲激情六月丁香| 日韩一区二区福利| 亚洲午夜久久久| 性欧美在线看片a免费观看| 亚洲欧美精品伊人久久| 久久国产一区二区| 母乳一区在线观看| 国产精品视频一区二区三区| 狠狠噜噜久久| 一区二区三区免费看| 亚洲欧美卡通另类91av| 麻豆九一精品爱看视频在线观看免费| 欧美国产日韩精品| 国产伦精品一区二区三区免费迷| 国产综合视频在线观看| 亚洲精品乱码久久久久久| 性久久久久久久久| 欧美aaa级| 国产日韩欧美中文在线播放| 亚洲国产成人porn| 午夜精品久久久久久久99水蜜桃| 久久综合中文字幕| 国产女主播一区二区| 亚洲精品女人| 久久精品99国产精品| 欧美日韩国产色站一区二区三区| 国产一区二区三区成人欧美日韩在线观看| 亚洲丰满在线| 午夜精品一区二区三区四区| 老司机精品视频网站| 国产欧美日本一区二区三区| 亚洲美女诱惑| 久久久999精品视频| 国产精品丝袜白浆摸在线| 亚洲精品精选| 久久综合一区二区| 国产精品日韩久久久| aa级大片欧美| 欧美成人高清视频| 在线观看不卡| 久久久久久综合| 国产九区一区在线| 亚洲一区亚洲二区| 欧美人成在线| 亚洲免费观看| 免费观看日韩| 亚洲黄色高清| 免费成人激情视频|