日本欧洲视频一区_国模极品一区二区三区_国产熟女一区二区三区五月婷_亚洲AV成人精品日韩一区18p

CS5012代做、代寫Python設計程序

時間:2024-03-03  來源:  作者: 我要糾錯



CS5012 Mark-Jan Nederhof Practical 1
Practical 1: Part of speech tagging:
three algorithms
This practical is worth 50% of the coursework component of this module. Its due
date is Wednesday 6th of March 2024, at 21:00. Note that MMS is the definitive source
for deadlines and weights.
The purpose of this assignment is to gain understanding of the Viterbi algorithm,
and its application to part-of-speech (POS) tagging. The Viterbi algorithm will be
related to two other algorithms.
You will also get to see the Universal Dependencies treebanks. The main purpose
of these treebanks is dependency parsing (to be discussed later in the module), but
here we only use their part-of-speech tags.
Getting started
We will be using Python3. On the lab (Linux) machines, you need the full path
/usr/local/python/bin/python3, which is set up to work with NLTK. (Plain
python3 won’t be able to find NLTK.)
If you run Python on your personal laptop, then next to NLTK (https://www.
nltk.org/), you will also need to install the conllu package (https://pypi.org/
project/conllu/).
To help you get started, download gettingstarted.py and the other Python
files, and the zip file with treebanks from this directory. After unzipping, run
/usr/local/python/bin/python3 gettingstarted.py. You may, but need not, use
parts of the provided code in your submission.
The three treebanks come from Universal Dependencies. If you are interested,
you can download the entire set of treebanks from https://universaldependencies.
org/.
1
Parameter estimation
First, we write code to estimate the transition probabilities and the emission probabilities of an HMM (Hidden Markov Model), on the basis of (tagged) sentences from
a training corpus from Universal Dependencies. Do not forget to involve the start-ofsentence marker ⟨s⟩ and the end-of-sentence marker ⟨/s⟩ in the estimation.
The code in this part is concerned with:
• counting occurrences of one part of speech following another in a training corpus,
• counting occurrences of words together with parts of speech in a training corpus,
• relative frequency estimation with smoothing.
As discussed in the lectures, smoothing is necessary to avoid zero probabilities for
events that were not witnessed in the training corpus. Rather than implementing a
form of smoothing yourself, you can for this assignment take the implementation of
Witten-Bell smoothing in NLTK (among the implementations of smoothing in NLTK,
this seems to be the most robust one). An example of use for emission probabilities is
in file smoothing.py; one can similarly apply smoothing to transition probabilities.
Three algorithms for POS tagging
Algorithm 1: eager algorithm
First, we implement a naive algorithm that chooses the POS tag for the i-th token
on the basis of the chosen (i − 1)-th tag and the i-th token. To be more precise, we
determine for each i = 1, . . . , n, in this order:
tˆi = argmax
ti
P(ti
| tˆi−1) · P(wi
| ti)
assuming tˆ0 is the start-of-sentence marker ⟨s⟩. Note that the end-of-sentence marker
⟨/s⟩ is not even used here.
Algorithm 2: Viterbi algorithm
Now we implement the Viterbi algorithm, which determines the sequence of tags for a
given sentence that has the highest probability. As discussed in the lectures, this is:
tˆ1 · · ·tˆn = argmax
t1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
2
where the tokens of the input sentence are w1 · · ·wn, and t0 = ⟨s⟩ and tn+1 = ⟨/s⟩ are
the start-of-sentence and end-of-sentence markers, respectively.
To avoid underflow for long sentences, we need to use log probabilities.
Algorithm 3: individually most probable tags
We now write code that determines the most probable part of speech for each token
individually. That is, for each i, computed is:
tˆi = argmax
ti
X
t1···ti−1ti+1···tn
 Yn
i=1
P(ti
| ti−1) · P(wi
| ti)
!
· P(tn+1 | tn)
To compute this effectively, we need to use forward and backward values, as discussed
in the lectures on the Baum-Welch algorithm, making use of the fact that the above is
equivalent to:
tˆi = argmax
ti
P
t1···ti−1
Qi
k=1 P(tk | tk−1) · P(wk | tk)

·
P
ti+1···tn
Qn
k=i+1 P(tk | tk−1) · P(wk | tk)

· P(tn+1 | tn)
The computation of forward values is very similar to the Viterbi algorithm, so you
may want to copy and change the code you already had, replacing statements that
maximise by corresponding statements that sum values together. Computation of
backward values is similar to computation of forward values.
See logsumexptrick.py for a demonstration of the use of log probabilities when
probabilities are summed, without getting underflow in the conversion from log probabilities to probabilities and back.
Evaluation
Next, we write code to determine the percentages of tags in a test corpus that are
guessed correctly by the above three algorithms. Run experiments for the training
and test corpora of the three included treebanks, and possibly for treebanks of more
languages (but not for more than 5; aim for quality rather than quantity). Compare
the performance of the three algorithms.
You get the best experience out of this practical if you also consider the languages of
the treebanks. What do you know (or what can you find out) about the morphological
and syntactic properties of these languages? Can you explain why POS tagging is more
difficult for some languages than for others?
3
Requirements
Submit your Python code and the report.
It should be possible to run your implementation of the three algorithms on the
three corpora simply by calling from the command line:
python3 p1.py
You may add further functionality, but then add a README file to explain how to run
that functionality. You should include the three treebanks needed to run the code, but
please do not include the entire set of hundreds of treebanks from Universal
Dependencies, because this would be a huge waste of disk space and band
width for the marker.
Marking is in line with the General Mark Descriptors (see pointers below). Evidence of an acceptable attempt (up to 7 marks) could be code that is not functional but
nonetheless demonstrates some understanding of POS tagging. Evidence of a reasonable attempt (up to 10 marks) could be code that implements Algorithm 1. Evidence
of a competent attempt addressing most requirements (up to 13 marks) could be fully
correct code in good style, implementing Algorithms 1 and 2 and a brief report. Evidence of a good attempt meeting nearly all requirements (up to 16 marks) could be
a good implementation of Algorithms 1 and 2, plus an informative report discussing
meaningful experiments. Evidence of an excellent attempt with no significant defects
(up to 18 marks) requires an excellent implementation of all three algorithms, and a
report that discusses thorough experiments and analysis of inherent properties of the
algorithms, as well as awareness of linguistic background discussed in the lectures. An
exceptional achievement (up to 20 marks) in addition requires exceptional understanding of the subject matter, evidenced by experiments, their analysis and reflection in
the report.
Hints
Even though this module is not about programming per se, a good programming style
is expected. Choose meaningful variable and function names. Break up your code into
small functions. Avoid cryptic code, and add code commenting where it is necessary for
the reader to understand what is going on. Do not overengineer your code; a relatively
simple task deserves a relatively simple implementation.
You cannot use any of the POS taggers already implemented in NLTK. However,
you may use general utility functions in NLTK such as ngrams from nltk.util, and
FreqDist and WittenBellProbDist from nltk.
4
When you are reporting the outcome of experiments, the foremost requirement is
reproducibility. So if you give figures or graphs in your report, explain precisely what
you did, and how, to obtain those results.
Considering current class sizes, please be kind to your marker, by making their task
as smooth as possible:
• Go for quality rather than quantity. We are looking for evidence of understanding
rather than for lots of busywork. Especially understanding of language and how
language works from the perpective of the HMM model is what this practical
should be about.
• Avoid Python virtual environments. These blow up the size of the files that
markers need to download. If you feel the need for Python virtual environments,
then you are probably overdoing it, and mistake this practical for a software
engineering project, which it most definitely is not. The code that you upload
would typically consist of three or four .py files.
• You could use standard packages such as numpy or pandas, which the marker will
likely have installed already, but avoid anything more exotic. Assume a version
of Python3 that is the one on the lab machines or older; the marker may not
have installed the latest bleeding-edge version yet.
• We strongly advise against letting the report exceed 10 pages. We do not expect
an essay on NLP or the history of the Viterbi algorithm, or anything of the sort.
• It is fine to include a couple of graphs and tables in the report, but don’t overdo
it. Plotting accuracy against any conceivable hyperparameter, just for the sake
of producing lots of pretty pictures, is not what we are after.
請加QQ:99515681  郵箱:99515681@qq.com   WX:codehelp 

標簽:

掃一掃在手機打開當前頁
  • 上一篇:代做CS252編程、代寫C++設計程序
  • 下一篇:AcF633代做、Python設計編程代寫
  • 無相關信息
    昆明生活資訊

    昆明圖文信息
    蝴蝶泉(4A)-大理旅游
    蝴蝶泉(4A)-大理旅游
    油炸竹蟲
    油炸竹蟲
    酸筍煮魚(雞)
    酸筍煮魚(雞)
    竹筒飯
    竹筒飯
    香茅草烤魚
    香茅草烤魚
    檸檬烤魚
    檸檬烤魚
    昆明西山國家級風景名勝區
    昆明西山國家級風景名勝區
    昆明旅游索道攻略
    昆明旅游索道攻略
  • NBA直播 短信驗證碼平臺 幣安官網下載 歐冠直播 WPS下載

    關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

    Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網 版權所有
    ICP備06013414號-3 公安備 42010502001045

    日本欧洲视频一区_国模极品一区二区三区_国产熟女一区二区三区五月婷_亚洲AV成人精品日韩一区18p

              9000px;">

                        激情丁香综合五月| 午夜视频一区二区| 玉米视频成人免费看| 久久99精品久久久久| 欧美色区777第一页| 亚洲女爱视频在线| 日韩一级完整毛片| 亚洲永久精品国产| 日韩视频一区在线观看| 国产精品中文有码| 亚洲色图另类专区| 欧美大肚乱孕交hd孕妇| 丁香激情综合国产| 亚洲国产一区视频| 久久婷婷综合激情| eeuss鲁片一区二区三区在线观看 eeuss鲁片一区二区三区在线看 | 国产亚洲1区2区3区| 色综合久久久久网| 精品一区二区三区的国产在线播放| 久久久精品影视| 欧美亚洲禁片免费| 国产sm精品调教视频网站| 亚洲一区二区三区四区五区黄| 日韩欧美在线不卡| 91亚洲国产成人精品一区二三 | 国产精品不卡一区二区三区| 欧美亚洲综合久久| 成人高清视频在线| 久久草av在线| 亚洲二区在线观看| 国产欧美日韩视频一区二区| 制服丝袜亚洲播放| 91麻豆视频网站| 韩国精品在线观看| 婷婷丁香久久五月婷婷| 欧美高清在线视频| 精品裸体舞一区二区三区| 色综合久久88色综合天天6 | 久久精品视频免费| 欧美成人欧美edvon| 欧美在线观看视频一区二区三区| 国产精品99久久久久久久vr| 五月天激情综合网| 日韩美女久久久| 国产欧美1区2区3区| 久久久久久久久免费| 日韩精品资源二区在线| 欧美亚洲日本国产| 欧美综合在线视频| 色综合久久久久综合体桃花网| 国产成人综合在线观看| 韩国精品在线观看| 激情久久五月天| 国产精品亚洲视频| 国产福利91精品一区| 国产尤物一区二区在线 | 蜜芽一区二区三区| 午夜国产精品一区| 亚洲成人午夜电影| 午夜不卡av在线| 视频一区欧美精品| 日韩电影在线看| 美女网站色91| 国产一区二区不卡在线| 激情综合色丁香一区二区| 极品少妇xxxx精品少妇偷拍| 国产一区亚洲一区| 国产**成人网毛片九色 | 国产成人精品在线看| 成人黄色av电影| 91福利在线播放| 91精品国产色综合久久不卡蜜臀| 91精品国产一区二区三区| 欧美大片一区二区三区| 精品少妇一区二区| 国产精品乱人伦中文| 亚洲影视在线观看| 精品一二三四区| 91免费精品国自产拍在线不卡 | 欧美天天综合网| 欧美一区二区福利在线| 久久久99精品久久| 亚洲国产三级在线| 精品中文字幕一区二区| 91亚洲精品乱码久久久久久蜜桃 | 亚洲制服丝袜av| 裸体一区二区三区| 丁香天五香天堂综合| 91久久香蕉国产日韩欧美9色| 欧美久久婷婷综合色| 久久久国产精品不卡| 一区二区高清在线| 国产精品综合二区| 欧美精品1区2区3区| 国产亚洲精品精华液| 亚洲成人一区二区| 成人av在线资源| 日韩一级精品视频在线观看| 国产精品成人午夜| 黄页视频在线91| 色综合久久综合网97色综合 | 男女激情视频一区| 99久久99久久免费精品蜜臀| 欧美一区二区在线播放| 亚洲人123区| 国产传媒日韩欧美成人| 欧美夫妻性生活| 亚洲欧美日韩系列| 成人app在线观看| 久久色视频免费观看| 亚洲综合免费观看高清完整版在线 | 视频一区欧美日韩| 91麻豆国产在线观看| 国产清纯美女被跳蛋高潮一区二区久久w | 成年人国产精品| 久久一夜天堂av一区二区三区| 一区二区三区在线免费| av亚洲精华国产精华| 久久久亚洲午夜电影| 日韩高清欧美激情| 欧美日韩1区2区| 亚洲高清免费观看高清完整版在线观看| 福利91精品一区二区三区| 亚洲精品在线电影| 久久99精品久久久久久国产越南 | 精品久久国产字幕高潮| 日韩av网站免费在线| 欧美日韩国产在线播放网站| 亚洲免费在线看| 91久久一区二区| 洋洋成人永久网站入口| 欧美曰成人黄网| 亚洲最大成人网4388xx| 在线观看视频欧美| 亚洲综合一区在线| 欧美日韩一区二区不卡| 日韩二区三区四区| 欧美一级理论性理论a| 久久国产夜色精品鲁鲁99| 337p日本欧洲亚洲大胆精品| 国产成人8x视频一区二区| 亚洲男人的天堂在线观看| 欧美美女喷水视频| 黄色资源网久久资源365| 国产精品欧美经典| 欧美日韩在线电影| 国产乱子伦视频一区二区三区| 国产精品欧美一级免费| 欧美日韩精品一区二区三区| 精品一区二区精品| 国产精品第五页| 欧美一区二区三区免费在线看| 国产精品一区三区| 亚洲综合成人网| 久久精品人人做人人综合 | 偷拍日韩校园综合在线| 精品国产伦一区二区三区观看体验| 成人做爰69片免费看网站| 亚洲欧美日韩中文播放| 欧美一区二区精美| 波多野结衣中文字幕一区 | 欧美精品一区二区三区一线天视频| 国产精品996| 亚洲国产成人av| 国产亚洲精品免费| 欧美午夜精品一区二区三区| 麻豆成人免费电影| 欧美日韩国产区一| 精品在线免费视频| 日韩一区二区视频| 99免费精品在线观看| 亚洲一区在线播放| 日韩视频免费观看高清完整版在线观看 | 国产欧美一区在线| 97精品久久久午夜一区二区三区| 一区二区三区四区乱视频| 国产婷婷色一区二区三区在线| www.欧美色图| 国产一区在线视频| 亚洲电影第三页| 亚洲精品一卡二卡| 精品电影一区二区| 一本高清dvd不卡在线观看| 国产成人亚洲综合色影视| 久久久亚洲欧洲日产国码αv| 成人黄色在线视频| 国产麻豆视频一区| 亚洲.国产.中文慕字在线| 国产欧美日韩综合| 成人a区在线观看| 成人a区在线观看| 久久99热99| 亚洲成人福利片| 日韩激情视频网站| 亚洲一区免费在线观看| 国产女人18毛片水真多成人如厕| 国产丝袜美腿一区二区三区| 欧美一区二区三区四区在线观看| 激情文学综合网| 日日摸夜夜添夜夜添国产精品|