CISC3025代寫、代做c++,Java程序設計

            時間:2024-04-03  來源:  作者: 我要糾錯



            University of Macau
            CISC3025 - Natural Language Processing
            Project#3, 2023/2024
            (Due date: 18th April)
            Person Name ('Named Entity') Recognition
            This is a group project with two students at most. You need to enroll in a group here. In this project,
            you will be building a maximum entropy model (MEM) for identifying person names in newswire
            texts (Label=PERSON or Label=O). We have provided all of the machinery for training and testing
            your MEM, but we have left the feature set woefully inadequate. Your job is to modify the code
            for generating features so that it produces a much more sensible, complete, and higher-performing
            set of features.
            NOTE: In this project, we expect you to design a web application for demonstrating your final
            model. You need to design a web page that provides at least such a simple function: 1) User inputs
            sentence; 2) Output the named entity recognition results. Of course, more functionalities in your
            web application are highly encouraged. For example, you can integrate the previous project’s work,
            i.e., text classification, into your project (It would be very cool!).
            You NEED to submit:
            • Runnable program
            o You need to implement a Named Entity Recognition model based on the given starter
            codes
            • Model file
            o Once you have finished the designing of your features and made it functions well, it
            will dump a model file (‘model.pkl’) automatically. We will use it to evaluate
            your model.
            • Web application
            o You also need to develop a web application (freestyle, no restriction on programming
            languages) to demonstrate your NER model or even more NLP functions.
            o Obviously, you need to learn how to call your python project when building the web
            application.
            • Report
            o You should finish a report to introduce your work on this project. Your report should
            contain the following content:
            § Introduction;
            § Description of the methods, implementation, and additional consideration to
            optimize your model;
            § Evaluations and discussions about your findings;
            2
            § Conclusion and future work suggestions.
            • Presentation
            o You need to give a 8-minute presentation in the class to introduce your work followed
            by a 3-minute Q&A section. The content of the presentation may refer to the report.
            Starter Code
            In the starter code, we have provided you with three simple starter features, but you should be able
            to improve substantially on them. We recommend experimenting with orthographic information,
            gazetteers, and the surrounding words, and we also encourage you to think beyond these
            suggestions.
            The file you will be modifying is MEM.py
            Adding Features to the Code
            You will create the features for the word at the given position, with the given previous label. You
            may condition on any word in the sequence (and its relative position), not just the current word
            because they are all observed. You may not condition on any labels other than the previous one.
            You need to give a unique name for each feature. The system will use this unique name in training
            to set the weight for that feature. At the testing time, the system will use the name of this feature
            and its weight to make a classification decision.
            Types of features to include
            Your features should not just be the words themselves. The features can represent any property of
            the word, context, or additional knowledge.
            For example, the case of a word is a good predictor for a person's name, so you might want to add
            a feature to capture whether a given word was lowercase, Titlecase, CamelCase, ALLCAP, etc.
            def features(self, words, previous_label, position):
             features = {}
             """ Baseline Features """
             current_word = words[position]
             features['has_(%s)' % current_word] = 1
             features['prev_label'] = previous_label
             if current_word[0].isupper():
             features['Titlecase'] = 1
             #===== TODO: Add your features here =======#
             #...
             #=============== TODO: Done ================#
             return features
            3
            Imagine you saw the word “Jenny”. In addition to the feature for the word itself (as above), you
            could add a feature to indicate it was in Title case, like:
            You might encounter an unknown word in the test set, but if you know it begins with a capital letter
            then this might be evidence that helps with the correct prediction.
            Choosing the correct features is an important part of natural language processing. It is as much art
            as science: some trial and error is inevitable, but you should see your accuracy increasing as you
            add new types of features.
            The name of a feature is not different from an ID number. You can use assign any name for a
            feature as long as it is unique. For example, you can use “case=Title” instead of “Titlecase”.
            Running the Program
            We have provided you with a training set and a development set. We will be running your programs
            on an unseen test set, so you should try to make your features as general as possible. Your goal
            should be to increase F1 on the dev set, which is the harmonic mean of the precision and the recall.
            You can use three different command flags (‘-t’, ‘-d’, ‘-s’) to train, test, and show respectively.
            These flags can be used independently or jointly. If you run the program as it is, you should see the
            following training process:
            Afterward, it can print out your score on the dev set.
            You can also give it an additional flag, -s, and have it show verbose sample results. The first column
            is the word, the last two columns are your program's prediction of the word’s probability to be
            $ python run.py -d
            Testing classifier...
            f_score = 0.8715
            accuracy = 0.9641
            recall = 0.7143
            precision = 0.9642
            if current_word[0].isupper():
            features['Titlecase'] = 1
            $ cd NER
            $ python run.py -t
            Training classifier...
             ==> Training (5 iterations)
             Iteration Log-Likelihood Accuracy
             ---------------------------------------
             1 -0.69315 0.055
             2 -0.09383 0.946
             3 -0.08134 0.968
             4 -0.07136 0.969
             Final -0.06330 0.969
            4
            PERSON or O. The star ‘*’ indicates the gold result. This should help you do error analysis and
            properly target your features.
            Where to make your changes?
            1. Function ‘features()’ in MEM.py
            2. You can modify the “Customization” part in run.py in order to debug more efficiently and
            properly. It should be noted that your final submitted model should be trained under at least 20
            iterations.
            3. You may need to add a function “predict_sentence( )” in class MEM( ) to output predictions
            and integrate with your web applications.
            Changes beyond these, if you choose to make any, should be done with caution.
            Grading
            The assignment will be graded based on your codes, reports, and most importantly final
            presentation.
            $ python run.py -s
             Words P(PERSON) P(O)
            ----------------------------------------
             EU 0.0544 *0.9456
             rejects 0.0286 *0.9714
             German 0.0544 *0.9456
             call 0.0286 *0.9714
             to 0.0284 *0.9716
             boycott 0.0286 *0.9714
             British 0.0544 *0.9456
             lamb 0.0286 *0.9714
             . 0.0281 *0.9719
             Peter *0.4059 0.5941
             Blackburn *0.5057 0.4943
             BRUSSELS 0.4977 *0.5023
             1996-08-22 0.0286 *0.9714
             The 0.0544 *0.9456
             European 0.0544 *0.9456
             Commission 0.0544 *0.9456
             said 0.0258 *0.9742
             on 0.0283 *0.9717
             Thursday 0.0544 *0.9456
             it 0.0286 *0.9714
            #====== Customization ======
            BETA = 0.5
            MAX_ITER = 5 # max training iteration
            BOUND = (0, 20) # the desired position bound of samples
            #==========================
            5
            Tips
            • Start early! This project may take longer than the previous assignments if you are aiming for
            the perfect score.
            • Generalize your features. For example, if you're adding the above "case=Title" feature, think
            about whether there is any pattern that is not captured by the feature. Would the "case=Title"
            feature capture "O'Gorman"?
            • When you add a new feature, think about whether it would have a positive or negative weight
            for PERSON and O tags (these are the only tags for this assignment).

            請加QQ:99515681  郵箱:99515681@qq.com   WX:codinghelp






















             

            標簽:

            掃一掃在手機打開當前頁
          1. 上一篇:COMP3334代做、代寫Python程序語言
          2. 下一篇:代寫CSC 330、代做C/C++編程語言
          3. 無相關信息
            昆明生活資訊

            昆明圖文信息
            蝴蝶泉(4A)-大理旅游
            蝴蝶泉(4A)-大理旅游
            油炸竹蟲
            油炸竹蟲
            酸筍煮魚(雞)
            酸筍煮魚(雞)
            竹筒飯
            竹筒飯
            香茅草烤魚
            香茅草烤魚
            檸檬烤魚
            檸檬烤魚
            昆明西山國家級風景名勝區
            昆明西山國家級風景名勝區
            昆明旅游索道攻略
            昆明旅游索道攻略
          4. 福建中專招生網 NBA直播 短信驗證碼平臺 幣安官網下載 WPS下載

            關于我們 | 打賞支持 | 廣告服務 | 聯系我們 | 網站地圖 | 免責聲明 | 幫助中心 | 友情鏈接 |

            Copyright © 2025 kmw.cc Inc. All Rights Reserved. 昆明網 版權所有
            ICP備06013414號-3 公安備 42010502001045

            主站蜘蛛池模板: 久久AAAA片一区二区| 人妻少妇一区二区三区 | 色综合视频一区二区三区 | 亚洲欧洲一区二区三区| 精品亚洲AV无码一区二区| 99精品久久精品一区二区| 加勒比精品久久一区二区三区| 国产一区二区精品久久| 国产日韩高清一区二区三区| 亚洲一区精彩视频| 精品一区二区三区自拍图片区| 国产99精品一区二区三区免费| 白丝爆浆18禁一区二区三区 | 中文人妻无码一区二区三区| 国产日韩精品一区二区在线观看 | 国产AV午夜精品一区二区三| 国产成人无码精品一区在线观看| 人妻体内射精一区二区三区| 国产成人精品一区二区秒拍| 亚洲一区二区三区香蕉| 国精产品一区一区三区免费视频| 一区二区三区AV高清免费波多| 日韩毛片一区视频免费| 午夜精品一区二区三区在线观看 | 国产精品一区二区av| 中文字幕一区二区日产乱码| 国产激情一区二区三区| 精品一区二区三区免费视频| 99精品高清视频一区二区| 99精品高清视频一区二区| 久久人妻内射无码一区三区| 色窝窝无码一区二区三区色欲| 亚洲AV色香蕉一区二区| 无码人妻一区二区三区在线视频 | 亚洲一区二区三区不卡在线播放| 久久久久99人妻一区二区三区 | 爆乳熟妇一区二区三区| 久久久不卡国产精品一区二区| 免费无码一区二区三区蜜桃| 日本在线不卡一区| 亚洲AV一区二区三区四区|