Research Article
A Low-Cost Named Entity Recognition Research Based on Active Learning
Algorithm 1
The pseudocode of AL-CRF.
| | Initialization: unlabelled dataset , , , , initialization data number , additional number in iteration | | | // k-means clustering | | | select cluster centers randomly as | | | do | | | for in do | | | for in do | | | if is then | | | the cluster of is | | | end | | | end | | | end | | | update to | | | while | | | output the clustered dataset | | | select samples from by stratified sampling | | | annotate samples into | | | train CRFs by | | | | | | calculate the F-value of CRFs as | | | while do | | | calculate entropy in | | | rank the according to the entropy | | | annotate top samples into | | | train CRFs by | | | | | | calculate the F-value of CRFs as | | | end |
|