Research Article

A Data-Driven Model for Automated Chinese Word Segmentation and POS Tagging

Table 5

Presentation of the four datasets.

DatasetData segmentationNumber of words (K)Number of sentences (K)

MSRTraining set49418
Test set8.0348
CTB7Training set7831
Test set24510
CTB5Training set213178
Test set1074