JPS-daprinfo: A Dataset for Japanese Dialog Act Analysis and People-related Information Detection
We conducted a labeling work on a spoken Japanese dataset (I-JAS) for the text classification, which contains 50 interview dialogues of two-way Japanese conversation that discuss the participants' past present and future. Each dialogue is 30 minutes long. From this dataset, we selected the interview dialogues of native Japanese speakers as the samples. Given the dataset, we annotated sentences with 13 labels. The labeling work was conducted by native Japanese speakers who have experiences with data annotation. The total amount of the annotated samples is 20130.
READ FULL TEXT