DeepAI AI Chat
Log In Sign Up

The Czech Court Decisions Corpus (CzCDC): Availability as the First Step

by   Tereza Novotná, et al.

In this paper, we describe the Czech Court Decision Corpus (CzCDC). CzCDC is a dataset of 237,723 decisions published by the Czech apex (or top-tier) courts, namely the Supreme Court, the Supreme Administrative Court and the Constitutional Court. All the decisions were published between 1st January 1993 and 30th September 2018. Court decisions are available on the webpages of the respective courts or via commercial databases of legal information. This often leads researchers interested in these decisions to reach either to respective court or to commercial provider. This leads to delays and additional costs. These are further exacerbated by a lack of inter-court standard in the terms of the data format in which courts provide their decisions. Additionally, courts' databases often lack proper documentation. Our goal is to make the dataset of court decisions freely available online in consistent (plain) format to lower the cost associated with obtaining data for future research. We believe that simplified access to court decisions through the CzCDC could benefit other researchers. In this paper, we describe the processing of decisions before their inclusion into CzCDC and basic statistics of the dataset. This dataset contains plain texts of court decisions and these texts are not annotated for any grammatical or syntactical features.


page 1

page 2

page 3

page 4


The Harrington Yowlumne Narrative Corpus

Minority languages continue to lack adequate resources for their develop...

NUBES: A Corpus of Negation and Uncertainty in Spanish Clinical Texts

This paper introduces the first version of the NUBes corpus (Negation an...

The Project Dialogism Novel Corpus: A Dataset for Quotation Attribution in Literary Texts

We present the Project Dialogism Novel Corpus, or PDNC, an annotated dat...

SigmaLaw-ABSA: Dataset for Aspect-Based Sentiment Analysis in Legal Opinion Texts

Aspect-Based Sentiment Analysis (ABSA) has been prominent and ongoing re...

Automated Extraction of Sentencing Decisions from Court Cases in the Hebrew Language

We present the task of Automated Punishment Extraction (APE) in sentenci...

An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols

We describe an effort to annotate a corpus of natural language instructi...