iLCM - A Virtual Research Infrastructure for Large-Scale Qualitative Data

05/11/2018
by   Andreas Niekler, et al.
0

The iLCM project pursues the development of an integrated research environment for the analysis of structured and unstructured data in a "Software as a Service" architecture (SaaS). The research environment addresses requirements for the quantitative evaluation of large amounts of qualitative data with text mining methods as well as requirements for the reproducibility of data-driven research designs in the social sciences. For this, the iLCM research environment comprises two central components. First, the Leipzig Corpus Miner (LCM), a decentralized SaaS application for the analysis of large amounts of news texts developed in a previous Digital Humanities project. Second, the text mining tools implemented in the LCM are extended by an "Open Research Computing" (ORC) environment for executable script documents, so-called "notebooks". This novel integration allows to combine generic, high-performance methods to process large amounts of unstructured text data and with individual program scripts to address specific research requirements in computational social science and digital humanities.

READ FULL TEXT
research
12/11/2018

Text data mining and data quality management for research information systems in the context of open data and open science

In the implementation and use of research information systems (RIS) in s...
research
07/11/2017

Leipzig Corpus Miner - A Text Mining Infrastructure for Qualitative Data Analysis

This paper presents the "Leipzig Corpus Miner", a technical infrastructu...
research
10/06/2021

Application of the interactive Leipzig Corpus Miner as a generic research platform for the use in the social sciences

This article introduces to the interactive Leipzig Corpus Miner (iLCM) -...
research
09/15/2018

Geo-Text Data and Data-Driven Geospatial Semantics

Many datasets nowadays contain links between geographic locations and na...
research
01/20/2023

Transforming Unstructured Text into Data with Context Rule Assisted Machine Learning (CRAML)

We describe a method and new no-code software tools enabling domain expe...
research
03/17/2022

Modellieren mit Heraklit: Prinzipien und Fallstudie

Heraklit is an ongoing research program and development project aimed at...
research
06/01/2023

Cross Modal Data Discovery over Structured and Unstructured Data Lakes

Organizations are collecting increasingly large amounts of data for data...

Please sign up or login with your details

Forgot password? Click here to reset