Data Lakes for Digital Humanities

12/04/2020
by   Jérôme Darmont, et al.
0

Traditional data in Digital Humanities projects bear various formats (structured, semi-structured, textual) and need substantial transformations (encoding and tagging, stemming, lemmatization, etc.) to be managed and analyzed. To fully master this process, we propose the use of data lakes as a solution to data siloing and big data variety problems. We describe data lake projects we currently run in close collaboration with researchers in humanities and social sciences and discuss the lessons learned running these projects.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/05/2020

Role of Apache Software Foundation in Big Data Projects

With the increase in amount of Big Data being generated each year, tools...
research
10/10/2022

Revisiting Connotations of Digital Humanists: Exploratory Interviews

This ongoing study revisits the connotations of "digital humanists" and ...
research
06/19/2020

REBD:A Conceptual Framework for Big Data Requirements Engineering

Requirements engineering (RE), as a part of the project development life...
research
02/26/2018

Digital Archives as Big Data

Digital archives contribute to Big data. Combining social network analys...
research
08/19/2021

Challenges and Solutions for Utilizing Earth Observations in the "Big Data" era

The ever-growing need of data preservation and their systematic analysis...
research
01/21/2000

Take-home Complexity

We discuss the use of projects in first-year graduate complexity theory ...
research
11/26/2020

Early Life Cycle Software Defect Prediction. Why? How?

Many researchers assume that, for software analytics, "more data is bett...

Please sign up or login with your details

Forgot password? Click here to reset