Florian Schmidt

is this you? claim profile


  • Neural Document Embeddings for Intensive Care Patient Mortality Prediction

    We present an automatic mortality prediction scheme based on the unstructured textual content of clinical notes. Proposing a convolutional document embedding approach, our empirical investigation using the MIMIC-III intensive care database shows significant performance gains compared to previously employed methods such as latent topic distributions or generic doc2vec embeddings. These improvements are especially pronounced for the difficult problem of post-discharge mortality prediction.

    12/01/2016 ∙ by Paulina Grnarova, et al. ∙ 0 share

    read it

  • Deep State Space Models for Unconditional Word Generation

    Autoregressive feedback is considered a necessity for successful unconditional text generation using stochastic sequence models. However, such feedback is known to introduce systematic biases into the training and it obscures a principle of generation: committing to global information and forgetting local nuances. We show that a non-autoregressive deep state space model with a clear separation of global and local uncertainty can be build from only two ingredients: An independent noise source and a deterministic transition function. Recent advances on flow-based variational inference allow training an evidence lower-bound without resorting to annealing, auxiliary losses or similar measures. The result is a highly interpretable generative model on par with a comparable auto-regressive model on the task of word generation.

    06/12/2018 ∙ by Florian Schmidt, et al. ∙ 0 share

    read it

  • Grand Challenge: Real-time Destination and ETA Prediction for Maritime Traffic

    In this paper, we present our approach for solving the DEBS Grand Challenge 2018. The challenge asks to provide a prediction for (i) a destination and the (ii) arrival time of ships in a streaming-fashion using Geo-spatial data in the maritime context. Novel aspects of our approach include the use of ensemble learning based on Random Forest, Gradient Boosting Decision Trees (GBDT), XGBoost Trees and Extremely Randomized Trees (ERT) in order to provide a prediction for a destination while for the arrival time, we propose the use of Feed-forward Neural Networks. In our evaluation, we were able to achieve an accuracy of 97 mins) for the ETA prediction.

    10/12/2018 ∙ by Oleh Bodunov, et al. ∙ 0 share

    read it

  • BrainSlug: Transparent Acceleration of Deep Learning Through Depth-First Parallelism

    Neural network frameworks such as PyTorch and TensorFlow are the workhorses of numerous machine learning applications ranging from object recognition to machine translation. While these frameworks are versatile and straightforward to use, the training of and inference in deep neural networks is resource (energy, compute, and memory) intensive. In contrast to recent works focusing on algorithmic enhancements, we introduce BrainSlug, a framework that transparently accelerates neural network workloads by changing the default layer-by-layer processing to a depth-first approach, reducing the amount of data required by the computations and thus improving the performance of the available hardware caches. BrainSlug achieves performance improvements of up to 41.1 user as they do not require hardware changes and only need tiny adjustments to the software.

    04/23/2018 ∙ by Nicolas Weber, et al. ∙ 0 share

    read it

  • Application-Agnostic Offloading of Packet Processing

    As network speed increases, servers struggle to serve all requests directed at them. This challenge is rooted in a partitioned data path where the split between the kernel space networking stack and user space applications induces overheads. To address this challenge, we propose Santa, a new architecture to optimize the data path by enabling server applications to partially offload packet processing to a generic rule processor. We exemplify Santa by showing how it can drastically accelerate kernel-based packet processing - a currently neglected domain. Our evaluation of a broad class of applications, namely DNS, Memcached, and HTTP, highlights that Santa can substantially improve the server performance by a factor of 5.5, 2.1, and 2.5, respectively.

    04/01/2019 ∙ by Oliver Hohlfeld, et al. ∙ 0 share

    read it

  • On the Fly Orchestration of Unikernels: Tuning and Performance Evaluation of Virtual Infrastructure Managers

    Network operators are facing significant challenges meeting the demand for more bandwidth, agile infrastructures, innovative services, while keeping costs low. Network Functions Virtualization (NFV) and Cloud Computing are emerging as key trends of 5G network architectures, providing flexibility, fast instantiation times, support of Commercial Off The Shelf hardware and significant cost savings. NFV leverages Cloud Computing principles to move the data-plane network functions from expensive, closed and proprietary hardware to the so-called Virtual Network Functions (VNFs). In this paper we deal with the management of virtual computing resources (Unikernels) for the execution of VNFs. This functionality is performed by the Virtual Infrastructure Manager (VIM) in the NFV MANagement and Orchestration (MANO) reference architecture. We discuss the instantiation process of virtual resources and propose a generic reference model, starting from the analysis of three open source VIMs, namely OpenStack, Nomad and OpenVIM. We improve the aforementioned VIMs introducing the support for special-purpose Unikernels and aiming at reducing the duration of the instantiation process. We evaluate some performance aspects of the VIMs, considering both stock and tuned versions. The VIM extensions and performance evaluation tools are available under a liberal open source licence.

    09/17/2018 ∙ by Pier Luigi Ventre, et al. ∙ 0 share

    read it

  • Representation Learning for Resource Usage Prediction

    Creating a model of a computer system that can be used for tasks such as predicting future resource usage and detecting anomalies is a challenging problem. Most current systems rely on heuristics and overly simplistic assumptions about the workloads and system statistics. These heuristics are typically a one-size-fits-all solution so as to be applicable in a wide range of applications and systems environments. With this paper, we present our ongoing work of integrating systems telemetry ranging from standard resource usage statistics to kernel and library calls of applications into a machine learning model. Intuitively, such a ML model approximates, at any point in time, the state of a system and allows us to solve tasks such as resource usage prediction and anomaly detection. To achieve this goal, we leverage readily-available information that does not require any changes to the applications run on the system. We train recurrent neural networks to learn a model of the system under consideration. As a proof of concept, we train models specifically to predict future resource usage of running applications.

    02/02/2018 ∙ by Florian Schmidt, et al. ∙ 0 share

    read it

  • Support for Error Tolerance in the Real-Time Transport Protocol

    Streaming applications often tolerate bit errors in their received data well. This is contrasted by the enforcement of correctness of the packet headers and payload by network protocols. We investigate a solution for the Real-time Transport Protocol (RTP) that is tolerant to errors by accepting erroneous data. It passes potentially corrupted stream data payloads to the codecs. If errors occur in the header, our solution recovers from these by leveraging the known state and expected header values for each stream. The solution is fully receiver-based and incrementally deployable, and as such requires neither support from the sender nor changes to the RTP specification. Evaluations show that our header error recovery scheme can recover from almost all errors, with virtually no erroneous recoveries, up to bit error rates of about 10

    12/20/2013 ∙ by Florian Schmidt, et al. ∙ 0 share

    read it