Achieving Guidance in Applied Machine Learning through Software Engineering Techniques

by   Lars Reimann, et al.
University of Bonn

Development of machine learning (ML) applications is hard. Producing successful applications requires, among others, being deeply familiar with a variety of complex and quickly evolving application programming interfaces (APIs). It is therefore critical to understand what prevents developers from learning these APIs, using them properly at development time, and understanding what went wrong when it comes to debugging. We look at the (lack of) guidance that currently used development environments and ML APIs provide to developers of ML applications, contrast these with software engineering best practices, and identify gaps in the current state of the art. We show that current ML tools fall short of fulfilling some basic software engineering gold standards and point out ways in which software engineering concepts, tools and techniques need to be extended and adapted to match the special needs of ML application development. Our findings point out ample opportunities for research on ML-specific software engineering.


page 1

page 2

page 3

page 4


Software Engineering Practices for Machine Learning

In the last couple of years we have witnessed an enormous increase of ma...

Machine Learning Application Development: Practitioners' Insights

Nowadays, intelligent systems and services are getting increasingly popu...

On Misbehaviour and Fault Tolerance in Machine Learning Systems

Machine learning (ML) provides us with numerous opportunities, allowing ...

Machine Learning at Microsoft with ML .NET

Machine Learning is transitioning from an art and science into a technol...

Mutation Testing framework for Machine Learning

This is an article or technical note which is intended to provides an in...

A Survey on Software Engineering Practices in Brazilian Startups

Today's significant technological advancement allows early-stage softwar...

1. Introduction

The success of deep learning and machine learning (ML) in general in recent years is attributed to the availability of large datasets, more efficient algorithms, and specialized hardware

(Chollet, 2017). Based on the current hype, more and more companies try to integrate ML into their products, leading to an ever increasing demand of data scientists. Software developers who already know the targeted application domain could fill this gap but typically lack a scientific ML background. Therefore, they need guidance that helps them correctly use ML.

We define guidance to encompass all means to facilitate or enforce correct usage of a tool or an API or proper adherence to a workflow. Guidance includes, on one hand, preventing, detecting, explaining and fixing erroneous usage and, on the other hand, communicating best practices and helping apply them. In a traditional software engineering (SE) context, guidance can help developers when learning an API or the use of a tool, developing a program using the API or tool, and debugging the program when something went wrong.

This paper analyses how current ML tools and APIs fail to provide guidance to developers and presents ideas for improvement. Sec. 2 reviews a typical workflow of ML application development, points out differences to traditional software engineering, and elaborates the desirable guidance for each step of the ML workflow. Sec. 3 reviews how the main APIs, languages, and development environments currently used for ML application development fail to provide guidance. Sec. 4 discusses how better guidance can be offered by applying and adapting various SE concepts to the specific requirements of ML application development.

2. ML Workflows are Different

Due to space constraints, we focus without loss of generality on supervised learning, the currently most widely-used ML workflow

(Chollet, 2017). The task in supervised learning is to learn a function on the basis of some training data, which is a (typically very large) set of examples containing features and labels. Given the features, should ideally produce the labels. If the labels are from a continuous set, we call it a regression problem, if the labels are from a discrete set we call it a classification problem.

A traditional SE workflow involves activities such as requirement engineering, design, implementation, refactoring, and testing. In an ML workflow, most of these activities are performed in a quite specific way (e.g. Sec. 2.1 and 2.2), if at all111What, for instance, is the equivalent of refactoring in ML?. Moreover, an ML workflow involves additional essential activities, such as Data Engineering (Sec. 2.3), Model Engineering (Sec. 2.4), and Model Quality Engineering (Sec. 2.5).

2.1. Evaluation-Focussed Requirements Engineering

Requirements Engineering (RE) for ML reflects the fact that ML development includes empirical evaluation of results as part of the indispensable Model Quality Engineering activity (Sec. 2.5). Accordingly, the definition of quality metrics and the required level of quality for each metric is not an optional best practice but an essential part of the workflow. One quality metric we might define upfront, for instance, is that at least accuracy must be achieved. This gives as clear metric to assess the usefulness of a model and a stopping criteria for the workflow.

Requirements Engineering Guidance

Desirable guidance for RE for ML includes:

  • Automated suggestions of suitable quality metrics based on problem (regression / classification) and data.

  • Automated checking if a comparable task has been solved before and achieved the desired quality.

2.2. Test-Driven Development the ML Way

Before we can start learning, we must first ensure that a large amount of data is available and split it into a

  • a training set that will be used for the actual learning / training (Sec. 2.4),

  • a validation set that will be used for assessing whether the training results are good enough and to fine-tune the model222Alternatively, we can use cross-validation (Chollet, 2017). (Sec. 2.5).

  • a test set, that will be used to get a last measure of how well the model performs on real-world data before pushing the model into production (Sec. 2.5).

Test-Driven Development Guidance

Desirable guidance for Test-Driven Development in ML includes:

  • Automated suggestion whether to use cross-validation or a specific validation set based on desired quality, training time and the amount of data available.

  • Automated suggestion of what percentage of the data should go into each of the sets.

  • Automated verification of best practices for a ‘good’ split of the data. For instance, the distribution of labels and significant features in the three sets should be as similar as possible (stratified sampling (Geron, 2017)).

  • Automatically hiding the test set away until needed, so it does not influence decisions in subsequent steps of the workflow like the choice of the model (Chollet, 2017).

2.3. Data Engineering

Next, we must ensure that the available training data is of sufficient quality. If not, we need to perform data engineering, which involves

  • handling of missing values,

  • turning categorical into numerical values,

  • normalization of numerical values,

  • assessing the statistical relevance of available features,

  • selecting the features used for learning, possibly combining several features.

Data engineering is typically the most time consuming step in an ML project but also the most important one: If the data used for learning is of poor quality, so will be the learned function .

Data Engineering Guidance

Desirable guidance for data engineering includes:

  • Automated detection of missing values and automated suggestions for handling them, ideally specific to the type of data or learning task at hand. For instance, for numeric values, missing values could be substituted by the mean of the other values. This is not possible for a feature that represents street names.

  • Automatic conversion of categorical into numerical values, or at least suggestions for possible conversions, again taking into account the specific data types and application semantics. For instance, if categorical values can be ordered (e.g. ‘poor’, ‘middle class’, ‘rich’, ‘billionaire’), we can replace them by an increasing sequence of numbers (label encoding). Otherwise, we can perform a one-hot encoding

    , which maps each different value to a sparse vector

    (Geron, 2017).

  • Automated normalization of numerical values.

  • Automated computation of the entropy of the distribution of a feature or of any other measure for the statistical relevance of the available features.

  • Automated suggestions for sensible feature combinations in a specific application domain.

  • Ideally, any data processing step performed on the training data, should be automatically applied also to the validation and test set in order to avoid inconsistencies or tedious and error-prone manual application of each step on each dataset.

2.4. Model Engineering

The next step in the ML workflow is the selection of an existing ML API and the choice of a supported learning algorithm, such as decision trees or neural networks (NNs). At first glance, this might appear similar to the choice of proper algorithms and data structures in SE. However, it was shown by Wolpert in (Wolpert, 1996) that lacking specific assumptions no learning algorithm performs better than any other. This makes it impossible to provide a general guideline to always use, say NNs.

Choosing a learning algorithm instead requires significant background in ML and a careful consideration of the application goal333Do we need explainability? and the properties of the data available after data engineering. Even with this knowledge, though, the final choice of the proper learning algorithm is typically the outcome of a lot of experimentation, involving several iterations of the entire ML development workflow.

Given a choice of an algorithm, a developer then writes a training program and starts the learning process on the prepared training data. Depending on the circumstances (hardware, data size, learning algorithm, etc.) this step can take a few seconds or several days. Its outcome is a model representing the learned function . It is worth noting that, unlike in SE, the learned model is the final output of the workflow, not the written program.

Model Engineering Guidance

Desirable guidance for model engineering includes:

  • Automated suggestions of suitable learning algorithms based on the available data and the application domain.

  • Well-documented APIs and all kinds of teaching and training material that help quickly learn how to use them.

  • Automated checking of correct API use, ideally during program development (static checking) but at least at runtime. Given that a single training run can take hours and days and the entire workflow involves many iterations, the importance of static checking cannot be overstated. Costs incurred in statically detecting and fixing errors in the training program that would be prohibitive in traditional SE appear small in an ML context.

  • Help in understanding and fixing errors, ranging from sensible error messages, ideally enriched with suggestions how to resolve the error, to automated quickfixes.

2.5. Model Quality Engineering

Next, the performance of the model is evaluated by running it on the validation set (Sec. 2.2), to get a measure of how well the model performs on data not seen during training. By measuring the metrics established in the RE phase (Sec. 2.1) we check that the model learned the characteristics of the training data well enough (no underfitting) and did not simply learn the training data by heart (no overfitting). The gathered metrics are used to fine-tune the training program by setting different hyperparameters

that control the behaviour of the learning algorithm. For example, the maximum number of nodes in a decision tree can be a hyperparameter. A low value can lead to underfitting, while a high value can lead to overfitting. With the improved settings, a new training is started and the entire process is repeated from there. If the results are still not satisfactory, one might go further back to select a different learning algorithm (Sec.

2.4) or even rethink and enhance data engineering (Sec. 2.3).

Finally, when the model has passed the validation stage, we use the test set (Sec. 2.2) to get a last measure of how well the model performs on unseen data before pushing the model into production. We cannot use the validation set for this, since we selected the model that works best on the validation set, which introduces a risk of overfitting (Geron, 2017).

Model Quality Engineering Guidance

Desirable guidance for quality engineering includes:

  • Automated experiment management, that is, saving of the models that achieved the highest performance values during validation along with the related training program and fully data-engineered datasets (including training, validation and test data) in order to support reproducibility of results.

  • Automated detection of overfitting and related suggestions how to combat it, e.g. by inserting drop-out layers (Hinton et al., 2012) in a neural network.

  • Automated detection of underfitting and related suggestions how to combat it, e.g. by choosing a more powerful learning algorithm.

3. State of the Art

The highly explorative and iterative workflow described in Sec. 2 is supported by high-level APIs for ML (Sec. 3.1), dynamic languages (Sec. 3.2), and dynamic development environments (Sec. 3.3). In this section, we review each of them with respect to the guidance they provide.

3.1. Efficient, High-Level APIs

Based on a survey of ML APIs by Nguyen and others from 2019 (Nguyen et al., 2019) we investigated scikit-learn (Pedregosa et al., 2011)

, a representative of an API for ‘classical’ ML, as well as Keras

(Chollet and others, 2015)

and PyTorch

(Paszke et al., 2019), which are APIs for deep learning. These APIs provide high-level abstractions to apply and adapt learning algorithms (model engineering), to ensure sufficient prediction quality (model quality engineering) and even to perform data engineering, thereby providing all functionality needed to rapidly iterate the ML workflow from Sec. 2. With respect to guidance, though, we identified several shortcomings:

Criticism: Hidden and Inconsistent Constraints in the API Documentation

The broadness of these APIs makes it difficult to use them correctly, especially since many constraints are hard to understand from the documentation. By looking just at the documentation of the support vector machine learning algorithm for classification (SVC) in scikit-learn

444 we can already identify four different categories of constraints documented only in natural language or documented inconsistently:

  • type constraints limit values of expressions in any context,

  • dependencies limit values depending on other values,

  • temporal constraints limit the order of operations,

  • execution context constraints express restrictions specific to the used programming language or executions platform.

Figure 1. Type constraints (scikit-learn)

Hidden and inconsistent type constraints are illustrated in Fig. 1: According to the first line of the documentation, the constructor parameter kernel of the SVC must be of type string. However, the rest of the text (see highlighting), reveals that only the five string literals ‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’ and ‘precomputed’ are allowed and that we can additionally pass a callable, thus something that is not a string. Reading on, we learn that the signature of the callable is further restricted: It must have a single parameter that must be an array of shape n_samples by n_samples.

Figure 2. Dependent API elements (scikit-learn)

A dependency constraint is shown in Fig. 2: The value of the constructor parameter degree is only relevant if the constructor parameter kernel has the value ‘poly’.

Figure 3. Temporal constraint (scikit-learn)

A temporal constraint is shown in Fig. 3: We must set the attribute probability to ‘True’ prior to the first call of the fit method (which starts the learning process).

Figure 4. Constraint of execution context (scikit-learn)

An execution context constraint is shown in Fig. 4: Verbose output may not work in a multithreaded environment.

Criticism: Lack of Constraint Checking and Helpful Error Messages

Let us now investigate what happens if we run a program that does not comply with some of the restrictions shown above. We begin with the simple program in List. 1. There, we create an SVC with the invalid kernel ‘line’ (see Fig. 1) — the programmer probably meant ’linear’. But despite the wrong value, the code executes without error. The programmer therefore adds more code to preprocess the data and initiate training, which leads to the program state in List. 2. Executing this program throws the error shown in List. 3:

Listing 1. Incorrect, but throws no error (scikit-learn)

Listing 2. Throws a misleading error (scikit-learn)

Listing 3. Error thrown by program in List. 2 (scikit-learn)

There are multiple issues with this behaviour:

  • No precondition checking in API function. First, the error is not thrown by the API at the place where the illegal value was set, misleading the programmer to think that the first two lines are correct.

  • Misleading, generic error messages. A generic error is thrown by the runtime system when stumbling over the illegal value, which is much too late. At this point, the error message and shown stacktrace further mislead the programmer. Since lines 3 and 4 contain Python lists that indeed do not contain the string ‘line’, the programmer could try to find a bug there. Going back to line 2, which apparently executed flawlessly, requires deep internal knowledge of the API, which is unlikely for anybody except its authors. One could argue that the string literal ‘line’ shows up in line 2, which could point the programmer to the true issue. But note that the code for data preprocessing is normally much longer than shown here, so the line where the error originated might not be visible in the context where it manifests itself.

  • No hint how to recover. Finally, once the programmer has found the true bug, he must consult the documentation to find the correct value since the generic error message does not specify the valid inputs.

Criticism: No Checking of ML Best Practices

We only looked at constraints of the API so far. For general ML best practices (Sec. 2), the situation is even more dire: For them none of the APIs provides any checking or feedback. In addition, due to the differences between the APIs, each one would have to implement checking of these best practices separately, even though they apply regardless of the API that is used.

Criticism: No Static Checking

In either case, even the best documentation and ideal runtime error checking still requires developers to wait for (possibly long) training runs just to find out that an error occured. The effect is an annoying loop of editing, running, debugging, fixing and re-running the training program in order to see if the fix was successful — the runtime acts like a test oracle. Ideally, the developer would get guidance while he writes the program but this is poorly supported in the programming languages used for ML development, which brings us to the next point.

3.2. Languages

The above-mentioned study by Nguyen, et al. from 2019 (Nguyen et al., 2019) found that Python is the most popular language for ML. This can be attributed to its readability and to the fact that it is an interpreted language and, thus, well suited for highly interactive and explorative development. Moreover, Python has a vast ecosystem of scientific APIs that are useful for ML. Being a general-purpose programming language, Python provides the flexibility needed to implement new learning algorithms and highly customized ML-workflows. However, much of this flexibility — and therefore complexity of the language — is not needed if one is just using the functionality of an API for data, model and quality engineering to accomplish standard tasks.

Criticism: Static Checking is Hard

Because Python is such a dynamic, general-purpose programming language, it is difficult to statically check the constraints of an API. Since the 2015 release of version 3.5, Python at least supports optional type hints, defined in PEP 484 (van Rossum et al., 2015). Still, the authors stress that “Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention”. Thus, taking advantage of type hints requires the use of an external tool such as the static type checker mypy555 However, of the three ML APIs we observed, only the most recent one, PyTorch, includes type hints. In addition, dependencies, temporal constraints, execution context constraints and ML best practices are beyond the scope of type hints. Even a simple constraint such as ”the value must be a float between 0 and 1” cannot be expressed.

The impact of static type checking was quantified (among others) in a study by Fischer and Hanenberg from 2015 (Fischer and Hanenberg, 2015), where they found the development time of a program in the statically typed JavaScript-superset TypeScript666 to be significantly shorter than in the dynamically typed JavaScript itself. We argue that static checking of non-type errors has a similar positive effect.

3.3. Development Environments

Last but not least, notebooks for Python and other languages, e.g. Jupyter Notebooks777, have become a popular learning and development environment for ML and have recently been integrated into the generic Python IDE PyCharm888

In a notebook, a program can be split into cells that can be evaluated independently without deleting the results created by other cells. Results, including complex visualizations, are shown directly in the notebook next to the cell. It is possible to include text cells between code cells for documentation purposes and to structure the notebook. This feature also makes notebooks ideal for creating executable teaching materials for novices. Due to their interactivity, they are used in practice as a development environment for ML, especially for data exploration. We claim that the popularity of notebooks, even when used within PyCharm, points to the lack of IDEs that are adapted to the needs of ML development.

Criticism: No Support for Data Engineering

Notebooks and generic Python IDEs such as PyCharm (henceforth called altogether ‘ML tools’) provide no specific support for data engineering. There are no built-in automated analyses of a dataset, as suggested in Sec. 2.3, or visualizations of them. Developers must rely on external tools or the functionality provided through APIs.

Criticism: No Support for Introspection

While developing a training program, developers need to inspect the current state of data or models. There is no support for this in current ML tools. Instead, developers need to call specific API functions that provide introspection support. Keras, for example, has a summary method that displays the layers of an NN, the shape of their outputs and the number of trainable parameters, which is a good measure of how long training is going to take and whether the model is likely to overfit or underfit. Fig. 5 shows an example output of this method.

Figure 5. Output of the summary method (Keras)

Although introspection is not conceptually part of the learning program, we must mix the two concerns in the notebook environment. A dedicated ML IDE could provide a clearer separation of concerns.

Criticism: No Experiment Management

For experiment management developers must also invent their own solutions or choose external tools. A dedicated IDE could fulfill this need instead and also improve the discoverability of past experiments by automatically comparing the current task and available data to a database, thereby preventing duplicate work.

4. The Simple-ML Approach

The Simple-ML project999 is dedicated to the development of solutions for the above-mentioned gaps in the state of the art. Its core ideas, summarized in Fig. 6, are: (1) the development of a unified ML API, (2) the development of a domain specific language (DSL) for ML, (3) the development of a dedicated ML IDE, and (4) last but not least, as a common basis for (2) and (3), the development of metamodels of the ML domain, ML APIs, and application-specific datasets.

Figure 6. Our Simple-ML Approach

Towards a Unified ML API

In its first release, the unified ML API will generalize the functionality of scikit-learn, Keras and PyTorch. Development of an own API lets us (a) ensure the quality of its documentation, and (b) ensure the existence of runtime checks where no static checking is possible. This enables checking of compliance with general, API-independent ML best practices once and for all. The unified API will be based on adapters to the different existing APIs and will shield the higher levels from their details. In particular, it will serve as a unified target for code generation by the compiler of the DSL.

A DSL with ML-aware Static Analysis Capabilities

The Simple-ML DSL, will provide ML-specific abstractions as first-class language elements, and will make them available via a textual and a visual syntax. This will make it easy to learn by developers who want to experiment with ML but are not experienced programmers. In addition, the DSL will statically catch errors related to all the types of constraints identified in Sec. 3.1.

Under the hood the DSL compiler will generate code for a general purpose programming language such as Python, using the implemented adapters to target an individual framework. This avoids locking users into the Simple-ML DSL. If they eventually need the flexibility of a general purpose programming language, they can take the generated code and modify it as desired.

Using Ontologies to Store Metadata

ML-specific error detection will be done based on metadata in the form of ontologies. We will create ontologies that describe ML concepts, such as learning algorithms and quality metrics, to capture information like the hyperparameters of learning algorithms or whether a learning algorithm and a quality metric are compatible. For this we consider improving the MEX ontology (Esteves et al., 2015).

The ML-specific metadata will be complemented by metadata about available datasets. Each dataset will be associated with a semantic description of its features (Gottschalk et al., 2019)

. For each feature, we will store metainformation like its semantic category (e.g. the fact that it represents the name of a person), or the associated unit of measurement (e.g. whether a distance is measured in metres or miles). We will also store general statistical properties of a feature, like minimum and maximum values, standard deviation etc.

Using this metadata we can provide the desirable data engineering guidance identified in Sec. 2.3. For instance, we will be able to quickly detect if data is normalized and normalize it automatically if desired, or provide targeted suggestions for feature combinations based on the semantic categories. For model engineering (Sec. 2.4) we can also provide guidance, e.g. by suggesting suitable models given the data and the task. Ontologies can also provide the technical basis for model quality engineering guidance (Sec. 2.5).

An IDE to Tie Everything Together

Further guidance will be provided by an ML-specific IDE that will support the entire ML workflow (Sec. 2) through GUI features. The error checking and fixing capabilities of the DSL will be used to display errors and to provide quickfixes while the training program is being written. The ontology-based metadata will be used for elaborate suggestions of preprocessing steps during data engineering (Sec. 2.3), for suggestions of suitable models during model engineering (Sec. 2.4), and for experiment management and suggestions of model improvements during quality engineering (Sec. 2.5). The GUI of the IDE will also provide introspection (Sec. 3.3) capabilities, allowing us to eliminate introspection from the user-visible part of the unified ML API, thus making it easier to learn.

5. Conclusion

This paper introduced the concept of guidance as the means offered by an API or tool to facilitate or enforce correct usage or to help follow a predetermined workflow or established best practices. Guidance speeds up learning, development and debugging.

We discussed key differences between ML and SE workflows, derived the desirable ML-specific guidance and analyzed how existing APIs, languages and tools used for ML application development fail to provide most of it. As a possible solution, we identified static analysis, meta-modeling, and support by an IDE with an advanced GUI as key SE concepts, whose combined use could provide the missing guidance. Finally, we sketched the Simple-ML project, which will implement this solution in order make ML available as a well-understood tool for software developers.

This work was partially funded by the Federal Ministry of Education and Research (BMBF), Germany under Simple-ML (01IS18054).


  • F. Chollet et al. (2015) Keras. Note: Cited by: §3.1.
  • F. Chollet (2017) Deep learning with python. 1st edition, Manning Publications Co., USA. External Links: ISBN 978-1-61729-443-3 Cited by: §1, 4th item, §2, footnote 2.
  • D. Esteves, D. Moussallem, C. B. Neto, T. Soru, R. Usbeck, M. Ackermann, and J. Lehmann (2015) MEX vocabulary: a lightweight interchange format for machine learning experiments. In Proceedings of the 11th International Conference on Semantic Systems, SEMANTICS ’15, New York, NY, USA, pp. 169––176. External Links: ISBN 9781450334624, Link, Document Cited by: §4.
  • L. Fischer and S. Hanenberg (2015) An empirical investigation of the effects of type systems and code completion on API usability using TypeScript and JavaScript in MS visual studio. In DLS 2015 - Proceedings of the 11th Symposium on Dynamic Languages, pp. 154–167. External Links: Document, ISBN 9781450336901 Cited by: §3.2.
  • A. Geron (2017)

    Hands-on machine learning with scikit-learn and tensorflow

    1st edition, O’Reilly. External Links: ISBN 978-1491962299 Cited by: 3rd item, 2nd item, §2.5.
  • S. Gottschalk, N. Tempelmeier, G. Kniesel, V. Iosifidis, B. Fetahu, and E. Demidova (2019) Simple-ml: towards a framework for semantic data analytics workflows. In

    Semantic Systems. The Power of AI and Knowledge Graphs

    , M. Acosta, P. Cudré-Mauroux, M. Maleshkova, T. Pellegrini, H. Sack, and Y. Sure-Vetter (Eds.),
    Cham, pp. 359–366. External Links: ISBN 978-3-030-33220-4 Cited by: §4.
  • G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov (2012) Improving neural networks by preventing co-adaptation of feature detectors. CoRR abs/1207.0580. External Links: Link, 1207.0580 Cited by: 2nd item.
  • G. Nguyen, S. Dlugolinsky, M. Bobák, V. Tran, Á. López García, I. Heredia, P. Malík, and L. Hluchý (2019) Machine Learning and Deep Learning frameworks and libraries for large-scale data mining: a survey. Artificial Intelligence Review 52 (1), pp. 77–124. External Links: Document, ISSN 15737462 Cited by: §3.1, §3.2.
  • A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala (2019) PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alché-Buc, E. Fox, and R. Garnett (Eds.), pp. 8024–8035. External Links: Link Cited by: §3.1.
  • F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and É. Duchesnay (2011) Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, pp. 2825–2830. External Links: 1201.0490, ISSN 15324435 Cited by: §3.1.
  • G. van Rossum, J. Lehtosalo, and Ł. Langa (2015) PEP 484 – Type Hints — External Links: Link Cited by: §3.2.
  • D. H. Wolpert (1996) The Lack of a Priori Distinctions between Learning Algorithms. Neural Computation 8 (7), pp. 1341–1390. External Links: Document, ISSN 08997667 Cited by: §2.4.