I Introduction
Chiral magnets with DzyaloshinskiiMoriya (DM) interactions [1, 2] present a rich set of phases in which topologically nontrivial structures arise. Among them, one finds skyrmions [3, 4], with unit topological charge; antiskyrmions, which are counterparts with opposite charge; and other objects with fractional charge, such as merons. These structures have been observed experimentally in a variety of materials [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]. The interest in these objects goes beyond the determination of their fundamental properties, as they have applications in the field of spintronics [20, 21, 22, 23, 24, 14].
One of the critical theoretical tools for studying chiral magnets is the implementation of Monte Carlo simulations for a discretized version of them. It has been shown in Ref. [25] that a 3D cubic spin lattice model with ferromagnetic and DM interactions can correctly reproduce the experimentallydetermined phase diagram for materials that support Bloch skyrmions. An exploration of the consequences of varying both the strength and the internal structure of the DM interaction was performed in Ref. [26]. Changing their structure in this system generates Neel skyrmions and antiskyrmions, while the strength controls the size of the corresponding objects.
As these simulations produce more varied and complex states and are used to map larger parameter spaces, it becomes crucial to develop automatic methods for dealing with the data they generate. The supervised machine learning framework provides an excellent toolbox to do this. As long as they are adequately selected, machine learning models can be trained with a relatively small set of samples and then used to analyze any new data generated through simulations automatically. In particular, Convolutional Neural Networks (CNNs) have been successfully applied to this type of task: to identify the phase
[27, 28, 29], the topological charge [30] or the DM interaction [31] from 2D images of a spinlattice; and to find the phase from videos of the lattice [32].In this work, we use a CNN in a multilabel approach to identify features such as skyrmions, merons, helical and ferromagnetic states, or hexagonal arrangements of the skyrmions. Several of these features can coexist in the same sample. The corresponding phase can then be inferred from the features. Furthermore, this approach accounts for states that mix two or more phases and allows for identifying features that may only appear in a small region of the image. We also apply this model to predict the features of the final state of a simulation from images of intermediate states. We then use it to construct a phase diagram by running the simulations for very short times and using the CNN to predict the final phase.
The rest of this paper is organized as follows. In Section II we review the Monte Carlo simulations that we perform for the generation of the training and test data. The data set itself, and the particular CNN that we use for it are described in Section III. We show our results in Section IV, and give our conclusions in Section V.
Ii Simulations
The system we simulate is a chiral magnet with ferromagnetic and DM interactions. From a coarsegrained perspective, it can be described by a local continuum Hamiltonian for the magnetisation field. We discretise it in a 3D cubic spinlattice of size
, with a unit vector
at each lattice position and periodic boundary conditions. To leading order in the lattice spacing, the Hamiltonian of this discrete system consists of the following nearest neighbours interactions:(1) 
where is the DM interaction, is the magnitude of an external magnetic field applied along the direction, and and are parameters controlling the strength of the ferromagnetic and DM interactions, respectively. All these parameters of the lattice system are adimensional and proportional to the corresponding parameters of the physical system. Following Ref. [25], we correct the anisotropies generated by the finite lattice spacing by introducing nexttonearest neighbours of the same form as the nearestneighbours ones in Eq. (1), with adequate coefficients.
The DM interactions for different materials take different forms as functions of . Here, we select the structure that generates antiskyrmions, which is [26]:
(2) 
When we extract data from the simulation to train the CNN, we will only use the component of . Since Bloch/Neel skyrmions and antiskyrmions have similar distributions for the component, the choice we have made for the DM interaction does not lead to a loss of generality: the approach we use in Section III would perform similarly for Bloch/Neel skyrmions.
At a finite temperature
, the probability of finding the system in a state with energy
is proportional to . As for the other parameters,is here adimensional and proportionate to the physical temperature of the system. To reproduce this probability distribution in our simulation we use the Metropolis algorithm: the system is initialized in a random state and then updated iteratively by choosing a random spin and changing it to a new random direction with a probability
(3) 
where is the energy difference between the system’s energy after the potential change and the current one. To speed up the process, we divide the lattice into three noninteracting sublattices and update all spins in each sublattice in parallel, using a GPU, as described in Ref [26].
A rescaling of the parameters , , and by the same factor leaves the system invariant. We are thus free to fix the value of any of them without loss of generality. We pick . We set the remaining parameters , and to constant values inside a given simulation but varying across different simulations.
We perform two types of procedures during a simulation, which we call themalization stages and averaging stages. A thermalization stage consists of 1000 lattice sweeps. A lattice sweep is an update of all of the spins in the lattice. An averaging step involves taking the average of 200 lattice configurations, with 50 sweeps performed between each consecutive pair. In a simulation, we perform thermalization and averaging stages alternatively, for a total of 20 each. The number of sweeps we take in both kinds of the stage is small compared to their typical size in other applications. As a result, the sequence of averaged configurations obtained from the averaging stages gives us a collection of snapshots showing the dynamical evolution of the system as the final state is formed.
Iii Data preparation and training
To generate our data set, we run 1000 Monte Carlo simulations with values of , and
randomly chosen with a uniform distribution in the intervals
, and , respectively. As described in Section II, we obtain 20 equallyspaced 3D snapshots for each simulation. We perform an additional averaging stage over 2000 configurations for the final state. For each snapshot (and for the final state), we take a 2D slice of the lattice at constant and keep only the component of the corresponding spins. This is enough information to identify the relevant objects in each case. We refer to the set of 20 2D snapshots plus the final 2D image (both of size ) obtained in this way as a sample.To each sample, we attach a set of labels, chosen from the five ones presented in Table 1, and representing the features observed in the final image. One or more of these labels can be assigned to the same sample. This is represented as a vector of 5 binary 0/1 components for each sample, which we call the label vector. A component of the label vector being one means that the corresponding label is present, while a value of 0 indicates its absence. We remove 50 samples that do not present a clear structure with features matching any of the ones in Table 1. We then standardise the entire data set via RobustScaler from the scikitlearn package.
We perform a separate training procedure for each snapshot index from 1 to 20. Among the 950 samples we have, we use 190 for testing purposes and 20% of the remaining samples for validation. We augment the training data by a factor of 10 by applying a shift by a random amount in the horizontal and the vertical direction, wrapping around the edges. This has several related advantages. First, it increases the effective size of the data set by a factor of , the number of different shifts performed. Second, it enforces that the CNN learns the translational symmetry that the system possesses. Finally, it prevents overfitting to particular characteristics localised at specific positions in the snapshots selected for training.
The network architecture has been formed via a simple convolution block followed by a fully connected layer. The convolution block has been constructed via a convolutional layer with 16 filters and four stride steps in each direction on the 2D image. The convolutional layer has been wrapped with a
ReLUactivation function, and its weights have been regularised via L2 with a weight of 0.01. A maxpooling layer has followed it, which takes the pixel with maximum impact within a square of pixels. The maxpooling layer has been sandwiched between two batch normalisation layers. The flattened output of the convolution block then has been fed into a fully connected layer with 16 nodes which, again, has been wrapped with ReLUactivation function, and the weights have been regularised with the same L2 regulariser. Finally, the fully connected layer has been padded with dropout layers with 30% probability. This network is designed to provide a fivedimensional output wrapped with a sigmoid activation function.
For training, we use the crossentropy loss function, which we minimize using the
Adam algorithm [33], with a learning rate starting atand decaying by a factor of 2 if the validation loss did not improve for 20 epochs. We perform a maximum of 200 training epochs for each snapshot, with the training being terminated if the validation loss function has not improved for 50 epochs. As mentioned above, to preserve the simplicity of the application, we used one architecture for all snapshots. However, since the last snapshots include more information than the initial snapshots, we observed overtraining within a short amount of epochs during the training of the earlier snapshots. Hence, we only used the network structure before it started to overtrain.
Label  Description  Examples 

antiskyrmion  A circular region of spins antialigned with the extenal magnetic field.  
meron  A wall of spins (antialigned with the external magnetic field) that ends.  
helix  At least two contiguous walls of spins antialigned with the external magnetic field.  
ferromagnetic  A region of spins aligned with the extenal magnetic field, either filling the full snapshot, or at least having a size larger than the typical distance between antiskyrmions.  
hexagonal  An arrangement of antiskyrmions forming a hexagonal lattice filling the entire snapshot. 
Iv Results
In Fig. 1
, we show how the final accuracy changes as the network is trained on different snapshots from the simulations. We have performed the training independently for each snapshot, with the same architecture for all of them. In order to estimate the bias originating from the optimisation algorithm, the training is repeated 10 times for every snapshot with independent initiations. We present the results as central points representing the mean accuracy from the 10 runs, together with the error bars showing one standard deviation from the mean. Each label is assigned if the corresponding network output is above a threshold of 50%
^{1}^{1}1The network output does not refer to a specific certainty measure, where 90% output does not ensure that the network is absolutely certain with the labelling. Here we used the network output as a threshold for a complex observable designed by the network.. The colours distinguish the accuracies for the different labels defined in Table 1 where antiskyrmion, meron, helix, ferromagnetic and hexagonal are represented by red, green, blue, purple and orange colours.As Fig. 1 shows, the performance of the network for most labels improves as it is trained on snapshots generated with longer Monte Carlo times until the snapshot index is around 5–10. This automatically identifies the point at which the simulation can be stopped: at snapshot 10, the network can already predict each label with the same accuracy as for the final one.
For fixed snapshot index, there is a hierarchy in the network’s performance for different classes. With its arrangement of aligned spins, the ferromagnetic label is the easiest to identify, giving an accuracy of around 99% for all snapshots. The antiskyrmion, meron and helix features are next, all of them having a similar final accuracy of approximately 95–96%. The detection of the hexagonal arrangement is not so precise, with a maximal accuracy close to 90%. This is to be expected since it is the only label that does not depend on the local features of the images but their global structure.
We illustrate the evolution of the network predictions for three samples in Fig. 2. The first row shows the remarkable performance of the network for the early snapshots: to the naked eye, the first snapshot looks like it contains antiskyrmions with strong deformations, which could be even tagged as merons, and no hexagonal structure. Intuitively, it is not clear from this what the evolution will be. Despite this, the network can predict that the final phase will only contain antiskyrmions, not merons, and be arranged in a hexagonal lattice. Finally, we stress the importance of training the network on the snapshot index it is expected to be used: the network for the final image, which achieves high accuracy in its domain, performs worse than the one specialised in the first snapshot index here, giving the incorrect [antiskyrmion, meron] prediction.
In the second row of Fig. 2
, one can see the first snapshot with structures that the noise makes hard to classify by hand. One of them appears to be a wall of spins on the bottom left of the image, which could be misidentified as a bimeron being formed. This ends up disappearing in later snapshots. The network correctly predicts this by attaching a single
[antiskyrmion] label to it. However, in this case, one needs to wait until snapshot 10 for the network to predict the hexagonal arrangement of antiskyrmions. Finally, in the third row of Fig. 2, we show an example with meron/helical features which is correctly classified from the beginning, despite the noise that is present there.As an example of the practical application of our approach, we use it to generate a phase diagram from the results of simulations that are stopped at early times and compare its application to the results of the full simulations. First, we generate samples in the same way as the training set, but this time for an equallyspaced grid of points in space, with . The temperature varies between 0.05 to 1.25, in steps of 0.05, while the external magnetic field goes from 0 to 0.4, in steps of 0.02. Then, we apply the previouslytrained networks for snapshot indexes 6 and 20 to the corresponding snapshots and generate a phase diagram. To have smoother borders for each region in both diagrams, we take the average of the predictions for each block in the grid. We then assign a colour/label to each point as follows:

Points with a network output above 0.5 for the antiskyrmion label are coloured in red and labelled “Antiskyrmion”.

If the output for both the antiskyrmion and the ferromagnetic label is above 0.5, this means that both antiskyrmions and regions of significant size with spins aligned with the external magnetic field are found. This means that the antiskyrmions are not closely packed. Thus, the region is labeled “Antiskyrmion gas”, and a lighter red colour is used.

Points whose only network output above 0.5 is the ferromagnetic one are labelled “Ferromagnetic” and assigned a grey colour.

Similarly, those points whose only network output above 0.5 is the helical one are labelled “Helical” and assigned a blue colour.

Finally, we select a lower threshold of 0.25 for the meron output. If this is reached, independent of other labels, the corresponding point is labelled “Meron” and uses a light salmon colour. This is done to show a region where merons might appear, in conjunction with antiskyrmions, helicalphase walls or both. The choice of a lower threshold is made to show this region more clearly. For 0.5, it is smaller, and it almost disappears at snapshot index 19.
The points that do not fit any of these criteria are left in white colour, signalling that they could not be classified confidently in any of these phases.
We show the resulting phase diagrams in Fig. 3. Since the schedule consists of thermalising at a constant value of and , starting from a random lattice configuration, these diagrams correspond to what may be obtained experimentally by rapidly cooling from a high temperature directly into the target point.
We first notice the following differences between the phase diagrams for the two selected snapshots: the shrinking of the “Meron” region and the appearance of the “Antiskyrmion gas” one. The shrinking of the “Meron” region can be explained by the presence in the early stages of structures that are classified as merons, which later disappear in favour of antiskyrmions. This is partially mitigated by training the network on the early snapshots, as shown in some of the examples in Fig. 2, but the effect persists to a smaller degree. Another feature that can only be found at larger Monte Carlo times is the mixed antiskyrmionferromagnetic states, the “Antiskyrmion gas”, which is not detectable at early stages due to noise. We conclude that the simulations need to be run for the entire time we considered to resolve these details.
However, the rest of both diagrams approximately agree. Concretely, the regions for the helical and ferromagnetic phase and the regions where antiskyrmions are detected are similar. This means that to identify where these features emerge, one only needs to simulate snapshot 6. Since these are the most critical features, and the only ones present in a typical phase diagram, running short simulations and using the CNN to predict a final state is a viable option for many applications.
V Conclusions
We have shown that a multilabel machine learning approach allows us to identify complex structures and mixed states in the results of 3D Monte Carlo simulations of spin lattices with DM interactions. We have trained a CNN using 2D lattice slices to detect antiskyrmions, merons, helicalphase walls, regions with ferromagnetic arrangements of spins, and hexagonal lattices of antiskyrmions. We have only used the component of the lattice spins, which is similar among antiskyrmions and Bloch/Neel skyrmions. Although we have applied this approach to a version of the DM interactions that supports antiskyrmions, it would perform similarly for skyrmions of both types.
In addition to directly identifying these features, we have used this framework to predict their emergence from the early stages of the simulation. The CNNs trained on the first few snapshots, with labels given by the final configuration, can predict final features even in cases in which it is not intuitively apparent from an inspection by the naked eye of the corresponding images.
One of the applications of the earlysnapshot CNNs is in shortening simulation times. Thanks to them, one can stop the simulation at early stages and predict the relevant features. As an example, we have constructed phase diagrams for snapshots 6 and 20 with . Although long simulation times are needed to resolve fine details involving mixtures of merons and antiskyrmions or whether antiskyrmions are in a crystal or a gas phase, we find that the main phases, antiskyrmion, helical, and ferromagnetic, can be detected with significantly shorter simulations.
References
 Dzyaloshinsky [1958] I. Dzyaloshinsky, A thermodynamic theory of weak ferromagnetism of antiferromagnetics, Journal of Physics and Chemistry of Solids 4, 241 (1958).
 Moriya [1960] T. Moriya, Anisotropic Superexchange Interaction and Weak Ferromagnetism, Phys. Rev. 120, 91 (1960).
 Skyrme [1962] T. H. R. Skyrme, A Unified Field Theory of Mesons and Baryons, Nucl. Phys. 31, 556 (1962).
 Bogdanov and Yablonskii [1989] A. N. Bogdanov and D. Yablonskii, Thermodynamically stable “vortices” in magnetically ordered crystals. the mixed state of magnets, Zh. Eksp. Teor. Fiz 95, 178 (1989).
 Muhlbauer et al. [2009] S. Muhlbauer, B. Binz, F. Jonietz, C. Pfleiderer, A. Rosch, A. Neubauer, R. Georgii, and P. Boni, Skyrmion lattice in a chiral magnet, Science 323, 915?919 (2009).
 Münzer et al. [2010] W. Münzer et al., Skyrmion lattice in the doped semiconductor FeCoSi, Phys. Rev. B 81, 041203 (2010), arXiv:0903.2587 [condmat.strel] .
 Yu et al. [2010] X. Yu, Y. Onose, N. Kanazawa, J. H. Park, J. Han, Y. Matsui, N. Nagaosa, and Y. Tokura, Realspace observation of a twodimensional skyrmion crystal, Nature 465, 901 (2010).
 Yu et al. [2011] X. Yu, N. Kanazawa, Y. Onose, K. Kimoto, W. Zhang, S. Ishiwata, Y. Matsui, and Y. Tokura, Near roomtemperature formation of a skyrmion crystal in thinfilms of the helimagnet fege, Nature materials 10, 106 (2011).
 Tokunaga et al. [2015] Y. Tokunaga, X. Yu, J. White, H. M. Rønnow, D. Morikawa, Y. Taguchi, and Y. Tokura, A new class of chiral materials hosting magnetic skyrmions beyond room temperature, Nature communications 6, 1 (2015).
 Woo et al. [2016] S. Woo, K. Litzius, B. Krüger, M.Y. Im, L. Caretta, K. Richter, M. Mann, A. Krone, R. M. Reeve, M. Weigand, P. Agrawal, I. Lemesh, M.A. Mawass, P. Fischer, M. Kläui, and G. S. D. Beach, Observation of roomtemperature magnetic skyrmions and their currentdriven dynamics in ultrathin metallic ferromagnets, Nature Materials 15, 501 (2016), arXiv:1502.07376 [condmat.mtrlsci] .
 Fujima et al. [2017] Y. Fujima, N. Abe, Y. Tokunaga, and T. Arima, Thermodynamically stable skyrmion lattice at low temperatures in a bulk crystal of lacunar spinel GaVSe, Phys. Rev. B 95, 180410 (2017).
 Koshibae and Nagaosa [2016] W. Koshibae and N. Nagaosa, Theory of antiskyrmions in magnets, Nature communications 7, 1 (2016).
 Hoffmann et al. [2017] M. Hoffmann, B. Zimmermann, G. P. Müller, D. Schürhoff, N. S. Kiselev, C. Melcher, and S. Blügel, Antiskyrmions stabilized at interfaces by anisotropic DzyaloshinskiiMoriya interactions, Nature Communications 8, 308 (2017), arXiv:1702.07573 [condmat.meshall] .
 Huang et al. [2017] S. Huang, C. Zhou, G. Chen, H. Shen, A. K. Schmid, K. Liu, and Y. Wu, Stabilization and currentinduced motion of antiskyrmion in the presence of anisotropic DzyaloshinskiiMoriya interaction, Phys. Rev. B 96, 144412 (2017), arXiv:1709.07156 [condmat.mtrlsci] .
 Camosi et al. [2018] L. Camosi, N. Rougemaille, O. Fruchart, J. Vogel, and S. Rohart, Micromagnetics of antiskyrmions in ultrathin films, Phys. Rev. B 97, 134404 (2018), arXiv:1712.04743 [condmat.mtrlsci] .
 Kovalev and Sandhoefner [2018] A. A. Kovalev and S. Sandhoefner, Skyrmions and antiskyrmions in quasitwodimensional magnets, Frontiers in Physics 6, 98 (2018).
 Böttcher et al. [2018] M. Böttcher, S. Heinze, S. Egorov, J. Sinova, and B. Dupé, BT phase diagram of Pd/Fe/Ir(111) computed with parallel tempering Monte Carlo, New Journal of Physics 20, 103014 (2018), arXiv:1707.01708 [condmat.mtrlsci] .
 Jena et al. [2020] J. Jena, B. Göbel, T. Ma, V. Kumar, R. Saha, I. Mertig, C. Felser, and S. S. P. Parkin, Elliptical Bloch skyrmion chiral twins in an antiskyrmion system, Nature Communications 11, 1115 (2020).
 Yu [2021] X. Yu, Magnetic imaging of various topological spin textures and their dynamics, Journal of Magnetism and Magnetic Materials 539, 168332 (2021).
 Fert et al. [2013] A. Fert, V. Cros, and J. Sampaio, Skyrmions on the track, Nature Nanotechnology 8, 152 (2013).
 Tomasello et al. [2014] R. Tomasello, E. Martinez, R. Zivieri, L. Torres, M. Carpentieri, and G. Finocchio, A strategy for the design of skyrmion racetrack memories, Scientific Reports 4, 6784 (2014), arXiv:1409.6491 [condmat.meshall] .

Song et al. [2020]
K. M. Song, J.S. Jeong,
B. Pan, X. Zhang, J. Xia, S. Cha, T.E. Park,
K. Kim, S. Finizio, J. Raabe, et al.
, Skyrmionbased artificial synapses for neuromorphic computing,
Nature Electronics 3, 148 (2020), arXiv:1907.00957 [physics.appph] .  Pinna et al. [2020] D. Pinna, G. Bourianoff, and K. EverschorSitte, Reservoir Computing with Random Skyrmion Textures, Physical Review Applied 14, 054020 (2020).
 Zázvorka et al. [2019] J. Zázvorka, F. Jakobs, D. Heinze, N. Keil, S. Kromin, S. Jaiswal, K. Litzius, G. Jakob, P. Virnau, D. Pinna, K. EverschorSitte, L. Rózsa, A. Donges, U. Nowak, and M. Kläui, Thermal skyrmion diffusion used in a reshuffler device, Nature Nanotechnology 14, 658 (2019).
 Buhrandt and Fritz [2013] S. Buhrandt and L. Fritz, Skyrmion lattice phase in threedimensional chiral magnets from monte carlo simulations, Physical Review B 88, 195137 (2013).
 Criado et al. [2021] J. C. Criado, P. D. Hatton, S. Schenk, M. Spannowsky, and L. A. Turnbull, Simulating magnetic antiskyrmions on the lattice, (2021), arXiv:2109.15020 [condmat.strel] .

SalcedoGallo et al. [2020]
J. SalcedoGallo, C. GalindoGonzález, and E. RestrepoParra, Deep learning approach for image classification of magnetic phases in chiral magnets, Journal of Magnetism and Magnetic Materials
501, 166482 (2020).  Singh and Han [2019] V. K. Singh and J. H. Han, Application of machine learning to twodimensional dzyaloshinskiimoriya ferromagnets, Physical Review B 99, 174426 (2019).
 Albarracín and Rosales [2022] F. Albarracín and H. Rosales, Machine learning techniques to construct detailed phase diagrams for skyrmion systems, arXiv preprint arXiv:2201.05683 (2022).
 Matthies et al. [2022] T. Matthies, A. F. Schäffer, T. Posske, R. Wiesendanger, and E. Y. Vedmedenko, Topological characterization of dynamic chiral magnetic textures using machine learning, arXiv preprint arXiv:2201.01629 (2022).

Kawaguchi et al. [2021]
M. Kawaguchi, K. Tanabe, K. Yamada, T. Sawa, S. Hasegawa, M. Hayashi, and Y. Nakatani, Determination of the dzyaloshinskiimoriya interaction using pattern recognition and machine learning, npj Computational Materials
7, 1 (2021).  Wang et al. [2021] W. Wang, Z. Wang, Y. Zhang, B. Sun, and K. Xia, Learning order parameters from videos of skyrmion dynamical phases with neural networks, Physical Review Applied 16, 014005 (2021).
 Kingma and Ba [2014] D. P. Kingma and J. Ba, Adam: A method for stochastic optimization, CoRR abs/1412.6980 (2014).