Reference documentation for deal.II version Git 73c87d96ef 20211130 22:54:44 +0100

#include <deal.II/matrix_free/matrix_free.h>
Public Types  
enum  TasksParallelScheme { none = internal::MatrixFreeFunctions::TaskInfo::none, partition_partition, partition_color, color = internal::MatrixFreeFunctions::TaskInfo::color } 
Public Member Functions  
AdditionalData (const TasksParallelScheme tasks_parallel_scheme=partition_partition, const unsigned int tasks_block_size=0, const UpdateFlags mapping_update_flags=update_gradientsupdate_JxW_values, const UpdateFlags mapping_update_flags_boundary_faces=update_default, const UpdateFlags mapping_update_flags_inner_faces=update_default, const UpdateFlags mapping_update_flags_faces_by_cells=update_default, const unsigned int mg_level=numbers::invalid_unsigned_int, const bool store_plain_indices=true, const bool initialize_indices=true, const bool initialize_mapping=true, const bool overlap_communication_computation=true, const bool hold_all_faces_to_owned_cells=false, const bool cell_vectorization_categories_strict=false, const bool allow_ghosted_vectors_in_loops=true, const bool use_fast_hanging_node_algorithm=true)  
AdditionalData (const AdditionalData &other)  
AdditionalData &  operator= (const AdditionalData &other) 
Collects the options for initialization of the MatrixFree class. The first parameter specifies the MPI communicator to be used, the second the parallelization options in shared memory (taskbased parallelism, where one can choose between no parallelism and three schemes that avoid that cells with access to the same vector entries are accessed simultaneously), the third with the block size for task parallel scheduling, the fourth the update flags that should be stored by this class.
The fifth parameter specifies the level in the triangulation from which the indices are to be used. If the level is set to numbers::invalid_unsigned_int, the active cells are traversed, and otherwise the cells in the given level. This option has no effect in case a DoFHandler is given.
The parameter initialize_plain_indices
indicates whether the DoFInfo class should also allow for access to vectors without resolving constraints.
The two parameters initialize_indices
and initialize_mapping
allow the user to disable some of the initialization processes. For example, if only the scheduling that avoids touching the same vector/matrix indices simultaneously is to be found, the mapping needs not be initialized. Likewise, if the mapping has changed from one iteration to the next but the topology has not (like when using a deforming mesh with MappingQEulerian), it suffices to initialize the mapping only.
The two parameters cell_vectorization_categories
and cell_vectorization_categories_strict
control the formation of batches for vectorization over several cells. It is used implicitly when working with hpadaptivity but can also be useful in other contexts, such as in local time stepping where one would like to control which elements together form a batch of cells. The array cell_vectorization_categories
is accessed by the number given by cell>active_cell_index() when working on the active cells with mg_level
set to numbers::invalid_unsigned_int
and by cell>index() for the level cells. By default, the different categories in cell_vectorization_category
can be mixed and the algorithm is allowed to merge lower category numbers with the next higher categories if it is necessary inside the algorithm, in order to avoid partially filled SIMD lanes as much as possible. This gives a better utilization of the vectorization but might need special treatment, in particular for face integrals. If set to true
, the algorithm will instead keep different categories separate and not mix them in a single vectorized array.
Definition at line 181 of file matrix_free.h.
enum MatrixFree::AdditionalData::TasksParallelScheme 
Collects options for task parallelism. See the documentation of the member variable MatrixFree::AdditionalData::tasks_parallel_scheme for a thorough description.
Definition at line 188 of file matrix_free.h.

inline 
Constructor for AdditionalData.
Definition at line 214 of file matrix_free.h.

inline 
Copy constructor.
Definition at line 253 of file matrix_free.h.

inline 
Copy assignment.
Definition at line 281 of file matrix_free.h.
TasksParallelScheme MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::tasks_parallel_scheme 
Set the scheme for task parallelism. There are four options available. If set to none
, the operator application is done in serial without shared memory parallelism. If this class is used together with MPI and MPI is also used for parallelism within the nodes, this flag should be set to none
. The default value is partition_partition
, i.e. we actually use multithreading with the first option described below.
The first option partition_partition
is to partition the cells on two levels in onionskinlike partitions and forming chunks of tasks_block_size after the partitioning. The partitioning finds sets of independent cells that enable working in parallel without accessing the same vector entries at the same time.
The second option partition_color
is to use a partition on the global level and color cells within the partitions (where all chunks within a color are independent). Here, the subdivision into chunks of cells is done before the partitioning, which might give worse partitions but better cache performance if degrees of freedom are not renumbered.
The third option color
is to use a traditional algorithm of coloring on the global level. This scheme is a special case of the second option where only one partition is present. Note that for problems with hanging nodes, there are quite many colors (50 or more in 3D), which might degrade parallel performance (bad cache behavior, many synchronization points).
partition_partition
option in case of usual DG elements, no comprehensive tests have been performed for systems of more general elements, like combinations of continuous and discontinuous elements that add face integrals to all terms. Definition at line 344 of file matrix_free.h.
unsigned int MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::tasks_block_size 
Set the number of socalled macro cells that should form one partition. If zero size is given, the class tries to find a good size for the blocks based on MultithreadInfo::n_threads() and the number of cells present. Otherwise, the given number is used. If the given number is larger than one third of the number of total cells, this means no parallelism. Note that in the case vectorization is used, a macro cell consists of more than one physical cell.
Definition at line 355 of file matrix_free.h.
UpdateFlags MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::mapping_update_flags 
This flag determines the mapping data on cells that is cached. This class can cache data needed for gradient computations (inverse Jacobians), Jacobian determinants (JxW), quadrature points as well as data for Hessians (derivative of Jacobians). By default, only data for gradients and Jacobian determinants times quadrature weights, JxW, are cached. If quadrature points or second derivatives are needed, they must be specified by this field (even though second derivatives might still be evaluated on Cartesian cells without this option set here, since there the Jacobian describes the mapping completely).
Definition at line 368 of file matrix_free.h.
UpdateFlags MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::mapping_update_flags_boundary_faces 
This flag determines the mapping data on boundary faces to be cached. Note that MatrixFree uses a separate loop layout for face integrals in order to effectively vectorize also in the case of hanging nodes (which require different subface settings on the two sides) or some cells in the batch of a VectorizedArray of cells that are adjacent to the boundary and others that are not.
If set to a value different from update_general (default), the face information is explicitly built. Currently, MatrixFree supports to cache the following data on faces: inverse Jacobians, Jacobian determinants (JxW), quadrature points, data for Hessians (derivative of Jacobians), and normal vectors.
boundary_operation
in the MatrixFree::loop(), this field must be set to a value different from UpdateFlags::update_default. Definition at line 388 of file matrix_free.h.
UpdateFlags MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::mapping_update_flags_inner_faces 
This flag determines the mapping data on interior faces to be cached. Note that MatrixFree uses a separate loop layout for face integrals in order to effectively vectorize also in the case of hanging nodes (which require different subface settings on the two sides) or some cells in the batch of a VectorizedArray of cells that are adjacent to the boundary and others that are not.
If set to a value different from update_general (default), the face information is explicitly built. Currently, MatrixFree supports to cache the following data on faces: inverse Jacobians, Jacobian determinants (JxW), quadrature points, data for Hessians (derivative of Jacobians), and normal vectors.
face_operation
in the MatrixFree::loop(), this field must be set to a value different from UpdateFlags::update_default. Definition at line 408 of file matrix_free.h.
UpdateFlags MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::mapping_update_flags_faces_by_cells 
This flag determines the mapping data for faces in a different layout with respect to vectorizations. Whereas mapping_update_flags_inner_faces
and mapping_update_flags_boundary_faces
trigger building the data in a facecentric way with proper vectorization, the current data field attaches the face information to the cells and their way of vectorization. This is only needed in special situations, as for example for blockJacobi methods where the full operator to a cell including its faces are evaluated. This data is accessed by FEFaceEvaluation::reinit(cell_batch_index, face_number)
. However, currently no coupling terms to neighbors can be computed with this approach because the neighbors are not laid out by the VectorizedArray data layout with an arrayofstructofarraytype data structures.
Note that you should only compute this data field in case you really need it as it more than doubles the memory required by the mapping data on faces.
If set to a value different from update_general (default), the face information is explicitly built. Currently, MatrixFree supports to cache the following data on faces: inverse Jacobians, Jacobian determinants (JxW), quadrature points, data for Hessians (derivative of Jacobians), and normal vectors.
Definition at line 436 of file matrix_free.h.
unsigned int MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::mg_level 
This option can be used to define whether we work on a certain level of the mesh, and not the active cells. If set to invalid_unsigned_int (which is the default value), the active cells are gone through, otherwise the level given by this parameter. Note that if you specify to work on a level, its dofs must be distributed by using dof_handler.distribute_mg_dofs(fe);
.
Definition at line 446 of file matrix_free.h.
bool MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::store_plain_indices 
Controls whether to enable reading from vectors without resolving constraints, i.e., just read the local values of the vector. By default, this option is enabled. In case you want to use FEEvaluationBase::read_dof_values_plain, this flag needs to be set.
Definition at line 454 of file matrix_free.h.
bool MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::initialize_indices 
Option to control whether the indices stored in the DoFHandler should be read and the pattern for task parallelism should be set up in the initialize method of MatrixFree. The default value is true. Can be disabled in case the mapping should be recomputed (e.g. when using a deforming mesh described through MappingEulerian) but the topology of cells has remained the same.
Definition at line 465 of file matrix_free.h.
bool MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::initialize_mapping 
Option to control whether the mapping information should be computed in the initialize method of MatrixFree. The default value is true. Can be disabled when only some indices should be set up (e.g. when only a set of independent cells should be computed).
Definition at line 474 of file matrix_free.h.
bool MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::overlap_communication_computation 
Option to control whether the loops should overlap communications and computations as far as possible in case the vectors passed to the loops support nonblocking data exchange. In most situations, overlapping is faster in case the amount of data to be sent is more than a few kilobytes. If less data is sent, the communication is latency bound on most clusters (pointtopoint latency is around 1 microsecond on good clusters by 2016 standards). Depending on the MPI implementation and the fabric, it may be faster to not overlap and wait for the data to arrive. The default is true, i.e., communication and computation are overlapped.
Definition at line 488 of file matrix_free.h.
bool MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::hold_all_faces_to_owned_cells 
By default, the face part will only hold those faces (and ghost elements behind faces) that are going to be processed locally. In case MatrixFree should have access to all neighbors on locally owned cells, this option enables adding the respective faces at the end of the face range.
Definition at line 497 of file matrix_free.h.
std::vector<unsigned int> MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::cell_vectorization_category 
This data structure allows to assign a fraction of cells to different categories when building the information for vectorization. It is used implicitly when working with hpadaptivity but can also be useful in other contexts, such as in local time stepping where one would like to control which elements together form a batch of cells.
This array is accessed by the number given by cell>active_cell_index() when working on the active cells with mg_level
set to numbers::invalid_unsigned_int and by cell>index() for the level cells.
triangulation.n_active_cells()
or triangulation.n_cells(level)
when filling data. Definition at line 515 of file matrix_free.h.
bool MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::cell_vectorization_categories_strict 
By default, the different categories in cell_vectorization_category
can be mixed and the algorithm is allowed to merge lower categories with the next higher categories if it is necessary inside the algorithm. This gives a better utilization of the vectorization but might need special treatment, in particular for face integrals. If set to true
, the algorithm will instead keep different categories separate and not mix them in a single vectorized array.
Definition at line 526 of file matrix_free.h.
bool MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::allow_ghosted_vectors_in_loops 
Assert that vectors passed to the MatrixFree loops are not ghosted. This variable is primarily intended to reveal bugs or performance problems caused by vectors that are involuntarily in ghosted mode, by adding a check that this is not the case. In terms of correctness, the MatrixFree::loop() and MatrixFree::cell_loop() methods support both cases and perform similar operations. In particular, ghost values are always updated on the source vector within the loop, and the difference is only in whether the initial nonghosted state is restored.
Definition at line 538 of file matrix_free.h.
bool MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::use_fast_hanging_node_algorithm 
Flag that allows to disable the fast hangingnode algorithm.
Definition at line 543 of file matrix_free.h.
MPI_Comm MatrixFree< dim, Number, VectorizedArrayType >::AdditionalData::communicator_sm 
Sharedmemory MPI communicator. Default: MPI_COMM_SELF.
Definition at line 548 of file matrix_free.h.