In this section, we explain the application of VISMO-YY for the intended purpose, i.e., in-situ visualization, and its performance for practical simulations using the Yin-Yang grid.
Simulation model
The target simulation, to which in-situ visualization is integrated, is again the thermal convection in a spherical shell, but with no rotation this time. We solve the thermal convection of a fluid confined in a spherical shell between two concentric spheres; outer sphere of radius \(r = r_o\) and inner sphere \(r=r_i\). The viscosity and thermal diffusivity of the fluid are constant; their ratio (Prandtl number) is unity. We assume a central gravity force and fixed temperatures \(T_o\) and \(T_i\) on the boundaries at \(r=r_o\) and \(r=r_i\). Because \(T_o<T_i\), thermal convection sets in if the Rayleigh number Ra of the system is greater than the critical value.
Figure 4 shows an in-situ slicer rendering and volume rendering applied to convection simulation with the Yin-Yang grid. Herein the convection layer is relatively deep; \(r_i=0.3\) and \(r_o=1.0\). The total grid size is \(N_r\times N_\theta \times N_\phi \times 2= 201\times 202\times 606\times 2\) with \(N_r\), \(N_\theta\), and \(N_\phi\) are grid sizes in radial, latitudinal, and longitudinal directions, respectively. The final factor 2 is for Yin- and Yang-components. We show in Fig. 4a an in-situ visualization of slicer rendering. We can place multiple slices in VISMO-YY.
The inner spherical surface, depicted as the gray ball in Fig. 4a, can be switched on and off using the configuration file. VISMO-YY can show the inner and/or outer spheres with surface/mesh.
Figure 4b shows an in-situ visualization of the volume rendering of enstrophy density. The semitransparent appearance of the volume rendering helps us observe the three-dimensional distribution of the scalar field.
One aspect we did not mention thus far regarding the features implemented in VISMO-YY is the background color. It is sometimes useful to change the background, which is black by default, to lighter colors. In Fig. 4, we set the background as gray in the configuration file. The gray background helps to comprehend the three-dimensional structure of the convection cell visualized through the bluish volume rendering.
Figure 5a shows convection-flow visualization by the arrow glyphs. In VISMO-YY, the arrow glyphs are placed on a plane specified in the configuration file. In this case, two perpendicular planes are set: one for the equatorial plane and the other for the meridian plane. The density of the arrows and size of each glyph can be controlled using the configuration file. The planes of arrow glyphs reveal part of the distribution of the flow.
Applying multiple visualization methods at once is straightforward with VISMO-YY. By specifying the methods with their parameters in the configuration file, we can automatically obtain multiple in-situ visualizations, resulting in superimposed images with appropriate ordering as well as hiding. Figure 5b depicts an example of such a visualization in which the isosurface rendering of enstrophy density and the arrow glyph rendering of flow velocity are combined. The pale green objects represent the isosurfaces. The density of the arrow glyphs in Fig. 5b is lower than that in Fig. 5a.
Visualization performance
The coupled simulation code that generated Figs. 4 and 5 were executed on SGI Standard-Depth Server C2112-4GP3 (CPU: Intel Xeon E5-2650v3 2.3GHz (10 cores) \(\times\)2, Memory: 128GB / node) using 512 CPU cores of 32 nodes. The total number of MPI processes was 256 and 2 OpenMP threads per MPI process. We calculated 50, 000 steps. VISMO-YY visualized the data, for every 1, 000 steps, and output 4 kinds of visualization images with the resolution \(512\times 512\) pixels.
The times required for simulation and visualization were 73,223 s and 283 s, respectively. Only about 0.38 % of the total time is used for the visualization. The low time cost proves that VISMO-YY can be a practical library for in-situ visualization.
The cost depends on the image resolution. In general, the rendering time of VISMO-YY linearly depends on the total number of pixels of images. To confirm this, we show a plot in Fig. 6a, as a function of the total number of pixels, the rendering time for isosurface visualization with the same data used to make Fig. 2. Figure 6a verifies the linear dependence on the total number of pixels. According to this estimate, even if we apply VISMO-YY visualizations with the pixel size of \(2048\times 2048\), it would take about \(0.38\times 16 \sim 6\)% of the total simulation time to generate visualization images. Another critical indicator of VISMO-YY’s performance is parallel scaling. We measured strong scaling of VISMO-YY for the same convection simulation with a relatively small size of total computational grid \(N_r\times N_\vartheta \times N_\varphi \times 2=401\times 402\times 1206\times 2\). We applied in-situ visualization of \(v_r\) isosurfaces. Figure 6b summarises results for different number of cores from 64 to 1024. Fixing the number of MPI processes as 64, we increased OpenMP threads from 1 to 16. It took \(0.056\%\) of the total simulation time in the case of 1024 cores. This graph shows that VISMO-YY has a practically good parallelization performance with 1024 cores even with this moderate computational grid size. For more extensive simulations, VISMO-YY would linearly scale to more parallelization numbers.
The test simulations as well as in-situ visualization results presented so far are relatively simple ones. Here we demonstrate the feasibility of VISMO-YY in much larger simulation, The target simulation is the same problem (thermal convection in a spherical shell), but with larger scale with higher parallelization. The spherical shell is deep; \(r_i=0.1\) and \(r_o=1.0\) that leads to higher Rayleigh number. The grid size is \(N_r\times N_\theta \times N_\phi \times 2 = 1023\times 1002\times 3006\times 2\). The number of MPI processes is 15, 360. The computation was performed on Plasma Simulator (SX-Aurora TSUBASA A412-8) at National Institute for Fusion Science, Japan. Figure 7a shows a snapshot of isosurface rendering of radial velocity \(v_r=\pm c\) by VISMO-YY. The isosurface level c is set as a half of the maximum value of \(|v_r|\) at the snapshot time. The positive flow \(v_r=+c\) is rendered as pink surfaces and negative flow \(v_r=-c\) light blue. The image size of the in-situ visualization is \(1024\times 1024\) pixels. Among 1800 steps of numerical integration of the convection, we produced one in-situ visualization image. The time for the 1800 integration was 299 s, while the in-situ visualization with VISMO-YY was 57 s. We also performed the simulation with the same grid size (\(N_r\times N_\vartheta \times N_\varphi \times 2 = 1023\times 1002\times 3006\times 2\)) on HPE Apollo 2000 Gen10 (Xeon Gold 6248 2.5 GHz 40 cores per node) for 1800 steps. The number of MPI processes is 512, with 4 OpenMP threads per MPI process. It took about 12000 s for the simulation, while it took only 3.0 s using this computer system to generate the in-situ visualization of isosurfaces of \(v_r\) with the same image resolution (\(1024\times 1024\)). The performance ratio (57 s/3.0 s) indicates that there should be plenty of room for optimization of VISMO-YY for vector-type computer systems.
For comparison with post-hoc visualization, we have tried to perform the same isosurface visualization of \(v_r\) of the above mentioned large scale simulation data. We used ParaView for this comparison, aiming at the same visualization as Fig. 7a. The purpose of the comparison is to roughly estimate the cost for post-hoc visualization when a special graphics system is unavailable. It is possible that utilizing a more powerful hardware system will significantly improve the visualization performance of ParaView.
For this post-hoc visualization test, we used DELL Precision 5550, a mobile workstation with Intel Core i7-10850H and 32GB memory. The data size of \(v_r\) (downsized to single precision) for just one time step is \(4~\mathrm {B}\times N_r\times N_\theta \times N_\phi \times 2 \sim 24.65~\mathrm {GB}\). Transferring the data from Plasma Simulator to a local PC took more than 11 minutes for just one time step data. We apply the ParaView visualization on the Yin-Yang grid, using the vtkStructuredGrid format. We need to hold additional three float data for each grid position, i.e., x, y, and z coordinates. Therefore, the total data size needed for the isosurface visualization amounts to \(4\times 24.65=98.6\) GB, which is beyond the main memory of the workstation.
Giving up the complete realization of Fig. 7a with ParaView, we have visualized only a part of the Yin-grid data, discarding the Yang-part, as to \(N_r\times N_\theta \times N_\phi =1023\times 1002\times 1850\). The total data size of the extracted data with the coordinates information by vtkStructuredGrid format is about 28 GB, which is close to the practical limit of the workstation.
It took about 70 s to load the 28 GB data to ParaView, and about 40 s to render the isosurfaces shown in Fig. 7b. In total, it took about 110 s to obtain one snapshot. If we were to visualize the whole Yin-Yang data (98.6 GB), it would take about \(110\times 3 = 330\) s. The performance of 57 s by VISMO-YY is a stark contrast.
Let us note that the above test is just for one scalar field. If we were to visualize multiple scalar fields at once, or to apply vector field visualizations, such as stream lines or arrow glyphs, the memory burden on the post-hoc visualization becomes much more serious. In short, the post-hoc visualization for this scale is impractical, even if we could make use of larger main memory in the workstation. It would almost impossible to adopt the post-hoc approach to the visualization for many time steps due the storage and time costs.