Understanding the Earth’s subsurface is crucial for everything from identifying natural resources to predicting seismic activity. However, traditional methods for geophysical exploration have long been hampered by their immense computational demands, often requiring months or even years of processing time. Now, a new software stack called DeepWave, developed by a team of researchers from the Brazilian Synchrotron Light Laboratory (LNLS) and the University of Campinas (UNICAMP) and the Center for Computing in Engineering and Sciences (CCES) is set to transform this field.
DeepWave, unveiled at the 2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD), represents a significant leap forward in integrating generative artificial intelligence (AI) with seismic surveying techniques. By leveraging advanced machine learning frameworks, DeepWave dramatically boosts the computational efficiency of geophysical exploration, promising faster and more accurate insights into our planet’s hidden structures.
The Challenge: Peering Beneath the Surface
For decades, geophysicists have relied on seismic surveys to create images of the Earth’s interior. This involves generating pulses (waves) that travel through the Earth and are then captured by receivers on the surface after reflecting off different geological layers. Algorithms like Reverse-Time Migration (RTM) and Full-waveform Inversion (FWI) are then used to simulate wave propagation and construct detailed subsurface images.
While these techniques provide invaluable information, they come at an enormous computational cost. Simulating elastic wave propagation, dealing with the complex interdependence of physical properties of the wavefield, and accurately modeling amplitudes can make FWI, for instance, extremely resource-intensive, often requiring simplifying assumptions that might compromise accuracy. This computational burden has historically limited the scale and speed of geophysical analysis, making detailed subsurface analysis a prolonged and expensive endeavor.
AI to the Rescue: A New Wave of Efficiency
The advent of generative artificial intelligence has opened new avenues for accelerating these complex geophysical algorithms. By synthesizing high-fidelity data in a controlled manner and simplifying wave simulation equations without losing generality, AI can significantly reduce both data acquisition costs and the computational workload of seismic imaging.
However, applying existing machine learning (ML) model parallelism solutions to geophysical inverse problems has presented its own set of challenges. Many current ML platforms are designed for models with billions of parameters, which are often not suitable for geophysical applications, and they face limitations when it comes to parallel execution in private clusters.
DeepWave: A Masterclass in Parallelization
This is where DeepWave shines. The software stack was specifically designed to overcome these hurdles by enhancing traditional methods for solving inverse problems in geophysics. The DeepWave team has meticulously combined powerful machine learning frameworks like JAX, FLAX, and ALPA to implement sophisticated parallelization strategies.
At its core, DeepWave focuses on optimizing image-to-image translation networks. This means that instead of directly running computationally intensive simulations, DeepWave can “translate” one type of seismic data into another, for example, converting acoustic seismic data into more complex elastic seismic data. This translation, powered by generative adversarial networks (GANs) like Pix2Pix, offers a more economical and efficient way to generate realistic seismic data, especially the highly costly elastic data.
The Power Behind DeepWave: JAX, Flax, and Alpa
DeepWave’s efficiency stems from its intelligent integration of several key technologies:
- JAX: A Python library developed by Google, JAX is a powerhouse for high-performance numerical computing. It allows for Just-in-Time (JIT) compilation of Python functions and automatic differentiation, making it ideal for machine learning workloads. JAX simplifies the process of moving computations to accelerator devices like GPUs and TPUs, automatically handling data movements.
- Flax & Optax: While JAX provides the fundamental building blocks, Flax and Optax offer the higher-level tools needed to construct and train deep learning models efficiently. Flax provides the mathematical operations to build convolutional neural networks, and Optax offers optimizers for training. This pairing significantly reduces the “boilerplate” code typically required for deep learning implementations.
- XLA GSPMD: This compiler-based technique enables the automatic parallelization of machine learning models. It allows users to specify how tensors (multi-dimensional arrays of data) should be partitioned across multiple devices, supporting various parallelism patterns like data parallelism (replicating the model across devices), model parallelism (partitioning layer weights), and spatial partitioning (slicing input data).
- Alpa: Building upon GSPMD, Alpa addresses its limitations by automatically partitioning programs into stages, determining optimal sharding annotations, and executing them in a distributed environment. Alpa’s hierarchical parallelism approach intelligently maps pipeline stages to cluster nodes and utilizes intra-operator parallelism among GPUs on the same node, minimizing execution and data exchange times through advanced optimization techniques.
Real-World Impact: Faster Processing, Deeper Insights
The integration of these frameworks within DeepWave allows for the distribution of ML models across computational clusters, making it feasible to train complex models with the massive datasets characteristic of seismic imaging. This distributed training is crucial for tackling the “large model size” challenge that previously hindered the application of deep learning in geophysical contexts.
The results are compelling. Experiments conducted on the Ogbon supercomputer at SENAI CIMATEC’s Supercomputing Center demonstrated substantial improvements in processing speed and resource management. For instance, when evaluating data parallelism using Alpa on a single node with multiple GPUs, the Pix2Pix model achieved a speedup of 1.96x, and the U-Net model saw an even greater speedup of 2.84x compared to their respective baselines. In multi-node environments, using four nodes with a total of 16 GPUs, DeepWave achieved a 2.0x speedup for both Pix2Pix and U-Net models compared to a single-node baseline.
This means that complex geophysical algorithms, like Full-waveform Inversion (FWI), which are vital for detailed subsurface analysis, can now be executed with significantly reduced computational demands while maintaining the accuracy required for detailed analysis. This translates to faster and more efficient processing of large seismic datasets, providing deeper insights into the Earth’s subsurface structures with fewer computational resources.
While the team did encounter some limitations with Alpa in certain multi-node model parallelism configurations, particularly with the cyclicity inherent in GANs, their innovative approach of selectively disabling parts of the network that are less resource-intensive showcases their dedication to finding practical solutions.
A New Standard for Geophysical Exploration
The work by the DeepWave team—Allan Pinto, Gustavo Leite, Marcio Pereira, Hervé Yviquel, Sandro Rigo, and Guido Araujo, is setting a new standard for geophysical research and exploration. By making advanced deep learning models accessible and efficient for seismic data interpretation, DeepWave promises to accelerate our understanding of the Earth, leading to more efficient resource exploration, improved natural hazard prediction, and a deeper appreciation of our planet’s complex geological processes. This innovative software stack is not just a technological advancement; it’s a catalyst for a new era of discovery beneath our feet.