Download Seismic Data

In this notebook, we will download continuous seismic data from a public data center with obspy.

[1]:
from BPMF.config import cfg
import glob
import obspy as obs
import os

from datetime import datetime, timedelta
from obspy.clients.fdsn import mass_downloader
[2]:
DATE = obs.UTCDateTime("2012-07-26")
DATA_BUFFER_SEC = 500.

Defining and Initializing the Data Folder Architecture

[3]:
ROOTDIRPATH_DATA = cfg.INPUT_PATH
dirpath_data = os.path.join(ROOTDIRPATH_DATA, str(DATE.year), DATE.strftime("%Y%m%d"))
dirpath_raw_waveforms = os.path.join(dirpath_data, "raw")
dirpath_resp_files = os.path.join(dirpath_data, "resp")

# Create the repository if needed
if not os.path.isdir(dirpath_raw_waveforms):
    os.makedirs(dirpath_raw_waveforms)
if not os.path.isdir(dirpath_resp_files):
    os.makedirs(dirpath_resp_files)

Data Selection

obspy’s fdsn downloader takes a Domain and a Restrictions instances to identify the data to request from the data center.

[4]:
# Geographical restrictions
domain = mass_downloader.RectangularDomain(
    minlatitude=40.60,
    maxlatitude=40.76,
    minlongitude=30.20,
    maxlongitude=30.44,
)

# Time and station restrictions
restrictions = mass_downloader.Restrictions(
    starttime=DATE - DATA_BUFFER_SEC,
    endtime=DATE + timedelta(days=1.) + DATA_BUFFER_SEC,
    network="YH",
    location="*",
    channel="BH*,HH*",
    station="SAUV,SPNC,DC08,DC07,DC06,DD06,DE07,DE08",
    reject_channels_with_gaps=False,
    minimum_length=0.0,
    minimum_interstation_distance_in_m=500.0,
    channel_priorities=["HH[ZNE]", "BH[ZNE]"],
)

# Downloader instance
downloader = mass_downloader.MassDownloader(providers=["IRIS"])
[2023-07-06 14:30:07,633] - obspy.clients.fdsn.mass_downloader - INFO: Initializing FDSN client(s) for IRIS.
[2023-07-06 14:30:07,941] - obspy.clients.fdsn.mass_downloader - INFO: Successfully initialized 1 client(s): IRIS.

Download the Data

[5]:
# don't hesitate running this cell many times to make
# sure all the available data were downloaded
# some requests may fail due to busy data centers
downloader.download(
    domain,
    restrictions,
    mseed_storage=dirpath_raw_waveforms,
    stationxml_storage=dirpath_resp_files,
    threads_per_client=4,
)
[2023-07-06 14:30:10,708] - obspy.clients.fdsn.mass_downloader - INFO: Total acquired or preexisting stations: 0
[2023-07-06 14:30:10,709] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Requesting reliable availability.
[2023-07-06 14:30:11,159] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully requested availability (0.45 seconds)
[2023-07-06 14:30:11,161] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Found 8 stations (24 channels).
[2023-07-06 14:30:11,199] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Will attempt to download data from 8 stations.
[2023-07-06 14:30:11,205] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Status for 24 time intervals/channels before downloading: NEEDS_DOWNLOADING
[2023-07-06 14:31:33,072] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 3 channels (of 3)
[2023-07-06 14:32:02,464] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 3 channels (of 3)
[2023-07-06 14:32:10,280] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 3 channels (of 3)
[2023-07-06 14:32:11,203] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 3 channels (of 3)
[2023-07-06 14:33:22,927] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 1 channels (of 1)
[2023-07-06 14:33:25,448] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 1 channels (of 1)
[2023-07-06 14:33:34,430] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 3 channels (of 3)
[2023-07-06 14:33:59,062] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 3 channels (of 3)
[2023-07-06 14:34:35,089] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 1 channels (of 1)
[2023-07-06 14:35:25,673] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded 3 channels (of 3)
[2023-07-06 14:35:25,674] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Launching basic QC checks...
[2023-07-06 14:35:26,452] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Downloaded 229.3 MB [746.52 KB/sec] of data, 0.0 MB of which were discarded afterwards.
[2023-07-06 14:35:26,453] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Status for 24 time intervals/channels after downloading: DOWNLOADED
[2023-07-06 14:35:26,804] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded '../BPMF_data/2012/20120726/resp/YH.DC07.xml'.
[2023-07-06 14:35:26,809] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded '../BPMF_data/2012/20120726/resp/YH.DC06.xml'.
[2023-07-06 14:35:26,811] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded '../BPMF_data/2012/20120726/resp/YH.DC08.xml'.
[2023-07-06 14:35:27,044] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded '../BPMF_data/2012/20120726/resp/YH.DE07.xml'.
[2023-07-06 14:35:27,047] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded '../BPMF_data/2012/20120726/resp/YH.DD06.xml'.
[2023-07-06 14:35:27,051] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded '../BPMF_data/2012/20120726/resp/YH.DE08.xml'.
[2023-07-06 14:35:27,282] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded '../BPMF_data/2012/20120726/resp/YH.SPNC.xml'.
[2023-07-06 14:35:27,351] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully downloaded '../BPMF_data/2012/20120726/resp/YH.SAUV.xml'.
[2023-07-06 14:35:27,386] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Downloaded 8 station files [2.0 MB] in 0.9 seconds [2307.39 KB/sec].
[2023-07-06 14:35:27,387] - obspy.clients.fdsn.mass_downloader - INFO: ============================== Final report
[2023-07-06 14:35:27,388] - obspy.clients.fdsn.mass_downloader - INFO: 0 MiniSEED files [0.0 MB] already existed.
[2023-07-06 14:35:27,389] - obspy.clients.fdsn.mass_downloader - INFO: 0 StationXML files [0.0 MB] already existed.
[2023-07-06 14:35:27,391] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Acquired 24 MiniSEED files [229.3 MB].
[2023-07-06 14:35:27,391] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Acquired 8 StationXML files [2.0 MB].
[2023-07-06 14:35:27,392] - obspy.clients.fdsn.mass_downloader - INFO: Downloaded 231.3 MB in total.
[5]:
{'IRIS': <obspy.clients.fdsn.mass_downloader.download_helpers.ClientDownloadHelper at 0x7f6ec5a83ee0>}
[7]:
# check the files in DIRPATH_WAVEFORMS
glob.glob(os.path.join(dirpath_raw_waveforms, "*"))
[7]:
['../BPMF_data/2012/20120726/raw/YH.DC08..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC07..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SAUV..HHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC08..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SPNC..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE08..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE07..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE07..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DD06..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DD06..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC06..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC06..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC06..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE08..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE08..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE07..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DD06..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC07..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SAUV..HHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC08..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC07..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SAUV..HHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SPNC..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SPNC..BHZ__20120725T235140Z__20120727T000820Z.mseed']

Cleanup the Downloaded Data

Even though mass_downloader.Restrictions handles channel priorities, some components might end up being downloaded at multiple channels because of failed data requests (see the output of the previous cell). To avoid data duplicate and in the interest of keeping the memory footprint low, we now remove these duplicated data.

[8]:
from fnmatch import filter as fnfilter
[9]:
# keep only HH channels when HH and BH channels were downloaded
data_list = glob.glob(os.path.join(dirpath_raw_waveforms, "*mseed"))
station_list = glob.glob(os.path.join(dirpath_resp_files, "*xml"))
for sta in station_list:
    sta_id, _ = os.path.splitext(os.path.basename(sta))
    print(sta_id)
    BH_channels = fnfilter(data_list, os.path.join(dirpath_raw_waveforms, f"{sta_id}.*.BH*__*.mseed"))
    for fname in BH_channels:
        cha_info = fname[len(dirpath_raw_waveforms)+len(sta_id):]
        if os.path.isfile(os.path.join(dirpath_raw_waveforms, sta_id + cha_info.replace(".BH", ".HH"))):
            # the HH channel exists as well
            # get rid of the BH channel
            os.remove(fname)
data_list = glob.glob(os.path.join(dirpath_raw_waveforms, "*mseed"))
YH.DC07
YH.DC08
YH.DC06
YH.DD06
YH.SPNC
YH.DE08
YH.SAUV
YH.DE07
[10]:
data_list
[10]:
['../BPMF_data/2012/20120726/raw/YH.DC08..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC07..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SAUV..HHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC08..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SPNC..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE08..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE07..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE07..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DD06..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DD06..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC06..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC06..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC06..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE08..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE08..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DE07..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DD06..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC07..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SAUV..HHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC08..BHN__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.DC07..BHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SAUV..HHZ__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SPNC..BHE__20120725T235140Z__20120727T000820Z.mseed',
 '../BPMF_data/2012/20120726/raw/YH.SPNC..BHZ__20120725T235140Z__20120727T000820Z.mseed']

Plot the Raw Data

[11]:
traces = obs.Stream()
for fname in data_list:
    traces += obs.read(fname)
traces
[11]:
24 Trace(s) in Stream:

YH.DC08..BHZ | 2012-07-25T23:51:40.000000Z - 2012-07-27T00:08:20.000000Z | 50.0 Hz, 4370001 samples
...
(22 other traces)
...
YH.SPNC..BHZ | 2012-07-25T23:51:40.000000Z - 2012-07-27T00:08:20.000000Z | 50.0 Hz, 4370001 samples

[Use "print(Stream.__str__(extended=True))" to print all Traces]
[12]:
%config InlineBackend.figure_formats = ["svg"]

fig = traces.plot(equal_scale=False)
../../_images/tutorial_notebooks_1_download_data_16_0.svg