ITER demonstrates fast data transfer to Japan and the US
The ability to move experimental data instantly and securely will determine how thousands of researchers participate in near-real time when the machine comes online in the 2030s.
While Member scientists around the world will follow ITER experiments from their home countries, they will not be able to operate plant systems remotely. They will, however, be able to analyze data within seconds of an experiment and provide feedback to operators.
“It’s a kind of indirect participation,” explains Denis Stepanov, Computing Coordinating Engineer in ITER’s Control Program. “We quickly extract scientific data from the plant network and make it widely available so researchers can run calculations and feed results back during operations.”
Building the global data backbone
At the heart of this arrangement is the onsite Scientific Data and Computing Centre and its backup data centre in Marseille about 50 km away. This data centre has a dual purpose: it holds a redundant copy of all data generated by ITER and will serve as the distribution point to partners worldwide.
“By locating our backup and distribution hub in Marseille, we can protect the master data stored at ITER while providing high-speed, secure access for our international partners,” says Peter Kroul, Computing Center Officer at ITER.
The Cadarache site is connected to the Marseille centre via a redundant pair of dedicated 400 Gbps lines. In turn, the centre is connected, via the French network RENATER, to the pan-European GÉANT, which provides access to other research and education networks—including ESnet (USA) and SINET (Japan). This overall structure ensures that, even during intensive experimental campaigns, data can move at full speed while the primary plant network remains isolated and protected.
To move terabytes of data efficiently across 10,000 kilometres of fibre optics, the team needs software and hardware that can handle diverse systems without running the risk of vendor lock-in. “We cannot dictate what technologies our partners use on their side,” says Kroul. “So we built something flexible—able to connect to whatever they have, while still achieving high parallelization and efficiency even on high-latency links.”
The result is ITER.sync, a high-performance, open-source-based data-replication framework developed at the Scientific Data and Computing Centre. Drawing on the principles of rsync but heavily optimized, ITER.sync automatically parallelizes data streams, tunes network parameters, and maintains near-saturation speeds even over long-distance connections where latency is high.
ITER.sync was designed to operate in environments with tools used by some of the partners—for example, the Massively Multi-Connections File Transfer Protocol (MMCFTP), which was developed by Japan’s National Institute of Informatics (NII).
Global data network put to the test
This summer, ITER engineers carried out two large-scale data-transfer campaigns: one with Japan’s Remote Experimentation Centre (REC) in Rokkasho and the other with the DIII-D National Fusion Facility in San Diego (United States). For the purpose of the tests, ITER simulated the projected data acquisition scenarios.
The campaign in Japan, conducted from mid-August to early September, built on a 2016 demonstration that reached 10 Gbps, which was what was available at the time. The new tests achieved two simultaneous 100 Gbps links—a twenty-fold increase. Engineers demonstrated continuous throughput, multi-path transfers, and resilience by simulating a submarine-cable outage between Marseille and Rokkasho. Both ITER.sync and MMCFTP were used in the tests, providing valuable insight into data transfer strategies and specific tuning for long-distance transfers.
It is expected that only a fraction of the data will be needed in near-real time by remote experimentalists. This data will be transferred as soon as it reaches primary storage. The bulk of the data, however—which needs to be available for off-line analysis—will be transferred via quiet overnight syncs. This second scenario was also tested.
“The key was to test not just network speed but the whole chain—hardware, software, and reliability,” says Stepanov. “Building the technical link is one challenge, but coordinating with all the network providers across Europe and Asia is just as complex. It takes time, alignment, and trust.”
In parallel, the ITER computing centre completed its full-scale data challenge with ESnet and the DIII-D fusion facility at General Atomics in San Diego (United States), supported by a trans-Atlantic link operated at 100 Gbps. Over ten full-scale runs, the teams achieved consistent end-to-end performance close to the link’s theoretical maximum. The test also demonstrated interoperability between ITER’s IBM Spectrum Scale storage and DIII-D’s BeeGFS-based Science DMZ infrastructure, again confirming ITER.sync’s ability to bridge heterogeneous environments.
“These results show that ITER’s international data ecosystem will scale and be ready for the operations we will face in the 2030s,” says Kroul. “We can already, with current technology, ensure that scientific data moves efficiently and reliably between ITER and partner institutions worldwide.”