Idemitsu Kosan and GRID complete world’s first POC applying deep reinforcement learning to vessel routing plan optimization


Idemitsu Kosan Co., Ltd (HQ: Tokyo-to Chiyoda-ku, Representative: Shunichi Kito, Trade name: Idemitsu Showa Shell, herein “Idemitsu Kosan”) and GRID Inc. (HQ: Tokyo-to Minato-ku, Representative:  Masaru Sogabe, herein “GRID”), along with MITSUI & CO., LTD. (HQ: Tokyo-to Chiyoda-ku, Representative: Tatsuo Yasunaga, herein “Mitsui & Co.”) have announced the successful completion of the proof-of-concept (POC) experiment for their plan to apply deep reinforcement learning AI technology to the optimization of coastal ship routing.

This POC aimed to use AI techniques to both automate and optimize the creation of vessel routing plans, which until now was completely dependent on the knowledge of skilled workers with significant experience in the petroleum industry. The experiment included the creation of a simulator to recreate marine delivery of petroleum from refineries to oil tanks and an AI model for optimizing the vessel routing plan.

The vessel routing plan produced by the AI model demonstrated a stable supply of petroleum, as well as an increase in delivery efficiency of up to 20% [1] when compared to actual past data. Based on this result, the companies expect to achieve reductions in shipping costs, standardization of vessel route planning, and reductions in fuel consumption, leading to decreases in environmental impact. Additionally, this experiment demonstrated a significant reduction in the time required for route planning by a factor of 60, from approximately 1 month down to a few minutes [2].

This will significantly help vessel route planners, allowing them to rely on this technology to easily compare multiple routing plans to select the optimal plan. The optimization model factored in numerous restraints and conditions, such as vessel operation efficiency [3], stowage balance, time at sea, cargo handling time as well as overall vessel operation time. The plans generated by the AI model were also confirmed to be feasible in reviews by veteran route planners and ship operation companies.

The deep reinforcement learning technology at the core of this POC has previously been applied with great success to games such as go and chess but is often considered difficult to apply to real-world problems that contain extremely large numbers of combinations. In particular, this POC handled a route optimization problem with 10^800 possible combinations, significantly larger than the 10^360 possible combinations in a game of Go.

To select an optimal route from this vast solution space, this POC used a combination of algorithms in addition to the deep reinforcement learning mentioned above. The companies involved in this project expect this unprecedented achievement to be the first step in optimizing the entire supply chain. As the next step, they plan to validate this system with an increased number of refineries, oil tanks and vessels, as well as system development. They plan to begin production system operation in 2021.

※1 This was the largest improvement from Idemitsu Kosan’s POC test environment.

※2 Time required varies depending on the computational environment. The shortest time is listed here.

※3 Calculated by the product of total operation time, cargo ratio and ratio of time with cargo.

Page top