Skip to content

Latest commit

 

History

History
950 lines (792 loc) · 59.4 KB

File metadata and controls

950 lines (792 loc) · 59.4 KB

LNG站点聚类

LNG站点

液化天然气LNG

天然气是一种主要由甲烷组成的气态化石燃料。它主要存在于油田以及天然气田中,也有少量出于煤层。其主要成分是甲烷$(CH_4)$,含有一些较重的烃分子,例如乙烷$(C_2H_6)$、丙烷$(C_2H_6)$、丁烷$(C_3H_8)$,以及一些酸性气体,如二氧化碳$(CO_2)$、硫化氢$(H_2S)$ ,同时包括一些其他杂质如灰尘和有机硫化物等。天然气是无色无味的,通常需用硫醇来给天然气添加气味,以助于泄漏检测。其不具有毒性,本质上对人体无害,但在高浓度情况下仍可能导致窒息。

天然气作为全球能源供应部门的一个独立分支开始于20世纪60年代初,此后其在能源领域的影响力不断增大。到目前为止,天然气已占全球能源供应的22%。许多原因共同促成了天然气能源的快速发展,包括经济、政治和生态:如澳大利亚的目标是乘上亚洲经济快速增长的列车,发展海上贸易;美国的目标是减少对外国石油和能源供应的依赖,开发自己的资源;以及全球对取代化石燃料的气体排放的努力。与柴油相比,使用液化天然气(LNG)作为海洋燃料可以减少约90%的氮氧化物,最多可减少20%的二氧化碳,并完全避免二氧化硫和细颗粒1。因此,挪威船级社 Det Norske Veritas (DNV) 预计,到2020年,将有大约1000艘LNG动力船舶,占预计新船舶订单的近15%,这一变化很大程度上是受全球页岩气生产带来的天然气价格大幅下跌的影响。天然气的使用涉及运输和储存方面的困难,根据边界条件,通过管道运输4000-5000公里的距离通常是经济的。但在地理环境困难的情况下,如向岛屿(如日本和台湾)供应天然气,或需要穿越山脉的地方,管道运输就变得困难,成本也高得多。因此,到20世纪中期,将天然气液化,然后用船长距离运输的方法已经初具规模。

液化天然气 (Liquefied natural gas, LNG) 是为了方便安全地进行非加压储存或运输,将天然气冷却后产生的液化形式。在标准状况下其体积约为同等质量天然气的1/600. LNG无嗅、无色、无毒、无腐蚀性 ,其危险包括气化后的可燃性,冻结和窒息。 液化过程主要为在去除某些成分,如灰尘、酸性气体、氦气、水和重碳氢化合物后,将天然气冷却到大约−162°C (−260°F) ,在接近大气压的情况下凝结成液体。其最大输送压力设定在25kpa (4psi) 左右。2

image-20211209185855527

图1.1 Regional distribution of natural gas potential

LNG船舶、站点与海上运输系统

1956年,康斯托克液态甲烷公司成立,美国大陆石油公司拥有60%的股份,William Wood Prince 的联合畜牧场公司拥有40%的股份。该项目与英国天然气协会 (British Gas Council) 合作,目的是建造一艘液化天然气油轮并测试液化天然气储罐。康斯托克负责工程。为了节省时间和金钱,公司决定对现有的一艘船进行改造,并将精力集中开发耐低温液化天然气罐这一主要方面。设计团队最终选择铝和9%的镍钢作为储罐材料,这两种材料在测试中都表现良好。于此同时,来自肯塔基州路易斯维尔的专业公司甘布尔兄弟 (Gamble Brothers) 对巴尔沙木材绝缘材料进行了进一步的开发工作,设计了使用绝缘材料阻断与LNG直接接触的方案,来自马萨诸塞州剑桥市的 Arthur D. Little 设计了船舱运输构造 (Tanks) 。1959年7月25日,满载液化天然气的“甲烷先锋号”开始了穿越大西洋到英国的航行,这次航行大大拓宽了设计公司的知识库,同时也证明了用船运输液化天然气的可行性。因此,工业规模的运营已经势不可挡。

image-20211209192548867

图1.2 LNG Vessel structure

20世纪80年代,在德国威廉港附近的 Jade busen,一个专业的液化天然气接收站的规划工作开始。这个接收站的所有者是 Deutsche Flüssigerdgas Terminal Gesellschaft (dftg),其大股东是 Ruhrgas AG (后来的意昂公司)。他们的计划包括三个储罐,每个储罐的液化天然气容量为8万立方米,终端设计的液化天然气入口容量为12万立方米/小时,天然气再气化能力为120万立方米/小时。这个接收站所选择的储罐系统有一个顶部打开的内部容器和一个封闭的预应力混凝土外部容器,该外部容器通过一层聚氨酯泡沫保护,防止液化天然气直接接触,聚氨酯泡沫覆盖底板和墙壁的整个高度。内箱直径为62米,高28米,外箱直径为66米,整体高41米。该系统设计的工作压力为200 mbar,安全阀在300 mbar时启动。林德负责工艺工程,诺尔负责钢内容器,DYWIDAG负责混凝土外容器。这一接收站的设计获得了成功,在接下来的几十年里,液化天然气市场开始分阶段地迅速扩张:要么是出口终端或接收终端的容量增加,要么是一个新的国家成为出口国或进口国。3

image-20211210032020138

图1.3 LNG Receiving Terminal

LNG运输链由图1.4所示5个部分组成。在整个过程中,最主要的部分也即本实验关心的部分是海上船舶的运输过程:其中包括LNG船舶的航行状况信息,以及LNG入站点和出站点的分布。

image-20211210025845603

图1.4 LNG Chain

LNG站点

LNG站点分为出口站/液化厂 (Export Terminal / Liquefaction Plants) 和接收站 (Receiving Terminal) 。LNG站点是海上LNG运输系统的端点,也是船舶航线和泊点的汇聚地。

image-20211211130400469

图1.5 Australia Gladstone LNG Export Terminal

image-20211211130153109

图1.6 Japan Chita LNG Receiving Terminal

LNG发展现状4

LGU发布的 2021 WORLD LNG REPORT 显示:截至2020年年底,含2020年交付使用的35艘新船在内,全球LNG运输船船队由572艘现役船只组成,其中包括37艘浮式存储和再气化装置 (FSRU) 以及4艘浮式存储装置 (FSU) 。与2019年相比,这一数字增长了7%,而液化天然气航次增长了1%,这一数字低于预期,主要是由于COVID-19流行导致的需求中断。报告中还提到,在即将到来的2021年中,预期将有64艘新的LNG运输船将投入使用。于此同时,报告提供了LNG海上运输网络的主要航线,如图1.6所示。

image-20211210032417795 image-20211210033655022
图1.7 LNG Vessels information by the end of 2020    图1.8 Major LNG Shipping Routes, 2020  

截至2021年2月,全球液化天然气再气化能力达到850.1万吨/年的新峰值。在对天然气的强烈需求下,全球LNG的接收能力持续增长。进口能力的扩大主要是由现有的液化天然气市场驱动的,包括中国、印度、中国台北、美国(波多黎各)和巴西。自2018年以来,新的进口市场首次出现,缅甸将于2020年增加其第一个再气化站点,克罗地亚将于2021年初增加。2020年,4个新站点和4个现有站点的扩建项目完成,这意味着全球每年再气化能力将再增加19.0万吨。

2020年全球LNG接收站容量的增加大部分来自亚洲和亚太市场,印度和缅甸各增加了一个新的站点,展示了该地区的显著增长。在2020年末和2021年初,巴西和克罗地亚的两个新站点将投入使用,浮式再气化站点也在增加。成熟的进口市场将推动大多数再气化能力的增长,在亚洲尤其如此,中国和印度正在建设大量的再气化项目,以支撑强劲的天然气需求。预计在不久的将来,包括加纳、萨尔瓦多、塞浦路斯和尼加拉瓜在内的许多新的液化天然气进口国也将大大促进再气化能力的增长。所有这些市场都处于建造第一个LNG接受站的后期阶段,这些站点都计划在未来两年内上线。其他几个新市场也计划增加再气化能力,包括科特迪瓦、摩洛哥和德国。然而,由于各种各样的挑战,如融资和基础设施建设相关法规,这些市场中的许多都经历了项目开发的延迟。尽管存在这些挑战,但随着液化天然气发电项目的不断发展,全球液化天然气市场每年将继续增加一到两个新的液化天然气进口商。

image-20211210032417795 image-20211210033655022
 图1.9 LNG Receiving Terminals information  图1.10 Global LNG Receiving Terminal Locations

实验环境

  • Windows 10 Professional 18363.1474 OS
  • Intel(R) Core(TM) i7-9750H CPU @ 2.60GHz 2.59GHz
  • 16GB RAM
  • Windows Command
  • g++ ( x86_64-posix-seh-rev0, Built by MinGW-W64 project ) 8.1.0 with
    • -std = c++11 编译环境
    • -lpthread 多线程库

PS:

  • 实验部分图片使用 PowerBi、GitHub 图形库、Google earth 等可视化工具生成(非代码运行需求)。聚类结果可视化由 Html/JavaScript 制作,个人代码附在文件夹 Project/Visualization Code 中,可以在本地浏览器运行。
  • 本实验聚类算法由个人C++代码实现,已在代码中载入高性能多线程 I/O. 仅需在正常C++环境下执行命令行编译并运行即可,无需安装依赖包。

数据预处理

分析数据集 Ing2.csv 可得到数据格式如下:

MMSI Unix timestamp(s) AIS Navigation Status Velocity(Knot) Longtitude Latitude Draft
205421000 1620717008 0 10 -92.663696 24.532267 0
205421000 1620717669 0 9 -92.665520 24.534849 90
205421000 1620719003 0 10 -92.669563 24.539316 0
... ... ... ... ... ... ...

数据解释

  • MMSI:水上移动通信业务标识码(Maritime Mobile Service Identify)是船舶无线电通信系统在其无线电信道上发送的,能独特识别各类台站和成组呼叫台站的一列九位数字码。

  • Unix timestamp:unix时间戳是从1970年1月1日(UTC/GMT 00:00)开始所经过的秒数,不考虑闰秒。

  • AIS Navigation Status:

    Navigation Status Description
    0 Under way using engine
    1 At anchor
    2 Not under command
    3 Restricted manoeuverability
    4 Constrained by her draught
    5 Moored
    6 Aground
    7 Engaged in Fishing
    8 Under way sailing
    9 Reserved for future amendment of Navigational Status for HSC
    10 Reserved for future amendment of Navigational Status for WIG
    11 Reserved for future use
    12 Reserved for future use
    13 Reserved for future use
    14 AIS-SART is active
    15 Not defined (default)
  • velocity:船舶航行速度,单位通常是节(Knot)。1 knot ≈ 0.514444444 m / s.

  • Longtitude:经度,以小数表示,单位为°. 正值表示E,负值表示W.

  • Latitude:纬度,以小数表示,单位为°. 正值表示N,负值表示S.

  • Draft:吃水深度,指船舶在水中沉入水下部分的最深长度。不同船舶有不同的吃水深度,而同一船舶亦根据不同的载重量及所处水域的盐度,吃水深度有所不同。

噪声与异常数据处理

通过在个人搭建的服务器上建立数据库进行测试观察,并使用C++、Python相关项目和一些优化软件进行可视化处理,得到数据的大致状况和部分分类细节如下(预观测的部分由于依赖个人搭建的Mysql数据库和Github项目,且不属于本实验预处理所使用的实际代码部分,故代码暂略):

image-20211210051727912

image-20211211130918720

图2.1、2.2 数据点分布
  • **噪声/异常数据1:**观察到部分噪声孤立点和异常孤立点存在,提取其中某两点输入谷歌地球模型,结果如下:

    image-20211210051803073 图2.3 异常孤立点 image-20211210052042407 图2.4 噪声孤立点

    解决方案:若本实验在服务器端运行,推荐优先使用基于密度的聚类算法而非基于距离的聚类算法,如 DBSCAN 族算法等。考虑到实验在个人计算机上进行,且算法的效率和性能作为重要指标,故综合规划内存限制和时间复杂度。由于此两种孤立点的数量极少,故推荐使用优化后的 Mini Batch K-Means 算法。优化后的算法综合考虑了 DBSCAN 的精准度和 K-Means 的高性能,在数据预处理时处理经纬度范围中的点密度作为簇参考,此后可得到最佳结果。

  • **异常数据2:**通过数据库查询,可得到存在吃水深度过大和吃水深度过小的部分异常数据如下:

    image-20211210052042407 图2.5 过大吃水深度数据 image-20211211131839074 图2.6 过小吃水深度数据

    进一步统计得:

    Draft Number
    $x>300$ 9195
    $10<=x<=300$ 5430144
    $0<=x<10$ 2457580

    可得,由于特殊原因造成了吃水深度异常的数据污染。且根据可视化结果观察知,这些船只并无其他数据的异常行为。

    image-20211211132647507 图2.6 吃水数据异常船只

    **解决方案:**在站点聚类过程中,认为其其他数据有效;在站点类别判断中,由于需要使用吃水量数据,故对这些数据进行抛弃。

  • **异常数据3:**通过数据库查询得到,含有部分航行状况异常的船(锚定 / 系泊等固定状态时以一定速度行驶了一段时间)

    image-20211211134058028

    图2.6 航行状况异常船只

    **解决方案:**观察数据得,此类船只经纬度均大多发生连续邻近变化,故判断其航行状态异常,在关心航行状况时不计入考虑。

相关数据处理见4 LNG站点聚类中的聚点和分类过程。

LNG站点聚类

相关数据处理

AIS航行状态5

  • 0 Under way using engine A vessel is considered to be underway when it meets the following criteria:

    • It is not aground.
    • It is not at anchor.
    • It was not attached to a dock, the shore, or any other stationary object.

    This navigational status message refers to machinery vessels in motion.

  • 1 At anchor A vessel is at anchor when it is held in position by an anchor on the bottom of a body of water, thus preventing a vessel from drifting away from the desired position (e.g. waiting for a berth, heavy weather, receiving fuel oil, loading, and unloading cargo, for maintenance purposes). The "at anchor" state begins when the anchor hatches firmly hit the seabed and the ship is held in a certain position. While the vessel is considered to be "underway" as soon as the anchor is weighed or towed on the seabed. The vessel is not fixed at a dock, which is called moored.

  • 2 Not under command The term “not under command” means a vessel which through some exceptional circumstance is unable to manoeuvre and is therefore unable to keep out of the way of another vessel.

  • 3 Restricted Manoeuvrability Manoeuvring characteristics include turning, yaw-checking, course-keeping and stopping abilities of the vessel. The term "restricted manoeuvrability" means the vessel is unable to keep out of the way of another vessel. It also includes:

    • A vessel engaged in laying, servicing, or picking up a navigational mark, submarine cable or pipeline.
    • A vessel engaged in dredging, surveying or underwater operations.
    • A vessel engaged in replenishment or transferring persons, provisions or cargo while underway.
    • A vessel engaged in the launching or recovery of aircraft.
    • A vessel engaged in mine clearance operations.
    • A vessel engaged in a towing operation such as severely restricts the towing vessel and her tow in their ability to deviate from their course.
  • 4 Constrained by draught A power-driven vessel which is severely restricted in the ability to deviate from the course it is following because of the draught in relation to the available depth and width of the navigable water.

  • 5 Moored Securing a vessel at a pier or elsewhere by several lines or cables to limit the movement.

    • Multi-Buoy Moorings (MBM), conventional buoy moorings – A facility whereby a tanker is usually moored by a combination of the ship anchors forward and mooring buoys aft and held on a fixed heading.
    • Single Point Mooring (SPM) – A facility whereby the tanker is secured by the bow to a single buoy or structure and is free to swing with the prevailing wind and current. Three types are common: Catenary Anchor Leg Mooring, Single Anchor Leg Mooring and Turret Mooring.
  • 6 Aground A vessel that ran aground onto or on a shore, reef, or the bottom of a body of water.

  • 7 Engaged in Fishing The term ”engaged in fishing” means any vessel fishing with nets, lines, trawls or other fishing apparatus.

  • 8 Under way sailing A vessel is considered to be underway when it meets the following criteria:

    • It is not aground
    • It is not at anchor
    • It was not attached to a dock, the shore, or any other stationary object. This navigational status message refers to all ships using wind power.
  • 9 Reserved for future amendment of Navigational Status for HSC (high-speed craft)

  • 10 Reserved for future amendment of Navigational Status for WIG (Wing-in-ground craft)

  • 11 - 13 Reserved for future use The AIS standard describes the navigational status as a number. This number is transmitted using 4 bits. With 4 bits numbers from 0 to 15 can be displayed (2^4 possibilities). Because not all 15 statuses are assigned at the moment, there are areas which can be reserved for the future and can be reassigned if necessary.

  • 14 AIS-SART is active An AIS-SART (Automatic Identification System - Search And Rescue Transmitter) is a device that sends a position-based emergency message based on the Automatic Identification System (AIS) protocol. The position and time synchronisation of the AIS-SART is done by a built-in GPS module.

    The position of the lifeboat in distress or of a person with an AIS-SART in their life jacket is transmitted via the AIS receiver or AIS transmitter/receiver to the PC or to the chart plotter as a serial protocol and is then visible on the plotter or on the PC. This allows any ship with AIS on board to initiate an immediate rescue operation, which significantly increases the chances of survival.

  • 15 Undefined = default (also used by AIS-SART, MOB-AIS and EPIRB-AIS under test)

聚点过程

在聚类得到站点和泊点的过程中,由AIS航行状态表知,我们通常只关心航行状态为 “At anchor” (1) 和 “Moored” (5) 的船舶数据。但在此实验中,存在大量状态为默认状态 ( "Undefined", 15 ) 的数据。根据观察和SQL查询,我们得到以下数据,这些数据全部由经纬度范围在 $[22.58^{\circ} N,22.60^{\circ} N],[114.42^{\circ} E,114.44^{\circ} E]$ 的数据点构成,其中

Nav_status Number
0 35
5 2
15 1767

image-20211211141909792

图2.7 相关点数据

这些数据唯一精准了刻画了广东大鹏接受站 (GuangDong Dapeng Receving Terminal):

image-20211211142147618

图2.8 GuangDong Dapeng Receiving Station

由于其中绝大多数点的航行状态为15,故在聚点过程中这一默认状态不可忽略。

聚类算法(主要部分)

Mini Batch K-Means

Mini Batch K-means algorithm‘s main idea is to use small random batches of data of a fixed size, so they can be stored in memory. Each iteration a new random sample from the dataset is obtained and used to update the clusters and this is repeated until convergence. Each mini batch updates the clusters using a convex combination of the values of the prototypes and the data, applying a learning rate that decreases with the number of iterations. This learning rate is the inverse of the number of data assigned to a cluster during the process. As the number of iterations increases, the effect of new data is reduced, so convergence can be detected when no changes in the clusters occur in several consecutive iterations.

The empirical results suggest that it can obtain a substantial saving of computational time at the expense of some loss of cluster quality, but not extensive study of the algorithm has been done to measure how the characteristics of the datasets, such as the number of clusters or its size, affect the partition quality.

The algorithm takes small randomly chosen batches of the dataset for each iteration. Each data in the batch is assigned to the clusters, depending on the previous locations of the cluster centroids. It then updates the locations of cluster centroids based on the new points from the batch. The update is a gradient descent update, which is significantly faster than a normal Batch K-Means update.6

**简要来说,Mini Batch K-Means 的主要思想是将数据进行随机批划分,然后进行聚点迭代。每次迭代都会从数据的 Batch 集合中获得一个新的样本,并用于更新聚类,这样重复直到收敛。这一算法可配合其他若干优化进行使用,比如基于密度聚类 (DBSCAN) 的思想,根据密度进行批划分,基于动态簇的思想,并不指定 K 值,而是让程序根据批大小自己决定,在运行过程中使得 K 的数量随迭代变化。**这一做法综合了密度聚类的精准度和原型(距离)聚类的高效率,且契合本实验的适用条件。

实证结果表明,在稍稍牺牲精准度的条件下,这一算法可以对一类工程产生极其显著的优化,这些工程拥有特点:

  • 聚点间的纠缠度较低
  • 数据点的初始批易划分,数据点的关联性高
  • 大规模的数据和较小规模的聚点结果
实现方式

为了充分利用 C++ 的性能优势,实验中采用个人使用 C++ 实现的 Mini Batch K-Means 算法作为聚类算法。

分类过程

由LNG与船舶运输实际情况可得,出站和入站的船舶运输规模通常较大(通常来说出站的规模较入站更大,但受到不同国家地区港口洋流影响,故不作为站点区分依据),而锚地较小;出站附近的点特征为随时间吃水量增大和减小两种变化均存在,入站特征为吃水量只会减小,而锚地吃水量不会变化(在抛锚之后)。故以点规模、吃水量随时间变化过程和航行状态共同确定聚点类型。

环境搭建与命令行指令

  • C / C++ MinGW-w64 环境搭建:MinGW-w64安装教程——著名C/C++编译器GCC的Windows版本

  • 除去实验报告和 lng2.csv 外的所有文件(包括数据定义的.h文件等)已经安放在 Project 文件夹目录下。使用时将 lng2.csv 文件放入 Project 文件夹即可。

  • 程序运行指令

    //首先进入 Project 文件夹目录
    g++ -std=c++11 Cluster.cpp -o Cluster -lpthread
    Cluster
    image-20211213203701300

相关输出文件 lng_results_list.json 会生成在项目目录下。

PS: 显示的运行时间 4.28s 作为主要参考。实际上,高性能文件I/O占用了约 1.75s 的时间,站点分类占用了约 500 ms的时间。作为主体部分的算法最终采用在准确度和性能间较为平衡的方式实现。如果提高对随机性误差的允许,算法性能还可以进一步改善。实验者曾在控制程序整体运行时间在 2.48s 左右时得到过相当优秀的结果。

核心部分源代码 (Cluster.cpp)

#include "Cluster.h"

inline ll get_key(double x, double y)
{
    ll a = (ll)round(10000 * x) + 1600000, b = (ll)round(10000 * y) + 900000;
    return a * 10000000 + b;
}
inline double distance(nodes &x, cluster &y)
{
    return sqrt(pow((x.latitude - y.avg_lati), 2) + pow(x.longtitude - y.avg_longti, 2));
}
inline double distance(nodes &x, nodes &y)
{
    return fabs(x.latitude - y.latitude) + fabs(x.longtitude - y.longtitude);
}
inline double distance(cluster &x, cluster &y)
{
    return sqrt(pow((x.avg_lati - y.avg_lati), 2) + pow(x.avg_longti - y.avg_longti, 2));
}
inline bool cmp(int x, int y)
{
    if (vessel[x].mmsi == vessel[y].mmsi)
        return vessel[x].unix_time < vessel[y].unix_time;
    return vessel[x].mmsi < vessel[y].mmsi;
}
inline bool cmp2(nodes &x, nodes &y)
{
    if (x.longtitude == y.longtitude)
        return x.latitude < y.latitude;
    return x.longtitude < y.longtitude;
}
inline bool cmp3(cluster x, cluster y)
{
    if (x.avg_longti == y.avg_longti)
        return x.avg_lati < y.avg_lati;
    return x.avg_longti < y.avg_longti;
}
void pre_treat()
{
    R int cnt = 0, pre;
    for (int i = 1; i <= m; ++i)
        while (batch[i].number <= 5)
        {
            swap(batch[i], batch[m]);
            m--;
        }

    sort(batch + 1, batch + m + 1, cmp2);

    pre = 1;
    batch[m + 1].longtitude = 1000.0;
    for (int i = 1; i <= m; ++i)
    {
        if (distance(batch[i], batch[i + 1]) < 5e-4)
        {
            batch[i + 1].longtitude = (batch[i].longtitude * batch[i].number + batch[i + 1].longtitude * batch[i + 1].number) / (double)(batch[i + 1].number + batch[i].number);
            batch[i + 1].latitude = (batch[i].latitude * batch[i].number + batch[i + 1].latitude * batch[i + 1].number) / (double)(batch[i + 1].number + batch[i].number);
            batch[i + 1].number += batch[i].number;
            batch[i].number = 0;
        }
        else
        {
            for (int j = pre; j < i; ++j)
                batch[i].node_list.insert(batch[i].node_list.end(), batch[j].node_list.begin(), batch[j].node_list.end());
            pre = i + 1;
        }
    }

    cnt = 0;
    for (int i = 1; i <= m; ++i)
        if (batch[i].number > 0)
            batch[++cnt] = batch[i];
    m = cnt;

    cnt = 0;
    for (int i = 1; i <= m; ++i)
    {
        if (batch[i].number >= 500)
            ord[++cnt] = i;
    }
    K = cnt;

    for (int i = 1; i <= K; ++i)
    {
        clst[i].avg_longti = batch[ord[i]].longtitude;
        clst[i].avg_lati = batch[ord[i]].latitude;
        clst[i].number = 0;
    }
}
void Mini_Batch_K_Means()
{
    R bool find = false;
    R int tar = 0, cntk;
    R double dist, tmp, avg_x, avg_y;
    while (!find)
    {
        times++;
        if (times > 50)
            break;
        for (int i = 1; i <= m; ++i)
        {
            dist = inf;
            for (int j = 1; j <= K; ++j)
            {
                tmp = distance(batch[i], clst[j]);
                if (dist > tmp)
                {
                    tar = j;
                    dist = tmp;
                }
            }
            clst[tar].batch_list.push_back(i);
            clst[tar].number += batch[i].number;
            clst[tar].sum_longti += batch[i].longtitude * batch[i].number;
            clst[tar].sum_lati += batch[i].latitude * batch[i].number;
        }
        find = true;
        for (int i = 1; i <= K; ++i)
            if (clst[i].number > 0)
            {
                avg_x = clst[i].sum_longti / clst[i].number;
                avg_y = clst[i].sum_lati / clst[i].number;
                if (fabs(avg_x - clst[i].avg_longti) > 1e-6 || fabs(avg_y - clst[i].avg_lati) > 1e-6)
                    find = false;
                clst[i].avg_longti = avg_x;
                clst[i].avg_lati = avg_y;
            }
            else
                clst[i].avg_longti = clst[i].avg_lati = inf;
        if (!find || times == 1)
            for (int i = 1; i <= K; ++i)
            {
                clst[i].sum_longti = clst[i].sum_lati = 0.0;
                clst[i].number = 0;
                clst[i].batch_list.clear();
            }
    }

    sort(clst + 1, clst + K + 1, cmp3);

    for (int i = 1; i <= K; ++i)
    {
        if (distance(clst[i], clst[i + 1]) < 5e-3)
        {
            clst[i + 1].avg_longti = (clst[i].avg_longti * clst[i].number + clst[i + 1].avg_longti * clst[i + 1].number) / (double)(clst[i].number + clst[i + 1].number);
            clst[i + 1].avg_lati = (clst[i].avg_lati * clst[i].number + clst[i + 1].avg_lati * clst[i + 1].number) / (double)(clst[i].number + clst[i + 1].number);
            clst[i + 1].number += clst[i].number;
            clst[i + 1].batch_list.insert(clst[i + 1].batch_list.end(), clst[i].batch_list.begin(), clst[i].batch_list.end());
            clst[i].number = 0;
        }
    }
    cntk = 0;
    for (int i = 1; i <= K; ++i)
        if (clst[i].number > 0)
            clst[++cntk] = clst[i];
    K = cntk;
}
void classfication()
{

    int lst, cnt1, cnt2, cntk;
    for (int i = 1; i <= K; ++i)
    {
        lst = 0;
        cnt1 = 0;
        cnt2 = 0;

        int lb = clst[i].batch_list.size();
        for (int j = 0; j < lb; ++j)
        {
            int cur = clst[i].batch_list[j], ln = batch[cur].node_list.size();
            for (int k = 0; k < ln; ++k)
            {
                int now = batch[cur].node_list[k];
                if (vessel[now].draft > 50 && vessel[now].draft < 300)
                    mylist[++lst] = now;
            }
        }
        sort(mylist + 1, mylist + lst + 1, cmp);

        int d_type = 0;
        for (int j = 2; j <= lst; ++j)
        {
            if (vessel[mylist[j]].mmsi == vessel[mylist[j - 1]].mmsi && vessel[mylist[j]].unix_time - vessel[mylist[j - 1]].unix_time < 10800)
            {
                if (vessel[mylist[j]].draft > vessel[mylist[j - 1]].draft)
                {
                    cnt2 += vessel[mylist[j]].draft - vessel[mylist[j - 1]].draft;
                    if (cnt2 >= 10)
                    {
                        d_type = 2;
                        break;
                    }
                }
                else if (vessel[mylist[j]].draft < vessel[mylist[j - 1]].draft)
                {
                    cnt1 += vessel[mylist[j - 1]].draft - vessel[mylist[j]].draft;
                    if (cnt1 >= 10)
                        d_type = 1;
                }

                else
                    d_type = max(d_type, 0);
            }
        }
        clst[i].type = d_type;
    }
    cntk = 0;
    for (int i = 1; i <= K; ++i)
        if ((clst[i].number >= 500 && clst[i].type == 0) || (clst[i].number >= 2000 && clst[i].type > 0))
            clst[++cntk] = clst[i];
    K = cntk;
}
int main()
{
    clock_t start_time = clock();
    FILE *fp = fopen("lng_results_list.json", "w");

    R int mmsi, unix_time, status, velocity, draft, pos;
    R ll key;
    R double u, v;

    io::CSVReader<7> in("lng2.csv");
    while (in.read_row(mmsi, unix_time, status, velocity, u, v, draft))
    {
        vessel[++n].mmsi = mmsi;
        vessel[n].unix_time = unix_time;
        vessel[n].draft = draft;
        if (!(status == 1 || status == 5 || status == 15) || velocity != 0)
            n--;
        else
        {
            key = get_key(u, v);
            if (mp.find(key) == mp.end())
            {
                pos = ++m;
                mp[key] = pos;
                batch[pos].longtitude = round(u * 10000.0) / 10000.0;
                batch[pos].latitude = round(v * 10000.0) / 10000.0;
            }
            else
                pos = mp[key];
            batch[pos].node_list.push_back(n);
            batch[pos].number++;
        }
    }

    pre_treat();

    Mini_Batch_K_Means();

    classfication();

    int insta = 0, outsta = 0;
    fprintf(fp, "[\n");
    for (int i = 1; i <= K; ++i)
    {
        fprintf(fp, "    {");
        fprintf(fp, "\"code\":%d,\"latitude\":%lf,\"longtitude\":%lf,", i, clst[i].avg_lati, clst[i].avg_longti);
        if (clst[i].type > 0)
        {
            fprintf(fp, "\"isLNG\":true,");
            if (clst[i].type == 1)
                fprintf(fp, "\"IN\":true}");
            else
                fprintf(fp, "\"IN\":false}");
        }
        else
            fprintf(fp, "\"isLNG\":true,\"IN\":None}");
        if (i < K)
            fprintf(fp, ",");
        fprintf(fp, "\n");
        if (clst[i].type == 1)
            insta++;
        else if (clst[i].type == 2)
            outsta++;
    }
    fprintf(fp, "]\n");
    fclose(fp);
    double runtime = (double)(clock() - start_time) / CLOCKS_PER_SEC;
    printf("Runtime = %.2lfs\nTerminal Number = %d, Total Number = %d\n\n", runtime, insta + outsta, K);
    printf("Receiving Terminal:%d\n", insta);
    printf("Export Terminal:%d\n", outsta);
    printf("Anchorage ground:%d\n\n", K - insta - outsta);
    printf("Terminal information are in lng_results_list.json\n");
    return 0;
}

结果与性能分析

运行结果

  • 日志信息

    Runtime = 4.28s
    Terminal Number = 181, Total Number = 261
    
    Receiving Terminal:60
    Export Terminal:121
    Anchorage ground:80
    
    Terminal information are in lng_results_list.json
    
  • 可视化结果

    该可视化结果通过 HTML/JavaScript 调用 Echarts 实现,代码可见 Project/Visualization Code 文件夹中。

    image-20211213214831715

    图5.1 聚类结果
  • lng_results_list.json (优先按照"longtitude"属性排序)

    [
        {"code":1,"latitude":27.880066,"longtitude":-97.294783,"isLNG":true,"IN":false},
        {"code":2,"latitude":27.524410,"longtitude":-96.917330,"isLNG":false,"IN":None},
        {"code":3,"latitude":28.939251,"longtitude":-95.308109,"isLNG":true,"IN":false},
        {"code":4,"latitude":28.819865,"longtitude":-95.233990,"isLNG":false,"IN":None},
        {"code":5,"latitude":28.804832,"longtitude":-95.220117,"isLNG":false,"IN":None},
        {"code":6,"latitude":28.772783,"longtitude":-95.184969,"isLNG":false,"IN":None},
        {"code":7,"latitude":28.869171,"longtitude":-95.149945,"isLNG":true,"IN":false},
        {"code":8,"latitude":28.691024,"longtitude":-95.121548,"isLNG":false,"IN":None},
        {"code":9,"latitude":28.830466,"longtitude":-95.119258,"isLNG":true,"IN":true},
        {"code":10,"latitude":29.160701,"longtitude":-94.650470,"isLNG":true,"IN":false},
        {"code":11,"latitude":28.648235,"longtitude":-94.614462,"isLNG":false,"IN":None},
        {"code":12,"latitude":29.743305,"longtitude":-93.870446,"isLNG":true,"IN":false},
        {"code":13,"latitude":29.566262,"longtitude":-93.715063,"isLNG":false,"IN":None},
        {"code":14,"latitude":29.383547,"longtitude":-93.673015,"isLNG":true,"IN":false},
        {"code":15,"latitude":30.038946,"longtitude":-93.333249,"isLNG":true,"IN":false},
        {"code":16,"latitude":29.230151,"longtitude":-93.250394,"isLNG":true,"IN":false},
        {"code":17,"latitude":27.922899,"longtitude":-82.421611,"isLNG":true,"IN":true},
        {"code":18,"latitude":32.081380,"longtitude":-80.990144,"isLNG":true,"IN":false},
        {"code":19,"latitude":31.940499,"longtitude":-80.611428,"isLNG":false,"IN":None},
        {"code":20,"latitude":9.485927,"longtitude":-79.975795,"isLNG":false,"IN":None},
        {"code":21,"latitude":9.450592,"longtitude":-79.972050,"isLNG":false,"IN":None},
        {"code":22,"latitude":9.469649,"longtitude":-79.959711,"isLNG":false,"IN":None},
        {"code":23,"latitude":9.496569,"longtitude":-79.954679,"isLNG":false,"IN":None},
        {"code":24,"latitude":9.428227,"longtitude":-79.952580,"isLNG":false,"IN":None},
        {"code":25,"latitude":9.507104,"longtitude":-79.941821,"isLNG":false,"IN":None},
        {"code":26,"latitude":9.414667,"longtitude":-79.938953,"isLNG":true,"IN":false},
        {"code":27,"latitude":9.459706,"longtitude":-79.930999,"isLNG":true,"IN":false},
        {"code":28,"latitude":9.489780,"longtitude":-79.930532,"isLNG":true,"IN":false},
        {"code":29,"latitude":9.529239,"longtitude":-79.929744,"isLNG":false,"IN":None},
        {"code":30,"latitude":9.339237,"longtitude":-79.911384,"isLNG":true,"IN":true},
        {"code":31,"latitude":9.480623,"longtitude":-79.849950,"isLNG":false,"IN":None},
        {"code":32,"latitude":8.834909,"longtitude":-79.507258,"isLNG":true,"IN":false},
        {"code":33,"latitude":10.282647,"longtitude":-75.557228,"isLNG":false,"IN":None},
        {"code":34,"latitude":-30.750473,"longtitude":-71.793370,"isLNG":true,"IN":false},
        {"code":35,"latitude":12.469551,"longtitude":-70.134162,"isLNG":false,"IN":None},
        {"code":36,"latitude":12.124520,"longtitude":-68.926097,"isLNG":false,"IN":None},
        {"code":37,"latitude":-39.169800,"longtitude":-62.476049,"isLNG":true,"IN":true},
        {"code":38,"latitude":10.302722,"longtitude":-61.725412,"isLNG":false,"IN":None},
        {"code":39,"latitude":10.293171,"longtitude":-61.723914,"isLNG":false,"IN":None},
        {"code":40,"latitude":10.307580,"longtitude":-61.716124,"isLNG":true,"IN":true},
        {"code":41,"latitude":10.189513,"longtitude":-61.701362,"isLNG":true,"IN":false},
        {"code":42,"latitude":10.636475,"longtitude":-61.663949,"isLNG":false,"IN":None},
        {"code":43,"latitude":10.580755,"longtitude":-61.640109,"isLNG":false,"IN":None},
        {"code":44,"latitude":-34.470593,"longtitude":-57.929099,"isLNG":true,"IN":true},
        {"code":45,"latitude":-22.783341,"longtitude":-43.131529,"isLNG":true,"IN":true},
        {"code":46,"latitude":-11.912431,"longtitude":-37.835950,"isLNG":true,"IN":false},
        {"code":47,"latitude":28.481929,"longtitude":-16.219605,"isLNG":false,"IN":None},
        {"code":48,"latitude":28.142752,"longtitude":-15.400791,"isLNG":true,"IN":false},
        {"code":49,"latitude":37.934651,"longtitude":-8.857580,"isLNG":true,"IN":true},
        {"code":50,"latitude":38.473022,"longtitude":-8.796976,"isLNG":false,"IN":None},
        {"code":51,"latitude":43.463941,"longtitude":-8.239334,"isLNG":true,"IN":true},
        {"code":52,"latitude":43.474163,"longtitude":-8.230597,"isLNG":true,"IN":false},
        {"code":53,"latitude":43.473626,"longtitude":-8.224115,"isLNG":true,"IN":false},
        {"code":54,"latitude":43.470997,"longtitude":-8.192405,"isLNG":false,"IN":None},
        {"code":55,"latitude":36.596306,"longtitude":-6.306248,"isLNG":true,"IN":false},
        {"code":56,"latitude":51.697097,"longtitude":-5.086536,"isLNG":true,"IN":true},
        {"code":57,"latitude":51.697719,"longtitude":-5.078414,"isLNG":true,"IN":false},
        {"code":58,"latitude":51.699786,"longtitude":-5.001625,"isLNG":true,"IN":true},
        {"code":59,"latitude":48.385950,"longtitude":-4.458907,"isLNG":true,"IN":true},
        {"code":60,"latitude":48.383071,"longtitude":-4.454250,"isLNG":true,"IN":false},
        {"code":61,"latitude":50.465441,"longtitude":-3.354441,"isLNG":false,"IN":None},
        {"code":62,"latitude":43.380649,"longtitude":-3.107285,"isLNG":true,"IN":false},
        {"code":63,"latitude":47.109097,"longtitude":-2.462008,"isLNG":false,"IN":None},
        {"code":64,"latitude":47.300086,"longtitude":-2.138290,"isLNG":true,"IN":false},
        {"code":65,"latitude":35.816823,"longtitude":-0.257758,"isLNG":true,"IN":false},
        {"code":66,"latitude":35.814300,"longtitude":-0.229437,"isLNG":true,"IN":false},
        {"code":67,"latitude":39.628675,"longtitude":-0.207064,"isLNG":true,"IN":false},
        {"code":68,"latitude":51.431986,"longtitude":0.704780,"isLNG":true,"IN":true},
        {"code":69,"latitude":51.429549,"longtitude":1.707779,"isLNG":false,"IN":None},
        {"code":70,"latitude":41.342754,"longtitude":2.162533,"isLNG":true,"IN":false},
        {"code":71,"latitude":41.305234,"longtitude":2.199753,"isLNG":false,"IN":None},
        {"code":72,"latitude":51.304267,"longtitude":2.485908,"isLNG":true,"IN":false},
        {"code":73,"latitude":36.799927,"longtitude":3.117857,"isLNG":false,"IN":None},
        {"code":74,"latitude":51.352941,"longtitude":3.213632,"isLNG":true,"IN":false},
        {"code":75,"latitude":52.018075,"longtitude":3.503835,"isLNG":false,"IN":None},
        {"code":76,"latitude":51.975354,"longtitude":3.990816,"isLNG":false,"IN":None},
        {"code":77,"latitude":51.971937,"longtitude":4.051806,"isLNG":false,"IN":None},
        {"code":78,"latitude":51.969014,"longtitude":4.077981,"isLNG":true,"IN":false},
        {"code":79,"latitude":51.929109,"longtitude":4.203044,"isLNG":false,"IN":None},
        {"code":80,"latitude":43.450799,"longtitude":4.854249,"isLNG":true,"IN":true},
        {"code":81,"latitude":43.413452,"longtitude":4.898769,"isLNG":true,"IN":false},
        {"code":82,"latitude":59.359040,"longtitude":5.300134,"isLNG":false,"IN":None},
        {"code":83,"latitude":58.924215,"longtitude":5.577367,"isLNG":true,"IN":false},
        {"code":84,"latitude":36.883371,"longtitude":6.943444,"isLNG":true,"IN":false},
        {"code":85,"latitude":4.424942,"longtitude":7.143408,"isLNG":true,"IN":false},
        {"code":86,"latitude":59.112439,"longtitude":9.626531,"isLNG":false,"IN":None},
        {"code":87,"latitude":43.764253,"longtitude":10.042597,"isLNG":true,"IN":false},
        {"code":88,"latitude":55.467441,"longtitude":10.528351,"isLNG":true,"IN":false},
        {"code":89,"latitude":57.422457,"longtitude":10.660841,"isLNG":false,"IN":None},
        {"code":90,"latitude":57.420134,"longtitude":10.666428,"isLNG":false,"IN":None},
        {"code":91,"latitude":57.667734,"longtitude":10.667493,"isLNG":true,"IN":false},
        {"code":92,"latitude":44.506521,"longtitude":12.434730,"isLNG":false,"IN":None},
        {"code":93,"latitude":45.119488,"longtitude":12.521118,"isLNG":true,"IN":true},
        {"code":94,"latitude":54.195221,"longtitude":14.159482,"isLNG":true,"IN":true},
        {"code":95,"latitude":45.198187,"longtitude":14.468214,"isLNG":false,"IN":None},
        {"code":96,"latitude":45.201500,"longtitude":14.533769,"isLNG":true,"IN":true},
        {"code":97,"latitude":35.885264,"longtitude":14.842888,"isLNG":true,"IN":true},
        {"code":98,"latitude":58.905269,"longtitude":18.036454,"isLNG":true,"IN":false},
        {"code":99,"latitude":-32.419789,"longtitude":20.368200,"isLNG":false,"IN":None},
        {"code":100,"latitude":55.666106,"longtitude":21.135673,"isLNG":true,"IN":true},
        {"code":101,"latitude":37.889047,"longtitude":23.463216,"isLNG":true,"IN":false},
        {"code":102,"latitude":38.821661,"longtitude":26.913655,"isLNG":true,"IN":true},
        {"code":103,"latitude":60.432812,"longtitude":27.124785,"isLNG":true,"IN":false},
        {"code":104,"latitude":40.989838,"longtitude":27.987477,"isLNG":true,"IN":true},
        {"code":105,"latitude":40.712410,"longtitude":29.462175,"isLNG":true,"IN":false},
        {"code":106,"latitude":31.463463,"longtitude":30.221187,"isLNG":false,"IN":None},
        {"code":107,"latitude":31.456882,"longtitude":30.228521,"isLNG":false,"IN":None},
        {"code":108,"latitude":30.900661,"longtitude":32.317936,"isLNG":true,"IN":false},
        {"code":109,"latitude":36.727090,"longtitude":35.953325,"isLNG":true,"IN":true},
        {"code":110,"latitude":29.072674,"longtitude":48.165666,"isLNG":true,"IN":false},
        {"code":111,"latitude":28.717013,"longtitude":48.408910,"isLNG":true,"IN":true},
        {"code":112,"latitude":28.756745,"longtitude":48.621381,"isLNG":false,"IN":None},
        {"code":113,"latitude":25.934289,"longtitude":51.601007,"isLNG":true,"IN":false},
        {"code":114,"latitude":25.933232,"longtitude":51.609828,"isLNG":true,"IN":false},
        {"code":115,"latitude":25.937317,"longtitude":51.613509,"isLNG":true,"IN":false},
        {"code":116,"latitude":25.933488,"longtitude":51.617462,"isLNG":true,"IN":false},
        {"code":117,"latitude":25.898686,"longtitude":51.643328,"isLNG":true,"IN":false},
        {"code":118,"latitude":25.898951,"longtitude":51.649479,"isLNG":true,"IN":false},
        {"code":119,"latitude":25.837491,"longtitude":51.834371,"isLNG":true,"IN":false},
        {"code":120,"latitude":25.229771,"longtitude":53.066077,"isLNG":true,"IN":false},
        {"code":121,"latitude":25.405464,"longtitude":55.030728,"isLNG":true,"IN":false},
        {"code":122,"latitude":25.031658,"longtitude":55.067605,"isLNG":true,"IN":true},
        {"code":123,"latitude":25.250434,"longtitude":55.266617,"isLNG":true,"IN":false},
        {"code":124,"latitude":25.253006,"longtitude":55.271438,"isLNG":true,"IN":false},
        {"code":125,"latitude":25.269326,"longtitude":56.514527,"isLNG":true,"IN":false},
        {"code":126,"latitude":25.377144,"longtitude":56.552549,"isLNG":true,"IN":false},
        {"code":127,"latitude":19.662634,"longtitude":57.715449,"isLNG":false,"IN":None},
        {"code":128,"latitude":22.666497,"longtitude":59.411992,"isLNG":true,"IN":false},
        {"code":129,"latitude":24.772280,"longtitude":67.302040,"isLNG":true,"IN":false},
        {"code":130,"latitude":71.275995,"longtitude":72.098934,"isLNG":true,"IN":false},
        {"code":131,"latitude":21.672094,"longtitude":72.509587,"isLNG":true,"IN":false},
        {"code":132,"latitude":20.554981,"longtitude":72.643327,"isLNG":true,"IN":true},
        {"code":133,"latitude":9.891448,"longtitude":76.189660,"isLNG":true,"IN":true},
        {"code":134,"latitude":13.275536,"longtitude":80.346539,"isLNG":true,"IN":true},
        {"code":135,"latitude":21.557573,"longtitude":91.816250,"isLNG":true,"IN":true},
        {"code":136,"latitude":21.524212,"longtitude":91.818552,"isLNG":true,"IN":false},
        {"code":137,"latitude":16.652400,"longtitude":96.259801,"isLNG":false,"IN":None},
        {"code":138,"latitude":12.656447,"longtitude":98.437111,"isLNG":false,"IN":None},
        {"code":139,"latitude":12.641026,"longtitude":101.162330,"isLNG":true,"IN":false},
        {"code":140,"latitude":2.375478,"longtitude":101.893253,"isLNG":true,"IN":false},
        {"code":141,"latitude":2.366129,"longtitude":101.920883,"isLNG":true,"IN":false},
        {"code":142,"latitude":1.139366,"longtitude":103.614799,"isLNG":true,"IN":false},
        {"code":143,"latitude":1.300065,"longtitude":103.652446,"isLNG":true,"IN":false},
        {"code":144,"latitude":1.296662,"longtitude":103.657739,"isLNG":true,"IN":false},
        {"code":145,"latitude":1.231620,"longtitude":103.668441,"isLNG":true,"IN":false},
        {"code":146,"latitude":1.301094,"longtitude":103.674041,"isLNG":true,"IN":false},
        {"code":147,"latitude":1.229527,"longtitude":103.675713,"isLNG":true,"IN":false},
        {"code":148,"latitude":1.296941,"longtitude":103.678689,"isLNG":true,"IN":false},
        {"code":149,"latitude":1.192349,"longtitude":103.682159,"isLNG":true,"IN":false},
        {"code":150,"latitude":1.176157,"longtitude":103.703779,"isLNG":true,"IN":false},
        {"code":151,"latitude":1.164495,"longtitude":103.724831,"isLNG":true,"IN":false},
        {"code":152,"latitude":1.468564,"longtitude":103.826970,"isLNG":true,"IN":false},
        {"code":153,"latitude":1.468898,"longtitude":103.832511,"isLNG":true,"IN":false},
        {"code":154,"latitude":1.455094,"longtitude":103.874799,"isLNG":true,"IN":false},
        {"code":155,"latitude":1.451687,"longtitude":103.878490,"isLNG":true,"IN":false},
        {"code":156,"latitude":1.205413,"longtitude":104.057618,"isLNG":false,"IN":None},
        {"code":157,"latitude":1.309607,"longtitude":104.103224,"isLNG":true,"IN":false},
        {"code":158,"latitude":1.310022,"longtitude":104.171321,"isLNG":true,"IN":false},
        {"code":159,"latitude":1.473660,"longtitude":104.359154,"isLNG":true,"IN":false},
        {"code":160,"latitude":1.579437,"longtitude":104.405187,"isLNG":true,"IN":false},
        {"code":161,"latitude":1.538764,"longtitude":104.412441,"isLNG":true,"IN":false},
        {"code":162,"latitude":1.866053,"longtitude":104.984581,"isLNG":false,"IN":None},
        {"code":163,"latitude":-5.889858,"longtitude":106.018898,"isLNG":false,"IN":None},
        {"code":164,"latitude":-5.962233,"longtitude":106.769626,"isLNG":false,"IN":None},
        {"code":165,"latitude":-5.949732,"longtitude":106.770927,"isLNG":false,"IN":None},
        {"code":166,"latitude":-5.977813,"longtitude":106.805668,"isLNG":true,"IN":true},
        {"code":167,"latitude":9.605667,"longtitude":107.162767,"isLNG":false,"IN":None},
        {"code":168,"latitude":21.588691,"longtitude":108.359991,"isLNG":true,"IN":true},
        {"code":169,"latitude":19.780357,"longtitude":109.095258,"isLNG":true,"IN":false},
        {"code":170,"latitude":21.449139,"longtitude":109.536533,"isLNG":true,"IN":true},
        {"code":171,"latitude":3.349755,"longtitude":112.915473,"isLNG":true,"IN":true},
        {"code":172,"latitude":3.264579,"longtitude":113.060946,"isLNG":true,"IN":false},
        {"code":173,"latitude":21.838995,"longtitude":113.276914,"isLNG":true,"IN":true},
        {"code":174,"latitude":22.950950,"longtitude":113.551717,"isLNG":true,"IN":true},
        {"code":175,"latitude":22.504522,"longtitude":113.838135,"isLNG":true,"IN":false},
        {"code":176,"latitude":22.498577,"longtitude":113.839000,"isLNG":true,"IN":true},
        {"code":177,"latitude":22.501337,"longtitude":113.844555,"isLNG":true,"IN":false},
        {"code":178,"latitude":22.121228,"longtitude":114.107977,"isLNG":true,"IN":false},
        {"code":179,"latitude":22.581062,"longtitude":114.429968,"isLNG":true,"IN":true},
        {"code":180,"latitude":-21.535438,"longtitude":115.098298,"isLNG":true,"IN":false},
        {"code":181,"latitude":6.447975,"longtitude":115.435611,"isLNG":true,"IN":false},
        {"code":182,"latitude":-20.813326,"longtitude":115.488995,"isLNG":true,"IN":false},
        {"code":183,"latitude":-32.206025,"longtitude":115.712776,"isLNG":false,"IN":None},
        {"code":184,"latitude":-20.748519,"longtitude":115.774408,"isLNG":true,"IN":false},
        {"code":185,"latitude":6.120320,"longtitude":116.030503,"isLNG":false,"IN":None},
        {"code":186,"latitude":6.096143,"longtitude":116.033462,"isLNG":false,"IN":None},
        {"code":187,"latitude":6.088649,"longtitude":116.049408,"isLNG":false,"IN":None},
        {"code":188,"latitude":22.924325,"longtitude":116.368743,"isLNG":true,"IN":true},
        {"code":189,"latitude":-20.288502,"longtitude":116.675607,"isLNG":true,"IN":false},
        {"code":190,"latitude":-20.495004,"longtitude":116.746435,"isLNG":false,"IN":None},
        {"code":191,"latitude":-20.592671,"longtitude":116.761411,"isLNG":true,"IN":false},
        {"code":192,"latitude":0.002911,"longtitude":117.619914,"isLNG":false,"IN":None},
        {"code":193,"latitude":38.754041,"longtitude":117.715004,"isLNG":true,"IN":false},
        {"code":194,"latitude":38.927673,"longtitude":117.876234,"isLNG":true,"IN":false},
        {"code":195,"latitude":38.907393,"longtitude":118.519066,"isLNG":true,"IN":true},
        {"code":196,"latitude":38.863413,"longtitude":118.758674,"isLNG":false,"IN":None},
        {"code":197,"latitude":25.183878,"longtitude":119.010019,"isLNG":true,"IN":true},
        {"code":198,"latitude":35.574470,"longtitude":119.762282,"isLNG":true,"IN":false},
        {"code":199,"latitude":23.388913,"longtitude":120.318071,"isLNG":true,"IN":false},
        {"code":200,"latitude":14.541774,"longtitude":120.814518,"isLNG":false,"IN":None},
        {"code":201,"latitude":32.548879,"longtitude":121.431106,"isLNG":true,"IN":true},
        {"code":202,"latitude":31.593292,"longtitude":121.445188,"isLNG":true,"IN":false},
        {"code":203,"latitude":31.586158,"longtitude":121.463236,"isLNG":true,"IN":false},
        {"code":204,"latitude":31.299257,"longtitude":121.712817,"isLNG":true,"IN":false},
        {"code":205,"latitude":31.314304,"longtitude":121.777600,"isLNG":false,"IN":None},
        {"code":206,"latitude":31.312205,"longtitude":121.783200,"isLNG":false,"IN":None},
        {"code":207,"latitude":38.895929,"longtitude":121.957775,"isLNG":true,"IN":true},
        {"code":208,"latitude":29.894505,"longtitude":122.090150,"isLNG":true,"IN":true},
        {"code":209,"latitude":30.600565,"longtitude":122.104518,"isLNG":true,"IN":false},
        {"code":210,"latitude":30.074562,"longtitude":122.275945,"isLNG":true,"IN":false},
        {"code":211,"latitude":29.675061,"longtitude":122.689491,"isLNG":false,"IN":None},
        {"code":212,"latitude":34.732917,"longtitude":126.372519,"isLNG":false,"IN":None},
        {"code":213,"latitude":34.731115,"longtitude":126.380096,"isLNG":false,"IN":None},
        {"code":214,"latitude":37.337524,"longtitude":126.583266,"isLNG":true,"IN":true},
        {"code":215,"latitude":37.337581,"longtitude":126.591422,"isLNG":true,"IN":false},
        {"code":216,"latitude":37.003662,"longtitude":126.775132,"isLNG":true,"IN":true},
        {"code":217,"latitude":36.998863,"longtitude":126.782206,"isLNG":true,"IN":false},
        {"code":218,"latitude":34.759973,"longtitude":127.856510,"isLNG":true,"IN":false},
        {"code":219,"latitude":34.948617,"longtitude":128.437968,"isLNG":true,"IN":true},
        {"code":220,"latitude":34.965864,"longtitude":128.441321,"isLNG":true,"IN":false},
        {"code":221,"latitude":34.989356,"longtitude":128.470626,"isLNG":true,"IN":false},
        {"code":222,"latitude":34.909239,"longtitude":128.598810,"isLNG":true,"IN":false},
        {"code":223,"latitude":34.909400,"longtitude":128.604805,"isLNG":false,"IN":None},
        {"code":224,"latitude":34.902464,"longtitude":128.608861,"isLNG":true,"IN":false},
        {"code":225,"latitude":34.884135,"longtitude":128.699632,"isLNG":true,"IN":false},
        {"code":226,"latitude":34.878761,"longtitude":128.703719,"isLNG":false,"IN":None},
        {"code":227,"latitude":34.872887,"longtitude":128.713148,"isLNG":false,"IN":None},
        {"code":228,"latitude":34.878693,"longtitude":128.715765,"isLNG":true,"IN":true},
        {"code":229,"latitude":34.889846,"longtitude":128.719123,"isLNG":false,"IN":None},
        {"code":230,"latitude":34.858895,"longtitude":128.802087,"isLNG":false,"IN":None},
        {"code":231,"latitude":35.476470,"longtitude":129.402952,"isLNG":true,"IN":true},
        {"code":232,"latitude":35.483931,"longtitude":129.404018,"isLNG":true,"IN":false},
        {"code":233,"latitude":35.508074,"longtitude":129.432858,"isLNG":false,"IN":None},
        {"code":234,"latitude":35.522407,"longtitude":129.439194,"isLNG":false,"IN":None},
        {"code":235,"latitude":35.528488,"longtitude":129.442240,"isLNG":false,"IN":None},
        {"code":236,"latitude":35.525940,"longtitude":129.444643,"isLNG":false,"IN":None},
        {"code":237,"latitude":33.156614,"longtitude":129.701106,"isLNG":false,"IN":None},
        {"code":238,"latitude":-12.481264,"longtitude":130.772395,"isLNG":true,"IN":false},
        {"code":239,"latitude":33.932799,"longtitude":130.775646,"isLNG":true,"IN":true},
        {"code":240,"latitude":33.276176,"longtitude":131.712148,"isLNG":true,"IN":true},
        {"code":241,"latitude":34.743024,"longtitude":134.641395,"isLNG":true,"IN":true},
        {"code":242,"latitude":34.564189,"longtitude":135.309003,"isLNG":false,"IN":None},
        {"code":243,"latitude":34.556492,"longtitude":135.408022,"isLNG":true,"IN":true},
        {"code":244,"latitude":34.688294,"longtitude":136.594942,"isLNG":false,"IN":None},
        {"code":245,"latitude":34.999137,"longtitude":136.697915,"isLNG":true,"IN":true},
        {"code":246,"latitude":34.971593,"longtitude":136.818886,"isLNG":true,"IN":false},
        {"code":247,"latitude":35.023249,"longtitude":138.499277,"isLNG":true,"IN":false},
        {"code":248,"latitude":38.002629,"longtitude":139.230472,"isLNG":true,"IN":true},
        {"code":249,"latitude":35.400288,"longtitude":139.634914,"isLNG":true,"IN":true},
        {"code":250,"latitude":35.417316,"longtitude":139.681096,"isLNG":true,"IN":false},
        {"code":251,"latitude":35.461059,"longtitude":139.718486,"isLNG":true,"IN":true},
        {"code":252,"latitude":35.469955,"longtitude":139.738706,"isLNG":true,"IN":true},
        {"code":253,"latitude":35.345009,"longtitude":139.821525,"isLNG":true,"IN":true},
        {"code":254,"latitude":35.473776,"longtitude":139.967808,"isLNG":true,"IN":true},
        {"code":255,"latitude":36.478900,"longtitude":140.626287,"isLNG":true,"IN":true},
        {"code":256,"latitude":37.849911,"longtitude":140.956141,"isLNG":true,"IN":true},
        {"code":257,"latitude":-9.335175,"longtitude":146.956489,"isLNG":true,"IN":false},
        {"code":258,"latitude":-23.762741,"longtitude":151.184737,"isLNG":true,"IN":false},
        {"code":259,"latitude":-23.777941,"longtitude":151.195480,"isLNG":true,"IN":false},
        {"code":260,"latitude":-23.783611,"longtitude":151.201261,"isLNG":true,"IN":false},
        {"code":261,"latitude":-23.843264,"longtitude":151.533465,"isLNG":false,"IN":None}
    ]

性能分析

程序优化部分

  • 使用多线程并发的高性能 I/O
  • 多次优化后的 Mini Batch K-Means 算法

程序性能

将程序运行60次,得到以下结果:

  • 平均运行时间:4.32s
    • I/O 平均时间 : 1.73s
    • Mini Batch K-Means 平均时间:1.38s
    • 分类平均时间:1.21s
  • 空间利用:166.8 MB(峰值)

若允许随机误差,采用牺牲准确度的算法(随机化规定开始簇和增大 Batch 规划的临界 EPS),算法单次运行平均耗时:2.48s

对于接近千万级规模的数据,这是一个较为优秀的结果。

Footnotes

  1. Design and Construction of LNG Storage Tanks, First Edition. Josef Rötzer.

  2. Liquefied natural gas from Wikipedia. Retrieved 9 December 2021.

  3. Design and Construction of LNG Storage Tanks by Josef Rotzer, published in 2019.

  4. 2021 World LNG Report, published by IGU in 2021.

  5. MarineTraffic. Retrieved 10 December 2021.

  6. https://www.geeksforgeeks.org . Retrieved 11 December 2021.