佛山移动网站建设公司数据可视化网站
分析Elasticsearch Index文件是如何存储的?
主要是想看一下FST文件是以什么粒度创建的?
首先通过kibana找一个索引的shard,此处咱们就以logstash-2023.05.30索引为例
查看下shard分布情况
GET /_cat/shards/logstash-2023.05.30?vindex               shard prirep state      docs   store ip             node
logstash-2023.05.30 3     p      STARTED 1520736 408.1mb 10.138.40.73  10.138.40.73-node1
logstash-2023.05.30 5     p      STARTED 1520888 409.9mb 10.138.40.74  10.138.40.74-node1
logstash-2023.05.30 6     p      STARTED 1518331 408.2mb 10.138.40.221 10.138.40.221-node1
logstash-2023.05.30 4     p      STARTED 1518186 409.3mb 10.138.204.194 10.138.204.194-node1
logstash-2023.05.30 1     p      STARTED 1519231 408.8mb 10.138.40.220 10.138.40.220-node1
logstash-2023.05.30 2     p      STARTED 1519970 409.9mb 10.138.204.195 10.138.204.195-node1
logstash-2023.05.30 0     p      STARTED 1520024 410.6mb 10.138.204.193 10.138.204.193-node1
 
这里以位于10.138.204.193上的shard 0为例分析。
要找到存储目录先要找到index的id
GET /logstash-2023.05.30/_settings{"logstash-2023.05.30" : {"settings" : {"index" : {"codec" : "best_compression","routing" : {"allocation" : {"include" : {"_tier_preference" : "data_content"}}},"refresh_interval" : "60s","number_of_shards" : "7","provided_name" : "logstash-2023.05.30","creation_date" : "1685376005206","number_of_replicas" : "0","uuid" : "FYWtFGTIS2CLB8yJhFXG9g",//这里就是索引的id"version" : {"created" : "7130499"}}}}
} 
登录机器,找到存储索引文件的对应目录
/data3/10.138.204.193-node1/nodes/0/indices/FYWtFGTIS2CLB8yJhFXG9g
 
展开一下该目录下的文件
root@prd-paas-es-01:/data3/10.138.204.193-node1/nodes/0/indices/FYWtFGTIS2CLB8yJhFXG9g# tree -C -s
.
├── [       4096]  0
│   ├── [      20480]  index
│   │   ├── [        158]  _17f.fdm
│   │   ├── [   25578562]  _17f.fdt
│   │   ├── [       1939]  _17f.fdx
│   │   ├── [       4636]  _17f.fnm
│   │   ├── [    7981735]  _17f.kdd
│   │   ├── [      20898]  _17f.kdi
│   │   ├── [        716]  _17f.kdm
│   │   ├── [    7945983]  _17f_Lucene80_0.dvd
│   │   ├── [       3916]  _17f_Lucene80_0.dvm
│   │   ├── [    6230127]  _17f_Lucene84_0.doc
│   │   ├── [    3875001]  _17f_Lucene84_0.pos
│   │   ├── [    7448815]  _17f_Lucene84_0.tim
│   │   ├── [     108786]  _17f_Lucene84_0.tip
│   │   ├── [       1637]  _17f_Lucene84_0.tmd
│   │   ├── [        593]  _17f.si
│   │   ├── [        158]  _3uv.fdm
│   │   ├── [   33652243]  _3uv.fdt
│   │   ├── [       2555]  _3uv.fdx
│   │   ├── [       4636]  _3uv.fnm
│   │   ├── [   10520395]  _3uv.kdd
│   │   ├── [      27689]  _3uv.kdi
│   │   ├── [        716]  _3uv.kdm
│   │   ├── [   10573208]  _3uv_Lucene80_0.dvd
│   │   ├── [       3916]  _3uv_Lucene80_0.dvm
│   │   ├── [    8298061]  _3uv_Lucene84_0.doc
│   │   ├── [    5154427]  _3uv_Lucene84_0.pos
│   │   ├── [    9716222]  _3uv_Lucene84_0.tim
│   │   ├── [     142063]  _3uv_Lucene84_0.tip
│   │   ├── [       1620]  _3uv_Lucene84_0.tmd
│   │   ├── [        593]  _3uv.si
│   │   ├── [        158]  _5bg.fdm
│   │   ├── [   16433011]  _5bg.fdt
│   │   ├── [       1259]  _5bg.fdx
│   │   ├── [       4636]  _5bg.fnm
│   │   ├── [    5158094]  _5bg.kdd
│   │   ├── [      13396]  _5bg.kdi
│   │   ├── [        716]  _5bg.kdm
│   │   ├── [    5140762]  _5bg_Lucene80_0.dvd
│   │   ├── [       3916]  _5bg_Lucene80_0.dvm
│   │   ├── [    4005897]  _5bg_Lucene84_0.doc
│   │   ├── [    2583880]  _5bg_Lucene84_0.pos
│   │   ├── [    4873082]  _5bg_Lucene84_0.tim
│   │   ├── [      70979]  _5bg_Lucene84_0.tip
│   │   ├── [       1593]  _5bg_Lucene84_0.tmd
│   │   ├── [        593]  _5bg.si
│   │   ├── [        158]  _60h.fdm
│   │   ├── [   24664753]  _60h.fdt
│   │   ├── [       1886]  _60h.fdx
│   │   ├── [       4636]  _60h.fnm
│   │   ├── [    7640438]  _60h.kdd
│   │   ├── [      19996]  _60h.kdi
│   │   ├── [        716]  _60h.kdm
│   │   ├── [    7754954]  _60h_Lucene80_0.dvd
│   │   ├── [       3916]  _60h_Lucene80_0.dvm
│   │   ├── [    6147241]  _60h_Lucene84_0.doc
│   │   ├── [    3998559]  _60h_Lucene84_0.pos
│   │   ├── [    7254035]  _60h_Lucene84_0.tim
│   │   ├── [     105673]  _60h_Lucene84_0.tip
│   │   ├── [       1719]  _60h_Lucene84_0.tmd
│   │   ├── [        593]  _60h.si
│   │   ├── [        200]  _7jq.fdm
│   │   ├── [   63208093]  _7jq.fdt
│   │   ├── [       4692]  _7jq.fdx
│   │   ├── [       4636]  _7jq.fnm
│   │   ├── [   19306117]  _7jq.kdd
│   │   ├── [      51562]  _7jq.kdi
│   │   ├── [        716]  _7jq.kdm
│   │   ├── [   20228561]  _7jq_Lucene80_0.dvd
│   │   ├── [       3916]  _7jq_Lucene80_0.dvm
│   │   ├── [   15606568]  _7jq_Lucene84_0.doc
│   │   ├── [    9581341]  _7jq_Lucene84_0.pos
│   │   ├── [   17383473]  _7jq_Lucene84_0.tim
│   │   ├── [     272615]  _7jq_Lucene84_0.tip
│   │   ├── [       1592]  _7jq_Lucene84_0.tmd
│   │   ├── [        593]  _7jq.si
│   │   ├── [        437]  _82w.cfe
│   │   ├── [    4489379]  _82w.cfs
│   │   ├── [        408]  _82w.si
│   │   ├── [        437]  _87w.cfe
│   │   ├── [    4932636]  _87w.cfs
│   │   ├── [        408]  _87w.si
│   │   ├── [        437]  _8ao.cfe
│   │   ├── [   13905317]  _8ao.cfs
│   │   ├── [        408]  _8ao.si
│   │   ├── [        437]  _8ls.cfe
│   │   ├── [   20181047]  _8ls.cfs
│   │   ├── [        408]  _8ls.si
│   │   ├── [        437]  _8nq.cfe
│   │   ├── [    1234712]  _8nq.cfs
│   │   ├── [        408]  _8nq.si
│   │   ├── [        437]  _8oa.cfe
│   │   ├── [     872798]  _8oa.cfs
│   │   ├── [        408]  _8oa.si
│   │   ├── [        437]  _8pp.cfe
│   │   ├── [    1593677]  _8pp.cfs
│   │   ├── [        408]  _8pp.si
│   │   ├── [        437]  _8r5.cfe
│   │   ├── [     914008]  _8r5.cfs
│   │   ├── [        408]  _8r5.si
│   │   ├── [        437]  _8rf.cfe
│   │   ├── [     940473]  _8rf.cfs
│   │   ├── [        408]  _8rf.si
│   │   ├── [        437]  _8rz.cfe
│   │   ├── [    1315312]  _8rz.cfs
│   │   ├── [        408]  _8rz.si
│   │   ├── [        437]  _8s9.cfe
│   │   ├── [    1121692]  _8s9.cfs
│   │   ├── [        408]  _8s9.si
│   │   ├── [        437]  _8sk.cfe
│   │   ├── [     243476]  _8sk.cfs
│   │   ├── [        408]  _8sk.si
│   │   ├── [       1678]  segments_6
│   │   └── [          0]  write.lock
│   ├── [       4096]  _state
│   │   ├── [        186]  retention-leases-2865.st
│   │   └── [        125]  state-0.st
│   └── [       4096]  translog
│       ├── [         55]  translog-29.tlog
│       └── [         88]  translog.ckp
└── [       4096]  _state└── [       1230]  state-2.st5 directories, 118 files
 
有了文件信息,我们再来看下,segment信息
GET /logstash-2023.05.30/_segments// 这里为了直观 只展示shard 0对应的segment
{"_shards": {"total": 7,"successful": 7,"failed": 0},"indices": {"logstash-2023.05.30": {"shards": {"0": [{"routing": {"state": "STARTED","primary": true,"node": "4hEWcF8hRFWTEkQxlKQmqg"},"num_committed_segments": 17,"num_search_segments": 17,"segments": {"_17f": {"generation": 1563,"num_docs": 210331,"deleted_docs": 0,"size_in_bytes": 59203502,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_3uv": {"generation": 4999,"num_docs": 278411,"deleted_docs": 0,"size_in_bytes": 78098502,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_5bg": {"generation": 6892,"num_docs": 132645,"deleted_docs": 0,"size_in_bytes": 38291972,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_60h": {"generation": 7793,"num_docs": 199809,"deleted_docs": 0,"size_in_bytes": 57599273,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_7jq": {"generation": 9782,"num_docs": 520420,"deleted_docs": 0,"size_in_bytes": 145654675,"memory_in_bytes": 5204,"committed": true,"search": true,"version": "8.8.2","compound": false,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_82w": {"generation": 10472,"num_docs": 15416,"deleted_docs": 0,"size_in_bytes": 4490224,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_87w": {"generation": 10652,"num_docs": 16837,"deleted_docs": 0,"size_in_bytes": 4933481,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8ao": {"generation": 10752,"num_docs": 48855,"deleted_docs": 0,"size_in_bytes": 13906162,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8ls": {"generation": 11152,"num_docs": 70903,"deleted_docs": 0,"size_in_bytes": 20181892,"memory_in_bytes": 5140,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8nq": {"generation": 11222,"num_docs": 3954,"deleted_docs": 0,"size_in_bytes": 1235557,"memory_in_bytes": 6924,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8oa": {"generation": 11242,"num_docs": 2785,"deleted_docs": 0,"size_in_bytes": 873643,"memory_in_bytes": 6820,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8pp": {"generation": 11293,"num_docs": 5194,"deleted_docs": 0,"size_in_bytes": 1594522,"memory_in_bytes": 7060,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8r5": {"generation": 11345,"num_docs": 2936,"deleted_docs": 0,"size_in_bytes": 914853,"memory_in_bytes": 6748,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8rf": {"generation": 11355,"num_docs": 2920,"deleted_docs": 0,"size_in_bytes": 941318,"memory_in_bytes": 6836,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8rz": {"generation": 11375,"num_docs": 4304,"deleted_docs": 0,"size_in_bytes": 1316157,"memory_in_bytes": 6820,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8s9": {"generation": 11385,"num_docs": 3647,"deleted_docs": 0,"size_in_bytes": 1122537,"memory_in_bytes": 6892,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}},"_8sk": {"generation": 11396,"num_docs": 657,"deleted_docs": 0,"size_in_bytes": 244321,"memory_in_bytes": 7620,"committed": true,"search": true,"version": "8.8.2","compound": true,"attributes": {"Lucene87StoredFieldsFormat.mode": "BEST_COMPRESSION"}}}}]}}}
} 
对比segment与shard目录中文件可以看出,两者是一一对应的。
看下es及对应lucene的版本
GET /{"name" : "10.138.204.193-node1","cluster_name" : "elasticsearch","cluster_uuid" : "XWDyVuo6TgK4yUp2XWD3lw","version" : {"number" : "7.13.4","build_flavor" : "default","build_type" : "docker","build_hash" : "c5f60e894ca0c61cdbae4f5a686d9f08bcefc942","build_date" : "2021-07-14T18:33:36.673943207Z","build_snapshot" : false,"lucene_version" : "8.8.2","minimum_wire_compatibility_version" : "6.8.0","minimum_index_compatibility_version" : "6.0.0-beta1"},"tagline" : "You Know, for Search"
} 
那么shard目录中各种后缀的文件具体是什么含义呢?下面来看下

截图出处:
https://lucene.apache.org/core/8_8_2/core/org/apache/lucene/codecs/lucene87/package-summary.html#package.description
从表格中可以看出与FST相关的文件后缀有:tip、tim,从这里就可以看出FST文件是以segment维度来创建的。
