Amazon Kinesis 處理(近乎即時)串流資料的服務。主要分Kinesis Data Stream 和 Kinesis Video Stream。
- 是由1..n個shard組成
- producer -> data record -> stream -> shard -> consumer
- producer, e.g. applications, client, SDK, Kinesis Client Library (KCL), Kinesis Agent...etc
- consumer, e.g. Apps(SDK, KCL), Lambda, Kinesis Data Firehose, Kinesis Data Analytics...etc
- read/write時的data record都包含partition key和data blog。read的data record多一個sequence no (at shard level)
- 資料用同個partition key會被分派到同個shard (ordering)
- e.g. 用 user_id 當作partition key, 所以這個用戶的資料都送往同個shard, 且保有順序性
- retention時間為1~365 days, 一但資料送到kinesis data stream, 就不能被刪除
- Capacity Modes
- provisioned mode
- 需自己手動調整shard數量
- 單個shard能吃的data record大小限制為 1MB/sec或 1000 records/sec
- 單個shard能被consume的data record大小限制有兩種
- 2MB/sec (shared) for all consumer
- 2MB/sec (enhanced) per consumer
- 每個shard以每小時為單位計費
- on-demand mode
- 預設單個shard的data record大小限制為 4MB/sec或 4000 records/sec
- 根據前30天吞吐量的高峰值,自動調整shard數量
- 每小時資料串流量計費 (per GB/hour)
- 分析streaming資料,並將結果輸出至Sinks (Kinesis data streams|firehose)
- for SQL application
- for Apache Flink (aka Amazon Managed Service for Apache Flink)
Reference
- https://www.udemy.com/course/aws-certified-solutions-architect-associate-saa-c03