【发布时间】:2021-03-22 02:38:27
【问题描述】:
我的用例是我整天都在向文件写入条目。我可以压缩这些条目,也可以在事后压缩整个文件。这些文件可能会变得相当大(约 10 GB 未压缩),我同时写入多个文件。其他一些考虑因素是我可以将文件拆分为更小的粒度,以解决每个文件压缩的缓冲区问题。对此可能没有明确的正确或错误答案,只是看看是否还有其他需要考虑的因素。
压缩后,这些文件将上传到某种存储介质中,用于存档和可能的日后分析。
按行压缩
| Pros | Cons |
|---|---|
| More space efficient while writing | More Complicated to Implement |
| More space efficient while reading since I can decompress on a per entry granularity | Less efficient in terms of disk space usage vs compressing an entire file |
按文件压缩
| Pros | Cons |
|---|---|
| Better Compression on a per file basis since there is more data that can be compressed | Requires a bigger buffer of disk space to handle writes throughout the day before compressing |
| Simpler to implement, write normally to file and compress afterwards using simple linux tools |
【问题讨论】:
标签: performance file architecture compression