Redis缓存空间怎么优化

场景设定1、我们需要将pojo存储到缓存中，该类定义如下
public class testpojo implements serializable { private string teststatus; private string userpin; private string investor; private date testquerytime; private date createtime; private string bizinfo; private date othertime; private bigdecimal useramount; private bigdecimal userrate; private bigdecimal applyamount; private string type; private string checktime; private string preteststatus; public object[] tovaluearray(){ object[] array = {teststatus, userpin, investor, testquerytime, createtime, bizinfo, othertime, useramount, userrate, applyamount, type, checktime, preteststatus}; return array; } public creditrecord fromvaluearray(object[] valuearray){ //具体的数据类型会丢失，需要做处理 }}
2、用下面的实例作为测试数据
testpojo pojo = new testpojo();pojo.setapplyamount(new bigdecimal("200.11"));pojo.setbizinfo("xx");pojo.setuseramount(new bigdecimal("1000.00"));pojo.setteststatus("success");pojo.setchecktime("2023-02-02");pojo.setinvestor("abcd");pojo.setuserrate(new bigdecimal("0.002"));pojo.settestquerytime(new date());pojo.setothertime(new date());pojo.setpreteststatus("processing");pojo.setuserpin("abcdefghij");pojo.settype("y");
常规做法system.out.println(json.tojsonstring(pojo).length());
使用json直接序列化、打印 length=284**，**这种方式是最简单的方式，也是最常用的方式，具体数据如下：
{"applyamount":200.11,"bizinfo":"xx","checktime":"2023-02-02","investor":"abcd","othertime":"2023-04-10 17:45:17.717","precheckstatus":"processing","testquerytime":"2023-04-10 17:45:17.717","teststatus":"success","type":"y","useramount":1000.00,"userpin":"abcdefghij","userrate":0.002}
我们发现，以上包含了大量无用的数据，其中属性名是没有必要存储的。
改进1-去掉属性名system.out.println(json.tojsonstring(pojo.tovaluearray()).length());
通过选择数组结构代替对象结构，去掉了属性名，打印 length=144，将数据大小降低了50%，具体数据如下：
["success","abcdefghij","abcd","2023-04-10 17:45:17.717",null,"xx","2023-04-10 17:45:17.717",1000.00,0.002,200.11,"y","2023-02-02","processing"]
我们发现，null是没有必要存储的，时间的格式被序列化为字符串，不合理的序列化结果，导致了数据的膨胀，所以我们应该选用更好的序列化工具。
改进2-使用更好的序列化工具//我们仍然选取json格式，但使用了第三方序列化工具system.out.println(new objectmapper(new messagepackfactory()).writevalueasbytes(pojo.tovaluearray()).length);
选取更好的序列化工具，实现字段的压缩和合理的数据格式，打印 **length=92，**空间比上一步又降低了40%。
这是一份二进制数据，需要以二进制操作redis，将二进制转为字符串后，打印如下：
��success�abcdefghij�abcd��j�6��xx��j�6��?`bm��@i��q�y�2023-02-02�processing
顺着这个思路再深挖，我们发现，可以通过手动选择数据类型，实现更极致的优化效果，选择使用更小的数据类型，会获得进一步的提升。
改进3-优化数据类型在以上用例中，teststatus、precheckstatus、investor这3个字段，实际上是枚举字符串类型，如果能够使用更简单数据类型（比如byte或者int等）替代string，还可以进一步节省空间。可以使用long类型代替字符串来表示checktime，这样序列化工具输出的字节数会更少。
public object[] tovaluearray(){ object[] array = {toint(teststatus), userpin, toint(investor), testquerytime, createtime, bizinfo, othertime, useramount, userrate, applyamount, type, tolong(checktime), toint(preteststatus)}; return array;}
在手动调整后，使用了更小的数据类型替代了string类型，打印 length=69
改进4-考虑zip压缩除了以上的几点之外，还可以考虑使用zip压缩方式获取更小的体积，在内容较大或重复性较多的情况下，zip压缩的效果明显，如果存储的内容是testpojo的数组，可能适合使用zip压缩。
对于小于30个字节的文件，zip压缩可能增加文件大小，不一定能减少文件体积。在重复性内容较少的情况下，无法获得明显提升。并且存在cpu开销。
在经过以上优化之后，zip压缩不再是必选项，需要根据实际数据做测试才能分辨到zip的压缩效果。
最终落地上面的几个改进步骤体现了优化的思路，但是反序列化的过程会导致类型的丢失，处理起来比较繁琐，所以我们还需要考虑反序列化的问题。
在缓存对象被预定义的情况下，我们完全可以手动处理每个字段，所以在实战中，推荐使用手动序列化达到上述目的，实现精细化的控制，达到最好的压缩效果和最小的性能开销。
可以参考以下msgpack的实现代码，以下为测试代码，请自行封装更好的packer和unpacker等工具：
<dependency> <groupid>org.msgpack</groupid> <artifactid>msgpack-core</artifactid> <version>0.9.3</version></dependency>
public byte[] tobytearray() throws exception { messagebufferpacker packer = messagepack.newdefaultbufferpacker(); tobytearray(packer); packer.close(); return packer.tobytearray(); } public void tobytearray(messagebufferpacker packer) throws exception { if (teststatus == null) { packer.packnil(); }else{ packer.packstring(teststatus); } if (userpin == null) { packer.packnil(); }else{ packer.packstring(userpin); } if (investor == null) { packer.packnil(); }else{ packer.packstring(investor); } if (testquerytime == null) { packer.packnil(); }else{ packer.packlong(testquerytime.gettime()); } if (createtime == null) { packer.packnil(); }else{ packer.packlong(createtime.gettime()); } if (bizinfo == null) { packer.packnil(); }else{ packer.packstring(bizinfo); } if (othertime == null) { packer.packnil(); }else{ packer.packlong(othertime.gettime()); } if (useramount == null) { packer.packnil(); }else{ packer.packstring(useramount.tostring()); } if (userrate == null) { packer.packnil(); }else{ packer.packstring(userrate.tostring()); } if (applyamount == null) { packer.packnil(); }else{ packer.packstring(applyamount.tostring()); } if (type == null) { packer.packnil(); }else{ packer.packstring(type); } if (checktime == null) { packer.packnil(); }else{ packer.packstring(checktime); } if (preteststatus == null) { packer.packnil(); }else{ packer.packstring(preteststatus); } } public void frombytearray(byte[] bytearray) throws exception { messageunpacker unpacker = messagepack.newdefaultunpacker(bytearray); frombytearray(unpacker); unpacker.close(); } public void frombytearray(messageunpacker unpacker) throws exception { if (!unpacker.tryunpacknil()){ this.setteststatus(unpacker.unpackstring()); } if (!unpacker.tryunpacknil()){ this.setuserpin(unpacker.unpackstring()); } if (!unpacker.tryunpacknil()){ this.setinvestor(unpacker.unpackstring()); } if (!unpacker.tryunpacknil()){ this.settestquerytime(new date(unpacker.unpacklong())); } if (!unpacker.tryunpacknil()){ this.setcreatetime(new date(unpacker.unpacklong())); } if (!unpacker.tryunpacknil()){ this.setbizinfo(unpacker.unpackstring()); } if (!unpacker.tryunpacknil()){ this.setothertime(new date(unpacker.unpacklong())); } if (!unpacker.tryunpacknil()){ this.setuseramount(new bigdecimal(unpacker.unpackstring())); } if (!unpacker.tryunpacknil()){ this.setuserrate(new bigdecimal(unpacker.unpackstring())); } if (!unpacker.tryunpacknil()){ this.setapplyamount(new bigdecimal(unpacker.unpackstring())); } if (!unpacker.tryunpacknil()){ this.settype(unpacker.unpackstring()); } if (!unpacker.tryunpacknil()){ this.setchecktime(unpacker.unpackstring()); } if (!unpacker.tryunpacknil()){ this.setpreteststatus(unpacker.unpackstring()); } }
场景延伸假设，我们为2亿用户存储数据，每个用户包含40个字段，字段key的长度是6个字节，字段是分别管理的。
正常情况下，我们会想到hash结构，而hash结构存储了key的信息，会占用额外资源，字段key属于不必要数据，按照上述思路，可以使用list替代hash结构。
通过redis官方工具测试，使用list结构需要144g的空间，而使用hash结构需要245g的空间**（当50%以上的属性为空时，需要进行测试，是否仍然适用）**
在以上案例中，我们采取了几个非常简单的措施，仅仅有几行简单的代码，可降低空间70%以上，在数据量较大以及性能要求较高的场景中，是非常值得推荐的。：
• 使用数组替代对象（如果大量字段为空，需配合序列化工具对null进行压缩）
• 使用更好的序列化工具
• 使用更小的数据类型
• 考虑使用zip压缩
• 使用list替代hash结构（如果大量字段为空，需要进行测试对比）
以上就是redis缓存空间怎么优化的详细内容。

Redis缓存空间怎么优化

VIP推荐