排行榜-排序 - 阿毛's笔记

以下是lua版本二分插入排序

-- lua实现
function binary_insert(list, value)
    local left = 1
    local right = #list
    local mid = -1
    while left <= right do
        mid = math.floor((left + right)/2)
        if list[i] == value then
            left = mid
            right = mid
            break
        elseif list[i] < value then
            left = mid + 1
        elseif list[i] > value then
            right = mid - 1
        end
    end
    table.insert(list, left, value)
    return left

按本人所经历的多个游戏项目经验，二分法插入足够支撑最大长度在几万的实时单区榜。有些网上开发者则说能支撑到10w+，本人没实测过，暂不敢相信。二分法插入的性能问题，在于table.insert(list, left, value)要后移left之后的节点，以下是lua-5.1.4/src/ltablib.c中table.insert的代码，case 3:将后部分的元素逐个后移

-- lua-5.1.4/src/ltablib.c
static int tinsert (lua_State *L) {
  int e = aux_getn(L, 1) + 1;  /* first empty element */
  int pos;  /* where to insert new element */
  switch (lua_gettop(L)) {
    case 2: {  /* called with only 2 arguments */
      pos = e;  /* insert new element at the end */
      break;
    }
    case 3: {
      int i;
      pos = luaL_checkint(L, 2);  /* 2nd argument is the position */
      if (pos > e) e = pos;  /* grow array if necessary */
      for (i = e; i > pos; i--) {  /* move up elements */
        lua_rawgeti(L, 1, i-1);
        lua_rawseti(L, 1, i);  /* t[i] = t[i-1] */
      }
      break;
    }
    default: {
      return luaL_error(L, "wrong number of arguments to " LUA_QL("insert"));
    }
  }
  luaL_setn(L, 1, e);  /* new size */
  lua_rawseti(L, 1, pos);  /* t[pos] = v */
  return 0;
}

不过以上的二分插入代码的性还可以再改进一下，

lua数组对应C层对象是TValue *array;，使用realloc分配连续内存，所以理论能通过取巧手段，即在C层用memcpy的方式将内存整体后移，导出接口供Lua层使用。考虑操作内存有些危险，所以不是主要考虑的解决方法。
当要刷新自己的榜内分数时，只需要前面部分的节点交换位置（类似插入排序）。从运营角度看刷榜的角色集中在头部的活跃玩家，加上刷新排名的幅度会非常小，所以实际情况下需要移动的节点数量很少。

但考虑到几倍于单区榜数量的跨服排行榜，就得需要更稳定的数据结构，一种可行解决办法就是网上常常谈及的skiplist结构，redis zset内部就是用到了skiplist，据官方介绍redis zset的查询/插入效率均为O(logN)，实际还是是链式结构，所以插入元素不需要移动前/后的节点。zset的插入/更新指令是ZADD key score member，假设在游戏中有关卡榜DungeonRank，id等于1234的角色刷新到第99关卡，

ZADD DungeonRank 99 "1234"

获取id等于1234的排名，则使用另一个指令ZRANK key member

ZRANK DungeonRank "1234"

然而排行榜很多情况下，需要多条件复合排序，比如当多个角色刷新到第99关卡时，通关消耗时间少排在前面。按ZADD文档上的介绍，当64bit浮点数（有效精度52bit）score数值相同时，默认将以member字符串编码排序，但member需要来存储角色id

Elements with the same scoreWhile the same element can’t be repeated in a sorted set since every element is unique, it is possible to add multiple different elements having the same score. When multiple elements have the same score, they are ordered lexicographically (they are still ordered by score as a first key, however, locally, all the elements with the same score are relatively ordered lexicographically).The lexicographic ordering used is binary, it compares strings as array of bytes.If the user inserts all the elements in a sorted set with the same score (for example 0), all the elements of the sorted set are sorted lexicographically, and range queries on elements are possible using the command ZRANGEBYLEX (Note: it is also possible to query sorted sets by range of scores using ZRANGEBYSCORE).

本人想了两个解决办法

将redis的zset代码独立一份出来，修改zset的排序判断逻辑，让调用端可以自定义【排序比较函数】。 这是非常好的通用接口设计，而且大多数开发者都习惯这么使用。缺点就对zset的内容数据结构了解清楚，否则容易改错。在写本文时，在github搜了一下，发现有不少类似的项目
将所有复合条件映射成有相对关系的数值。 这种做法要求调用层预先规划好，条件信息的bit偏移位置及长度。假设游戏中关卡的最大长度不超过1000，即最多占用空间10个bit（2^10=1024），单场通关时间不得超出10分钟，即最多占用空间4个bit（2^4=16）。关卡的排序优先级高于通关时间，即将关卡10个bit比通关时间4个bit，放在更高的字节偏移上。不过有可能出现64bit（有效精度52位） score不够用的问题，大多数排行榜经常会用到64bit的UTC时间戳作为排序条件，即相同条件下先提交的分数排在前面。仅一个UTC时间戳就已经占满整个score值的内存空间，没有额外的空间存储其它条件信息。回到本段落的句首我写到的“映射成有相对关系的数值”，实际是没有必要存储UTC时间戳这个绝对值，而是寻找“相对关系”。我们往排行榜业务的需求思考，排行榜是不是有重置周期日期，能不能存储相对于重置日期的时间戳？或者相对开服日期的时间戳？这时会考虑到游戏产品的运营生命期，8年需要占用28bit，17年需要占用29bit, 136需要占用32bit。

原文:
https://lizijie.github.io/2021/10/23/%E6%8E%92%E8%A1%8C%E6%A6%9C-%E6%8E%92%E5%BA%8F.html
作者github:
https://github.com/lizijie