Sharding data across service fabric partitions

a year ago

Service fabric gives you two mechanisms out of the box when resolving which partition you hit when calling a Reliable Service. We’ll ignore the singleton partitions as they won’t help us with sharding.

  • Named Partition - This is a fixed name for each partition configured at deploy time.
  • Ranged Partition - This uses an Int64 range to decide which partition a numbered key falls in.

More information can be found here.

Named Partitioning

A named partition allows you to specify explicitly which partition you want to access at runtime. A common example is to specify A-Z named partitions and use the first letter of your data as the key. This splits your data into 26 partitions.

<Service Name="TestService">
<StatefulService ServiceTypeName="TestServiceType"
TargetReplicaSetSize="3"
MinReplicaSetSize="2">
<NamedPartition>
<Partition Name="a"/>
<Partition Name="b"/>
...
<Partition Name="z"/>
</NamedPartition>
</StatefulService>
</Service>

The advantages to this are that it is simple and you know which partition your data goes in without a lookup. Unfortunately as we will test later, you are unlikely to get a good distribution of your data across the partitions.

Ranged Partitioning

With a ranged partition the fabric tooling by default uses the entire Int64 range as keys to decide which partition. It will then convert these into ranges or buckets depending on the partition count.

<Service Name="TestService">
<StatefulService ServiceTypeName="TestServiceType"
TargetReplicaSetSize="3"
MinReplicaSetSize="2">
<UniformInt64Partition PartitionCount="26"
LowKey="-9223372036854775808"
HighKey="9223372036854775807" />
</StatefulService>
</Service>

However to be able to lookup a partition we need a function which can reduce our data to an integer value. To use the configuration above we can convert our strings into an Int64.

var md5 = MD5.Create();
var value = md5.ComputeHash(Encoding.ASCII.GetBytes(value));
var key = BitConverter.ToInt64(value, 0);
var client = ServiceProxy.Create<ITestService>(
key,
new Uri("fabric:/App/TestService"))
  1. Hash the value to a fixed length byte array.
  2. Convert the array to an Int64.
  3. Create the client with the calculated key to connect to the service on that partition.

Ranged Partition with Consistent Hashing

Rather than use the ranges, you can fix your keys and plug in your own hash algorithm to resolve the partition.

<Service Name="TestService">
<StatefulService ServiceTypeName="TestServiceType"
TargetReplicaSetSize="3"
MinReplicaSetSize="2">
<UniformInt64Partition PartitionCount="26"
LowKey="0"
HighKey="25" />
</StatefulService>
</Service>

We now have a key range limited to 0-25 rather than the entire Int64 range. We can resolve a client connected to this partition in the same way, however this time we need to compute a key that fits in this smaller range. I’m using the jump consistent hash implementation in hydra.

var shard = new JumpSharding().GetShard(value, 26);
var client = ServiceProxy.Create<ITestService>(
shard,
new Uri("fabric:/App/TestService"))
  1. Call get shard with the value and number of partitions to distribute across.
  2. Create the client with the calculated key to connect to the service on that partition.

Distribution

To benchmark the distribution we have a list of around 17000 real email addresses. This should give us an idea of how the sharding strategies will distribute the data across 26 partitions. Another advantage of using one of the Int64 methods is that they can be used with any amount of partitions.

We are looking for an even number of accounts allocated to each partition.



PartitionAlphabetConsistent HashRanging
01569684650
1912682730
21027647646
31175662701
4513687700
5415665658
6581653684
7466693637
8405657690
91714681699
10643654669
11608696681
121800734665
13526717647
14213693613
15793693676
1631654683
171039681713
181562661665
19803708747
2046653709
21268693666
22301678679
2355702675
24134670708
25136737744

We can see from those results that sharding using the first character of an email produces wildly different partition sizes, not what we want! Both the jump hash and integer ranging methods produced very even parition sizes.

Conclusion

Based on these results I would use the ranged partitioning method, it produces provides good balancing and is fast to compute. An additional advantage is you do not need to know the partition count in the code, just map your data to an Int64 and service fabric will do the rest.

Posted in: development
Tagged with: azure, .NET, C#, servicefabric


Comments