The Science and Mathematics of Blockchain Cuties Breeding

By Daniel Goldman | The B.C.U. Times | 25 Jun 2019

Genetics

Cutie genetics are fairly different from regular genetics. The BC wiki already covers most of the basics needed for this discussion. The main point to cover here is that a gene refers to a specific trait. Alleles are specific versions of a gene. In human biology, for instance, there is a gene — and this is an oversimplification — that codes for eye color, and multiple alleles for that gene that code for blue eyes, green eyes, brown eyes, etc. So when I mention an allele, I mean a specific version of the gene, or in this case, a specific value which can range from 0-F.

Probability

To get started, it’s important to understand what “random” means. A random process is not a process with no rhyme or reason. A random process can have a great deal of structure, and we can learn a lot about that process. The only thing is, we cannot know, in advance, exactly what the result of the process will be.

One of the most basic examples of a random process is flipping a coin. Now, because physics is involved, it’s not truly random, or unbiased, however in most cases, we simplify matters by assuming it is. The odds of either a head or a tail on a flip is 0.5 or 50%. The coin example is a good one, because in BC breeding, we have a 50% chance of a gene coming from a mother and 50% chance of it coming from the father.

A success, in a trial or experiment is getting the desired outcome. For instance, if you want to find the probability of getting a head, a head is a success. To calculate the probability of an event, we calculate the number of ways in which we can get a success, and divide it by the total number out potential outcomes. In the case of a coin, it’s just one way to get a success, and there are two potential outcomes (head or tail) so the probability is 1/2 or 50%.

One iteration of a random process is called a “trial” or an “experiment.” By knowing the kind of random process, we can start to discuss multiple trials. The easiest cases are when each trial is independent. Two independent trials do not impact each other. Coin flips are generally considered random.

One of the most common examples of a process where trials aren’t independent is having a jar of items and removing some of those items each time. For instance, if there are 10 blue marbles and 50 red marbles, then the probability of selecting a blue marble is 10/60, since there are 10 ways to get a success, while there are 60 different outcomes. However, once we take a marble out, the numbers change, and so the trials are not independent.

Luckily breeding involves independent trials. The probability of getting one gene from a given parent doesn’t depend on the others. The only issue is dealing with recessives, but I’ll cover that in a minute. For now, how do you figure out probabilities involving multiple trials? What if you want to know the probability of getting 5 heads out of 19 trials, or in terms of breeding, getting the 5 alleles that you need to get the desired trait.

That’s the probability of getting the first trait, and the second, and the third, and the fourth, and the fifth. So long as trials are independent, you can just multiply the probability of each success, or in this case 0.5⁵. That gets low pretty quickly. The probability of getting all five alleles, ignoring recessives, is only about 3%. So does that mean you can breed 100 and know you’ll get 3? Nope. You cannot know ahead of time what the results of a random process will be. Maybe you’ll end up with 10 successes, or maybe you’ll get none.

Successful Breeding

But we can calculate the probability of getting at least one successful breeding in so many trials. But how? If we want to know the probability of at least one success, we would need to calculate the probability of exactly one, two, three, …, 100 successes, and then multiply them right? Luckily there’s a way around this method. It involves another basic rule in probability.

Two events A and B are complementary if either one must occur or the other must occur, but they cannot both occur. So for instance, a coin mustcome up heads or tails, but it cannot be both. These kinds of events have the following relationship: P(A) =1-P(B). So what’s the complement of at least one success? It’s zero successes, or all failures. So now we need to know the probability of a failure. Luckily we can apply our rule again. A success and a failure are complementary, so P(failure) = 1-P(success).

So the probability of a failure is 1-P(success), which is 1–0.5⁵. There are 100 trials, and each one is a failure, so we raise that whole quantity to the 100th power. Then we want to subtract that from one. I could make this tutorial look prettier if Medium had LaTeX support, but the result is about 95.8% so we’ve got a 95.8% chance of getting at least one out of 100 breeds.

Of course, if some of the alleles match in both positions, it gets even easier, as you only need to worry about the ones that are different. However, you also have to worry about recessive traits.

There’s a 4% chance that recessive #2 genes end up as a dominant gene and a 20% chance of a recessive #1 trait becoming a dominant trait. I already mentioned the rule to calculate probabilities of compound events when you want to know the probability of both occurring, but what about the probability of either occurring?

The probability of events A or B occurring is equal to the sum of the two, less the probability of them both occurring. If A and B are complementary though, the product of them both occurring is zero, which makes things easier. In this case, the recessive #1 and recessive #2 traits cannot both become dominant, so it’s just the sum of each probability: the probability of a either recessive trait becoming a dominant trait is 0.24.

So now we can reevaluate the probability of getting a given allele from the parent we want. Instead of 50%, we need to ensure that (1) the dominant trait hasn’t been replaced and (2) that it is transferred to the offspring. So we have P(stays dominant AND transfers). Well, staying dominant is the complement of changing, so that’s 1–0.24 or 76%. The probability of the allele transferring is 50%, so the product is 38%.

We can then redo our earlier analysis. The probability of getting 5 desired alleles is 0.38⁵ or less than 1%. That’s why it’s so useful to have purebreds if you can create them.

Combinatorics

I’ll finish this discussion with a short piece on combinatorics, which is largely the study of counting possible arrangements and collections. It’s useful when figuring out probabilities, because it allows us to calculate how many successful and total outcomes there are for a given scenario.

In some ways, it’s from combinatorics that we get some of our other rules, including the rule for calculating the probability of A and B occurring. If you have two slots, and slot one has m options and the other has n options, then the total number of options you can have is m*n. It’s a useful rule. The school restaurant menu example is a common one to help explain it. Suppose you a cafeteria offers 2 appetizers, 5 entrees, 3 desserts, and 3 drinks. How many different meal combinations can you make? It’s simply the product of all of those options, because they’re all independent and so you have 2*5*3*3 = 90 options.

That’s one reason why a genome with only 19 dominant traits is still a lot. Each gene can have 0-F, or 16 alleles. So that’s 16¹⁹ combinations. Even taking into account that most of those combinations do little, and that some traits are compound traits requiring a specific set of allele combinations, that’s still millions of potential combinations with unique features! By the way, if all 16 alleles were available for all 19 genes, there could be more genetically unique cuties than grains of sand on Earth (7.5*10²² vs 7.5*10¹⁸).

Summary

With a basic understanding of genetics, probability theory, and combinatorics, getting a handle of BC cuties can be a lot easier. Of course, there’s still a lot of time and careful evaluation of genomes, note keeping, etc, but it’s important to start with theory rather than just diving in and trying things out at random. I’ll probably be updating this document over time, and if there are questions and comments, I may expand this article into a series. If you don't yet have a BC account, feel free to use my referral link.

Resources