What we care about is. Two legitimate users inadvertently generating the same address. An attacker deliberately trying to generate collisions with the addresses of existing unspent outputs.

We can't reduce the risks of these to zero but we can reduce them to negligible levels. Lets consider case 1 first.

I don't have a good way to find the total number of addresses that have ever been used but we can get an upper bound by taking the size of the blockchain and dividing it by the size of an address.

The blockchain is now about GB, divide by 20 gives us an upper bound of about 7.

In reality the blockchain contains a lot more than just addresess and so I expect the real number is going to be lower than this since the blockchain carries a bunch of stuff other than addreses.

The probability of an accidental collision can now be approximated by the equation.

The malicious actor is attempting to find an address collision with an "unspent" Finding a collision with an output that has already been spent doesn't help him. Now of course the attacker can try many times.

Lets assume that generating an address takes the same effort as attempting to hash a block in reality it takes more. Lets assume that the attacker has as much hashing power as the whole bitcoin network put together and that they run their attack for a century.

