How long would it take to break a rsa2048 key on one computer that handles 2.4 billion instructions per second based on the fact there there is approximately
32317006071311007300714876688669951960444102669715484032130345427524655138867890893197201411522913463688717960921898019494119559150490921095088152386448283120630877367300996091750197750389652106796057638384067568276792218642619756161838094338476170470581645852036305042887575891541065808607552399123930385521914333389668342420684974786564569494856176035326322058077805659331026192708460314150258592864177116725943603718461857357598351152301645904403697613233287231227125684710820209725157101726931323469678542580656697935045997268352998638215525166389437335543602135433229604645318478604952148193555853611059596230656
different key combinations?
The above is an approximation. A crude approximation. Your hardware is very fast. However, if it were a thousand times faster the significant difference in an absurd context reduces the value from 550,000 times the age of the universe to 550 times the age of the universe.
To give a more accurate assessment, information on how you want to attempt to factor the number is required. What method? Brute force? The answer is close enough.
Instructions per function are small for addition and multiplication. This increases exponentially when the values are larger than 64 bits (assuming a 64 bit computer). Along with computational functions, the computer has to keep track of the values and the address of the values. After the 64 bit mark then more memory is need to keep track of the memory used. This become like a snake consuming itself via its tail.
Computational (milestones) of a general factorization
Factoring of 542 bits (163 digits), in 1993
Factoring of 774 bits (233 digits), 2000
Factoring of 809 bits (244 digits), 2003
Factoring of 911 bits (275 digits), 2006
Factoring of 1039 bits (313 digits) 2007
Then a factoring of 1061 bits (320 digits) between early 2011 and 4 August 2012 by a group headed by Greg Childers at CSU Fullerton, using the nfs@home BOINC project for about 300 CPU-years of sieving; the linear algebra was run at the Trestles cluster at SDSC and the Lonestar cluster at TACC.[10] on a challenge published by RSA (Source:wikipedia.org/wiki/Integer_factorization_records)
Consider these use the most advanced algorithms of their day and super computer arrays -- Not single computers. Keep in mind this is just the computational time. It does not include the preliminary research and hardware setup time.
The most recent individual record is September 2013, the 696-bit RSA-210 was factored by Ryan Propper. (ibid). (This still was not done on a single computer). This was brilliant piece of work, but the fact is, there is an element of chance involved. The numbers sometimes conveniently converge.
Repeating the process for a different composite value requires the same “average” amount of time. Solving one does not speed up the process for solving another.
The best factorization algorithm is called the General Number Field Sieve. This speeds up the process by a factor of 1000 to 10000 or more, but only if parallel processing is used. To implement it on a single computer would not save any time. (It might for much smaller numbers). A general idea here is this allows a “deconstruction” of large numbers into sections. These sections are processed in parallel and then reassembled.
Considering the General Number Field Sieve, there is a great deal involved in factoring large numbers. The processing power increases exponentially in relation to the size of the number. (P vs NP shows clearly here: The solution, if known, is easy to verify. If the solution is unknown is very difficult to find).
There must be an initial effort by very intelligent, seasoned mathematicians to choose proper polynomials and other parameters. Seasoned mathematicians still have a difficult time with this. It requires weeks of preliminary testing to verify general accuracy.
GNFS consists of two parts, the sieve and the linear reduction. The sieve can be made in parallel over hundreds or thousands of machines. The linear reduction involves computing with a matrix that is larger by several orders of magnitude over a computer array that has terabytes of RAM.
The matrix contains residuals and is sparsely populated. There are algorithms to compress and compute in this format. However, this too adds computational overhead. (One brilliant approach is to over-sample with the sieve. This speeds up the process by making the matrix more efficient).
This number is large enough that a consensus estimates 7 to 10 years of advances in computational power before it will be feasible to attempt this. Then it will require 2 to 5 years to complete.
Your computer, if optimized, would require months to factor a 512 bit number. If 4 months were an average and each bit doubles that time, where 513 bits is (512*2^1), then going to 2048 bits is (512*2^1536), it starts making clear why the time is 550,000 times greater than the age of the universe. (It is not melodrama –though it sounds like it).
You could program the computer to “guess” the primes. If this is an RSA key, then the primes will be relatively far apart (read an RSA text to understand what relatively far apart means). Generate random large odd numbers, then use a Rabin-Miller primality algorithm to test (almost) confirm if it is a prime. Do this until two large primes are found. After finding two large primes multiply them. If it equals the RSA key, the solution is known, the totient can be calculated etc, etc, etc.
Or consider, after findind a large prime divide the RSA key by this prime if it equals another prime you have found the solution. Other considerations are keeping track of the found primes and comparing them sequentially, moving up the array. This sounds easier than it is. Here why.
A 2048 bit number is 617 decimal digits long. A value this size has approximately 7.04E613 prime numbers below it. Many will be hundreds of digits long. The storage space for this is enormous. Here is another perspective comparing the probability of correctly guessing the primes. Here, all odd numbers are considered before the primality test.
What is the probability of choosing 1 of the 2 correct primes among all odd numbers?
(2^2047) = number of odd numbers in (2^2048)bits.
1.75E8 = reciprocal probability of US Powerball lottery win.
$$$\frac{Log(2^2^0^4^7)}{2Log(1.75E8)}=37.38$$
The number of times of correctly guessing Powerball numbers in sequential events. i.e equivalent to wining the Powerball lottery more than 37 times in a row, with a single ticket for each event.
Could this happen? Yes. Is it likely to? NO!
Another major step in reducing the search space is the search for “p” or “q”. Except for where p = q then p^2 =N or q^2 =N (never acceptable for RSA), then p is in the bottom half and q is in the top half. The center of the search space is the SQR(N).
This is logically obvious, but knowing this does not change the time required to find the compliment. Either N is divided by p and the result tested for primality or values of q are generated and iterated up or down until a confirmation that no prime (q) exist for selected value of (p). How far to search before using another (p) prime is another part of the guessing process. There are many variations of this process.
If this were easy then it would not be secure.
------------
I would appreciate another mathematician verifying the lottery correlation analogy probability (and any other math in this text).
~~D~~
About
$$8 * 10^{15}$$ years give or take a trillion years. It's still less time than it will take CPhill to find that Roman ZERO.
Yes, it is an actual calculation. However, it is crude.
I will post an explanation and clarification in the next 12 hours.
The above is an approximation. A crude approximation. Your hardware is very fast. However, if it were a thousand times faster the significant difference in an absurd context reduces the value from 550,000 times the age of the universe to 550 times the age of the universe.
To give a more accurate assessment, information on how you want to attempt to factor the number is required. What method? Brute force? The answer is close enough.
Instructions per function are small for addition and multiplication. This increases exponentially when the values are larger than 64 bits (assuming a 64 bit computer). Along with computational functions, the computer has to keep track of the values and the address of the values. After the 64 bit mark then more memory is need to keep track of the memory used. This become like a snake consuming itself via its tail.
Computational (milestones) of a general factorization
Factoring of 542 bits (163 digits), in 1993
Factoring of 774 bits (233 digits), 2000
Factoring of 809 bits (244 digits), 2003
Factoring of 911 bits (275 digits), 2006
Factoring of 1039 bits (313 digits) 2007
Then a factoring of 1061 bits (320 digits) between early 2011 and 4 August 2012 by a group headed by Greg Childers at CSU Fullerton, using the nfs@home BOINC project for about 300 CPU-years of sieving; the linear algebra was run at the Trestles cluster at SDSC and the Lonestar cluster at TACC.[10] on a challenge published by RSA (Source:wikipedia.org/wiki/Integer_factorization_records)
Consider these use the most advanced algorithms of their day and super computer arrays -- Not single computers. Keep in mind this is just the computational time. It does not include the preliminary research and hardware setup time.
The most recent individual record is September 2013, the 696-bit RSA-210 was factored by Ryan Propper. (ibid). (This still was not done on a single computer). This was brilliant piece of work, but the fact is, there is an element of chance involved. The numbers sometimes conveniently converge.
Repeating the process for a different composite value requires the same “average” amount of time. Solving one does not speed up the process for solving another.
The best factorization algorithm is called the General Number Field Sieve. This speeds up the process by a factor of 1000 to 10000 or more, but only if parallel processing is used. To implement it on a single computer would not save any time. (It might for much smaller numbers). A general idea here is this allows a “deconstruction” of large numbers into sections. These sections are processed in parallel and then reassembled.
Considering the General Number Field Sieve, there is a great deal involved in factoring large numbers. The processing power increases exponentially in relation to the size of the number. (P vs NP shows clearly here: The solution, if known, is easy to verify. If the solution is unknown is very difficult to find).
There must be an initial effort by very intelligent, seasoned mathematicians to choose proper polynomials and other parameters. Seasoned mathematicians still have a difficult time with this. It requires weeks of preliminary testing to verify general accuracy.
GNFS consists of two parts, the sieve and the linear reduction. The sieve can be made in parallel over hundreds or thousands of machines. The linear reduction involves computing with a matrix that is larger by several orders of magnitude over a computer array that has terabytes of RAM.
The matrix contains residuals and is sparsely populated. There are algorithms to compress and compute in this format. However, this too adds computational overhead. (One brilliant approach is to over-sample with the sieve. This speeds up the process by making the matrix more efficient).
This number is large enough that a consensus estimates 7 to 10 years of advances in computational power before it will be feasible to attempt this. Then it will require 2 to 5 years to complete.
Your computer, if optimized, would require months to factor a 512 bit number. If 4 months were an average and each bit doubles that time, where 513 bits is (512*2^1), then going to 2048 bits is (512*2^1536), it starts making clear why the time is 550,000 times greater than the age of the universe. (It is not melodrama –though it sounds like it).
You could program the computer to “guess” the primes. If this is an RSA key, then the primes will be relatively far apart (read an RSA text to understand what relatively far apart means). Generate random large odd numbers, then use a Rabin-Miller primality algorithm to test (almost) confirm if it is a prime. Do this until two large primes are found. After finding two large primes multiply them. If it equals the RSA key, the solution is known, the totient can be calculated etc, etc, etc.
Or consider, after findind a large prime divide the RSA key by this prime if it equals another prime you have found the solution. Other considerations are keeping track of the found primes and comparing them sequentially, moving up the array. This sounds easier than it is. Here why.
A 2048 bit number is 617 decimal digits long. A value this size has approximately 7.04E613 prime numbers below it. Many will be hundreds of digits long. The storage space for this is enormous. Here is another perspective comparing the probability of correctly guessing the primes. Here, all odd numbers are considered before the primality test.
What is the probability of choosing 1 of the 2 correct primes among all odd numbers?
(2^2047) = number of odd numbers in (2^2048)bits.
1.75E8 = reciprocal probability of US Powerball lottery win.
$$$\frac{Log(2^2^0^4^7)}{2Log(1.75E8)}=37.38$$
The number of times of correctly guessing Powerball numbers in sequential events. i.e equivalent to wining the Powerball lottery more than 37 times in a row, with a single ticket for each event.
Could this happen? Yes. Is it likely to? NO!
Another major step in reducing the search space is the search for “p” or “q”. Except for where p = q then p^2 =N or q^2 =N (never acceptable for RSA), then p is in the bottom half and q is in the top half. The center of the search space is the SQR(N).
This is logically obvious, but knowing this does not change the time required to find the compliment. Either N is divided by p and the result tested for primality or values of q are generated and iterated up or down until a confirmation that no prime (q) exist for selected value of (p). How far to search before using another (p) prime is another part of the guessing process. There are many variations of this process.
If this were easy then it would not be secure.
------------
I would appreciate another mathematician verifying the lottery correlation analogy probability (and any other math in this text).
~~D~~
DavidQD requested commentary from a mathematician on his summary of the difficulty to break RSA-2048 via the number field sieve (NFS). I run NFS software as a hobby, and am well-versed in its use. I happen to be a mathematician also, but my knowledge of NFS is empirical, rather than via academic interests.
1. There are two flavors of NFS: The special NFS (SNFS) and general (GNFS). SNFS is usable only if the input composite has a certain known form (such as a Mersenne number, 2^p-1). SNFS is quite a lot faster than GNFS, but RSA keys cannot take advantage of SNFS. The factorization records mentioned in David's post are SNFS records, so aren't relevant to an attempt on an RSA key.
2. The largest GNFS factorization I know of is RSA-768, 232 digits. A summary of the effort is at http://www.loria.fr/~zimmerma/records/rsa768.txt.
3. The difficulty of a GNFS job roughly doubles every 5 digits in length of the composite. It does not double for every bit added to the length of the input.
4. A fast home desktop can factor an RSA-512 key in 7-10 days, not 4 months. I have a haswell-E 6-core and can complete such a job in just under a week; a quad-core at 3.5ghz should take 10 days.
5. The estimate of 7-10 years of technological and software advances followed by 2-5 years for a factorization to be carried out is for RSA-1024, not RSA-2048. There is no hope of cracking RSA-2048 via the number field sieve in our lifetimes; a quantum computer or a breakthrough in integer factorization that renders this form of security useless would be required. The 2015 state of the art software package is CADO-NFS, and is freely available. This package may be capable of an attempt on RSA-1024, but the matrix-solving step is a hurdle.
6. It does not take a mathematician to run the first step of polynomial selection; the selection is a matter of letting automated software search for a period of time to find one that allows the longest step (sieving) to run faster. Anyone could produce a polynomial for RSA-1024 in a day, but spending a few hundred CPU-years will yield much greater reductions in sieve time. Polynomials are ranked roughly by score, which is generated by the searching software; the scores are rough estimates (say, within 5% of actual sieve-performance ranking) of how well the sieve step will run, but anyone can fire up the sieve to see how fast a polynomial works. The poly-search step is traditionally run for 3-4% of expected project length, as finding a poly that sieves 4% faster would shorten the sieve step by 4%, making up the time spent on poly search. It is not known how to find a "best" polynomial; the search is akin to pulling needles out of a haystack, and it's always possible that a longer search produces a polynomial that saves sieve time.
7. More info: Poly search and the sieve step are highly highly parallel; any number of machines can participate, with the simplest of organizational structure to distribute workload. Collecting data is slightly less trivial; an individual machine might produce a couple of GB/month in data from the sieve step that needs to be collected in one place. If 10,000 machines are recruited for a year to crack RSA-1024, collecting the data is nontrivial- the sieve data will be measured in petabytes. Unfortunately, the matrix step that comes last must be performed by a single machine; it can be multi-threaded, but the data exchanged among threads is much too great for a massively parallel effort. RSA-768's matrix was 192M by 192M, and took 13 months on a 144-core cluster to solve. RSA-1024 will have ~1000 times as much data, a matrix of perhaps 4G by 4G and a larger memory footprint than is available today.
8. More reading at the NFS@home website, or mersenneforum.org's factoring subforum, or the CADO-NFS software website.