Computational Work on the Sieve of Eratosthenese to Find Prime Numbers
I have just read about the Sieve of Eratosthenese from this Scientific American article from Facebook and was inspired to give it a try to see how it works, especially because the math and programming seemed simple enough. I worked with a dataset of 11000. I also tried up to 100,000. My browser to freaks out with a dataset larger than that.
I have also tried to setup my process so there would be no numbers that would be worked on more than once for each round of multiple removal. I am calling this the Square Prime Reduction Method (SPRM).
In order to do so, I would look for a pattern that would identify the remaining numbers. The farther along I worked the harder it became. Below I will list my notes, the pattern used for cumulative reduction, and the amount of dataset reduction, and the dataset size required for pattern recognition. This is the result of about a day of work.
See my notes here and the results of my scripting work below.
Square Prime Reduction Method (SPRM)
Using this cumulative mutiple removal method it seems that the beginning for each reduction round is the square of the next prime number (i.e. for 2 it is 4; for 3 it is 9; for 5 it is 25; etc). For each reduction round I:
 start at the square of the next prime
 run a clean process as if I were removing all of the mutliples which showed zeros for the already removed ones
 find the pattern for its remaining multiples
 code to remove the remaining multiples
With each successfully processed prime we are able to further reduce the pool. To reveal all primes under 1000 I would have to follow this process up to the square of the prime of 31 which is 961.
Prime Multiple Reduction (PMR)
I started with a dataset of 11000 to see if I could get it to work. Starting with 2 I processed the removal or the multiples of each prime number as I encountered them as I moved up the list. It also seemed to scale fine for 10,000 and 100,000 and my observations and tenous rules seem to work as the dataset scales.
Factors of 2
First, in any given number set we can get rid of half of the numbers because they are even and will be evenly divisible by 2, so that speeds up things considerably.
Pattern Complexity: Obvious
Pattern: Ummm...well... Every other number! =)
Dataset Reduction: This resulted in a reducing the possible number pool by 49.9% (499).
Factors of 3
Second, we can get rid of every third number because they will evenly divisible by 3.
Pattern Complexity: Obvious  1 extra step
Pattern: In writing the code for this I noticed the following pattern after removing the even numbers: start with 3 and then add 6 and remove that, since the previous value was divisible by 2.
Dataset Reduction: This resulted in a reducing the possible number pool by about 16.6% (166).
Factors of 5
Then, removing factors of 5 becomes slightly more tricky since we have removed numbers divisible by prime numbers 2 and 3 creating quite a few holes in its multiple list.
Pattern Complexity: Easy  2 extra steps
Pattern: The pattern for multiples of 5: start at 5, increment by 30, remove that value and then 10 value.
Dataset Reduction: This resulted in a reducing the possible number pool by 6.6% (66). Not a whole lot, but we are getting closer.
Factors of 7
Removing factors of 7 was much more difficult since we have removed a broad swath of numbers ~ 72.7% of the number pool. Stay on Target!
Pattern Complexity: Moderate  8 extra steps
Pattern: To find the pattern for multiples of 7 I copied the result of a straight removal by 7 and put it into a text file and looked for how the number of zeros (already removed multiples) stacked up between the number of those numbers remaining which were divisible by 7. There was a visual pattern and from there I had to figure it out. I found a repeating pattern of previously removed multiples: 3/1/3/1/3/5/1/5. Turning that into a computational model was fun!
My second idea, which was used, was the opposite sort of implementation than what I used for the multiples of 5. With 5 I went up and then back. With 7 seven I programmatically worked my way forward to the next iteration.
Dataset Reduction: This resulted in a reducing the possible number pool by 3.2% (32).
Factors of 11
Because I needed to jump the max number out from 1,000 to 10,000 in order to try to see a pattern attempting to remove factors of 11 was too difficult to fully pursue,
Pattern Complexity: Difficult  with the numbers and observations below I expect the pattern would be near 32 and 512 extra steps.
Pattern: I was able to see a vague pattern in about 5,000 items, but was not going to pursue that due to lack of time and quality pattern recognition and visual analysis tools.
Dataset Reduction: ??  given the numbers which I will display below I expect a reduction of about 1.98%
Observations
Larger Dataset Needed for Pattern Recognition
As you go further and further through the primes it seems like we need a larger dataset to compensate for the reduced info in the dataset in order to see the pattern. I needed the just over:
 for 3: I do not remeber exactly, but I expect that the pattern was obvious near 20 items.
 for 5: I do not remeber exactly, but I expect that the pattern was obvious near 75 items.
 for 7: about 500 to recognize the pattern
 for 11: I bumped it up to 10,000 and a pattern of sort was seen near 5,000
Possible rule: To find the pattern for each successive prime we will need a dataset 10 times larger than the previous one. With this I am not really taking into account the first two since the pattern is super simple to find because there is minimal processing and with a datapoint reduction which is very easy to compensate for.
Or perhaps the dataset has to do with the number of multiples removed and the remaining number of datapoints remaining from the prime's square value.
That is supremely annoying. I would need better tools to play with larger numbers. =O
Increasing Complexity in Pattern
As each square is processed it will become more and more difficult to process the pattern to prevent processing already removed numbers. As mentioned above the dataset requirements grow in order for us to see the pattern, but also so does the work required to process the pattern to remove subsequent multiples. This is also biased by by my limitations, biases, and inexperience with such work. I definitely need to get through processing the pattern for 7 before I really state anything, but, since I am stopping here, I will throw out the following tenuous rule:
Possible Rule: Based on my very limited work the complexity increase with each prime processed is the one of these 3 ideas: the square of the previous number of complexity steps * 2; the cube of the previous number of complexity steps; or number of complexity steps * 4.
This was more evident from processing 5 and 7:
 2  0 extra programmatic steps
 3  1 extra programmatic step
 5  2 extra programmatic steps
 7  8 extra programmatic steps
 11  I expect near 32, 128, or 512 extra programmatic steps
Keep in mind, this is probably not close to wholly accurate given my limited data and processing (up to 7), but it might give you a place to start with in understanding the complexity of finding the multiples in each successive prime generation using this technique. I really need to find out how many extra steps are required for 11 to have a better idea about my observations.
Diminishing Returns with each Processing Generation
When we start with 2 we remove half of the datapoints and as we process each generation we reduce fewer and fewer data points which will seem to result in diminishing returns for each generation.
 2  49.9%
 3  16.6%
 5  6.6%
 7  3.3%
 11  I expect this will be near 1.98%
I expect there is a mathematical relationship to be found here, but I am not sure what yet. I also expect that for any given dataset that proportionaly numbers will be found. At some point, depending on the size of your dataset, may not be worth the extra work to try to find a pattern and just working on a brute force technique to process.
Possible Rule: Law of Diminishing Returns  the % of multiples removed as compared to the previous generation is roughly equal to the delta removed by the previous generation + 10%. This average increase may also be increase be a few % points too assuming that the delta from 235 has any real bearing on further results.
Square Prime
Possible Rule: It appears that once you get done removing a prime's multiples from a dataset (and all previous primes' multiples) then all numbers up until the square of the next prime will be prime numbers.
Closing
Working through the primes through 7 gives us to a total reduction of 76.5 of the dataset up to 1000. =O My primes are correct up the the next square which is for 11 @ 121. I would need to process that and then prime numbers list would be correct until the next square  13 @ 169. I am expecting that this will be the patttern if we were to continue. Since I need a really large dataset to see the pattern for 11 I think I will stop now. Hopefully, someone will find this useful. =) I am sure there are much more complex mathematical relations here than I able to discern with my limited knowledge (not a mathemetcian), time, tools, and experience.
Possible Rules
Here is the collection of the potential observed rules from above so we can have them all in one place:
 Increasing Dataset Size: To find the pattern for each successive prime we will need a dataset 10 times larger than the previous one.
 Increasing Pattern Complexity: The increase with each prime processed is the one of these 3 ideas: the square of the previous number of complexity steps * 2; the cube of the previous number of complexity steps; or number of complexity steps * 4.
 Law of Diminishing Returns: the % of multiples removed as compared to the previous generation is roughly equal to the delta removed by the previous generation + 10%
 Square Prime Rule: Once you get done removing a prime's multiples from a dataset (and all previous pime's multiples) then all numbers up until the square of the next prime will be prime numbers.
Data Analysis
The info for primes 11, 13, 17 are predicted based on my limited observational projections here:
prime 
# of nonrepeating
datapoints reduced 
% of total datapoints
reduced 
% delta in reduction
from previous gen 
dataset required to see pattern 
extra programmatic
steps 
prime 
# of nonrepeating
datapoints reduced 
% of total datapoints
reduced 
% delta in reduction
from previous gen 

extra programmatic
steps 
1 
 
 
 
 
 
2 
499 
49.90% 
49.90% 
 
0 
3 
166 
16.60% 
33.27% 
30 
1 
5 
66 
6.60% 
39.76% 
75 
2 
7 
33 
3.30% 
50.00% 
500 
8 
11 
19.8 
1.98% 
60.00% 
5000 
512 or 128 or 32 
13 
13.86 
1.39% 
70.00% 
50000 
134,217,728 or 524,288.00 or 128 
17 
11.088 
1.11% 
80.00% 
500000 
2.41785E+24 or 36,028,797,018,964,000 or 512 