STATISTICS (UPDATED 7/30/2009)
NOT EVERYTHING THAT LOOKS IMPRESSIVE IS IMPRESSIVE
The ever constant question that pops up whenever a new matrix is found is (or should be), "Is this find significant, or the result of simple (often high) probability?" The more professional researchers generally turn to the Monte Carlo technique to arrive at an answer. Dr. Robert Haralick, who wrote the Foreword of Ark Code, has a discussion about probability on his site at http://www.torah-code.org/probability/probability.shtml. He also has a discussion about scrambling Torah texts to produce what are called "Monkey Texts" at http://www.torah-code.org/monkey_texts.shtml. This relates to the Monte Carlo technique. As applied to Torah Codes, ones scrambles the Torah text 10,000, 100,000 times or more, and then compares the results of key terms meeting in the real Torah text with the scrambled text. It matters greatly whether or not the Torah text that is scrambled to produce a Monkey text is scrambled at the letter level, word level, or verse letter. Because much is often made out of key words meeting certain phrases, it's probably better to scramble at the verse level and then use comparisons between it and the original text. If one scrambles the Torah text 10,000 times, and then finds 1,000 better (tighter, smaller) meetings between key terms in the "Monkey Text," then the chances are, that at best, the find in the Torah text had one chance in ten of being there. This is NOT significant. But if, in the 10,000 trials, there were only 100 as good or better meetings of terms in the monkey text, then the find in Torah might have had only about one chance in a hundred of being there. In this case, it begins to become interesting. A problem with the Monte Carlo technique when I began my research was that it was (a) often very slow due to the need to check such a high number of scrambled texts, each with 304,805 letters, and (b) it required computer software that was not available to the general public. As such, I developed an alternate method that focused on the number of rows and columns in a matrix, the number of ways a key term could fit in that size matrix (vertically, horizontally, diagonally, forwards, and backwards*); the frequency of this term at an ELS when limited to the number of ways it could fit in such a matrix, the percent of total Torah employed in the matrix; the chi-square value, and the combined total probability for everything found (which must then be adjusted for those things sought and NOT found). For those who are interested in examining the entire process, please buy a copy of ARK CODE and examine Appendices A and B (pages 183 to 221). Note: In the case of ELS maps, the probabilities derived must then be adjusted due to the requirement for the matrix to have items found be found at the correct course angles that correspond to what is seen on real world maps.
*ROFFMAN SKIP FORMULA.
The number of ways that a term (either forwards, backwards, or diagonal) can fit into a matrix is determined as follows:
(1) Let the number of skips possible in a forward direction on a row of length (r), where r = the number of columns in the matrix, be equal to "Sr."
(2) Likewise, let the number of skips possible in a vertical direction on a column of length (c), where c = the number of rows in the matrix, be equal to "Sc."
(3) The Roffman Skip Formula for total skips is as follows:
Skips = 2(Sr + Sc + 2[Sr][Sc]) = 2Sr + 2Sc + 4SrSc.
An example of skip value determined through use of the above tables and formula follows: Find the number of skips possible for a 4-letter word in a Matrix 28 columns by 11 rows.
Solution: Skip Tables for words ranging between 3 and 8 letters are posted on this site. For a 4-letter word use Table 1B. On it find that 28 columns = 9 possible skips forward. Thus Sr = 9. Now note that 11 rows = 3 possible skips vertically. Thus Sc = 3. Now apply the formula which is Skips = 2(Sr + Sc + 2[Sr][Sc]) 2(9 + 3 + 2(9)) = 2(12 + 54) = 2(66) = 132 SKIPS. Now, let us suppose that the 4-letter term occurred at skip 100. To get an idea of how likely such a term is to be found at an ELS, search a range of 132 skips, such as from skip 101 to skip 232. The number of "hits" for this term is then divided by letters in the Control (if this is scrambled Torah the number of letters in the Control is the same as in Torah, i.e., 304,805). The quotient is the Word Frequency Per Letter. This is multiplied by the number of letters on each matrix to reveal Word Expectancy Per Matrix. It is inherent in this procedure that the larger the number of letters in the matrix, the larger the number of placements possible for any given key word at any ELS. After determining Word Frequency Per Plot we apply the Poisson Equation to see the probability that they are present at least once. This is necessary to determine a true probability for each word. Just because a word is likely to appear once per plot does not imply it will always be there. Words may average out to many times per plot area without actually being in a given plot of that area. Of course, if the expected frequency is sufficiently high we eventually reach a probability like .9999999 which we simply round off as 1.0.
HOW TO FIND THE CHANCE OF A TERM APPEARING AT LEAST ONCE*
1. FIND PROBABILITY IT DOES NOT OCCUR BY POISSON EQUATION.
f(x) = Lambda e x = 0; lambda = expected frequency per matrix
2. 1 ‑ f(0) = THE PROBABILITY OF OCCURRING AT LEAST ONCE.
(where f(0) = the probability it will not occur)
3. On an Excel or Works spreadsheet, head columns as follows: A: Whatever identifies the calculation; B: Skips Used on the Matrix, C: Number of hits (on CodeFinder or similar software) in Skip Range; D: Divide by 304,805 Letters in Torah or Control; E: The Quotient Equals Frequency Per letter; F: E Quotient Multiplied by Letters on Matrix = Word Expectancy; G: Poisson Equation = 1-EXP(-F#) where # equals the row number of the item in Column F in question on the spreadsheet. If you want to know the chance for the item to be on the matrix, head Column H accordingly. The value of Column H will be the reciprocal of the value found in Column G by Poisson Equation.
* Note: While this author (Barry S. Roffman) discovered the Roffman Skip Formula, my son (an MIT geophysics graduate), Rabbi Robert Roffman, is the author of the spreadsheets and the man who first introduced use of the Poisson Equation into my research.
There is some indication that when a row skip or row split function for the axis term is employed, that the true value of an open text match must be the value computed by standard means divided by the row spit. The lowest ELS of Ark of the Covenant at skip -306 (cylinder circumference 306 letters) had about one chance in 2,931 of being in a 104-letter matrix with Egyptians were burying. At skip -306 there was no row split function enabled on CodeFinder. Had it been enabled and a row split of 2 were used (with a cylinder circumference of 153 letters), if the matrix size (area) were the same, I would have divided 2,931 by 2 to arrive at a value of 1 chance in 1,465. In this case however, the matrix with circumference 153 would have been larger because the match on the matrix with circumference 306 was already about as tight as it could be with the row skip function disabled. There is also a discussion about dividing the value of a matrix by the number of passes through the Torah made by CodeFinder on a wrapped (rounded torus) search before acquiring an axis term. See the permutation experiment.
SPECIAL CASE SKIPS
Finally, when computing the value of a-priori open text terms on a matrix, it is my practice to only employ the frequency of this term at Skip +1 (in unwrapped Torah) on my spreadsheet in column C. However, if the a priori term appears at skip -1, N (parallel to, in the same direction, and at the skip of the axis term), or -N (parallel to, in the opposite direction, and at the skip of the term) in column C, I list the frequency (number of hits) as the total hits at skips +1, -1, N, and -N (with wrapped Torah allowed if that was required to find the axis term. These skips are considered special because they seem to leap out at the eye of the researcher and make the case for deliberate encoding seem more plausible.