13 Comments

The age in the age column is not always off by 1-3 years relative to the age at vaccination, because some people were vaccinated in 2023 after their birthday in 2023. The difference between the age column and the age at vaccination is between 0 and 3 years, and the most common difference is 2 years:

> t=as.data.frame(data.table::fread("nz-record-level-data-4M-records.csv"))

> for(i in grep("date",colnames(t)))t[,i]=as.Date(t[,i],"%m-%d-%Y")

> library(lubridate);table(t$age-(t$date_of_birth%--%t$date_time_of_service)%/%years())

0 1 2 3

256801 1042553 2571549 322535

You said that Kirsch told you that "the D.O.B.’s were off by no more than 14 days". However on Kirsch's S3 server, the file `data-transparency/Code/time-series analysis/obfuscation_algorithm.txt` says: "For each person, a non-zero date offset was chosen from a gaussian distribution with sigma=7 and all of the dates for that record were offset for that same amount, so the differences between dates are identical." It's followed by this Python code: `date_delta = 0; while date_delta == 0: date_delta = int(random.normalvariate(0,1) * 7)`. The file doesn't mention anything about a 14-day limit, and only about 95.5% of a normal distribution falls within 2 standard deviations from the mean.

Expand full comment

and yes I confirm the variance as being from 0 to 3 yrs. It's good to know and confirm the mode will be around 2. An observation I look forward to seeing with my own eyes. Gracias

Expand full comment

Grateful for this awesome rabbit hole you are translating for us, and is it possible to group the ages into ranges, perhaps 10 yr intervals, and then look at the cusps or extremes in each range, for example, if some of the 9 & 10 yr olds may belong in the 11-20 yr old category, or if it's not discernible with the data, then add those to the anomalies and take them out of the equations, unless some of the categories have too many anomalies, where you would get a skewed result, so divide them equally between 2 ranges, which would then have to be done for every range? I believe that would give you enough mean averages, and then be formatted to compare apples to apples in compilation with other similar data around the world?

Expand full comment

Very cool, have you found what Igor could be talking about? DOB's in 586 rows doesn't have the same dob per same victim? Of these I guess he is saying 226 rows are off by several decades..? This will take me a while in my old Aztec method. lol

Expand full comment

Accurae Assessments. The most prescient point is that the Quaxcines are all harmful and none are safe or effective. The rest of the analysis may be interesting for some though it is true that evidence changed no ones mind, ever.

Expand full comment

Can you rate from 1 till 10 the New Zealand data? Do you think one day this data can be used for justice. Or the judge will call it nothing burger🙂

Expand full comment

Thank you very much for this explanation. Your video was really clear, I find it most helpful, thank you.

Expand full comment
Comment removed
Dec 28
Comment removed
Expand full comment

I really agree with you. I remember when the data was first released, I saw all the great analysts wondering about what Steve meant by “All the information is randomized”? I have some analysis background but am no where near the level required to work on this dataset, I also wondered about what Steve did to the data.

Also, when Steve released it, he was asking people to verify his output, so he MUST provide all the information regarding all the data manipulations he did to the dataset, otherwise he is wasting people's time. This is the minimum requirement but he did not do that.

I try not to get frustrated as I know I tend to be more prone to making mistakes when I lose my cool.

Thank you for sharing your thoughts!

Expand full comment
Comment removed
Dec 29Edited
Comment removed
Expand full comment

Thank you.

I think this data require very special skills to be able to draw right deductions from it, besides the correct statistical analysis skills which is essential, but first of all they need programming language skills and software to check and process the raw dataset - because the raw dataset has 4 millions rows (Excel has a limit of 1 million so it doesn’t work in Excel) so they need to be able to use SQL, R etc. to inspect the data quality and then process the data into summarised forms (tables and graphs). For example, my understanding is that Prof. Norman Fenton does not have the programming language skills so he relied on the outputs provided by Steve and also Steve’s word that the raw data is all checked and in good quality. But Steve then told everyone that Prof. Fenton is the best statistician in the world and he’d checked and agreed with Steve’s analyses and conclusions (this reminds me of the circular logic fallacy). My understanding is that Steve then told Barry that it all checked out and so Barry went public.

Please correct me if I am wrong on any of these points, thank you.

Ref: Prof Fenton’s substack “The New Zealand vaccine data: what I actually saw and analysed and what the limitations are”

https://wherearethenumbers.substack.com/p/the-new-zealand-vaccine-data-what

P.S. I am not sure if it’s sabotage yet, I will wait for more evidence. I have followed Steve for 2.5 years, he has been in the fighting mode the whole time, being in this mode (and for so long) can make people more prone to making mistakes. Many of Steve’s followers also keep encouraging him to fight.

Expand full comment
Comment removed
Dec 30
Comment removed
Expand full comment

Did you get jabbed?

Expand full comment
Comment removed
Dec 31
Comment removed
Expand full comment

Thank you and Happy New Year.

The company I worked for had two separate departments: Data Processing and Analysts, I was in the latter as a junior, I saw my colleagues competing to produce the coolest charts, while I’d spend time sitting with the Data people to understand what they did to “my” data? Sometimes I discover significant issues so I’d ask them to please adjust and resend. This takes time, so then I am not as efficient as my colleagues. Also, often the data is fine so then my double checking is “a waste of time”, and it also erodes into the time available for producing fancy outputs. In the end, fancy charts impress the clients more and earn more for the company so… Analysts have to just trust the Data people. Problem is: the Data team doesn’t know what/how the Analysts plan to use the data they processed so misunderstandings/mistakes may arise. IMHO I feel this is more of a system issue rather than an individual issue (or maybe both but the pressure from the system is probably the bigger driver).

Expand full comment