Revisiting the Badge Number - Production Number Correlation Problem

tinyman392 · Feb 9, 2020

So it's been stated that the Civic Type R badge numbers seem to be assigned to cars randomly from the factory. This post isn't to challenge that fact, as it seems that's 100% true (from the data gathered). However, there is a statement floating around that states that the badge numbers have no correlation with production number or year. This is the area I felt were completely false. The data below shows that the badge numbers do correlate with production year and it is certainly possible to predict production year given badge number. So while it is impossible to predict the next badge number for production (they are chosen at random), or predict a badge number given production number, it certainly is possible to correlate badge number with year and even predict it with a good amount of accuracy.

The above paragraph is basically your tl;dr for this incredibly long post.

Background

I joined the CTR forums around March of 2018, but noticed a few things as people started posting badge numbers. The first is that all the previous badge numbers posted (mine included) seemed to be mainly < 10k with a few higher than 10k. Eventually, we began seeing numbers higher than 20k and it seemed like those numbers began flooding in in droves shortly after the first one appeared. Then eventually 30k, and again, they came in full force. We're starting to hit 40k now which seem to be trickling in pretty quickly as well. Also of note is that when we started seeing 20k numbers pop up, numbers < 20k kind of stopped showing up (or slowly began rolling off). Same thing happened when we broke that 30k barrier.

This sort of pattern kind of screams correlation to my eyes. But every post about the badge numbers seems to state that they are completely random and have zero correlation. That isn't the pattern I saw above. During that time, it was only 2018, so data was scarce, I could only see some 2017 data and few 2018 data. Well 2019 eventually came around and now we have some data for that as well and a good number of 2018 data too!

Finding Correlation

I took all of the badge numbers from fk8registry.com and copied them into a file. fk8registery offered the data as a 3-column table containing badge number, color, and year. There are a total of 1023, 1215, and 398 entries for years 2017, 2018, and 2019, respectively (at time of writing). I went ahead computed the following for each year:

The average badge number
The standard deviation amongst the badge numbers
The 95% confidence interval for the badge numbers
The minimum badge number
The maximum badge number

Honda Civic 10th gen Revisiting the Badge Number - Production Number Correlation Problem ProdYearBadgeNumbers

The plot above shows the results of the analysis pretty clearly. The larger, colored dots represent the average badge number for each given production year of CTR. The grey line is the standard deviation amongst the badge numbers. The smaller grey dots represent the minimum and maximum badge numbers for each of the given years. It's pretty clear that the badge numbers are increasing, and thus there is a correlation. A correlation is also visible with the maximum badge numbers as well. 95% confidence intervals were not shown on the plot because they wouldn't be visible if shown (ranges between 130-339), suffice to say, they are that tight.

Predicting Model Year from Badge Number

The dataset from fk8registry contains a ton of datapoints available for 2636 FK8 badge numbers at time of writing. When a lot of datapoints like this are available, it may be possible to build a model to predict stuff. In this case, stuff = production year. More specifically, I took the badge numbers for each individual production year and split them up at a 1:4 ratio into testing and training set so 80% of the data would reside in a training set while 20% would reside in a testing set. I could then use SciKit-Learn to build a decision tree model (default parameters, didn't need to tune a thing) to predict production year given only the badge number.

Using the testing set (blind set not used during training), I took a look to see how successful the model was was predicting the exact production year (2017, 2018, or 2019) of the vehicle with said badge number and the model's ability to predict within 1 year (2017/2018 or 2018/2019). It was 95% successful at predicting the exact production year (guessing most prevalent year would be 46%) and 100% successful at predicting within a year (guessing most prevalent 2 years would be 84%). This result shows that it is possible to predict production year given the badge number.

Simulating Badge Numbers

So, how can a set of badge numbers that's drawn from a parts bin produce a strong pattern if they are selected at random? Well, given a random selection of stuff under a set of rules, patterns can, but don't always do, appear. Numberphile on YouTube has an excellent video on this subject through a video entitled Chaos Game. I've embedded it below, it's worth a watch if you've got a few minutes to blow (it is very straightforward, easy to understand, and may blow your mind).

I went ahead and tried to simulate how badge numbers would be placed on CTRs. So to simulate badge numbers being selected for artificial cars, I went ahead and assumed the following were true:

A just in time manufacturing system is set in place which results in badges being produced and placed into a parts bin. When parts in the bin run low, more parts are produced
Badge numbers are drawn randomly
Batches of 8000 badges are created
When the parts bin has 800 (10%) of badges remaining, 8000 more are produced
Badges are produced sequentially (badge 233 is made before 234, etc.)

40000 simulated vehicles were badged and the results are plotted below (actual CTR production counts are supposed to be around 35k at time of writing).

Honda Civic 10th gen Revisiting the Badge Number - Production Number Correlation Problem SimBd

The X-axis shows the production number of a simulated vehicle while the Y-axis shows the badge number of the simulated vehicle. Note how the first 8000 vehicles is indeed uniformly random. However, after the parts bin is refilled, there seems to be a relatively uniform random selection of 8000-16000 with some small selections in the first 0-8000 batch of badges. With each new batch being produced, the badges from the previous batches are selected at a much lower rate, the older the batch, the lower its chance of being chosen.

The Pearson correlation coefficient was computed for the badge number and their production numbers and a pcc = 0.94 was produced with a 2-tailed p-value = 0.0. This basically shows that in a badging system similar to what the CTR most likely goes through, the badge numbers are indeed heavily correlated despite being chose at random from a parts bin. Additionally, the simulation supports the fact that as the model years increase, the badge numbers are expected to as well. Although it is impossible to predict the production number given a badge number, it's very possible to get a general idea of production year from badge number.

Granted the rules above may not directly be what the CTR production does (batch sizes can technically vary and when they are replenished may be different), you'll still get the very dense "rectangles" as each new batch of badges is made and a pearson correlation coefficient > 0.5 (showing correlation). The earlier the parts bin is replenished, the lower the PCC.

RedGiant217 · Feb 9, 2020

I agree with this. They may be randomly pulled from a parts bin, but the badges are not randomly produced.

jetydosa · Feb 9, 2020

There is definitely a correlation between badge # and VIN#. The higher the badge #, the higher the VIN # - GENERALLY.

I was tracking this via excel spreadsheet for awhile - every CTR ViN I could find via dealer websites that also had pics of the badge #, I put in a spreadsheet. Then I realized I needed to get a life.....LOL

lturner · Feb 9, 2020

jetydosa said:
There is definitely a correlation between badge # and VIN#. The higher the badge #, the higher the VIN # - GENERALLY.

I was tracking this via excel spreadsheet for awhile - every CTR ViN I could find via dealer websites that also had pics of the badge #, I put in a spreadsheet. Then I realized I needed to get a life.....LOL

HA! Way to many to keep track of unlike the ITR.

WindJunkie · Feb 10, 2020

Thanks @tinyman392, great post

There is someone on this forum who always chimes in and states badge numbers are arbitrary and completely inaccurate (I'm not naming who this is, but I think most of the regular CivicX forum members know who this is).

Just like tinyman, I did my own research and found that the manufactured date on the inside of the driver door panel 100% correlates to the badge number, which speaks to tinyman's research. For example, my car was manufactured January 2019, badge number is 24,174. My brother's CTR was manufactured March 2019, and has a badge number of 25,5xx.

This data from tinyman and my method of matching badge numbers to manufacture date is ample evidence for me to believe that the badge number, while it may not be precise, is pretty damn close to the actual number produced (worldwide, since inception).

Gfrank · Jun 9, 2020

Do you think early production cars, mine is a 2017 number 1058 is worth more?

HRace · Jun 9, 2020

Gfrank said:
Do you think early production cars, mine is a 2017 number 1058 is worth more?

I can’t say with certainty but upon selling a vehicle, I imagine that outside of badge number 00001, vehicle condition will ultimately determine the worth or value of the car more than badge number unless a buyer has some strong demand for the exact number you have. I think generally, if all the items (any product) were the same, the lower production numbers may have increased value, but there’s simply too many Type Rs and the specs changed a few times with the minor models and then include vehicle condition, it’s hard to imagine the badge number dictating value. Just my $0.02, I could be way wrong.

Gfrank · Jun 9, 2020

Thought I would ask.. This forum is awesome with a lot of very useful information. Thinking of trading for a 2020. Just a thought maybe my lower number was worth more on trade. Hard decision no one knows how they drive in comparison..

HRace · Jun 9, 2020

Wish I could help more. I have driven both but unfortunately one year apart. I can tell you that I feel they are very close. I’d take either, any day but will choose the 2020; not due to driving dynamic changes. Just like the few extra bits they changed and added. Otherwise if my only way to get a Type R was to own a 17-19, I would never pass that up. Depends on the amount you’d owe and if it’s worth getting a 2020 vs adding whatever you want from the 2020 car (Honda sending obviously should be considered impossible to add). I doubt a dealer would give you anything extra for your low badge number and I think a private party sale would also just cross shop vehicle condition and price, likely ignoring badge number from the final decision.

Dom9lives · Jun 9, 2020

As long as there's no duplicates who the hell cares

Harlaquin · Jun 9, 2020

I think yall have entirely to much time on your hands. End of the day unless its number 1 or the last one made no one cares and effects the value exactly zero. My car was number 141 a fairly low number. Of the 27 dealers I took it too looking for trade value exactly zero gave a shit about the badge number. Those badges are marketing plain and simple thats all.

Gfrank · Jun 9, 2020

Yes.......I understand that now.. This forum is great....Thanks for the information

b2point0h · Jun 9, 2020

Excellent work @tinyman392!!

Your data analysis speaking of standard deviation and Confidence level reminds me of my work in avionics and predicting satellite beam Tx and Rx values

tinyman392 · Jun 9, 2020

Gfrank said:
Do you think early production cars, mine is a 2017 number 1058 is worth more?

Honestly, probably not. There is a lot of production of the Type R, so unless you have number 0 or number 1, ours probably aren’t that special.

2020s May have lowered our value as well (though some feel it increased it). That said, enjoy the car, it’s a ton of fun!

Edit: ours, not just yours ?

Revisiting the Badge Number - Production Number Correlation Problem

tinyman392

Senior Member

RedGiant217

Senior Member

jetydosa

Senior Member

lturner

Senior Member

WindJunkie

Senior Member

Gfrank

Senior Member

HRace

Senior Member

Gfrank

Senior Member

HRace

Senior Member

Dom9lives

Member

Harlaquin

Senior Member

Gfrank

Senior Member

b2point0h

Senior Member

tinyman392

Senior Member

Similar threads