350 Hours of Listening – (results so far)

the last of my Kanji. The first page is a list of all the Kanji i’d “missed” after rechecking a few times.

Hey guys. So i’m actually finished with my first leg of my Kanji journey, that is, the 2,136 Kanji plus or minus maybe 20 Kanji that I added along the way. Because there are so many Kanji (and individual Kanji) to learn, even after I scanned through my lists several times to see if I missed any, about 50  (yes, fifty!) “slipped through the cracks”, so I’ve memorized those now and will be focusing my efforts solidly on vocabulary acquisition for learned Kanji while slowly acquiring more Kanji to get to a total of 3,000 (which i’ll break down in another post). During this learning process I cannot emphasize to future learners that tracking what you are doing is ridiculously essential. It lets you know where you are, keeps you accountable, and keeps you solidly on the path you’ve chosen to take.


Okay, on to today’s topic.

All of today’s post will relate on what I call the Golden Number Theory, which is relative to listening.

Before we get into that theory,  let me remind you of the idea of what we call large data. Learning a language (though often not classified as such) is one, big, giant exercise in memorization. Your goal, it to be able to read, recognize, recall and reproduce hundreds (or thousands) of words, grammar patterns while also memorizing the recognition of these elements in spoken speech (whether live or in media), which I call sonic recognition. 

Sonic Recognition

There is a difference between saying, the single word “Apple” and the sentence “What kind of apple is that?” Due to the nature of phonetics and what is called prosody (the patterns of stress an intonation in a language) words sound different when mixed with other words. Sometimes this difference is stark for a brand new learner of a language. This is why “native speech” is the ultimate test of comprehension, because in addition to varying types of prosody, the speed with which these compressions and intonations happen will be (initially) too fast for your mind to track because you have not yet memorized these sonic shapes, compressions and intonations.

Using the sentence “what kind of apple is that?” When I say it, sounds more like wha kind oh apul is that?” This type of pronunciation is consistent with the compressions and intonations i’ve used my entire life. I am also able to recognize many varieties of these compressions and intonations from other speakers (because our base alphabet is the same). This is why even though Australians sound different from Irish who sound different from Americans who sound different from Jamaicans, we can for the most part all understand one another when speaking with reasonably clarity in English. Sonic recognition is something you must train through listening extensively as it is the only way to learn these shapes in real time. It’s the difference between Jamaicans saying “water” like wata dropping the ‘r’ and Americans saying wahdur dropping the ‘t’. Such phenomena are easy explanations of why many new speakers of a language experience extreme frustration when thrust into speaking, because they are ill-armed to process native speech and unaware of their limitations. Only through high exposure for hundreds of hours, will the brain begin to master the recognition of these phonetics.

So because our task involves large data, we must understand that true mastery requires more work that we think we need to do. But it doesn’t have to be “unusually hard work”. It is more work, in terms of raw time and input that one might assume, but tends to fit within certain parameters for all learners. By comparison, studying words and learning Kanji takes far more effort than passively listening to audio in the beginning. But it also takes quite a bit of effort eventually to actively try and process speech. Not to fear, we have a huge advantage in terms of listening. We are designed to casually listen to things and absorb a large sum of that information subliminally. This means we can get a lot of value from the activity without having to pay direct attention to what we are doing at all times. 

This leads me back to the Golden Number Theory. 

The Golden Number as I see it, is a base number of listening hours that overlaps with study that greatly unlocks your ability to process the language. Notice I didn’t say “shift”, or “increase”. I used “greatly” on purpose. When you hit this number, your shift in comprehension is great, or even extreme like a quantum leap. One situation lead me to this theory initially, when I was learning German.

The German Observation

I was at a point in learning German (in 2019, around the 4.5 month mark) where I said: “this is completely pointless, absolutely impossible, work relegated for mad geniuses and its just TOO MUCH!” I said this to myself after trying TONS of methodolgies to kickstart my listening and overall comprehension after learning 4,500 words over a five week period. After this “frustration point”, about two weeks later, I had a monstrous and (at the time) unexplained leap in my overall comprehension. Out of the blue, I noticed I was able to listen to German teachers speak about learning German, in German, when I could not only two weeks before. I was also able to watch a Tedx type series called Gedankentanken and I was able to watching quite a number of the speeches and follow along with what the presenters were saying when only two weeks before, I couldn’t. I even experienced my first realtime “laugh along moment”, where a speaker said something funny and I laughed at the same time the crowd did. It was mind blowing and I scratched my head asking, “WTF happened?”

In retrospect I believe I’d hit the lower bracket of the Golden Number for listening which I theorize automatically gives you a massive jump in comprehension, by say, 30-50%. Thinking like a visual scene, one goes from “noisy subway” to “a slightly busy street”. Bern Sebastian Kamps, author of Word Brain pretty much confirmed my theory when I happened upon his free ebook recently. Reading his work was like reading my own research (which finally made me feel I wasn’t crazy lol). In terms of listening, he says that 1000 hours of listening is fundamental to advanced command of your target langauge. Many learners say this number is somewhere between 350, 500 and 1000 depending on the language. To add to this statement, what I say relative to research is:

1000 hours of listening is fundamental to advanced command of your target language as you building other foundations through simultaneous progressive overlap in a certain way, manner and time. 

These activities, which are done in this certain way, manner and time period stimulates a factor which I call Brain Power X. 

We all have the ability to learn any language built into us from birth. Brain Power X is the software that does that in a staggered manner over about four years (which is the average time it takes a young child to gain fluency in a language pretty much anywhere on earth). I personally believe that Brain Power X relies on certain types of measurable data, for example, listening. But not just “raw listening”, which is listening without any context, but what I see as “adaptive listening”, which is a lot of listening that is being done while acquiring other data about the language in a specific way. This creates what one can call “progressive overlap” and the pieces start joining together. But what I theorized and observed in realtime, is that not only do I think Brain Power X is real, but you can ‘trigger’ it with data input in the correct way in a predictable period of time. 

What is Brain Power X?

What I see it as, is the brain’s ability to take incredible amounts of data and start to organize it into comprehensible information. I note later in this post that just like German, in Japanese now some parts of grammar (that I have never learned) start to become “understood” without knowing why. I do not mean 100% comprehension. I just mean chunks of conversations you just “get” without knowing what was really said or how to say it yourself. You just understand “what” was said. It is Brain Power X that is the basis for a lot of what i’m doing now, where I am able to predict a certain window of time (with certain activity) that triggers BPX. Once BPX starts activating, certain things start happening. But let me go deeper into my listening so far relative to BPX.

Current Listening Journey

I’ve clocked 350 hours of listening so far, which really should have been 500+ hours, but last month was really testing (i’ve been having health challenges and some days couldn’t muster up the strength for heavy immersion). What’s fascinating to me, is the process at its current stage, feels exactly as it did when I was learning German. This is made all the more familiar since i”m watching similar movies and media (albeit in Japanese) but the way I feel inside is similar because the same things have begun to happen in the same way. I understand far more, things have sharpened and there is this sense of being “two steps behind” where you are almost at the point of comprehension but not fully (a phenomenon which lets you know you are getting damn close to a monstrous breakthrough).

Let’s break down the time by numbers relative to what i’ve observed as ability. This is non-scientific, just based on my personal observations broken up like this.

0-100 hours – basically zero comprehension. Can hear words and some phrases but comprehension is very low and cannot follow all native speech prosody.

100-200 hours – words and patterns become much sharper, words and phrases start to “pop out”, previously watched media becomes more comprehensible when rewatched, the brain gets more comfortable with higher load.

200-300 hours – sentences begin to make sense in very small chunks. Hundreds of common words become obvious (even if their meaning is unknown). Lots of small phrases get memorized. Hundreds of “micro break throughs” “micro recognitions” and “micro sharpening” events start to happen. Speech prosody sharpens more. A good example of this is in Star Trek TNG which i’ve been watching. It didn’t take 300 hours, but I remember when I got very comfortable with the individual voices of the Japanese voice actors. It became easy to distinguish (even if i wasn’t watch directly) the voices of Captain Piccard, Worf, Data and all the other cast. If a new person was on the episode, I would also hear their voice and know it was not a common cast member.

300-400 hours (where I am Presently) – chains of common sentences become fully comprehensible, really fast speech prosody becomes much easier to track (that is, you can hear almost every word of bullet speech, breathy talking, or heavy native prosody). During this phase some grammar ‘self-organizes’ and is comprehensible (without learning the grammar patterns) and a lot of compressions and intonations that you didn’t catch before start get very, very sharp. Previous with German I had reached this stage with a vocabulary of over 4,000 words. I will speak about this stage, German and Japanese below:Let’s talk about this:

When I was learning German, I spent five weeks learning maybe 4,500 words and had what I called a “comprehension explosion”. When I started listening to media at that point, I could hear almost every word that was said. I didn’t understand everything, but i could hear a lot of it quite clearly, when it was previously noise. Soon afterwards (maybe a month), I didn’t have much issue watching TV shows like Dark on Netflix and following along. I watched all 3 seasons in German (with German subs) on Netflix. The current difference with Japanese, is that in order to focus heavily on effectively learning thousands of words, I needed to learn (first) the 2,136 Kanji Jouyou Kanji from the official government list. So i’m currently approaching phase II  the vocabulary phase, but I feel quite similar experience wise relative to my listening ability as I did with German. The main thing that happens first is basic speech becomes near 100% comprehensible. Chit-chat, small talk, those kinds of things become very easy to follow. Funny voices, weird accents start to organize and are a bit easier to follow. The data is adding up and the brian its doing its thing. You see, immersion takes a LOT of discipline and it is important to listen to stuff that you can watch over and over because if you choose the wrong media you will get bored and your brain will do its best to frustrate you and kill your journey. Quite a few times i’ve found it difficult to figure out “what to watch” even if its playing in the background because i’m tired. But one of my rules is that all activity towards the goal benefits it and time wasted figuring what to watch is time wasted NOT watching/listening to something. 

I’m not “sitting and watching” all this media. It plays picture in picture plays in the background, or I rewatch movies I know the lines for by heart. It can be “fun” but because I know i’m aiming for 1000 hours and STILL have thousands of words to learn it is challenging BUUUUUUT…. because of Brain Power X, this ability within all of us. Inputting all this data while learning other things like word, grammar etc, triggers some software in your brain to self-organize. My listening schedule below looks like this:

My choices are based on things that I’ve previously watched that are enjoyable, easy to follow, I know the storylines and have tons of seasons to guarantee repetitive themes, words and contexts.

Star Trek Next Generations (Five seasons watched)

Buttload of Native TV Shows and Game Shows on my iPad

Podcasts (maybe 30 @2 hrs each)

Star Trek Voyagers (Five seasons watched)

Movies: Maybe 20 so far. Total 350 hours. 

So because i’m watching Star Trek, words like “space”,”planet”,”captain”,”subspace”,”engineering”,”ship”,”transport”,”ensign”,”lieutenant”,”sick”,”go”,”fire”,”order”,”enemy”,”peace”,”alien”,”earth”,”message”,”sound”,”reply”,”federation”,”representative” are said almost ten times an episode. there are also hundreds of other common words being said that I have an awareness of, which means, the brain, having stored all this sonic data from hundreds of hours will begin to quickly associated newly learned stuff to what it has already memorized to a degree. This is why a “comprehension explosion” happens and also as I am theorizing, why we can predict this within a certain time span, with a certain type of active overlap.

I was watching Transformers yesterday and this guy said “俺は文明人でわない?” ore wa bunmeijin dewanai? The subtitle said “we aren’t animals are we?”, but I know “animals” is not bunmeijin (文明人) so I looked it up and it was “civilized person”. So he was basically saying, “we’re civilized aren’t we?”. Stuff like this starts to happen constantly and really fast (in fact its been happening far earlier, I just haven’t been completely focused on vocabulary so haven’t had the time to pay attention to all the new words as it would eat into Kanji time). So once I looked up this new word, I have it basically memorized because the context is so strong. In fact, most of the time I can hear them say something different from the subs (which is a good thing). The point is, this is Brain Power X at work. This is when it starts to do its thing, when it gets enough data. Its transforming things like Optimus Prime.

I’ve probably learned without much effort about 1000 words and I probably knew 2-300 before, though not well. Now that my primary focus is going to be vocabulary, I expect a comprehension explosion in about four weeks. About a 30% jump in total comprehension, where I start to get very comfortable with understanding longer and longer sections of what i’m listening to. The reason why I said this is so familiar is because this happened very quickly with German. I didn’t have access to any German films and I was watching tons of bootleg streams, which I had to watch without subtitles. So I was forced to really pay attention and to my surprised with certain films I could follow along very well. But I also raged at content I could barely understand, because I didn’t understand that BPX takes some time. As you start to be able to process the language, it doesn’t mean you are used to all he parameters yet. A person learning English in America for the first time will struggle to understand it, and probably have a heart attack when they try to understand Australians and Irish people. Even though we use the same base phonetics, our prosody varies with territory and the brain must be familiar with these variations to properly recognize it. These variations is just ‘more data’. I am not learning Japanese prosody from Star Trek alone. I spent the better part of my early journey listening only to Japanese podcasts and TV shows before getting stressed and running to VPNvill so I could get more content. This meant my “training” was based on native prosody which actually helps with TV voices and video game voices which are clearer. Jumping between the two gives you a nice range of phonetic variation, so I can hear people make jokes on a game show or talk ridonculously fast with a funky accent, but I am not “lost”. Meaning I can hear what they are saying, and know that as I improve my vocabulary, I will be able to eventually understand them. Being “lost” means by this definition I cannot hear what they are saying (prosody) and I cannot understand what they are communicating (meaning). One step logically follows the other, and by knowing that I can hear every word a native speaker is saying while speaking quite fast, that means one step in the logical chain is solved until I get more ‘data’ (vocabulary) to fill in the gaps. So I see “not understanding” in this context as a fabulous achievement. It is much better than what I call “complete noise” where you are just lost.

Phonetic variation can only be learned through experimental interactions through media (or in real life). A man with a very deep voice will sound different from a woman with a high voice even though they are using the same phonetics. People have different accents, talk at different speeds and so on.

What this means for Japanese

Brain Power X is already kicking in and i haven’t started “hardcore vocabulary” yet. I still have some revising for the Kanji before I start pushing there. Vocabulary study inevitably overlaps with a ton of grammar, so at the same time I learn words I can work through all the sections of n5, n4, n3 and n2 (my goal) over the next 8 weeks. Since everything “feels the same”, the result should be the same. Just like German, the same things are happening in the same way, the same insights, the same observations, the same relative shifts in comprehension and so on. Even when people are speaking goddamn fast it just sounds like someone speaking fast, but I can hear what they are saying, I just don’t know all the words yet.

So lastly, i’m documenting this because this stage is actually very exciting. It means that my brain is starting to get enough data to do its thing. Sure I might have another six weeks of solid work ahead, but once BPX activates, you make just a giant leap its almost as if you can’t remember where you were. I keep emphasizing this because not only does the leap in comprehension happen, but also speaking ability. I found after I was able to understand native speakers far better, without any practice i was able to speak far better as well. It nearly blew my mind. 

This confirmed to me that I had activated “Brain Power X” and I wanted to ensure what I’d learned was not a fluke. Not only is it not a fluke, but repeatable, which means this process can be do with any language, given the right set of circumstances. I’ve actually already realized what’s going to happen as If i’m looking into the future. My mind is so primed that I’m ready to tackle French again, because with my present knowledge I now believe I should be able to take myself to an incredibly advanced level in about 8 weeks tops. I mean incredibly advanced based on all my data and research and certain “hardcore training tactics that trigger BPX”.

I’ll dive more into BPX later as I make more personal gains. But if you can choose a set of activities that guarantee the activation of Brain Power X in a certain time, then you have guaranteed the prediction of either fluency, or a certain ability within a certain time. 

So I can tell you definitively that you can learn 2,136 Kanji in 8-10 weeks and I can show you how. I am also trying to definitively tell you (with hard data) when to expect these shifts in your mind and comprehension explosions. Doing so would allow you to map your own path and know where you are at each stage. Since I believe I can make this work with a romance language in just 8 weeks, then a dedicated learner wouldn’t need more than 2 months to become really advanced in say, French, German or another Romance language. However, this is just a projection it might be closer to 90 days, which is still incredibly short. But this would be a 90 day window of guaranteed ability with duplicatable steps.

If I can help someone to fluency in that predicted window (regardless of where they are) then I would really have achieved what i’ve been puzzling over for a decade. We all have our obsessions and it is interesting that my dots have started to connect in such a way. But theory and implementation on this scale have incredible demands on time and energy, but that makes victory even sweeter! 

Incredible work lies ahead, but I press on!

Welcome to the world of mad scientist Marcus Bird!




About marcusbird

Writer, Designer, Filmmaker
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s