Originally written on Nov. 11, 2017.
Last updated Dec. 28, 2017.
Caveat emptor: I'm just a learner of Japanese. Please contact me with corrections, questions, and comments!
All external links go to wikipedia pages for more reading unless its a link to a document or youtube video.
On Japanese-language learning forums, questions often pop up about Japanese 'accent' or 'pitch-accent'. The Japanese pitch-accent system isn't taught in classes and learners go on to Japan and seem to be understood well enough without it, leading many learners to assume that it's not important.
This guide serves as a high-level overview for learners about the Japanese pitch-accent system and how it relates to English stress and Chinese tone.
Before we begin, let's understand the difference between phonetics and phonology--at a very basic level. When we talk about the phonology, we're talking about properties of a speech system (something abstract). When we talk about phonetics, we're talking about pronunciation (something concrete).
In English, there is the word ''bottle''. The word is roughly encoded in a speaker's brain as the sequence of sounds [b a t l]--three consonants and a vowel. This sequence of sounds is part of the phonology of the word. Anything dealing with phonology only exists in an abstract sense.
Now in this abstract encoding of "bottle", one of the (abstract) sounds is [t]. When that [t] is pronounced, there are many different ways it could be pronounced. In my dialect (midwestern American English), it's generally what is called a 'flap'–a sound made by throwing the tongue up against the roof of the mouth. A very different [t] sound than the [t] sound in 'talk'. We could represent it with this symbol: 'ɾ'. A [t] can also be pronounced with a ''proper'' 't', made by pressing the tongue behind the teeth. It can also be pronounced with a glottal stop '?'--the same sound as the burpy pop sound in 'uh-oh'.
So 'botl', 'boɾl', and 'bo?l' are all acceptable English pronunciations of the word ''bottle''. In other words: These variations are all acceptable phonetic variants of the word ''bottle''.
Does that make sense? When we talk about phonology, we're talking about abstract properties. When we talk about phonetics, we're talking about physical properties.
Words are produced in a certain way because of their abstract properties but there is often phonetic variation that is not differentiated at the abstract level. (As an aside, this phonetic variation often is socially-indexed. It could be used to signal information about gender, socio-economic status, or dialect group.)
We can't talk about pitch accents without talking about pitch. Pitch is the fundamental frequency or F0 of the vibration of the vocal folds. We have fine-tuned automatic control over the pitch of our voice. It generally falls in the range of 50 to 500 hz.
Men generally have lower pitch and lower pitch range than women due to having a longer oral cavity, but there may also be differences in pitch that are culturally driven (e.g. Pépiot (2014) found that French women have higher average pitch than American women).
Pitch is used for lots of things. When you sing, you change the pitch of your voice to hit particular notes. In English, to ask a question, you raise your pitch at the end of a sentence. While to make a statement, you gradually lower and level off your pitch. When excited or angry, you may say a whole sentence with higher-than-normal pitch.
Confusingly, 'pitch accent' is used in two different ways. When talking about word-level phonology, 'pitch accent' refers to a system like Japanese where words or syllables are abstractly marked with an accent. When talking about pronunciation, 'pitch accent' just means 'a rapid movement in pitch', like a slap by the pitch.
Stressed syllables in English generally carry a 'pitch accent' (a pitch movement) but this is different from the pitch accent system in Japanese. So watch out for that.
In terms of both phonology and phonetics, pitch accent in Japanese is not unlike Chinese tone or pitch in English stress--all involve modulating the speaker's pitch based on properties of the word, however, these pitch manipulations manifest differently in each language and are used for different purposes.
In Chinese, each syllable has a ''tone value'' (the number of possible tones depends on the variety of Chinese we're talking about–eg. 4 for mandarin). Each tone is a different kind of pitch movement. High flat pitch, falling pitch, rising pitch, low flat pitch, etc.
In English, each syllable has a stress status: stressed or unstressed. Words are arranged in alternating stressed (S) and unstressed (U) syllables like S.U.S.U.S.U. Consider the word ''Apalachicola'', pronounced: A.pa.LA.chi.CO.la where stressed syllables are capitalized and unstressed syllables are in lowercase.
Although multi-syllabic words carry multiple stresses, there is always one syllable that is more stressed than any other--it is said to carry 'primary stress'.
In general, a pitch accent (ie pitch movement) falls on a syllable carrying primary stress (although any stressed syllable is eligible to have a pitch accent).
English pitch accents are peaky. A pitch accent generally causes the speaker's pitch to jump up to a high value and then drop down again.
Side note: English pitch can be hard to hear. English stress also manifests in vowel quality (accented syllables have clearly pronounced vowels while the vowel in 'uhhhh' can always replace the vowel in unaccented syllables: ''A.puh.LA.chuh.CO.luh'') and in duration--stressed syllables are longer than unstressed syllables. You can more easily hear stress if you exaggerate the word as if surprised--this also is a way to clearly see the one syllable with primary stress--marked here in bold. ''A.puh.LA.chuh.COOOO.luh!?!?!?!'' ''CAN.dy!??!'' ''E.le.VA.tor!!??!''
In Japanese, each mora has an accent status (high or low). When we talk about 'status' here, we're talking about a property of word and not about it's pronunciation. Each word can either have one accented mora (H) or no accent (no H). This contrasts with Chinese where in theory all tones can appear in all/most combinations (I don't speak Chinese) and in English where multisyllabic words must have at least one accent and can have multiple pitch accents ('university' has stress on both the first syllable and the third. And for the two stress locations one or both may carry a pitch accent).
So in Japanese, words tend to have one high plateau and are low before and after the plateau (more on this later), unlike English which is peaky as discussed in the English section. From my zero experience with Chinese, I would guess that Chinese can't be described as either peaky or plateauy.
In each language, different stress/tone patterns results in different words. In Chinese: ma1, ma2, ma3, ma4: ''mother'', ''hemp'', ''horse'', and ''scold''. In English ob.JECT (verb) vs OB.ject (noun). In Japanese: hashi, ha'shi, hashi' with the patterns LL, HL, and LH, meaning ''edge'', ''chopsticks'', and ''bridge'' respectively.
In all of these languages, you can generally survive without using tones/accents properly–context helps a lot in disambiguating meaning–but #1 you'll be marked as a foreigner who doesn't speak the language properly and #2 you may encounter communication problems.
Anecdotally: A friend of mine, a non-native English speaker, once asked me for if I liked "AH.spur.AH.gus"--it took me a while to figure out that they were asking about "uh.SPAIR.uh.GUS" or "Asparagus". They reversed the stressed and unstressed pattern of the word and my brain couldn't parse it.
There are basically three pronunciation patterns: 1) accent initial words, 2) non-initially accented words, and 3) accentless words. For the first case, the phonetic pattern is HLLLLL, the second pattern is LHHHHHHL, and the third pattern is LHHHHH. As I said earlier: Japanese pitch accents (movements) are plateaus--you can clearly see that here.
To be clear, the use of L and H here in the phonetics section is different than in the phonology section–it is confusing, but you'll see it elsewhere, so I used it here too.
The accent status of a word in Japanese is not just a matter of putting an accent on a given mora (which is, on the other hand, how English stress works). There are many different accent patterns for words in Japanese but there are only three broad differences in how words in Japanese are pronounced.
So in Japanese a word with an initial accent starts high and then drops and stays low. For a word that has a post-initial accent, the word starts low, rises to meet the accented mora and then stays high, until the end of the word when it drops. For words that are unaccented, they start low and then become high with the pitch staying high until the end of the word.
For a concrete example the word ha'shi (HL) would be /pronounced/ with the pitch contour: HL; and the word hashi' (LH) would be pronounced with the pitch contour: LHL; and the word hashi (LL) would be pronounced with the pattern LH.
There is a dialect of Japanese that doesn't use pitch accent. If I speak don't use pitch accent, won't it sound like I have their accent?
(I can't take credit for this question and answer, although I provide it here with some minor embellishment)
No. For two reasons.
#1 When you speak Japanese, unless you are trying to mimic the way Japanese people speak, you'll be using your own intonational speech patterns. As an English speaker, I use stress in Japanese words because my English phonology tells me that words have to have stress.
#2 Even if you did somehow correctly speak Japanese with no accent, you're missing all of the other features of the accent-less dialect, so no one will think you learned Japanese in that region. Also, note that the 'accentless' pattern is not accentless. It appears to not have contrastive accent patterns. If it didn't have accents, words would be produced with the pitch contour LLLL, but instead they're LLLH--low and flat except for a final rise.